2026 · Active — this is the problem I am taking on next
Experience understanding: agents as sensors
Every content-understanding system I have worked on read content that holds still. A video is the same file every time you analyze it. An ad creative is a fixed set of pixels, audio, and a destination. Interactive experiences — the unit of content on Roblox — do not hold still. An experience is a program. What it contains is not fully knowable from its assets, scripts, and metadata, because much of its content only exists when someone plays it: states that unlock, paths that open, behavior that emerges from interaction.
Static analysis therefore has a ceiling, and the decisions stacked on top of the read — ads suitability, age appropriateness, discovery, safety — inherit that ceiling. The direction I am pursuing: put the observer inside the content. Train and run agents in the same sandbox the experience runs in, let them explore — navigate, interact, trigger states, play — and collect artifacts as they go: screenshots, transcripts, event traces, interaction logs. Then distill those artifacts into what I think of as an experience understanding record: a structured, versioned description of what an experience actually contains and does, that downstream systems can consume the way they consume a video-understanding signal today.
The open problems are what make this research-shaped rather than just engineering:
- Coverage versus cost. How much play is enough? An exploration policy is a budget decision — random walks are cheap and shallow, goal-directed agents are expensive and biased toward what they were trained to find.
- Evaluating the record itself. The record is a model output, which means it needs its own evals. What is ground truth for "what this experience contains" — human playthroughs? At what sample rate, and for which slices of the catalog?
- Adversarial experiences. Content that wants to misbehave can try to detect automated observers and behave differently for them. The cat-and-mouse from ads moderation reappears one level up.
- Staleness. Experiences update continuously. A record is a snapshot; the refresh policy is as much a part of the system as the agent.
- The schema as a contract. Whatever the record contains becomes the interface every downstream team builds against. Getting it wrong is expensive in the way all platform-contract mistakes are expensive.
What makes this one different from every content-understanding system I've built before: the observer has to be an agent. An encoder can read a video. Only something that can act — navigate, interact, make choices under uncertainty — can read a world.