Paper Club

OS-Genesis: Reverse Task Synthesis

Most UI agent scaling is currently throttled by the cost of human time. OS-Genesis took a much more scalable path by using Reverse Task Synthesis. Instead of recording a user completing a task, they started from a terminal state and worked backwards to hypothesize the intent.

The intuition here is that it’s easier to verify a goal if you already have the state. By starting with the app state and generating the task description, they hit a 91% success rate on trajectory generation.

The jump on AndroidWorld is the part that stood out to us the most. Going from 15.3% to 31.8% just by augmenting with synthetic data suggests we’re nowhere near the ceiling for what small models can do if the data diversity is high enough.

For our work on CUAs, the reverse synthesis logic is a potential fix for the ground truth problem. If you can automate the trajectory, you can train on environments that humans haven’t manually mapped out yet.

At Vibrant Labs, we’re very interested with how synthetic data trajectories will define the next generation of CUAs. If you can automate the training data, you can automate the agent.

Vibrant Labs is proudly backed by