Loka-1 on Physical Reasoning

Keon Kim and Krish Chelikavada

Loka-1 on Physical Reasoning

Loka-1 is now on Meta FAIR's Physical Reasoning from Video leaderboard.

The leaderboard combines three benchmarks: IntPhys 2, MVPBench, and CausalVQA. Together, they test whether a model can predict physical outcomes in video rather than only describe what is visible.

ModelIntPhys 2MVPBenchCausalVQA
Human baseline92.4492.9084.78
Cosmos-Reason2-8B58.1447.1959.14
Loka-150.0050.0250.44
V-JEPA 256.4044.5044.89
GPT-4o53.1932.5050.95
Qwen2.5-VL49.1236.7049.05
Gemini 2.5 Flash56.10-61.66

Loka-1 is second among model submissions that report all three tasks, behind Cosmos-Reason2-8B. It is third overall when the human baseline is included.

Why this benchmark matters

Physical reasoning is a useful test because language fluency is not enough. A model can describe a scene and still fail to predict whether an object falls, a collision changes direction, or an action causes a later outcome.

The leaderboard makes that gap explicit. IntPhys 2 focuses on intuitive physics in controlled scenes. MVPBench uses paired videos that look similar but require different answers. CausalVQA asks questions about counterfactuals, planning, anticipation, and likely outcomes in real-world video.

What Loka-1 measures

Loka-1 is aimed at the part of world modeling that matters for action: prediction. The model has to use visual evidence to infer what happens next, not just label the current frame.

That is the same capability we care about in software agents. If an AI employee changes a system, clicks through a workflow, or monitors a production issue, it needs to anticipate consequences before users feel them.

Read the research note: Physical reasoning.

Citation

Please cite this work as:

Kim, Keon and Chelikavada, Krish, "Loka-1 on Physical Reasoning", Om Labs, Jun 2026.

Or use the BibTeX citation:

@article{kim2026loka1onphysicalreasoning,
  author = {Keon Kim and Krish Chelikavada},
  title = {Loka-1 on Physical Reasoning},
  journal = {Om Labs},
  year = {2026},
  note = {https://omlabs.xyz/blog/loka-1-physical-reasoning},
}