Roam Bounty Program #029 – $7000
Overview
www.roam.lol
Build an end-to-end video analysis pipeline that ingests mobile gameplay recordings and extracts comprehensive game semantics - including asset cataloging, behavior detection, and game objective categorization.
Core Challenge: Process 2-5 minute gameplay videos to create structured data about game assets, their behaviors, and overall game mechanics without manual annotation.
2. Core Task Definition
Build a Python system that:
- Ingests mobile gameplay videos (2-5 minutes each)
- Extracts and catalogs ALL game objects (excluding UI elements)
- Tracks individual assets across frames to avoid duplicate extractions
- Assigns semantic behaviors to detected assets
- Outputs structured JSON with assets, metadata, behaviors, and game analysis
Technical Constraints:
- Must use object tracking to match asset instances across frames
- Cannot create duplicate asset entries for the same object
- Must distinguish between player-controlled and AI-controlled objects
- Can use ANY models (YOLO, SAM2, Florence, GroundingDINO, VLMs, transformers, etc.)
- Must ignore/filter UI elements (health bars, buttons, score displays)
- Free choice on deployment infrastructure