Roam Bounty Program #018 – $8000

Overview

Roam is a research lab building its first social product: a mobile game builder that lets anyone create multiplayer games in minutes. We run targeted bounties to solve complex technical challenges that unlock scalability and extensibility in our stack.

This bounty is focused on building a text-to-3D placement system with a budget of at least $8000


Problem Statement

Modern AI models can generate assets, but the bigger challenge is placement: turning freeform user prompts into cohesive, playable and fun 3D scenes. We want a system that can take any input prompt — from highly descriptive (“build me a subway surfers–like game”) to vague (“make an exploration game”) — and produce a structured, well-placed scene layout.

This system should:

  1. Interpret Prompts Semantically: Understand both descriptive and vague prompts, including relative placement instructions (e.g., “stairs to the right of the tower in the northeast corner of the map”).
  2. Generate Placements: Produce a set of objects, their coordinates, sizes, and rotations in Unity’s coordinate system (Vector3 positions, Quaternion rotations, Vector3 bounds).
  3. Visualize Results: Render the placement in a lightweight Three.js engine using primitives as placeholders.
  4. Support Iteration: Allow users to reprompt or modify existing placements in realtime.

Context and References

Target Audience

Our target users are Gen Z and Gen Alpha creators. They often give short, vague, or playful prompts. The system should turn even minimal input into engaging and believable game layouts.

Scene Representation

Each generated scene should output placement data in a Unity-compatible schema:

Object:
{
  object: name,
  x: ,
  y: ,
  z: ,
  size_x: ,
  size_y: ,
  size_z: ,
  rotation_x: ,
  rotation_y: ,
  rotation_z:
}

This placement data will be visualized using Three.js with primitive blocks tagged by type (e.g., “tree”, “coin”, “altar”). No custom 3D assets are required — primitives are sufficient.