Roam is a research lab building its first social product: a mobile game builder that lets anyone create multiplayer games in minutes. We run targeted bounties to solve complex technical challenges that unlock scalability and extensibility in our stack.
This bounty is focused on building a text-to-3D placement system with a budget of at least $8000
Modern AI models can generate assets, but the bigger challenge is placement: turning freeform user prompts into cohesive, playable and fun 3D scenes. We want a system that can take any input prompt — from highly descriptive (“build me a subway surfers–like game”) to vague (“make an exploration game”) — and produce a structured, well-placed scene layout.
This system should:
Our target users are Gen Z and Gen Alpha creators. They often give short, vague, or playful prompts. The system should turn even minimal input into engaging and believable game layouts.
Each generated scene should output placement data in a Unity-compatible schema:
Object:
{
object: name,
x: ,
y: ,
z: ,
size_x: ,
size_y: ,
size_z: ,
rotation_x: ,
rotation_y: ,
rotation_z:
}
This placement data will be visualized using Three.js with primitive blocks tagged by type (e.g., “tree”, “coin”, “altar”). No custom 3D assets are required — primitives are sufficient.