Roam Bounty Program #019 – $5000

Overview

Roam is a research lab building its first social product: a mobile game builder that lets anyone create multiplayer games in minutes. We run targeted bounties to solve complex technical challenges that unlock scalability and extensibility in our stack.

This bounty is focused on building a text-to-audio generation system for games.


Problem Statement

Modern game creation tools let anyone design visuals quickly, but audio (background music and sound effects) remains a bottleneck. Players expect audio that matches the vibe, timing, and feel of a game. Static libraries of sound files don’t provide the flexibility needed for dynamic, player-driven content.

We want a system that can generate high-quality, perfectly loopable background music and unique, optimized SFX clips directly from a text prompt. The system should:

  1. Match Vibe and Context: Understand the style and emotional tone of the game (e.g., “retro cyberpunk arcade”, “calm mountain temple”).
  2. Generate Event-Aware Sounds: Ensure timing matches gameplay events (e.g., jump sound ≤ jump duration, punch impact syncs with animation).
  3. Produce Clean Audio: Sounds must start exactly at the point of play (no silence or fade-in unless explicitly requested) and be well-compressed for mobile/web.
  4. Handle Both Music and SFX: Support dynamic background tracks as well as individual action-based effects.

Context and References

Current Gaps in Audio Generation

Scope of Deliverable