Ray Fernando·March 29, 2026·3h 48m

The Hackathon Where Humans Can't Code

TL;DR

This hackathon literally penalizes humans for coding — At the San Francisco Ralphathon hosted at the Weights & Biases office, teams spent the first hours writing specs, then had to go hands-off while AI agents coded; if they touched their laptops, they got only 10 minutes and a “lobster” penalty.
The standout demos weren’t toy apps — they were self-monitoring agent systems — Benjamin’s “Agent Forge” built an agent harness and dashboard in 78 minutes, tracked 30 features, logged every commit, scored code quality with Sonnet, coded with Codex, and cost just $95 total to run.
A lot of the energy came from Korea’s ultra-locked-in AI builder culture — The organizers from Team Attention started six months ago in Seoul, said their Korean event had 100% attendance, and described participants launching agents, going to sleep, and checking results 8–9 hours later.
The format flips the normal hackathon social contract — Instead of everyone silently grinding, people were expected to launch a Ralph loop and then actually talk to each other, which Ray and Alex kept calling the weirdest and coolest part of the event.
The projects showed how far looping agents can go with good specs — Teams built things like a Twitch clip generator for gamers using Gemini transcript/sentiment analysis, a universal memory MCP server pulling from Discord/WhatsApp/email, and an AI-agent marketplace modeled like Upwork for OpenClaw-style workers.
Alex Vulov framed the bigger trend: memeability and UX are deciding what wins, not just model quality — He argued OpenClaw spread in China because it’s visual, sticky, and token-hungry, then tied that same “mimetic potential” to Ralph loops, while also riffing on Gemini Flash Live, AGI debates, and agent observability via Weights & Biases Weave.

The Breakdown

A San Francisco hackathon where touching your laptop gets you punished

Ray opens from the Weights & Biases office with Alex Vulov, then quickly gets to the gimmick that makes this event different: teams write specs, launch AI agents, and keep their hands off the keyboard. The organizers from Korea’s Team Attention explain that in Seoul, people literally kicked off agents, went to sleep, and checked results 8–9 hours later — which immediately sets the tone for how far this “humans can’t code” idea is being pushed.

The “lobster penalty” turns agent autonomy into a game

The scoreboard makes the whole thing feel half hackathon, half sport: every team submission is listed publicly, and anyone who touched their laptop had to request a “lobster,” got 10 minutes of manual intervention, and took a penalty. Alex ties it back to Jeff Huntley’s “Ralph loop” idea — write the spec, send the loop through tasks one by one, and hope the product works without you needing to babysit it.

Agent Forge steals the room with a dashboard that watches the agent build itself

One of the strongest demos comes from Benjamin, whose project “Agent Forge” built an autonomous coding harness plus a dashboard to visualize every iteration. It handled 30 features, logged 30 commits, scored each one with Sonnet while Codex wrote the code, tracked time and token cost per feature, showed where it got stuck, and came in at a total run cost of $95. Ray is visibly blown away because the thing didn’t just build a product — it built observability for its own process.

The vibe is equal parts chaos, fandom, and SF hackathon theater

Ray and Alex keep bumping into fans, repeat attendees, and people hauling absurd setups — including one guy with a MacBook and two portable displays in a hard case. They compare the event to Cerebral Valley’s polished hackathons, joke about the slight chaos of room assignments, and keep circling back to the surreal part: in this format, you’re supposed to socialize while your machine works, not hide and grind.

Alex zooms out: Gemini Flash Live, AGI discourse, and why memeable products spread

In one of the broader sidebars, Alex gives a fast-moving mini state-of-the-union on AI: Gemini Flash Live is the launch he’s most excited about because it’s a true omni model with built-in Google Search and near-real-time latency around 300 milliseconds. He contrasts Jensen Huang saying AGI is basically here with ARC Prize’s tougher benchmark suggesting we’re nowhere close, then goes even more meta and argues that products like OpenClaw win because they have “mimetic potential” — they’re visual, sticky, and culturally legible, not just technically good.

More demos: memory systems, agent marketplaces, and game-stream clipping

The stream turns into a walking demo floor. Caleb shows “Memorial,” an MCP memory server that dumps Discord, WhatsApp, email, and chat context into one place and powers three showcase apps, all built through a Ralph loop; another founder pitches an Upwork-for-agents marketplace where OpenClaw-style workers compete for tasks. Then a Korean team demoes a live Twitch clip generator for gamers that uses Gemini to watch transcripts, comments, sentiment, and in-game events in real time — and they hilariously point it at Ray’s own live stream on the spot.

Korea’s builder culture becomes a character in the video

Again and again, the human story comes back to the Korean contingent: founders flying in together, an EO House in North Beach, kimchi at the event, and repeated comments about how disciplined and committed the group feels. Alex notes that Seoul had 100% attendance versus the 42% show rate he considers “great” in San Francisco, and Ray jokes that they basically brought all of Koreatown into the office.

Ray ends by showing he’s playing the same game

Late in the stream, Ray reveals his own side project: “WatchClaw,” an open-source control layer for Claude Code, OpenClaw, Codex, Cursor, and Droid, designed to monitor and manage agent sessions from an Apple Watch, phone, web app, or Mac menu bar. It’s classic Ray energy — a giant spec-heavy architecture plan, a refusal to open Xcode, and the core lesson of the whole event made explicit: the leverage now comes from defining the system well enough that the agents can do the rest.