
Check out the latest model drops and powerful integrations.
I VJ three nights a week in a Vegas nightclub, and I've been trying to get real-time AI video to naturally integrate in my workflow on stage.
There are three hurdles that I felt needed to be addressed before live performance integration was feasible:
Beat synchronization. The AI output has no relationship to the music. Frames arrive whenever the model finishes generating them — completely detached from the tempo of the set. Luckily, during this cohort the Scope developers implemented Ableton Link BPM integration, which made this less of a problem. I decided to build on that by integrating AlphaTheta's ProDJ Link protocol for Pioneer CDJ player integration — which opened up a lot more than just tempo sync, including pulling prompt data straight off the DJ's USB stick. More on that in a bit.
Model latency. Generation takes time, and that time isn't consistent. Even when the input is beat-mapped, the output drifts because frames don't come back on a predictable schedule.
Prompting. Going to the keyboard to type a prompt mid-set is a flow killer. I needed a way to feed the AI relevant prompts without ever taking my hands off my controller.
The centerpiece of this project is the solution to the second problem: a dynamic timecoded buffer.
AI video generation wasn't trained to be beat aware — but matching content to the music is my job anyway. Since I know my input going into the pipeline is already mapped to the beat, I can use VACE to augment and transform my visuals while still retaining some of that beat-mapped rhythm. The problem is that the output latency is variable. The AI doesn't return frames on a consistent schedule, so even though the content carries some rhythmic DNA from the input, it arrives at unpredictable times. The sync drifts. Beat hits land late. The feel is off.
My initial solution was a timecoded barcode — a thin strip stamped at the bottom of every frame before it enters the AI pipeline, encoding the exact beat position and frame sequence with error correction. The idea was to mask that strip out of the generation step so the AI wouldn't paint over it, then decode it on the other side to know exactly where each frame belongs on the beat grid.
On the output side, I decode the barcode and now I know exactly where every frame belongs on the beat grid — regardless of when the AI decided to spit it out. Frames go into a buffer, and instead of displaying them the instant they arrive, the system holds them and releases them quantized to the beat. I can dial the delay on my APC40, or let the system use lookahead to line up a visual change with a moment it knows is coming. A phrase change. A drop. The output is finally BPM-matched, not just the input.
Switching to a keyboard mid-set to type a prompt completely kills the energy. Voice would be nice in theory but have you been in a nightclub? Way too loud for that. What I needed was something watching what I'm already doing and generating relevant prompts on its own — ones I can quick-select or let auto-apply based on rules I set up. So I got Qwen 2.5 VL 3B running locally on my laptop's 8GB GPU and fed it my Resolume outputs. Both the main output and my preview channel — the model sees where my visual is now and where I'm taking it next, and writes prompts that bridge the gap. No keyboard, no mic. Just context-aware prompting running in the background while I perform.
I implemented AlphaTheta's ProDJ Link protocol for BPM sync and phrase awareness — the system knows where the music is structurally, not just rhythmically. Then I realized the protocol exposes track metadata too, including the comment field. So I built it so DJs can embed prompts directly in their track comments. When that track loads on the CDJ, my app reads the metadata and folds those prompts into its logic automatically. The DJ's creative intent travels with their library — no setup at the venue, no coordination needed.
Three problems, one app. Beat sync through ProDJ Link and Scope's native BPM clock, a controllable buffer that compensates for AI latency, and vision-driven prompting that never breaks the flow. That's what it took to make realtime AI visuals actually work for live performance.
The Scope plugin is available for download now. The rest of VJ.Tools Realtime is still being finished up — check back in a few days.