Scope4Mac

00:00
00:00

Scope4Mac

Daydream Scope

Explore new worlds with Daydream Scope

Check out the latest model drops and powerful integrations.

Download Now

Scope4Mac - Daydream Scope on Apple Silicon (a n00b experiment)

I just got into Daydream as part of the hackathon. What started as "let me try Daydream on my Mac before switching to a 5090" escalated into a full MPS port with some novel features we're planning to bring back to Windows.

This is a fork of the official Scope distribution running natively on Apple Silicon. No CUDA, no cloud, just M2 Max 96GB unified memory and Metal. It's experimental and hacky but was an interesting way to explore Daydream.

50fps on M2

50fps on M2

THE MPS PORT (THE HARD PART)

Canonical Daydream defaults to CPU fallbacks when CUDA isn't present. We had to surgically replace every CUDA-only path:

• torch.amp.autocast(device_type="cpu") was the single biggest perf killer — the VAE was running on CPU at 800ms. Fixing to device_type="mps" dropped it to 200ms.

• 28 float64 occurrences across 12 files → float32 (MPS doesn't support double)

• Flash attention → SDPA fallback at 9 call sites across 5 pipeline files

• flex_attention (CUDA-only) → conditional imports + SDPA

• grid_sample(padding_mode='border') crashes on MPS → 'zeros'

• fp8 text encoder → bf16 alternative

• EulerDiscreteScheduler at num_inference_steps=1 works; EulerAncestralDiscrete doesn't — scheduler-specific workarounds needed

• Unified memory detection via os.sysconf so pipelines with VRAM requirements don't get filtered out

• Live MPS allocation tracked via torch.mps.current_allocated_memory() and displayed in a status bar gauge.

60+ files changed from upstream.

TURBO4MAC PIPELINE

Custom turbo4mac pipeline: SD-Turbo + TAESD (tiny autoencoder, 1.2M vs 49M params). GPU-native tensor path — input stays on MPS from frame ingestion through VAE encode, UNet, VAE decode, and output. No PIL round-trips. ~6-10 FPS at 256x256 on M2 Max.

RIFE-BUFFERED (THE FEATURE WE'RE PORTING TO WINDOWS)

This is the main novel contribution. Two modes:

• Auto: Set a target FPS, system measures input rate and picks interpolation depth (2x/4x/8x/16x) to approximate the target.

• Manual: Pick your depth directly. Output = input × depth.

Architecture: sliding-window frame state (input_size=1), recursive midpoint subdivision (each depth level is one batched RIFE model call), output FPS hinting piped through the transport layer to WebRTC pacing. Gets us 30-50 FPS displayed from a ~5 FPS diffusion source.

Planning to extract this as a standalone plugin with a clean API (push_frame / pop_frames / get_output_fps_hint) for the Windows/CUDA version.

Buffered RIFE + Bloom (some fps eaten up by NDI as I couldn't get Syphon to work)

OTHER ADDITIONS

• Bloom postprocessor — fast downsample-upsample glow (no conv2d, uses Metal's optimized spatial scaling). Auto-sorted before RIFE in the chain so it runs on source frames, not interpolated ones.

• Seed LFO — auto-stepping seed at configurable rate (10-1000ms) for evolving visual texture instead of static noise.

• Transport fixes — WebRTC frame rate cap raised from 8→60 FPS, parameter updates changed from bounded queue to latest-write-wins (eliminates "Parameter queue full" drops), queue sizes reduced for lower latency, pause/resume flushes stale frames.

• All preprocessors working on MPS — scribble, depth, optical flow, gray

• LongLive (Wan2.1 1.3B) works at ~0.6 FPS

THE AQUA SKIN

OS X 2001 Cheetah/Puma aesthetic. Aqua gel buttons, pinstripe textures, iTunes 1.0-style LCD transport bar with green phosphor display and scanlines. Skinnable via an abstracted style spec (AquaStyles.ts). Just a bit of fun, but has the minimal foundations for a  swappable skinning system if anyone were so inclined.

Spent way too much time on this.

Spent way too much time on this.

WHAT DOESN'T WORK YET

• Krea (14B) and StreamDiffusionV2 load but are too slow for interactive use on MPS

• The diffusion output tracks input structure but doesn't "lock in" the way full img2img with higher step counts does — SD-Turbo's adversarial distillation is a tradeoff

• True fixed-cadence output from RIFE needs an emission scheduler we haven't built yet

Built with: Claude Opus 4.6 (architecture, pipeline, transport) + Gemini 3.1 Pro (UI/Aqua skin, architectural observation) + Codex 5.4 (transport debugging, log analysis). Three-agent collaboration via the 555n-construct.

Stack: Python 3.12, PyTorch 2.9.1 (MPS), diffusers, Electron, React/TypeScript, FastAPI, aiortc, uv

If anyone thinks it's useful or interesting I can put it on github. The app can be installed with a .dmg