ALTO
MVP Sprint
49 execution cards across 7 days. Monday March 2 – Sunday March 8, 2026. Click any card to cycle its state. Everything persists locally.
Overview
Sprint at a glance
8–10 hours daily. 42h estimated work + 14–28h buffer. Every card has a pivot plan.
Sprint Journal
Daily planning & reflection
Plan each day the night before. Reflect after. Write freely.
Monday March 2
Voice Pipeline Foundation
Talk into phone → see transcription → hear ElevenLabs speak it back
Morning plan
Midday check-in
Evening reflection
Tuesday March 3
LLM Agent + Backend
Full voice conversation — talk to Alto, it reasons with GPT-4o-mini, it talks back
Morning plan
Midday check-in
Evening reflection
Wednesday March 4
WhatsApp Integration
"Alto, what messages do I have?" → reads real WhatsApp → dictate reply → sent
Morning plan
Midday check-in
Evening reflection
Thursday March 5
Gmail Integration
"Alto, read my latest emails" → reads real inbox → dictate reply → sent
Morning plan
Midday check-in
Evening reflection
Friday March 6
Calendar + Briefing Engine
Start drive → Alto auto-briefs you on everything → handle it all by voice
Morning plan
Midday check-in
Evening reflection
Saturday March 7
Driving Detection + UI + Onboarding
Get in car → Alto auto-activates → briefs → full voice control → onboarding works
Morning plan
Midday check-in
Evening reflection
Sunday March 8
Ship + Demo
TestFlight link live + 30-second TikTok demo filmed
Morning plan
Midday check-in
Evening reflection
Day 1 — Monday March 2
Voice Pipeline Foundation
Talk into phone → see transcription → hear ElevenLabs speak it back. The foundation of everything.
Critical risk: whisper.cpp integration — has a pivot to SFSpeechRecognizer if it fails.
Xcode Project Setup
- Who
- You (Xcode manual)
- Needs
- Nothing — this is the starting block
- Done when
- App builds and runs on device — black screen shows "ALTO" circle with "Initializing..." text
- Pivot
- None — this must work
Audio Session Manager
- Who
- Claude writes → You build + verify
- Needs
- 1.1
- Done when
- App shows "Audio ready" on screen (not "error")
- Pivot
- None — AVAudioSession is reliable
Whisper Model + SPM Package
- Who
- Both — Claude downloads model + adds SPM, You verify
- Needs
- 1.1
- Done when
- SwiftWhisper package resolves without errors AND ggml-base.bin appears in Copy Bundle Resources
- Pivot
- If SwiftWhisper SPM fails after 45min, switch to SFSpeechRecognizer
Whisper Service (On-Device STT)
- Who
- Claude writes → You build + test on device
- Needs
- 1.2, 1.3
- Done when
- Tap circle → speak "Hello Alto" → tap again → your words appear as text on screen
- Pivot
- If transcription is garbage after 2h, switch to SFSpeechRecognizer fallback
API Client + ElevenLabs TTS Service
- Who
- Claude writes → You test on device
- Needs
- 1.2
- Done when
- Hardcode a test string → hear ElevenLabs voice speak it through your phone speaker
- Pivot
- If ElevenLabs down → use AVSpeechSynthesizer (Apple TTS) as temp fallback
Voice Echo Loop Test
- Who
- Both — Claude wires it up, You test on device
- Needs
- 1.4, 1.5
- Done when
- Full loop: tap → speak "Hello Alto" → tap → hear ElevenLabs say "You said: Hello Alto". Latency <3s.
- Pivot
- If audio conflicts → check AudioSession category
Git Init + First Commits
- Who
- Claude (terminal)
- Needs
- 1.6
- Done when
- git log shows 3 commits: project setup, Whisper STT, ElevenLabs TTS + echo loop
- Pivot
- None
Day 2 — Tuesday March 3
LLM Agent + Backend
Full voice conversation — talk to Alto, it reasons with GPT-4o-mini, it talks back.
Critical risk: OpenAI API key + Cloudflare deploy — both should be smooth.
Cloudflare Workers Project Setup
- Who
- Claude (terminal)
- Needs
- Nothing (parallel track)
- Done when
- alto-api/ directory with wrangler.toml, package.json, correct folder structure
- Pivot
- None — standard setup
D1 Database + Schema
- Who
- Claude (terminal)
- Needs
- 2.1
- Done when
- wrangler d1 execute returns: users, actions_log, drive_sessions
- Pivot
- None
JWT Auth Helpers
- Who
- Claude writes
- Needs
- 2.1
- Done when
- _auth.js exports signJWT() and verifyJWT() that correctly sign and verify HMAC-SHA256 tokens
- Pivot
- None — pure crypto, well-tested pattern
Auth Endpoints (Register + Login)
- Who
- Claude writes
- Needs
- 2.2, 2.3
- Done when
- curl register → JWT token. curl login → JWT token. Wrong password → 401.
- Pivot
- None
Agent Chat Endpoint (GPT-4o-mini + Tools)
- Who
- Claude writes
- Needs
- 2.3, 2.4
- Done when
- curl with "What messages do I have?" returns intelligent response + mock tool calls
- Pivot
- If OpenAI errors → check API key. If tool-calling broken → add few-shot examples.
Set Secrets + Deploy + Test Backend
- Who
- Both — Claude deploys, You verify with curl
- Needs
- 2.5
- Done when
- All 3 curl commands succeed: register, login, chat with token
- Pivot
- If deploy fails → check wrangler.toml config
iOS Agent + API Client Chat Method
- Who
- Claude writes → You build
- Needs
- Day 1 complete, 2.6
- Done when
- AltoAgent.swift + APIClient chat method compile without errors
- Pivot
- None — standard Swift networking
Full Voice Conversation Loop Test
- Who
- Both — Claude wires ContentView, You test on device
- Needs
- 2.7
- Done when
- On device: tap → say "What's on my calendar?" → tap → hear Alto respond via ElevenLabs. Latency <4s.
- Pivot
- If latency >6s → add timing logs to find bottleneck
Day 3 — Wednesday March 4
WhatsApp Integration
"Alto, read my messages" → hears real WhatsApp messages → dictates reply → sent.
Critical risk: Unipile connection stability + WhatsApp pairing. If Unipile is down, skip to Day 4 (Gmail).
Unipile Setup + WhatsApp Pairing
- Who
- You (Unipile dashboard + phone)
- Needs
- 2.6 (backend deployed)
- Done when
- Unipile dashboard shows WhatsApp "connected". Test API call returns real messages.
- Pivot
- If pairing fails → skip to Day 4 (Gmail) and come back later
WhatsApp Messages Endpoint
- Who
- Claude writes
- Needs
- 3.1
- Done when
- curl /api/whatsapp/messages returns real WhatsApp messages with sender name + text + timestamp
- Pivot
- If Unipile API format changed → check their docs. Reuse Drifo patterns.
WhatsApp Send Endpoint
- Who
- Claude writes
- Needs
- 3.1
- Done when
- curl /api/whatsapp/send → message actually appears in the contact's WhatsApp
- Pivot
- If send fails → check chat_id resolution
Wire Real WhatsApp Tools Into Agent
- Who
- Claude writes
- Needs
- 3.2, 3.3
- Done when
- curl agent/chat with "Read my WhatsApp messages" returns REAL messages, not mock data
- Pivot
- If internal calls fail → check getInternalToken() and getBaseURL()
Deploy + Test WhatsApp E2E
- Who
- Claude deploys → You verify with curl
- Needs
- 3.4
- Done when
- Backend deployed. curl chat endpoint with WhatsApp queries returns real data.
- Pivot
- None at this point
Voice Test: WhatsApp by Voice
- Who
- You (on device)
- Needs
- 3.5
- Done when
- On phone: "Alto, what WhatsApp messages do I have?" → hear real messages → "Reply to Mike saying I'm on my way" → message sent
- Pivot
- If voice works but messages wrong → check backend via curl independently
Day 4 — Thursday March 5
Gmail Integration
"Alto, read my latest emails" → reads real inbox → dictate reply → sent.
Critical risk: Google OAuth flow — needs careful setup in Google Cloud Console.
Google Cloud Console Setup
- Who
- You (browser — Google Cloud Console)
- Needs
- Nothing
- Done when
- OAuth consent screen configured + OAuth 2.0 Client ID created with correct redirect URI
- Pivot
- Use "Testing" mode (100 users max) — fine for MVP
Google OAuth Backend Endpoint
- Who
- Claude writes
- Needs
- 4.1
- Done when
- /api/auth/google endpoint accepts auth code, exchanges for tokens, stores in D1
- Pivot
- If token exchange fails → double-check redirect_uri match
Token Refresh Helper
- Who
- Claude writes
- Needs
- 4.2
- Done when
- refreshGoogleToken() uses refresh_token to get new access_token when expired
- Pivot
- None
iOS Google OAuth Flow
- Who
- Claude writes → You test on device
- Needs
- 4.2
- Done when
- On device: tap "Connect Google" → sign in → redirects back → backend stores tokens
- Pivot
- If ASWebAuthenticationSession doesn't redirect → check URL scheme in Info.plist
Gmail Messages Endpoint
- Who
- Claude writes
- Needs
- 4.3
- Done when
- curl /api/gmail/messages returns real unread emails with sender, subject, snippet, date
- Pivot
- If Gmail API returns 403 → check scopes in consent screen
Gmail Send Endpoint
- Who
- Claude writes
- Needs
- 4.3
- Done when
- curl /api/gmail/send → email actually arrives in recipient's inbox
- Pivot
- If send fails → check RFC 2822 formatting / base64url encoding
Wire Gmail Tools Into Agent
- Who
- Claude writes
- Needs
- 4.5, 4.6
- Done when
- curl agent/chat with "Read my emails" returns real email data (not mock)
- Pivot
- None — same pattern as WhatsApp wiring
Deploy + Voice Test: Email by Voice
- Who
- Both — Claude deploys, You test on device
- Needs
- 4.7, 4.4
- Done when
- On phone: "Alto, read my latest emails" → hear real inbox → "Reply to Lisa saying I'll review tonight" → email sent
- Pivot
- If email reads but won't send → test send endpoint with curl
Day 5 — Friday March 6
Calendar + Briefing Engine
Start drive → Alto auto-briefs you → handle everything by voice.
Low risk — Calendar API is simpler than Gmail. The briefing engine is the creative piece.
Calendar Events Endpoint
- Who
- Claude writes
- Needs
- 4.3 (token refresh)
- Done when
- curl /api/calendar/events returns today's real calendar events
- Pivot
- If no events → check timezone (API is timezone-sensitive)
Calendar Create Endpoint
- Who
- Claude writes
- Needs
- 4.3
- Done when
- curl /api/calendar/create → event appears in your Google Calendar
- Pivot
- None
Wire Calendar Tools Into Agent
- Who
- Claude writes
- Needs
- 5.1, 5.2
- Done when
- ALL mock tools replaced. ZERO mock data. curl agent/chat with "What's on my calendar?" returns real events.
- Pivot
- None
Briefing Engine Endpoint
- Who
- Claude writes
- Needs
- 5.3 (all tools real)
- Done when
- curl /api/agent/briefing returns a natural, conversational 3-5 sentence summary of WhatsApp, email, and calendar
- Pivot
- If briefing sounds robotic → tune the prompt. If too long → add max_tokens=200.
iOS Briefing Service
- Who
- Claude writes → You build
- Needs
- 5.4
- Done when
- BriefingService.swift compiles. fetchBriefing() returns the briefing text.
- Pivot
- None — simple HTTP call
Deploy + E2E Test: All Integrations
- Who
- Both — Claude deploys, You test on device
- Needs
- 5.5
- Done when
- All 6 tools work by voice: WhatsApp read/send, Gmail read/send, Calendar read/create
- Pivot
- If any integration broke → test that endpoint with curl first
Commit: All Integrations Complete
- Who
- Claude (terminal)
- Needs
- 5.6
- Done when
- Clean commits for calendar + briefing. git log tells the story.
- Pivot
- None
Day 6 — Saturday March 7
Driving Detection + UI + Onboarding
Get in car → Alto auto-activates → briefs → full voice control → onboarding works.
Low risk — all features are local iOS, no API dependencies.
Driving Detector (CoreMotion + GPS)
- Who
- Claude writes → You test in car
- Needs
- Day 1 iOS foundation
- Done when
- On device: start driving → state changes to .driving. Park for 2min → changes to .parked.
- Pivot
- If CoreMotion unreliable → rely purely on GPS speed (>15 km/h = driving)
Driving Session Manager
- Who
- Claude writes
- Needs
- 6.1
- Done when
- DrivingSession tracks isActive, startTime, actionsCount, durationFormatted
- Pivot
- None — simple state class
Main View (Pulsing Circle UI)
- Who
- Claude writes → You review + iterate
- Needs
- 6.1, 6.2, Day 5 complete
- Done when
- Beautiful minimal UI: pulsing circle changes color, status text, activity log, "End Drive" button, auto-briefing plays
- Pivot
- If animations jank → simplify to opacity changes only
Onboarding View
- Who
- Claude writes → You test on device
- Needs
- 2.4 (auth endpoints)
- Done when
- Fresh install → onboarding → enter email + password → register → lands on MainView
- Pivot
- If register fails → test backend auth with curl independently
Content View Router
- Who
- Claude writes
- Needs
- 6.3, 6.4
- Done when
- App flow: not logged in → OnboardingView → login → MainView. Token persists across restarts.
- Pivot
- None
Keychain Token Storage
- Who
- Claude writes → You test
- Needs
- 6.4, 6.5
- Done when
- Token survives app kill + restart. API keys no longer hardcoded.
- Pivot
- If Keychain flaky → use UserDefaults for now
Full Integration Test
- Who
- You (on device, ideally in car)
- Needs
- 6.6
- Done when
- Full user journey: fresh install → onboarding → briefing → voice control → all integrations → "End Drive"
- Pivot
- List all bugs, prioritize for Day 7
Day 7 — Sunday March 8
Ship + Demo
TestFlight link live + 30-second demo filmed. Focus on "good enough to show", not perfection.
This is ship day. The demo IS the product-market fit test.
Post-Drive Summary View
- Who
- Claude writes → You review
- Needs
- 6.2 (DrivingSession)
- Done when
- After "End Drive" → see summary with action count, drive duration, activity log
- Pivot
- If time is tight → skip entirely. Nice-to-have.
Error Handling + Edge Cases
- Who
- Claude writes → You test
- Needs
- Day 6 complete
- Done when
- App doesn't crash on: no internet, empty transcription, backend error, token expired, model load fail
- Pivot
- Fix crashes first. Polish error messages last.
Voice Interruption Handling
- Who
- Claude writes → You test
- Needs
- 6.3
- Done when
- While Alto is speaking, tap → TTS stops → mic starts → you can speak immediately
- Pivot
- If audio session conflicts → add a 200ms delay between stopping TTS and starting mic
App Icon
- Who
- You (design tool or Claude generates)
- Needs
- Nothing
- Done when
- App icon shows on home screen + TestFlight. Clean, monochrome, recognizable.
- Pivot
- Use a simple "A" lettermark in white on black
TestFlight Build + Upload
- Who
- You (Xcode)
- Needs
- 7.2, 7.4
- Done when
- TestFlight email arrives. Install Alto from TestFlight on a clean device and full flow works.
- Pivot
- If archive fails → check signing + capabilities
Film Demo Video
- Who
- You (in car, filming)
- Needs
- 7.5
- Done when
- 30-second TikTok-ready demo: phone on mount, auto-briefing, voice commands, hands on wheel
- Pivot
- If in-car filming doesn't work → film parked in driveway. Voice-over with screen recording as backup.
Sprint Summary
Fill Sunday evening
How did it go?
Appendix A
Pre-Sprint Checklist
Complete these by Sunday March 1 evening so Monday starts clean.
- ElevenLabs API key — sign up at elevenlabs.io, get API key
- OpenAI API key — verify your key works, has GPT-4o-mini access
- Unipile credentials — confirm UNIPILE_DSN + UNIPILE_API_KEY from Drifo
- Google Cloud Console — could start Task 4.1 early (OAuth setup takes 1h)
- Physical iOS device — charged, connected to Xcode, trusted
- USB cable — for tethered debugging
- Xcode updated — latest version, iOS 17+ SDK
- wrangler CLI — npm install -g wrangler, wrangler login
- Camera/mount for demo day — phone mount for car filming
Appendix B
Secrets Required
| Secret | Where | When |
| ELEVENLABS_API_KEY | iOS code + Cloudflare | Day 1 |
| OPENAI_API_KEY | Cloudflare | Day 2 |
| JWT_SECRET | Cloudflare | Day 2 |
| UNIPILE_DSN | Cloudflare | Day 3 |
| UNIPILE_API_KEY | Cloudflare | Day 3 |
| GOOGLE_CLIENT_ID | Cloudflare + iOS | Day 4 |
| GOOGLE_CLIENT_SECRET | Cloudflare | Day 4 |
Appendix C
Card Count Summary
| Day | Focus | Cards | Est. Hours | Buffer |
| Day 1 | Voice Pipeline | 7 | 5.5h | 2.5–4.5h |
| Day 2 | LLM + Backend | 8 | 6h | 2–4h |
| Day 3 | WhatsApp | 6 | 5h | 3–5h |
| Day 4 | Gmail | 8 | 6.5h | 1.5–3.5h |
| Day 5 | Calendar + Briefing | 7 | 5.5h | 2.5–4.5h |
| Day 6 | UI + Onboarding | 7 | 6h | 2–4h |
| Day 7 | Ship + Demo | 6 | 5.5h | 2.5–4.5h |
| Total | | 49 | ~40h | 16–30h |
Appendix D
Fallback Decision Tree
When things go wrong, follow the pivot. Don't burn the day.
whisper.cpp won't compile (>1.5h)
Switch to SFSpeechRecognizer (Apple's built-in STT). Worse accuracy but zero integration friction. Can always swap back to Whisper later.
ElevenLabs API down or too slow
Use AVSpeechSynthesizer (Apple TTS) for dev/testing. Switch back when it's up. Or try OpenAI TTS API as alternative.
Unipile WhatsApp won't connect
Skip Day 3 entirely → do Day 4 (Gmail) instead. Come back on Day 6 buffer time. Worst case: ship without WhatsApp, add in v1.1.
Google OAuth consent screen blocked
Use Testing mode (100 users max) — fine for MVP. Add your email as test user. Submit for verification in parallel.
GPT-4o-mini tool-calling unreliable
Add few-shot examples to system prompt. If still bad → try gpt-4o (more expensive but better). Last resort: Claude Haiku (proven in Drifo).
Can't finish Day 6-7 features
Core loop works by Day 5. Day 6-7 are polish — skip driving detection + summary. Ship with manual tap-to-activate instead of auto-detect.