Alto Feb 2026

ALTO
Vision & Roadmap

Voice-first AI driving agent. The complete product vision — from the problem we're solving to the 7-day MVP sprint and beyond.

01 — Problem

Driving time is dead time

The average person spends 1-2 hours driving every day. That's 550 hours per year — 23 full days — sitting in a box doing nothing. No product solves this without putting a screen in your face.

1.5h
avg daily drive
550h
wasted / year
23d
full days lost
0
real solutions
ProductProblem
Siri / Google AssistantReactive. No context, no memory, no follow-through.
CarPlay / Android AutoStill a screen. Still tapping. IS the problem.
Podcasts / MusicEntertainment, not productivity. Nothing gets done.
02 — Non-Negotiable

Safety is the foundation, not a feature

Everything we build sits on one absolute rule: the driver is never endangered. This is not a phone app you use while driving — it's a voice agent that replaces the need to touch your phone at all.

Zero Visual Demand

Eyes never leave the road

No screen to glance at. No notifications to read. No UI to navigate. The phone stays in your pocket or on the seat. All information is delivered by voice. All actions are confirmed by voice.

Zero Manual Input

Hands never leave the wheel

No tapping. No swiping. No buttons. Not even a wake word button — the AI listens continuously during driving mode or activates via steering wheel controls. 100% hands-free, always.

Cognitive Load Management

The AI adapts to the drive

Complex traffic? AI goes silent. Highway cruising? AI briefs you. Detects stress in your voice? Reduces information density. The driver's cognitive safety always comes first — productivity second.

Confirmation Before Action

Nothing happens without your voice

The AI suggests. You confirm. "Send this reply?" → "Yes." No auto-sending without approval. No irreversible actions without explicit voice consent. You are always in control.

This is what separates us from every other productivity tool. We don't put a screen in your car — we remove the need for one. The phone becomes invisible. That's not a limitation, it's the entire point. It's why Apple would want us in the App Store, why insurers would endorse us, and why regulators can't touch us.
03 — Product

Proactive AI co-driver

Not an assistant you command. A co-pilot that runs your life. It knows your emails, messages, calendar, tasks — and handles everything the second you start driving.

"Morning. 47 min to office. Three things — Mike replied to your proposal, he's in. Want me to confirm and loop in legal? Your 10am moved to 11, I blocked your calendar. Your mom texted about Sunday dinner." "Yes to Mike. Tell my mom I'll be there at 6." → Email sent to Mike with legal CC'd. WhatsApp to Mom: "Bin um 6 da!" "What's on my to-do list?" "Three items. Contract review due today — I can read the key changes. Investor deck needs approval — send to Lisa? And you wanted to book a restaurant for Friday." "Read the contract. Send the deck to Lisa. Book Rocca, Friday 8pm, two people." → Reading contract... Deck forwarded. Restaurant booked.

Proactive intelligence

Starts briefing when you start driving. Prioritizes by urgency. Adapts to drive length. Fills silence with value, yields when you talk.

Real execution

Doesn't just inform — it acts. Sends replies, moves meetings, creates tasks, books restaurants. You confirm by voice.

Adaptive voice

Work context: Jarvis — calm, efficient, zero fluff. Personal context: warmer, knows your people by name, casual.

Safe by design

Voice-only. You approve actions by voice. Hands on wheel, eyes on road. Always. The phone is the speaker, not the interface.

04 — Why This Wins

Feature comparison

FeatureSiriCarPlayAlto
Proactive briefingsNoNoAuto-starts
Multi-step actionsNoNoUnlimited chains
Cross-app contextNoNoEmail+Cal+Msg
Conversation memoryNoNoFull drive
Voice reply to WhatsAppBuggyRequires tapsFully hands-free
Calendar managementBasicRead onlyRead+modify+create
Zero screen interactionNoNo100% voice
05 — Business Model

How it makes money

Free Tier

$0

5 drives/month. Messaging triage only. No calendar, no tasks, no actions. Enough to feel the magic.

Pro

$13.99/mo

Unlimited drives. All integrations. Proactive briefings. Full agent power. The tier 90% land on. Covers Unipile bridge + API costs with margin.

Enterprise

Custom

Team accounts. CRM integration. Custom workflows. Admin dashboard. Field sales teams — massive ROI.

06 — MVP Sprint

7 days. 1 agent. Ship it.

A focused sprint to build the core voice agent — from mic input to executed actions. Every day has one clear deliverable.

7
Days
6
Tools
3
Integrations
<1.5s
Latency
07 — Day 1

Voice Pipeline

Microphone to speaker. The foundation of everything. Get audio flowing before anything else matters.

Capture
Mic
AVAudioEngine
Transcribe
Whisper
On-device STT
Think
GPT-4o-mini
Reasoning
Synthesize
ElevenLabs
Streaming TTS
Output
Speaker
Bluetooth/Car
This is the hardest day. If the voice loop feels natural and fast, everything else is UI on top of it. If it's laggy, nothing else matters.
08 — Day 2

LLM Agent Core

The brain. Give the LLM tools, context, and the ability to chain actions.

Framework

Tool Architecture

Each tool is a self-contained function: name, description, parameters, execute. The LLM decides which tools to call and in what order. Tools can chain — "read email, draft reply, send" is one agent turn.

Memory

Context Window

Every drive session maintains full conversation history. The agent knows what you asked 20 minutes ago. Cross-drive memory stored server-side — the agent knows your preferences, contacts, patterns.

09 — Day 3

WhatsApp via Unipile

The first real integration. Read messages, reply by voice, handle group chats. The demo moment.

"Any new WhatsApp messages?" "Three conversations. Sarah sent 2 messages about the weekend trip — she's asking about hotels. Mike shared a link in the dev group. And your mom asked if you're coming for dinner Sunday." "Reply to Sarah — tell her the Marriott downtown, I'll book it. Tell mom yes, around 6." → WhatsApp to Sarah: "Let's do the Marriott downtown — I'll book it." WhatsApp to Mom: "Ja, bin um 6 da!" "Both sent. Want me to book the Marriott for the weekend?" "Yes, Friday to Sunday, two people." → Searching hotels... Marriott Downtown available. Booking for 2 guests, Fri–Sun.
10 — Day 4

Gmail via Google API

Triage your inbox by voice. Read, reply, compose, archive — all hands-free.

ActionVoice CommandAPI Call
Read inbox"What emails do I have?"messages.list + messages.get
Summarize"Give me the highlights"Batch get + LLM summarize
Reply"Reply to Mike — sounds good, let's do Thursday"messages.send (threadId)
Compose"Email Lisa about the Q2 report"messages.send (new)
Archive"Archive everything from LinkedIn"messages.batchModify
Search"Find the contract from last week"messages.list (q param)
Label"Star the email from the investor"messages.modify (labelIds)
11 — Day 5

Calendar + Morning Briefing

Google Calendar integration plus the killer feature: proactive morning briefings when you start driving.

The morning briefing is the moment the product becomes a habit. It's not "open an app" — it's "start driving and everything you need to know is spoken to you." That's the behavior change.
12 — Day 6

Detection, Interface, Onboarding

Driving detection to auto-start. Minimal UI. First-time setup flow.

Auto-Start

Driving Detection

Bluetooth connection + GPS speed + CoreMotion accelerometer. When 2 of 3 signals confirm driving, Alto activates. No button press. No "Hey Siri." You start driving, Alto starts working.

The Screen

Minimal UI

One screen. A pulsing circle — idle, listening, thinking, speaking. That's it. No text to read. No buttons to tap. The phone sits face-down or in your pocket. The UI exists for parked mode only.

First Run

Onboarding

Connect Google account. Connect WhatsApp via Unipile. Set your name. Pick voice preference. Done in 90 seconds. First drive auto-starts a mini demo briefing to show what Alto can do.

13 — Day 7

Ship it. Film it. Share it.

Polish, test the full flow, record the demo video. If the demo is compelling, you have product-market fit signal.

The demo IS the product-market fit test. If people watch it and say "I need this" — you have something. If they say "that's cool" — iterate. The reaction to the demo determines everything that comes next.
14 — Architecture

System overview

How all the pieces connect. Voice in, actions out, everything in between.

Input
iOS Audio
AVAudioEngine
STT
Whisper
On-device
Agent
Claude / GPT
Reasoning + Tools
Tools
APIs
Gmail, Cal, Unipile
TTS
ElevenLabs
Streaming
Output
Speaker
Bluetooth
LayerTechnologyPurpose
iOS AppSwiftUI + AVAudioEngineUI + audio capture
STTWhisper (on-device)Speech-to-text, <500ms
LLMGPT-4o-mini / Claude HaikuReasoning + tool calling
TTSElevenLabs Turbo v2.5Voice synthesis, streaming
MessagingUnipile APIWhatsApp bridge
EmailGoogle Gmail APIRead, reply, compose
CalendarGoogle Calendar APIEvents, scheduling
BackendCloudflare Workers + D1User data, conversation logs
AuthGoogle OAuth2Account linking
StorageCloudflare R2Audio files, attachments
15 — V2

After the MVP ships

The features that turn users into evangelists. Ship within 4-6 weeks post-MVP.

Feature 1 Drive Report

Monthly stats card. Shareable. Your Spotify Wrapped for driving productivity. Emails handled, messages replied, time saved. The viral loop.

Viral Retention Content
Feature 2 Meeting Dial-In

Join meetings from the car. Auto-dial at meeting time. Mute in traffic. Take notes. Extract action items. "I was on the call while driving" — that's the tweet.

CallKit Notes Viral Demo
Feature 3 Emotional Intelligence

Detect stress in voice. Bad day? Slower pacing, skip non-urgent items. Energetic? Faster briefing. Nobody else does this.

Voice Analysis Adaptive UX Differentiator
Feature 4 iMessage + Telegram

Expand messaging beyond WhatsApp. iMessage via local device integration. Telegram via Bot API. Cover 90% of messaging.

iMessage Telegram Coverage
16 — V3

The moat deepens

The features that make switching impossible. Data flywheel kicks in.

Intelligence

Predictive Actions

Don't react. Predict. Gym on Tuesdays → "Heading to gym? Cleared your hour." Always email after meetings → "Draft the follow-up?" Raining + usually order food → "Your usual from the Thai place?" The AI learns your patterns and acts before you ask.

Physical

Hardware Puck

Small matte-black device. Magnetic dashboard mount. Far-field mic array, Bluetooth speaker, one button, USB-C. $49.99. Opens every car without CarPlay. Unboxing content. Review bait. Subscription Trojan horse.

Marketplace

Voice Personas

"The CEO" — ultra-efficient, zero fluff. "The Best Friend" — warm, jokes. "The Drill Sergeant" — zero tolerance. Creator-built personas. Community + content + revenue. Celebrity persona = instant virality.

17 — V4

Scale plays

B2B2C distribution. The plays that make VCs lose their minds.

B2B2C

Insurance Partnerships

Prove users aren't touching their phone while driving. That's data insurers pay for. "Use Alto → 15% off car insurance." The app becomes a money-saving tool. Insurers promote you to their customers. Free distribution at scale.

B2B

Enterprise

Field sales teams. Delivery drivers. Real estate agents. Anyone who drives for work. Team accounts, CRM integration, custom workflows, admin dashboard. Massive ROI — reclaim 500+ hours/year per employee.

18 — Integration Timeline

What ships when

Every integration mapped to a version and week.

IntegrationVersionWeekDependencies
Whisper STTMVP1iOS audio permissions
ElevenLabs TTSMVP1API key, streaming
GPT-4o-mini AgentMVP2System prompt, tools
WhatsApp (Unipile)MVP3Unipile account
GmailMVP4Google OAuth
Google CalendarMVP5Google OAuth (shared)
Driving DetectionMVP6CoreMotion, Bluetooth
Morning BriefingMVP5Calendar + Gmail
OnboardingMVP6All OAuth flows
Demo VideoMVP7Full pipeline working
Drive ReportV29Usage analytics
Meeting Dial-InV210CallKit, calendar
Emotional DetectionV211Voice analysis model
iMessageV212Local device API
TelegramV212Bot API
Predictive EngineV31830+ drives of data
Hardware PuckV322Hardware partner
Insurance APIV430Driving data pipeline
19 — Content Arsenal

The product IS the content

You don't market this app. You film it working. Every feature is a TikTok. Every demo is a viral moment. Every integration launch is a content event.

You drive 550 hours a year.
You accomplish exactly zero.
Your phone can run your entire life.
You're told not to touch it.
Siri can set a timer.
Congratulations.
What if your phone just handled everything?
While you drive. Without touching it. Ever.
Ready-to-film

Scripts that stop the scroll

Hand these to a creator or film them yourself. Each one is engineered to hook in under 2 seconds.

TIKTOK / REELS 30 SEC

"I replied to 14 emails on my commute. Without touching my phone."

0:00POV dashcam. Morning traffic. Phone sitting untouched on passenger seat. Counter appears: "0 emails handled."
0:03AI voice kicks in: "Morning. 3 urgent emails. Mike confirmed the deal — want me to loop in legal?"
0:08Driver: "Yes. Reply to Sarah too — tell her Friday works." Counter ticks up.
0:12AI: "Done. Your 2pm moved to 3. Updated your calendar. Mom texted about dinner Sunday."
0:18Driver: "Tell her I'll be there at 7." Counter keeps climbing.
0:22Montage: traffic, counter hitting 14. Phone never moves.
0:26Text overlay: "14 emails. 6 messages. 2 calendar changes. 0 screen touches."
0:28Cut to phone still on seat. Brand tagline: "Your phone does everything. You just drive."
HOOK: Counter overlay on dashcam — "14 emails" in first frame is the pattern interrupt. Nobody scrolls past that.
TIKTOK / REELS 45 SEC

"My AI joined my Zoom while I was on the highway."

0:00Dashboard POV. Highway. Clock reads 9:58 AM.
0:03AI: "Your standup starts in 2 minutes. 6 people on the call. Sarah shared an agenda — want the summary?"
0:09Driver: "Summarize it. Dial me in."
0:12AI: "Connecting... You're in. Muted during traffic. Three topics: Q1 numbers, product launch, hiring."
0:18Meeting audio fades in. Someone asks: "Can we get your take on the launch timeline?"
0:22AI whispers: "Your last update said March 15. Budget approved yesterday."
0:26Driver answers with perfect context. Confident. No fumbling.
0:32Call ends. AI: "4 action items captured. Send to the team?"
0:36Driver: "Send it."
0:38Text overlay: "Joined a meeting. Got briefed. Nailed my part. Never touched the phone."
HOOK: "My AI joined my Zoom" is inherently unbelievable. That's the hook. People watch to verify if it's real.
YOUTUBE SHORTS / TIKTOK 60 SEC

"I gave an AI my commute for 7 days. Here's what happened."

0:00Face cam. "This app claims it can run your life while you drive. I'm testing it for a week."
0:05DAY 1: "It replied to my mom. In German. Correctly. I'm slightly scared."
0:12DAY 2: "It moved my dentist appointment because it saw my calendar conflict. I didn't ask it to."
0:20DAY 3: "My boss asked why I'm suddenly so responsive. I said 'new system.' I lied. It's an AI."
0:28DAY 5: "It reminded me about my dad's birthday. I forgot. It didn't. Flowers ordered."
0:36DAY 7: Counter card — "47 emails, 23 messages, 8 calendar changes, 3 reservations. All by voice."
0:44Face cam: "I can't go back. My commute is now the most productive hour of my day."
0:52Final card: stats summary + brand logo. "Your phone does everything. You just drive."
HOOK: "I gave an AI my commute" — the challenge format is algorithm bait. Day-by-day creates narrative arc. Each day is its own shareable clip.
The moments

5 conversations that sell the product instantly

Each one is a movie scene. Each one makes someone DM the link to their group chat. These aren't hypotheticals — they're what the product does.

The Monday Morning Blitz
"Good morning. 47 minutes to office. You have 23 unread emails — 4 need replies today. Your 9:30 got pushed to 10. The design team is waiting on your feedback from Friday. And your wife asked if you can pick up Noah at 4." "Reply to the design team — approved with one change, make the header darker. Tell my wife yes to Noah. What are the 4 urgent emails?" "First: client contract. They countered at 85K instead of 90K. The delta is in the payment terms, not the rate. Want me to read the relevant clause?" "Read it." "They want net-60 instead of net-30. Everything else is identical to your last proposal." "Accept it. Send to legal for signature." → Design team email sent. WhatsApp to wife: "Klar, hole Noah um 4 ab." Contract acceptance forwarded to legal. 3 more urgent emails queued.
The Birthday Save
"Hey — your mom's birthday is in 2 days. Last year you sent tulips from Bloomon and a card. Want me to do the same? I can also draft a WhatsApp message." "Same flowers but add chocolates. And yeah, write something nice." "Done. Tulips + Belgian chocolate box, delivery Thursday morning. Here's the message: 'Alles Gute zum Geburtstag, Mama! Ich hab dich lieb. Kleines Paket kommt morgen.' Send it?" "Perfect. Send Thursday morning at 8." → Bloomon order placed: €45. WhatsApp scheduled for Thursday 8:00 AM. Calendar reminder removed — already handled.
The Double-Book Save
"Heads up — Lisa just scheduled a product review for 2pm tomorrow, but you already have the investor call at 2. Want me to move one?" "Move Lisa to 3. Tell her I have a conflict." "Lisa's calendar is open at 3 and 4. I'll book 3pm and send her a note. Also — you have 30 minutes free before the investor call. Want me to prep a summary of last month's metrics?" "Yeah, pull the numbers and have them ready." → Lisa moved to 3pm. Message sent. Investor prep doc queued for 1:30pm with key metrics from your dashboard.
The One-Command Chain
"Book dinner for Friday. Somewhere nice. Italian. Two people." "Rocca has a table at 8pm. Good reviews, 12 minutes from your office. Want me to book it and tell Sarah?" "Do it." → Reservation confirmed: Rocca, Friday 8pm, 2 people. WhatsApp to Sarah: "Freitag 20 Uhr Rocca?" Calendar event created with address and directions.
The Contract From The Car
"New email from Müller AG. Revised contract, 14 pages. Want me to compare it to your last version?" "Yes. What changed?" "Three changes. Liability cap lowered from 500K to 200K. Payment terms now net-45 instead of net-30. And they added a non-compete clause for 12 months." "Push back on the liability cap and the non-compete. Accept the payment terms." "Draft reply: 'We can agree to net-45 payment terms. However, we'd need to maintain the original 500K liability cap and cannot accept the non-compete as written. Happy to discuss scope limitations as an alternative.' Send to Müller?" "Send it." → Email sent to Müller AG. Contract comparison saved. Lawyer CC'd for visibility.
Controversy plays

Hot takes that split the internet

Controversy is distribution. These are designed to make people argue in the comments — and every argument is a view.

"Siri is a voicemail box with a personality disorder."
DEFENSE — Name one complex multi-step task Siri handles end-to-end. Read an email, draft a reply, check your calendar for conflicts, and send it? Can't. It's 2026 and Siri still asks "which email?" after you've already told it.
"CarPlay is the problem pretending to be the solution."
DEFENSE — It's a screen. In your car. That you tap while driving. Apple literally made the distraction prettier and called it safety. The solution to phone distraction isn't a bigger phone — it's no phone.
"If you're not productive while driving, you're choosing to waste 23 days a year."
DEFENSE — 550 hours. That's the math. Voice-only, 100% hands-free, eyes on road. Safer than changing a Spotify playlist. The dangerous thing isn't the AI — it's the phone you're already grabbing at red lights.
The engine

3x per week. Every week. Forever.

The content machine never stops. Each post targets a different audience on a different platform in a different format.

Monday

The Demo

30-second product demo. Real car, real AI, real tasks getting done. Always end with the phone sitting untouched.

TIKTOK / REELS / SHORTS
Wednesday

The Take

One bold thought piece. Hot take thread. Industry insight. The kind of post that gets 200 quote tweets arguing.

TWITTER / LINKEDIN
Friday

The Build

Behind the scenes. New feature preview. Integration reveal. Bug war story. Building in public creates community.

TWITTER / YOUTUBE / TIKTOK
Every new integration is a launch event. Slack? "I triaged 40 Slack messages driving to work." Restaurant booking? "I booked dinner, messaged my date, and blocked my calendar — in one sentence." Each integration = new video, new audience, new use case. Constant content. Constant momentum.
20 — The Flywheel

Every drive makes switching impossible

TimeWhat happens
Week 1AI knows nothing about you
Week 4AI knows your contacts, schedule patterns, preferences
Week 12AI predicts your needs before you think of them
Week 26AI has handled 1,000+ actions — switching feels impossible
Year 1AI knows your life better than you do
The product gets better the more you use it. Memory, predictions, emotional calibration — they all compound. This creates a switching cost no competitor can overcome by copying features. They'd need YOUR data. That's a moat you can't buy.
21 — Risks

What could go wrong

Eyes open. Plan for the worst.

RiskSeverityMitigation
Voice latency >2sHIGHOn-device STT, streaming TTS, edge computing
Unipile API instabilityHIGHQueue + retry, local message cache, fallback notification
Google OAuth rejectionMEDIUMFollow review guidelines, minimal scope request
Background audio killed by iOSHIGHAudio session category, BGTaskScheduler, CarPlay entitlement
User speaks during TTSMEDIUMBarge-in detection, immediate pause + listen
Car noise degrades STTMEDIUMNoise cancellation, confidence thresholds, clarification prompts
App Store review rejectionMEDIUMPrivacy documentation, microphone usage justification
LLM hallucination in actionsHIGHConfirmation before every action, no auto-execute
22 — KPIs

How we know it works

The numbers that matter. If these move, we're winning.

<1.5s
Voice Latency
85%+
Action Success
5+
Drives per Week
70%+
Day-7 Retention
MetricTargetMeasurement
Voice round-trip latency<1.5 secondsSpeak → first TTS byte
Action success rate>85%Completed / attempted actions
Drives per week (active user)5+Weekly active drives
Day-7 retention>70%Users active 7 days after signup
Morning briefing completion>60%Users who listen to full briefing
Messages handled per drive3+Avg messages triaged per drive
NPS score>50Monthly survey