OpenRouter Is My Primary Subscription Now

I cancelled my frontier-provider subscriptions. Not because the models got worse — they didn't — but because paying a fixed monthly fee to one lab, then a second fee to another lab, then living inside whichever chat app happened to wrap the model I wanted, stopped making sense. I route everything through OpenRouter now. One account, one key, one balance, and a model picker that spans every provider worth using.^[1]

The catch is that the moment you leave the first-party apps, you lose everything that wasn't the raw model: web search, vision, image generation, document reading, voice. The chat box stops feeling like a product and starts feeling like a completions endpoint. So I built the missing layer as a Pi extension: pi-openrouter-multimodal. Eight tools that give a coding agent the same multimodal surface the consumer apps have, all proxied through the one OpenRouter key I already pay into.^[2]

Why I left the frontier subscriptions

The pitch for a frontier subscription is simple: pay $20–$200 a month, get the best model and the polished app around it. It's a good deal right up until you use more than one lab. Then you're paying two or three subscriptions, none of which talk to each other, each of which silently rate-limits you on the model you actually wanted, and each of which decides for you which model a given feature is allowed to call.

Three things pushed me over the edge:

The model I want changes weekly. The frontier is a moving target. Best coding model, best vision model, best cheap-and-fast model, and best long-context model are rarely the same lab in any given month. A subscription locks you to one vendor's release cadence. A credit balance doesn't.
I was paying for capacity I didn't use. Flat monthly fees are a bet that you'll use enough to come out ahead. For bursty, agent-driven usage, metered credits are almost always cheaper — you pay per token, not per calendar month, and idle weeks cost nothing.
Lock-in lives in the harness, not the weights. I've written about the harness being the real unit of software now. If your tools, prompts, and workflows are wired to one provider's SDK and one app's feature set, switching models means rewriting your harness. Routing through a provider-neutral gateway makes the model a config value, not an architecture decision.

OpenRouter is the gateway. It normalizes hundreds of models from dozens of providers behind a single OpenAI-compatible API, handles failover when a provider is down, and bills it all to one balance. The model name is a string you change. That's the whole pitch, and it's enough.^[1]

The app tax

Here's what nobody tells you when you cancel: the subscription was never just the model. It was the model plus a pile of server-side machinery the app quietly called on your behalf. Ask the chat app a question about a current event and it ran a web search. Paste a screenshot and it ran a vision model. Ask for a picture and it called an image model. Drop in a PDF and something parsed it. Talk to it and a speech-to-text model transcribed you.

Strip that away and a raw completions endpoint can't do any of it. The model has no eyes, no browser, no canvas, no ears. For most of my work — coding inside an agent — that's a real regression. I want my agent to read the docs page I link, look at the failing screenshot, and read the PDF spec without me leaving the terminal. That capability gap is the app tax, and it's the actual reason people feel trapped in the first-party apps.

The fix isn't to go back. It's to rebuild the capabilities as agent tools, pointed at the same gateway. OpenRouter already exposes the pieces — server-side web search, vision-capable chat completions, image output, audio endpoints. They just need to be wired into the agent as callable tools.

pi-openrouter-multimodal

pi-openrouter-multimodal is a Pi extension that adds eight tools to the agent, every one of them proxied through OpenRouter. One install, one key, and the agent gets its multimodal surface back:

pi install npm:@dtmirizzi/pi-openrouter-multimodal

Tool	What it does
`web_search`	Server-side web search with real-time results
`web_fetch`	Fetch page content from a URL — web pages, docs, PDFs
`image_generate`	Text-to-image generation via chat completions
`image_understand`	Analyze images with a vision model
`video_understand`	Analyze video (YouTube links work with Gemini)
`pdf_read`	Extract and analyze PDF content
`tts_speak`	Text-to-speech via the `/audio/speech` endpoint
`stt_transcribe`	Speech-to-text via the `/audio/transcriptions` endpoint

Every tool is independently toggleable. If you only want web search and PDF reading, turn the other six off and they never enter the agent's tool list — which keeps the context lean and the agent from reaching for capabilities you didn't sanction. This is the same least-privilege instinct from the governance work: the tools that aren't present can't be misused.

How it actually works

There's no second account and no second SDK. Every tool resolves the OpenRouter key in priority order, so if your agent already talks to OpenRouter, the extension is configured the moment it installs:

the OPENROUTER_API_KEY environment variable
Pi's model registry, under the openrouter provider
~/.pi/agent/models.json, at providers.openrouter.apiKey

Under the hood, the tools are thin adapters over endpoints OpenRouter already exposes. Search and fetch use server-side tool execution. Image generation rides on chat completions with image output. Vision and video analysis pass media as content blocks to a vision-capable model. PDFs go through a file-parser. TTS and STT hit the dedicated /audio/speech and /audio/transcriptions endpoints. The agent sees eight clean tools; the gateway does the routing.

Picking a model per modality

The part I care most about: you choose the model for each modality independently. The extension queries OpenRouter for the live model list at startup, so the picker reflects what's actually available right now, not a hardcoded menu that rots. Two interactive overlays drive it:

/web-tools — toggle tools on or off and configure the search and fetch engines
/web-models — pick the model for each modality: image generation, vision, video, PDF, the TTS voice, and the STT model

So I can point image generation at Gemini Flash Image one week and FLUX.2 the next, run vision on whatever the strongest current model is, and parse PDFs with a free Cloudflare model or Mistral OCR depending on the document. Search and fetch can route through the native engine, Exa, Firecrawl, or parallel. None of those choices touch which model writes my code — they're orthogonal, exactly as they should be.

These settings persist. They survive session shutdown, context compaction, and tree navigation, so I configure once and the agent keeps the setup across every future session.^[2]

What this buys me

The whole stack is now one mental model: a single balance, a single key, and a model name I treat as a dial. The coding model, the vision model, and the image model can each be the best available option from whichever lab shipped it, with no extra subscription and no second app. When a new model lands on OpenRouter, it shows up in my picker the same day — no waiting for an app to wire it into a feature.

And it composes with everything else I run on Pi. The governance extension still intercepts every one of these tool calls: web_fetch hitting a sketchy URL, image_generate burning budget, an audio transcription leaking PII on output. Multimodal capability and governance are separate extensions over the same harness, which is the point — capabilities are tools, control is the layer underneath them.

The honest tradeoffs

This isn't free of friction. Going metered means you watch a balance instead of forgetting a flat fee; a runaway agent loop can spend real money, which is one more reason to run it behind budgets and governance. You give up the consumer-app polish — the streaming UI, the saved chats, the mobile client — in exchange for control. And a gateway adds a hop: you're trusting OpenRouter's routing and uptime, though in practice their failover has been more reliable than any single provider I'd otherwise depend on.

For agent-driven, multi-model work, those are easy trades. I'd rather own the harness and rent the tokens than rent the whole stack and own none of it.

Try it

If you're on Pi and you route through OpenRouter, it's one command and you keep your existing key:

pi install npm:@dtmirizzi/pi-openrouter-multimodal
# then, inside Pi:
/web-tools     # toggle the eight tools
/web-models    # pick a model per modality

The source, the full tool reference, and the configuration options are on GitHub. Issues and PRs welcome — I'm running this as my daily driver, so I'll be shipping fixes either way.

References

OpenRouter — a unified API and credit balance across hundreds of models from dozens of providers. openrouter.ai
pi-openrouter-multimodal — eight OpenRouter-backed multimodal tools for Pi coding agents (source, tool reference, and configuration). github.com/dtmirizzi/pi-openrouter-multimodal