Can AI agents run on local models with Ollama?

Yes. Local models served by Ollama handle triage, drafting, summarization, and other routine work at zero marginal cost, with data never leaving your machine. Pair them with a frontier model for deep reasoning and you get most of the quality at a fraction of the spend.

Can I mix Claude, OpenAI, and local models in one workflow?

In Action Plan, yes — routing is per task, not per app. A single workflow can draft on a local Ollama model, review on Claude, and format on a cheap API model, with every call recorded in one ledger and the routing decision visible and overridable.

Which tasks need a frontier model and which don't?

Architecture, complex multi-step reasoning, and high-stakes writing benefit from frontier models like Claude or OpenAI's best. Classification, triage, routine drafts, extraction, and summaries usually don't — they run well locally. The mistake is sending everything to one model in either direction.

A conversation with the Action Plan team

Can I choose which AI models my agents use — including local ones?

You

I want privacy-sensitive work on local models — and honestly, free is nice — but I still want Claude-level quality for the hard stuff. Every tool I try locks me to one provider and one bill.

Lexi Orchestrator

You should never be locked to one model. The right setup routes each task to the cheapest model that can do it well — local Ollama models for routine and private work, frontier models like Claude for deep reasoning — with every decision visible and overridable. Model choice is a routing policy, and you should own it.

Arjun AI Research Architect

Let me teach the principle, because once you see it you can't unsee it: model fit is per task, not per product. Sending everything to a frontier model overpays for routine work; sending everything to a local model under-delivers on hard work. Both mistakes come from treating "which AI do we use?" as one decision. It's hundreds of small ones — which is exactly why software should make them, under a policy you set.

Arjun AI Research Architect

The decision matrix I'd give anyone, for any stack:

Local models via Ollama — triage, classification, routine drafts, summaries, formatting. Zero marginal cost, and the data never leaves your machine. The default for private material.
Cheap API tiers — bulk content, structured extraction, first-pass code review. Pennies, fast.
Frontier models (Claude, OpenAI's best) — architecture, complex multi-step reasoning, high-stakes writing, final reviews. Worth every token for those tasks.

Two override rules: privacy beats cost (sensitive data stays local even when an API would be better), and error cost beats token cost (if a cheap model's mistakes are expensive to fix, it wasn't cheap).

Lexi Orchestrator

In Action Plan this matrix is live policy, not a wiki page. One workflow mixes providers freely — and the ledger shows what the routing saved:

Weekly customer report Routing

LOCAL Summarize 40 conversations — Ollama on your machine · $0.00
API Extract themes to structured data — cheap tier · $0.06
FRNT Final narrative + recommendations — Claude · est. $0.42

Est. total $0.48 · all-frontier would be $3.10 · routing visible, overridable per step

Bring your Claude or OpenAI subscription, API keys, or nothing but a machine that runs Ollama — all three work, together, and switching providers later doesn't mean rebuilding your workflows.

Arjun AI Research Architect

Honest expectation: local models have real limits. A 8–14B parameter model on your laptop is superb at routine language work and unconvincing at novel reasoning. The win isn't pretending otherwise — it's that the router knows the difference, and you can audit its choices.

📌 Pinned by Arjun — the short version

The model-routing decision matrix

Local (Ollama, zero marginal cost, private): triage, classification, routine drafts, summaries, formatting.
Cheap API tier: bulk content, structured extraction, first-pass review.
Frontier (Claude / OpenAI): architecture, complex reasoning, high-stakes writing, final review.

Two override rules

Privacy beats cost — sensitive data stays local even when an API model would do it better.
Error cost beats token cost — a cheap model whose mistakes are expensive to fix wasn't cheap.

Can I mix providers in one workflow?

In Action Plan, yes: draft locally, review on Claude, format on a cheap tier — per task, in one workflow, with one ledger and overridable routing.

Action Plan is in private preview. When your invite opens, onboarding starts from exactly this conversation.

← Previous problemMy AI forgets — and makes things up Next problem →Get email off my plate