Browser Runtime Capabilities¶

Comprehensive inventory of what runs client-side in 2026.

AI Inference¶

Models proven in-browser¶

Model	Task	Size	Speed	Library
Kokoro TTS (82M)	Text-to-speech	~160MB	Real-time	kokoro-js + ONNX RT
Whisper tiny	Speech-to-text	40MB	~1x realtime	Transformers.js
Whisper small	Speech-to-text (better)	240MB	~0.5x realtime	Transformers.js
Moonshine tiny (27M)	Speech-to-text	50MB	Fast	ONNX RT
Llama 3.2 1B	Text generation	~700MB	60 tok/s (WebGPU)	Transformers.js v4
Llama 3.2 3B	Text generation	~1.8GB	30 tok/s (WebGPU)	Transformers.js v4
Phi-3 mini	Text generation	~2GB	20-40 tok/s	WebLLM
Gemma 2B	Text generation	~1.5GB	30-50 tok/s	WebLLM
Stable Diffusion	Image generation	~2GB	10-30s/image	WebGPU
RMBG	Background removal	~50MB	<1s/image	Transformers.js
SAM (ViT-B)	Image segmentation	~200MB	<2s/image	ONNX RT
Florence-2	Image captioning, OCR	~500MB	<3s/image	Transformers.js
NLLB-200	Translation (200 langs)	~400MB	<1s/sentence	Transformers.js
Marian-MT	Translation (specific pairs)	100-200MB	Fast	Transformers.js
BART-small	Summarization	~150MB	<2s/paragraph	Transformers.js
T5-small	Summarization/Q&A	~120MB	<2s/paragraph	Transformers.js
all-MiniLM-L6	Embeddings/search	23MB	<50ms/query	Transformers.js
DistilBERT	Sentiment/classification	65MB	<100ms/text	Transformers.js
YOLO	Object detection	20-50MB	<200ms/image	ONNX RT
DPT	Depth estimation	~100MB	<1s/image	Transformers.js
AST	Audio classification	~80MB	<500ms/clip	Transformers.js

Runtime backends¶

Backend	When used	Performance	Browser support
WebGPU	GPU available	3-10x faster than WASM	Chrome 113+, Edge, Firefox, Safari 18+
WASM (q8)	CPU fallback	Usable for <1B models	All modern browsers
WebNN	Native AI accelerator	Fastest (where available)	Chrome 130+ (limited)

Model caching¶

Models download once from HuggingFace CDN and cache in Cache Storage API:

Survives browser restarts
Survives tab closes
Per-origin (each agent caches independently, or shared via same origin)
Typical cache life: indefinite until user clears browser data

Shared model opportunity: If multiple agents on *.freeagentstore.online need Whisper, the model can be cached once at the parent domain level and shared. This requires the host worker to serve a model-cache service worker.

Node.js in Browser¶

WebContainers (StackBlitz)¶

Full Node.js runtime compiled to WebAssembly. Runs inside the browser with:

Virtual file system — read/write files
npm install — install any package
TCP networking — via ServiceWorker (local dev servers work)
Process spawning — child_process works
Build tools — webpack, vite, esbuild, tsc all run

Requirements: SharedArrayBuffer (needs COOP/COEP headers). Works in Chrome, Edge, Firefox. Not Safari (missing SharedArrayBuffer in cross-origin contexts).

Use cases for agents:

Agent type	What WebContainers enables
Code linter	Run ESLint/Biome on pasted code
Code formatter	Run Prettier on pasted code
Test runner	Run vitest/jest on pasted tests
Build tool	Compile TypeScript, bundle with esbuild
Package analyzer	`npm audit`, dependency graph
Markdown processor	Run remark/unified pipelines
Data transformer	Run jq-like transforms via Node
Static site generator	Run Eleventy/Astro in browser

Nodebox (CodeSandbox)¶

Alternative to WebContainers. Different tradeoffs:

	WebContainers	Nodebox
Safari support	No	Yes (beta)
SharedArrayBuffer	Required	Not required
npm install	Full	Full
Performance	Better	Good
TCP networking	Yes	Limited
Maturity	Production	Beta

Browser Automation (iframe-based)¶

Same-origin automation¶

For content hosted on *.freeagentstore.online, agents have full DOM access via iframe:

const iframe = document.createElement('iframe');
iframe.src = 'https://target.freeagentstore.online';
document.body.appendChild(iframe);

// Full DOM access (same origin)
const doc = iframe.contentDocument;
doc.querySelector('button').click();
doc.querySelector('input').value = 'hello';
const text = doc.querySelector('.result').textContent;

No Playwright needed. No Puppeteer needed. Just JavaScript.

Cross-origin automation (with proxy)¶

For external content, options:

CORS proxy — host worker proxies external page, serves as same-origin
postMessage bridge — if target cooperates, communicate via messages
Fetch + parse — fetch HTML, parse in-browser via DOMParser (no JS execution)

FAS Quality Reporter pattern¶

Already proven in production. The FAS Quality Reporter runs INSIDE an iframe and posts metrics to the parent via postMessage:

// Inside iframe (agent being tested)
window.parent.postMessage({
  type: 'fas:quality',
  viewport: { width: 393, height: 852 },
  document: { scrollsX: false, scrollsY: false },
  clipping: []
}, '*');

// Parent dashboard
window.addEventListener('message', (e) => {
  if (e.data.type === 'fas:quality') {
    // Process metrics
  }
});

This pattern extends to agent-on-agent automation: one agent loads another in an iframe and orchestrates it.

Puppeteer-in-browser¶

Puppeteer officially supports running from a browser context. The automation commands go to a separate browser with an open debugging port. This requires the user to have Chrome DevTools open, so it's a power-user feature, not a default path.

Local LLM Connection (Ollama)¶

How it works¶

User has Ollama installed locally (one-time setup)
Ollama runs at http://localhost:11434
User sets OLLAMA_ORIGINS="*" for CORS
Agent detects Ollama via fetch('http://localhost:11434/api/tags')
If available, agent uses local LLM for enhanced features

API (OpenAI-compatible)¶

// Check Ollama availability
const available = await fetch('http://localhost:11434/api/tags')
  .then(r => r.ok).catch(() => false);

// Chat completion
const response = await fetch('http://localhost:11434/v1/chat/completions', {
  method: 'POST',
  body: JSON.stringify({
    model: 'llama3.2',
    messages: [{ role: 'user', content: 'Summarize this text...' }]
  })
});

What Ollama enables¶

Without Ollama	With Ollama
Whisper transcribes audio	+ LLM summarizes transcript
Background removal works	+ LLM describes the removed object
OCR extracts text	+ LLM answers questions about the document
Translation works	+ LLM adapts tone/formality
Code linting finds issues	+ LLM suggests fixes

Ollama is a power-user enhancement, not a requirement. Every agent must work without it.

Storage & Persistence¶

API	Use case	Capacity	Persistence
Cache Storage	Model files	Unlimited*	Survives restarts
IndexedDB	Structured results, history	Unlimited*	Survives restarts
OPFS	Large file processing	Unlimited*	Survives restarts
localStorage	Small settings	5-10MB	Survives restarts
File System Access	Read/write user's local files	User-granted	Session-based
Clipboard API	Copy results	N/A	N/A

*"Unlimited" = browser-managed, typically multiple GB before the browser prompts the user.

Offline capability¶

With Service Workers, agents can:

Cache the app shell (HTML/JS/CSS)
Cache the AI model
Run entirely offline after first load
Sync results when back online

This is better than any cloud-based agent tool.