Speed Reader AI — Overview

Speed Reader AI helps you read, comprehend, and retain information faster. It combines a visual rapid serial visual presentation (RSVP) reader, high‑quality text‑to‑speech, a Teleprompt mode for performance reading, and built‑in AI tools for summarization, quizzing, and structured study workflows.

💡

Use the left sidebar to jump between topics. Type in the search box and press Enter to jump to the best matching section (e.g., “slider”, “WPM”, “quiz”, “teleprompt cadence”).

Quickstart

Open the app and paste text, upload a file, or extract from a URL.
Select Visual, Read‑to‑Me, or AI Features from the Mode selector.
Press Start (Visual) or Play (TTS) to begin. Adjust speed, chunking, and ORP color as needed.
Use Export to save progress or Export as MD to produce a Markdown study sheet.

Supported inputs

Paste text directly
Upload .txt, .pdf, .epub, .docx
Extract text from a URL
Capture text with OCR camera

Key outputs

RSVP visual reading with stats
Natural TTS audio with speed control
AI summaries, custom prompts, quizzes
Markdown export and audio packs

Controls reference

Speed slider (WPM): Sets the display rate for RSVP in Visual mode. Typical comfortable range: 250–450 WPM. Advanced readers: 600–800+ WPM. Increase gradually to maintain comprehension.
Words per chunk: Number of words shown per flash (1–5). 1–3 improves fixation and acuity at high WPM; 4–5 increases contextual coherence at lower WPM.
ORP color: Optimal Recognition Point highlight color. Increase contrast for better fixation at higher speeds.
TTS speed slider: Playback rate for Read‑to‑Me (0.25×–4.0×). For dense math/philosophy, use 0.75–1.25×; for lighter prose, 1.25–2.0×.

Reading modes & cadence

RSVP minimizes saccades (eye jumps) by keeping the text in a stable locus and controlling temporal cadence. Comprehension depends not only on WPM but on cadence shaping—how pauses and bursts map to syntax.

Temporal chunking: Short function words tolerate higher cadence; content words benefit from micro‑pauses for integration.
Punctuation heuristics (guidance): Pause longer after sentence terminators (., !, ?), medium after semicolons/colons, brief after commas/parentheticals. While the app streams at uniform WPM, you can simulate cadence effects by lowering WPM for dense passages and using sentence navigation to replay.
Chunk size vs. WPM: At high WPM, use smaller chunks (1–2) to reduce masking; at lower WPM, larger chunks (3–5) restore prosodic feel.
Fixation stability: Maintain consistent ORP color/contrast to anchor micro‑saccades and reduce regressions.

Advanced practice: cycle 90–120 seconds at target WPM, then 30–45 seconds recovery at −20% WPM to consolidate gist without fatigue.

Visual speed reading

The visual reader displays one to five words at a time at a chosen WPM. Adjust chunk size, set the ORP color, and navigate sentence‑by‑sentence. A progress bar and live stats track words read, total words, time elapsed, and current WPM.

Controls: Start, Pause, Reset, Fullscreen, Teleprompt mode
Sentence navigation: Repeat current sentence, advance to next
Stats: Words Read, Total Words, Time Elapsed, Current WPM

Teleprompt mode

Teleprompt presents the text as a large, smooth‑scrolling prompt for talks, presentations, or performance reading. While RSVP optimizes for speed at a fixed gaze, Teleprompt optimizes for delivery—breath, emphasis, and audience contact.

Purpose

Rehearsal: Practice scripts with consistent pacing and visual clarity.
Live delivery: Present with minimal cognitive load so attention can shift to tone and audience.

Cadence and phrasing

Cadence: Aim for 150–180 WPM conversationally; 120–140 WPM for technical/novice audiences. Use sentence boundaries and clause commas as micro‑pause anchors.
Phrasing: Break at semantic units (noun phrases, verb phrases) to maintain coherence. Avoid mid‑unit breaks that fragment meaning.
Breath: Plan inhale points every 10–15 seconds; longer lines demand earlier breaths. Balance respiratory cadence with rhetorical emphasis.
Prosody: Vary pitch and intensity at clause ends to signal structure. Technical density benefits from descending contours; persuasion can leverage rising contours on claims.

Typography and visibility

Use large high‑contrast fonts and generous line height for peripheral readability.
Keep line length 45–70 characters to reduce return‑sweep errors.

Workflow: draft → summarize with AI for talk outline → convert to Teleprompt → rehearse at 130–160 WPM → export notes or key terms.

Read‑to‑Me (TTS)

Generate high‑quality speech from your text using selectable voices and adjustable playback speed. The player supports play/pause/stop, sentence navigation, and track downloads.

Enter an API key for your selected provider
Choose a voice and speed (0.25× – 4.0×)
Style prompt: guide tone, pacing, and pronunciation
Export: Full Track and Audio Pack (zip)

AI features (state‑of‑the‑art study tooling)

Speed Reader AI integrates with multiple providers to synthesize, structure, and test knowledge.

Summarize Text

Abstractive focus: Produces concise, sectioned summaries capturing gist, claims, and caveats.
Structure: Uses headings, bullet hierarchies, and key takeaways for rapid review.

Custom Summary

Accepts style instructions (e.g., “for executives”, “Feynman explanation”, “compare X vs Y”).
Useful frameworks: SCQA, Claim‑Evidence‑Reasoning, Problem‑Approach‑Result‑Implications.

Quiz Generator

Active recall: Generates questions to retrieve knowledge, improving retention.
Cognitive coverage: Mix factual, conceptual, procedural, and conditional questions (Bloom’s taxonomy from Remember→Evaluate).
Calibration: Start with open‑ended prompts; add distractor choices to assess discrimination.

Configure provider, model, and API key in the AI Configuration panel within the app.

Document inputs

Paste text directly into the editor with line numbers
Upload supported files: .txt, .pdf, .epub, .docx
Extract from a URL via the URL form
OCR camera for scanned pages and photos

OCR camera

Built on Tesseract.js, the OCR camera captures text from images with adjustable levels to improve recognition. Use it to ingest printed material or whiteboard notes quickly.

Key terms

Double‑click words to collect them into your Key Terms panel. Print the list or clear it as needed. For best retention, convert terms into Q/A pairs (define, compare, apply) and interleave across sessions.

Chapter selection

For long documents, select chapters to process. Choose all or only the sections you want, then process the selected subset to focus your reading.

Export / Import

Export progress to JSON; import later to resume
Export as Markdown with headings, summaries, and notes
Audio Pack export/import for TTS assets

AI providers & models

Speed Reader AI supports multiple providers. The app populates model lists dynamically from configuration. Below is a professionally curated overview of the options present in your configuration, including strengths, ideal use cases, and a pricing snapshot.

Pricing evolves quickly and may differ by account, region, or via aggregators like OpenRouter. Treat the costs below as a qualitative guide: “Free” means no per‑token API charge (e.g., OpenRouter free tier or local inference). “Provider‑priced” means billed by the provider; follow the link to confirm current rates.

OpenAI

Frontier multimodal models with strong instruction‑following, tool use, and reasoning. Choose smaller variants for lower latency and cost.

Model	Profile & Benefits	Pricing
`gpt-5`	Frontier general model for complex reasoning, synthesis, and agentic workflows. Best for highest‑quality summaries and nuanced analysis.	Standard per 1M tokens — Input $1.25, Cached $0.125, Output $10.00. Source: OpenAI
`gpt-5-chat`	Chat‑optimized variant of GPT‑5 tuned for dialogue, safety, and helpfulness. Ideal for interactive study assistants and Q/A.	Standard per 1M tokens — Input $1.25, Cached $0.125, Output $10.00. Source: OpenAI
`gpt-5-mini`	Latency/throughput‑oriented. Balanced quality for everyday summarization and quiz generation at lower cost.	Standard per 1M tokens — Input $0.25, Cached $0.025, Output $2.00. Source: OpenAI
`gpt-5-nano`	Cost‑efficient micro‑model for fast drafts, boilerplate, or light classification. Use when volume matters more than marginal quality.	Standard per 1M tokens — Input $0.05, Cached $0.005, Output $0.40. Source: OpenAI
`gpt-4.1-mini`	Well‑known value model: strong instruction following at low latency. Great for summaries and quizzes with predictable behavior.	Standard per 1M tokens — Input $0.40, Cached $0.10, Output $1.60. Source: OpenAI
`gpt-4o-mini`	Multimodal‑capable “omni” mini; excellent price/performance for text tasks. Recommended default for summaries/quiz when cost sensitive.	Standard per 1M tokens — Input $0.15, Cached $0.075, Output $0.60. Source: OpenAI
`gpt-4.1-nano`	Ultra‑light footprint for high‑volume requests with basic accuracy. Use for simple extractions and scaffolding.	Standard per 1M tokens — Input $0.10, Cached $0.025, Output $0.40. Source: OpenAI

Anthropic (Claude)

Strong at harmlessness and long‑form analysis with high factuality. Good for structured study materials and safer content generation.

Model	Profile & Benefits	Pricing
`claude-3-5-haiku-latest`	Fast, low‑cost. Solid quality for summarization and short quizzes. Ideal for large batch workloads.	Per 1M tokens — Input $0.80, Output $4.00; Prompt caching: Write $1.00, Read $0.08. Source: Anthropic
`claude-3-7-sonnet-latest`	Balanced flagship with excellent instruction following and helpful tone. Great for study guides and explanations.	Per 1M tokens — Input $3.00, Output $15.00 (typical Sonnet tier). Source: Anthropic
`claude-opus-4-20250514`	Top‑tier reasoning and nuanced synthesis. Best for complex, high‑stakes content.	Per 1M tokens — Input $15.00, Output $75.00; Prompt caching: Write $18.75, Read $1.50. Source: Anthropic
`claude-sonnet-4-20250514`	Updated Sonnet (Claude 4 family). Strong overall quality at lower cost than Opus.	Per 1M tokens — ≤200K: Input $3.00, Output $15.00; >200K: Input $6.00, Output $22.50; Caching ≤200K: Write $3.75, Read $0.30. Source: Anthropic

DeepSeek

Competitive performance with a reasoner variant for chain‑of‑thought and tool‑use heavy tasks.

Model	Profile & Benefits	Pricing
`deepseek-chat`	General chat model with good price/perf. Suitable for summaries and Q/A.	Standard per 1M tokens — Input (cache hit) $0.07, Input (cache miss) $0.27, Output $1.10. Off‑peak (UTC 16:30–00:30): hit $0.035, miss $0.135, output $0.550. Source: DeepSeek
`deepseek-reasoner`	Reasoning‑forward variant; better for multi‑step solutions and rubric‑aligned quizzes.	Standard per 1M tokens — Input (cache hit) $0.14, Input (cache miss) $0.55, Output $2.19. Off‑peak discounts apply as documented. Source: DeepSeek

OpenRouter (aggregator)

Access many providers/models through a single API. Your config includes the following free or general routes:

Route	Profile & Benefits	Pricing
`openai/gpt-oss-20b:free`	Open‑source 20B class model via OpenRouter. Good for baseline summaries and lightweight tasks.	Free via OpenRouter
`google/gemini-2.0-flash-exp:free`	Gemini Flash experimental route. Fast responses; excellent for quick summaries and outline generation.	Free via OpenRouter
`moonshotai/kimi-k2:free`	Kimi K2 route for general chat and synthesis.	Free via OpenRouter
`tencent/hunyuan-a13b-instruct:free`	Instruction‑tuned 13B model; efficient for simple tasks and bulk processing.	Free via OpenRouter
`qwen/qwen3-235b-a22b-07-25:free`	Large Qwen variant via free route; strong general capabilities for summaries and Q/A.	Free via OpenRouter
`openai/gpt-oss-120b`	120B‑class OSS route. Higher quality than 20B; better long‑form synthesis.	Provider‑priced via OpenRouter (non‑free). See OpenRouter pricing.

Ollama (local)

Run local models with zero per‑token API cost and maximum data privacy. Performance depends on your CPU/GPU and chosen model size.

Cost: $0 per API call (local compute only).
Best when offline, cost‑sensitive, or requiring data residency.

Pricing snapshot as of 2025-08-15 (per 1M tokens, USD). Sources: OpenAI, Anthropic, DeepSeek, OpenRouter. Verify current rates before production. Tiers (Flex/Standard/Priority), prompt caching, and time‑of‑day discounts (DeepSeek) affect effective cost.

// In-app steps:
// 1) Open AI Configuration
// 2) Select Provider (e.g., OpenAI)
// 3) Choose Model from the populated list
// 4) Paste API Key (if required)
// 5) Use Summarize / Custom Summary / Quiz in AI Features

Voices & speed

The TTS engine offers multiple voice options and playback speed control. Style prompt lets you specify tone, pacing, and pronunciation hints (e.g., acronyms, names).

Markdown exporter

Export your study materials in Markdown. The exporter structures headings and preserves lists for easy review in any Markdown viewer or notes app.

Readwise

Readwise integration helps consolidate your highlights and notes. Use the Readwise panel to sync summaries or key insights from your reading sessions.

Performance & accuracy

Use chunking 1–3 words for maximum recognition at high WPM
Increase ORP contrast to maintain fixation
For OCR, ensure sharp images, good lighting, and straight alignment
Alternate sprint and recovery blocks to maintain comprehension at higher speeds

Shortcuts

ESC: exit fullscreen
Space: pause/resume (visual)
Arrow keys (where applicable): sentence navigation

Troubleshooting

If TTS won’t start, verify provider, model, and a valid API key
For PDFs that fail to parse, try OCR camera or export to .txt
If sidebar highlighting desyncs, refresh once to reset anchors

Privacy

API keys are entered locally in the app. Review your provider’s data retention policy. Exported files remain on your device unless you choose to share them.

Changelog (high‑level)

Visual reader with stats, sentence nav, fullscreen
Read‑to‑Me with voices, speed, and style prompts
AI features: summarize, custom summaries, quizzes
OCR camera with adjustable levels
Key terms, teleprompt mode, chapter selection
Markdown and audio pack export/import