Speed Reader AI — Overview
Speed Reader AI helps you read, comprehend, and retain information faster. It combines a visual rapid serial visual presentation (RSVP) reader, high‑quality text‑to‑speech, a Teleprompt mode for performance reading, and built‑in AI tools for summarization, quizzing, and structured study workflows.
Quickstart
- Open the app and paste text, upload a file, or extract from a URL.
- Select Visual, Read‑to‑Me, or AI Features from the Mode selector.
- Press Start (Visual) or Play (TTS) to begin. Adjust speed, chunking, and ORP color as needed.
- Use Export to save progress or Export as MD to produce a Markdown study sheet.
Supported inputs
- Paste text directly
- Upload .txt, .pdf, .epub, .docx
- Extract text from a URL
- Capture text with OCR camera
Key outputs
- RSVP visual reading with stats
- Natural TTS audio with speed control
- AI summaries, custom prompts, quizzes
- Markdown export and audio packs
Controls reference
- Speed slider (WPM): Sets the display rate for RSVP in Visual mode. Typical comfortable range: 250–450 WPM. Advanced readers: 600–800+ WPM. Increase gradually to maintain comprehension.
- Words per chunk: Number of words shown per flash (1–5). 1–3 improves fixation and acuity at high WPM; 4–5 increases contextual coherence at lower WPM.
- ORP color: Optimal Recognition Point highlight color. Increase contrast for better fixation at higher speeds.
- TTS speed slider: Playback rate for Read‑to‑Me (0.25×–4.0×). For dense math/philosophy, use 0.75–1.25×; for lighter prose, 1.25–2.0×.
Reading modes & cadence
RSVP minimizes saccades (eye jumps) by keeping the text in a stable locus and controlling temporal cadence. Comprehension depends not only on WPM but on cadence shaping—how pauses and bursts map to syntax.
- Temporal chunking: Short function words tolerate higher cadence; content words benefit from micro‑pauses for integration.
- Punctuation heuristics (guidance): Pause longer after sentence terminators (., !, ?), medium after semicolons/colons, brief after commas/parentheticals. While the app streams at uniform WPM, you can simulate cadence effects by lowering WPM for dense passages and using sentence navigation to replay.
- Chunk size vs. WPM: At high WPM, use smaller chunks (1–2) to reduce masking; at lower WPM, larger chunks (3–5) restore prosodic feel.
- Fixation stability: Maintain consistent ORP color/contrast to anchor micro‑saccades and reduce regressions.
Visual speed reading
The visual reader displays one to five words at a time at a chosen WPM. Adjust chunk size, set the ORP color, and navigate sentence‑by‑sentence. A progress bar and live stats track words read, total words, time elapsed, and current WPM.
- Controls: Start, Pause, Reset, Fullscreen, Teleprompt mode
- Sentence navigation: Repeat current sentence, advance to next
- Stats: Words Read, Total Words, Time Elapsed, Current WPM
Teleprompt mode
Teleprompt presents the text as a large, smooth‑scrolling prompt for talks, presentations, or performance reading. While RSVP optimizes for speed at a fixed gaze, Teleprompt optimizes for delivery—breath, emphasis, and audience contact.
Purpose
- Rehearsal: Practice scripts with consistent pacing and visual clarity.
- Live delivery: Present with minimal cognitive load so attention can shift to tone and audience.
Cadence and phrasing
- Cadence: Aim for 150–180 WPM conversationally; 120–140 WPM for technical/novice audiences. Use sentence boundaries and clause commas as micro‑pause anchors.
- Phrasing: Break at semantic units (noun phrases, verb phrases) to maintain coherence. Avoid mid‑unit breaks that fragment meaning.
- Breath: Plan inhale points every 10–15 seconds; longer lines demand earlier breaths. Balance respiratory cadence with rhetorical emphasis.
- Prosody: Vary pitch and intensity at clause ends to signal structure. Technical density benefits from descending contours; persuasion can leverage rising contours on claims.
Typography and visibility
- Use large high‑contrast fonts and generous line height for peripheral readability.
- Keep line length 45–70 characters to reduce return‑sweep errors.
Read‑to‑Me (TTS)
Generate high‑quality speech from your text using selectable voices and adjustable playback speed. The player supports play/pause/stop, sentence navigation, and track downloads.
- Enter an API key for your selected provider
- Choose a voice and speed (0.25× – 4.0×)
- Style prompt: guide tone, pacing, and pronunciation
- Export: Full Track and Audio Pack (zip)
AI features (state‑of‑the‑art study tooling)
Speed Reader AI integrates with multiple providers to synthesize, structure, and test knowledge.
Summarize Text
- Abstractive focus: Produces concise, sectioned summaries capturing gist, claims, and caveats.
- Structure: Uses headings, bullet hierarchies, and key takeaways for rapid review.
Custom Summary
- Accepts style instructions (e.g., “for executives”, “Feynman explanation”, “compare X vs Y”).
- Useful frameworks: SCQA, Claim‑Evidence‑Reasoning, Problem‑Approach‑Result‑Implications.
Quiz Generator
- Active recall: Generates questions to retrieve knowledge, improving retention.
- Cognitive coverage: Mix factual, conceptual, procedural, and conditional questions (Bloom’s taxonomy from Remember→Evaluate).
- Calibration: Start with open‑ended prompts; add distractor choices to assess discrimination.
Document inputs
- Paste text directly into the editor with line numbers
- Upload supported files: .txt, .pdf, .epub, .docx
- Extract from a URL via the URL form
- OCR camera for scanned pages and photos
OCR camera
Built on Tesseract.js, the OCR camera captures text from images with adjustable levels to improve recognition. Use it to ingest printed material or whiteboard notes quickly.
Key terms
Double‑click words to collect them into your Key Terms panel. Print the list or clear it as needed. For best retention, convert terms into Q/A pairs (define, compare, apply) and interleave across sessions.
Chapter selection
For long documents, select chapters to process. Choose all or only the sections you want, then process the selected subset to focus your reading.
Export / Import
- Export progress to JSON; import later to resume
- Export as Markdown with headings, summaries, and notes
- Audio Pack export/import for TTS assets
AI providers & models
Speed Reader AI supports multiple providers. The app populates model lists dynamically from configuration. Below is a professionally curated overview of the options present in your configuration, including strengths, ideal use cases, and a pricing snapshot.
OpenAI
Frontier multimodal models with strong instruction‑following, tool use, and reasoning. Choose smaller variants for lower latency and cost.
| Model | Profile & Benefits | Pricing |
|---|---|---|
gpt-5 |
Frontier general model for complex reasoning, synthesis, and agentic workflows. Best for highest‑quality summaries and nuanced analysis. | Standard per 1M tokens — Input $1.25, Cached $0.125, Output $10.00. Source: OpenAI |
gpt-5-chat |
Chat‑optimized variant of GPT‑5 tuned for dialogue, safety, and helpfulness. Ideal for interactive study assistants and Q/A. | Standard per 1M tokens — Input $1.25, Cached $0.125, Output $10.00. Source: OpenAI |
gpt-5-mini |
Latency/throughput‑oriented. Balanced quality for everyday summarization and quiz generation at lower cost. | Standard per 1M tokens — Input $0.25, Cached $0.025, Output $2.00. Source: OpenAI |
gpt-5-nano |
Cost‑efficient micro‑model for fast drafts, boilerplate, or light classification. Use when volume matters more than marginal quality. | Standard per 1M tokens — Input $0.05, Cached $0.005, Output $0.40. Source: OpenAI |
gpt-4.1-mini |
Well‑known value model: strong instruction following at low latency. Great for summaries and quizzes with predictable behavior. | Standard per 1M tokens — Input $0.40, Cached $0.10, Output $1.60. Source: OpenAI |
gpt-4o-mini |
Multimodal‑capable “omni” mini; excellent price/performance for text tasks. Recommended default for summaries/quiz when cost sensitive. | Standard per 1M tokens — Input $0.15, Cached $0.075, Output $0.60. Source: OpenAI |
gpt-4.1-nano |
Ultra‑light footprint for high‑volume requests with basic accuracy. Use for simple extractions and scaffolding. | Standard per 1M tokens — Input $0.10, Cached $0.025, Output $0.40. Source: OpenAI |
Anthropic (Claude)
Strong at harmlessness and long‑form analysis with high factuality. Good for structured study materials and safer content generation.
| Model | Profile & Benefits | Pricing |
|---|---|---|
claude-3-5-haiku-latest |
Fast, low‑cost. Solid quality for summarization and short quizzes. Ideal for large batch workloads. | Per 1M tokens — Input $0.80, Output $4.00; Prompt caching: Write $1.00, Read $0.08. Source: Anthropic |
claude-3-7-sonnet-latest |
Balanced flagship with excellent instruction following and helpful tone. Great for study guides and explanations. | Per 1M tokens — Input $3.00, Output $15.00 (typical Sonnet tier). Source: Anthropic |
claude-opus-4-20250514 |
Top‑tier reasoning and nuanced synthesis. Best for complex, high‑stakes content. | Per 1M tokens — Input $15.00, Output $75.00; Prompt caching: Write $18.75, Read $1.50. Source: Anthropic |
claude-sonnet-4-20250514 |
Updated Sonnet (Claude 4 family). Strong overall quality at lower cost than Opus. | Per 1M tokens — ≤200K: Input $3.00, Output $15.00; >200K: Input $6.00, Output $22.50; Caching ≤200K: Write $3.75, Read $0.30. Source: Anthropic |
DeepSeek
Competitive performance with a reasoner variant for chain‑of‑thought and tool‑use heavy tasks.
| Model | Profile & Benefits | Pricing |
|---|---|---|
deepseek-chat |
General chat model with good price/perf. Suitable for summaries and Q/A. | Standard per 1M tokens — Input (cache hit) $0.07, Input (cache miss) $0.27, Output $1.10. Off‑peak (UTC 16:30–00:30): hit $0.035, miss $0.135, output $0.550. Source: DeepSeek |
deepseek-reasoner |
Reasoning‑forward variant; better for multi‑step solutions and rubric‑aligned quizzes. | Standard per 1M tokens — Input (cache hit) $0.14, Input (cache miss) $0.55, Output $2.19. Off‑peak discounts apply as documented. Source: DeepSeek |
OpenRouter (aggregator)
Access many providers/models through a single API. Your config includes the following free or general routes:
| Route | Profile & Benefits | Pricing |
|---|---|---|
openai/gpt-oss-20b:free |
Open‑source 20B class model via OpenRouter. Good for baseline summaries and lightweight tasks. | Free via OpenRouter |
google/gemini-2.0-flash-exp:free |
Gemini Flash experimental route. Fast responses; excellent for quick summaries and outline generation. | Free via OpenRouter |
moonshotai/kimi-k2:free |
Kimi K2 route for general chat and synthesis. | Free via OpenRouter |
tencent/hunyuan-a13b-instruct:free |
Instruction‑tuned 13B model; efficient for simple tasks and bulk processing. | Free via OpenRouter |
qwen/qwen3-235b-a22b-07-25:free |
Large Qwen variant via free route; strong general capabilities for summaries and Q/A. | Free via OpenRouter |
openai/gpt-oss-120b |
120B‑class OSS route. Higher quality than 20B; better long‑form synthesis. | Provider‑priced via OpenRouter (non‑free). See OpenRouter pricing. |
Ollama (local)
Run local models with zero per‑token API cost and maximum data privacy. Performance depends on your CPU/GPU and chosen model size.
- Cost: $0 per API call (local compute only).
- Best when offline, cost‑sensitive, or requiring data residency.
// In-app steps:
// 1) Open AI Configuration
// 2) Select Provider (e.g., OpenAI)
// 3) Choose Model from the populated list
// 4) Paste API Key (if required)
// 5) Use Summarize / Custom Summary / Quiz in AI Features
Voices & speed
The TTS engine offers multiple voice options and playback speed control. Style prompt lets you specify tone, pacing, and pronunciation hints (e.g., acronyms, names).
Markdown exporter
Export your study materials in Markdown. The exporter structures headings and preserves lists for easy review in any Markdown viewer or notes app.
Readwise
Readwise integration helps consolidate your highlights and notes. Use the Readwise panel to sync summaries or key insights from your reading sessions.
Performance & accuracy
- Use chunking 1–3 words for maximum recognition at high WPM
- Increase ORP contrast to maintain fixation
- For OCR, ensure sharp images, good lighting, and straight alignment
- Alternate sprint and recovery blocks to maintain comprehension at higher speeds
Shortcuts
- ESC: exit fullscreen
- Space: pause/resume (visual)
- Arrow keys (where applicable): sentence navigation
Troubleshooting
- If TTS won’t start, verify provider, model, and a valid API key
- For PDFs that fail to parse, try OCR camera or export to .txt
- If sidebar highlighting desyncs, refresh once to reset anchors
Privacy
API keys are entered locally in the app. Review your provider’s data retention policy. Exported files remain on your device unless you choose to share them.
Changelog (high‑level)
- Visual reader with stats, sentence nav, fullscreen
- Read‑to‑Me with voices, speed, and style prompts
- AI features: summarize, custom summaries, quizzes
- OCR camera with adjustable levels
- Key terms, teleprompt mode, chapter selection
- Markdown and audio pack export/import