Building a Word-Level Quran Viewer with Layered Data: How open-quran-view Enables Rich Quranic Experiences
The Quran is text. But the Quran is also recitation, interpretation, and lived practice. A static page displaying Arabic text misses all of that.
I wanted to build something that kept the Arabic text front and center while layering word-level recitation timing and word-level tafseer on top. The result is a multi-layer Musaf architecture where the view and recitation stay synchronized as the audio plays, the current word being recited is highlighted in real time, and tapping any word surfaces its tafsir.
Here's how it works.
The Base: open-quran-view
The foundation is open-quran-view, an npm package I built that handles rendering Quran pages in React and Web. It takes care of the hard parts: positioning Arabic text across a page, handling different mushaf layouts (Hafs QCF V2, Hafs QCF V4, Hafs Unicode), and exposing word-level interactions via callbacks like onWordClick.
The component API is minimal:
interface OpenQuranViewProps {
page: number;
width: number;
height: number;
theme: "light" | "dark";
mushafLayout?: "hafs-v2" | "hafs-v4" | "hafs-unicode";
onPageChange?: (page: number) => void;
onWordClick?: (word: {
id: number;
surahNumber?: number;
ayahNumber?: number;
}) => void;
onLoad?: (layout: unknown) => void;
}
It renders the page. That's the base layer.
Layer 1: Word-Level Recitation from Quran Foundation API
The Quran Foundation API (api.quran.com) supports a segments=true parameter that returns word-level timing data alongside audio recitations.
Call https://api.quran.com/api/qdc/audio/reciters/{id}/audio_files?chapter={surah}&segments=true and you get back timestamps for every word in the chapter. Each word has a start time and end time in milliseconds.
With this, you can:
Highlight the current word during recitation playback
Tap any word and jump to its position in the audio
Display word-by-word as the reciter moves through the text
This is the second layer. The Arabic stays the same. The audio becomes interactive at the word level.
Layer 2: Word-Level Tafseer
Tafseer (interpretation) is typically served at the verse level. But interpretation often depends on understanding specific word choices in context.
My app links tafseer references at the word level. When you tap a word, the tafsir for that specific term is retrieved (see word-level tafseer design and community discussion). The word-to-tafseer mapping comes from /data/tafseers/quran-words.json, generated during the issue discussion with help from MostafaOsmanFathi. The challenge is that the Quran Foundation API and the tafsir data don't use matching data source indices, I had to build alignment the words index to reconcile them.
The tafsir layer doesn't replace the verse-level commentary. It supplements it at the granularity where it matters: the individual word.
The Three Layers Together
| Layer | Source | What It Adds |
|---|---|---|
| Quran text | open-quran-view | Uthmani script page, word-level click detection |
| Tarteel (recitation) | Quran Foundation API | Word-level audio sync, pronunciation timing |
| Tafseer | Word-level tafseer json | Word-level interpretation, contextual meaning |
When all three are active, the experience is: you're reading the Arabic, the audio highlights each word as it's recited, and tapping any word surfaces its tafsir, all without leaving the page.
Why This Architecture
Traditional Quran apps treat these as separate screens. You go to the recitation tab to hear audio, the tafsir tab to read interpretation, and the text tab to read. The reference points between them are loose, you have to manually match verse numbers across views.
Layering everything on the same page means the context stays synchronized. The word you're reading is the word being recited is the word being interpreted.
Potential Uses
This architecture opens up:
Tajweed learning tools: Highlight mispronounced words, show elongation and noon rules visually
Adaptive recitation apps: Speed up or slow down individual words for practice
Tarteel AI voice models: Use word-level timing data to train or evaluate AI recitation models — segment timestamps provide ground truth for pronunciation and rhythm
Word-level search across tafsir: Find every instance where a particular Arabic root appears in interpretation
Comparative tafsir: Layer multiple tafsir sources side by side, each anchored to the same word
Accessibility: Word-level audio sync helps readers with visual impairments follow along
What I Learned
The Quran Foundation API's segment data is reliable and well-structured. The harder problem was data alignment, different APIs use different indexing schemes, and reconciling them requires a proxy layer with explicit mapping logic.
The open-quran-view package handles the rendering. The layered data approach handles the richness. Together they make it possible to build Quranic experiences that were previously locked in native apps or separate tools.
The code for the Mubin app is on GitHub. The open-quran-view package is on npm. If you're building something in this space, the package handles the rendering so you can focus on the data layer.
This article documents the architecture behind the Mubin Quran app. The open-quran-view package is MIT licensed and works with any React or Web project.



