Quran Search Engine v0.3.x-(athar) Released: What's New and How to Migrate from v0.1.5

Building search functionalities for Arabic texts—especially the Quran—comes with unique challenges, from morphology and lemmatization to phonetic matching and performance bottlenecks. The quran-search-engine library was built to solve this by providing a stateless, UI-agnostic, and purely TypeScript-driven search engine.

With the release of v0.3.x-(athar), the library has undergone a massive architectural shift to prioritize speed, scalability, and developer experience. If you are currently using v0.1.5 (or v0.2.x), you will notice significant breaking changes designed to make your application much faster.

In this article, we will break down what's new in version 0.3.x-(athar) and provide a comprehensive, step-by-step guide on how to migrate your codebase from v0.1.5 to v0.3.x-(athar).

What is `quran-search-engine`?

For those new to the library, quran-search-engine is a deterministic, client/server-side search engine for the Quran. Unlike traditional solutions that tie you to a specific UI or require a dedicated backend server, this library runs anywhere JavaScript runs (vanilla JavaScript, React, Vue, React Native, Node.js ... etc).

It features:

Advanced Linguistic Search*: Lemma and root matching.
Semantic Search*: Concept-based mapping (e.g., searching for English concepts to find Arabic verses).
Boolean Operators*: AND(+), OR(|), NOT(-), and grouped queries.
Regex Search*: Pattern-based queries with built-in ReDoS safety validation.
Range Queries*: Verse-coordinate lookups like 2:255 or 1:1-7.
Phonetic Search (Pronunciation)*: Latin-to-Arabic transliteration for non-Arabic queries (e.g., searching "Alhamdulillah" to find "الحمد لله").
Cross-Language Search*: English-to-Arabic search out of the box, which can be easily extended or replaced with other languages (e.g., French, Urdu).
Fuzzy Search*: Fallback matching for misspelled words.
UI-Agnostic Highlighting*: Returns precise match offsets (HighlightRanges) instead of raw HTML, pushing the display responsibility to your framework (React, Next.js, React Native, Terminals, etc.).
Stateless Architecture*: You control the data loading, caching, and state management.

The New Layered Search Architecture

To ensure the engine remains highly performant, maintainable, and extensible, the search logic has been refactored into a Layered Search Pipeline. The orchestrator (search()) routes each query through a strict priority cascade, where each layer is an isolated, independently testable module located in core/layers/:

Range Layer: Verse-coordinate queries (2:255) short-circuit all linguistic processing.
Boolean Layer: Parses complex logic (AND, OR, NOT, ()) into an AST.
Regex Layer: Executes pattern queries in isolation (skipping linguistic layers).
Simple Layer: Exact token matching in normalized Arabic text.
Linguistic Layer: Lemma and root matching via morphology data.
Fuse Layer: Fuzzy fallback matching (Fuse.js) for unmatched tokens.
Semantic Layer: Concept expansion and cross-language mappings.
Phonetic Layer: Latin→Arabic transliteration processing.

This modular structure means that managing, adding, or updating new search features is now as simple as inserting a new isolated layer into the pipeline without breaking existing functionality or risking monolithic complexity.

## What’s New in Version 0.3.x(athar)?

The jump from v0.1.5 to v0.3.x-(athar) brings game-changing performance optimizations. Here are the major highlights:

### 1. Architectural Shift: Arrays to Maps for O(1) Performance

In previous versions, the Quran data was stored and searched as an Array (QuranText[]). As queries grew more complex, array iterations became a bottleneck. In v0.3.x, data structures have been migrated to Maps (Map<number, QuranText>). This shift enables O(1) access time, drastically reducing the latency of verse lookups and filtering.

### 2. The Context Object Pattern

The search() function signature has been revamped. Instead of passing an endless list of positional arguments, v0.3.x-(athar) introduces a clean Search Context object. This makes your code more readable and future-proof.

### 3. Mandatory Inverted Index

To achieve lightning-fast lookups for morphological and semantic searches, v0.3.x-(athar) introduces a mandatory Inverted Index. Instead of computing search paths on the fly, you now build an inverted index once during initialization (buildInvertedIndex) and pass it into the search context.

### 4. Enhanced Web Worker Support & Error Handling

Moving search operations off the main thread is crucial for a smooth UI. Version 0.3.x-(athar) simplifies worker initialization and introduces the WorkerError class. This provides structured, type-safe error codes when dealing with web workers.

### 5. Deprecations and Cleanups

To keep the library lightweight and focused, several legacy features and data files from previous versions have been removed:

Pre-built Index Files*: Static JSON files (e.g., lemma-index.json, root-index.json, word-index.json) are no longer distributed. Instead, these indices are now built dynamically in-memory via buildInvertedIndex(), reducing bundle size and ensuring data consistency.

Data Modularity

One of the most powerful aspects of quran-search-engine is how it handles data ingestion. The library does not force heavy data files into your bundle; instead, it empowers you with dynamic loading and strict validation.

1. Dynamic Data Loading

All core datasets—including the Quran text, morphology maps, semantic words, and phonetic transliterations—can be dynamically loaded from the package on demand. This ensures your initial application bundle size remains incredibly small. You only load the datasets (like English semantic data or Latin transliteration data) when the user actually requests those features.

2. "Bring Your Own Language"

You are not locked into our default English semantic indices. The engine allows you to easily load another language by replacing the default English semantic map with your own (e.g., French, Urdu, or Turkish mappings). This gives you complete flexibility to build multi-lingual Quran applications without being bound to English as the only bridge for semantic search.

3. Custom Data Validation

Beyond just semantics, you can also inject custom datasets. The library exposes robust data validation functions to ensure your custom payloads are correctly structured.

As long as your custom dataset passes the engine's rigorous validation schema, the library guarantees the search engine won't crash when running advanced queries on them.

Highly Typed (TypeScript First)

The library is written entirely in TypeScript and is strictly typed from the ground up. Every data model, search context, configuration object, and search response (like HighlightRanges) is backed by explicit interfaces.

This robust typing provides:

Excellent IDE autocomplete: Discover available search options and response fields seamlessly.
Runtime safety: Prevents errors during data hydration by enforcing strict contracts on custom datasets.
Absolute confidence: Know exactly what properties exist on a verse or search result when manipulating them in your UI.

Production Ready & Heavily Tested

quran-search-engine is not just an experimental library; it is heavily tested in production environments. It powers the search infrastructure behind Open Mushaf Native, a modern, fully-featured offline Quran application built with React Native and Expo. The library handles complex morphological queries seamlessly on mobile devices without relying on any backend API.

Included Playground & Examples

To help you get started quickly, the repository ships with a comprehensive playground featuring multiple real-world integration examples. These examples demonstrate how to architect stateless searching across different frameworks:

Vanilla TypeScript (examples/vanilla-ts): Pure browser implementation without any VDOM overhead.
React + Vite (examples/vite-react): A feature-rich web app showcasing a full search UI with highlighted components.
Node.js (examples/nodejs): Bare-metal server-side searching via CLI.
Angular (examples/angular): Dedicated Angular component highlighting implementation.

You can easily spin up these examples locally using the provided yarn scripts (e.g., yarn playground:react or yarn playground:node).

Offline & AI-Friendly Documentation

We understand that modern development workflows often involve AI pair programmers. That is why quran-search-engine ships with a comprehensive docs/ folder directly inside the package.

This offline documentation is split into practical Guides and deep-dive References covering everything from search syntax to the core API. Whether you are reading it locally in your IDE or providing it as context to an AI agent (like GitHub Copilot or Trae), the markdown files are structured to be easily digestible and highly accessible without needing an internet connection.

Powered by the ITQAN Community

The architectural leap in version 0.3.x-(athar) was heavily inspired and shaped by the brilliant developers in the ITQAN Community.

Their participation in the Athar Initiative (a collaborative effort to build advanced, open-source Quranic technologies) directly influenced the strict typings, the dynamic data loading approach, and the robust layered search architecture you see today. The name of this release, athar (meaning "impact" or "trace"), is dedicated to their ongoing contributions to the Quran technology ecosystem.

You can read more about the initial ideas and discussions that led to this release in the official Athar Initiative article on the ITQAN Community Forum.

Step-by-Step Migration Guide: From v0.1.5 to v0.3.x-(athar)

Because v0.3.x introduces breaking changes, your v0.1.5 code will not work out of the box. Follow these steps to migrate your application safely.

Step 1: Update Your State Definitions

Since quranData is now a Map, and you need to store the new Inverted Index and Semantic Map, update your state hooks (if using React).

Before (v0.1.5):*


const [quranData, setQuranData] = useState<QuranText[]>([]);

After (v0.3.x):*


import { InvertedIndex, QuranText } from 'quran-search-engine';

const [quranData, setQuranData] = useState<Map<number, QuranText> | null>(null);

const [invertedIndex, setInvertedIndex] = useState<InvertedIndex | null>(null);

const [semanticMap, setSemanticMap] = useState<Map<string, string[]> | null>(null);

const [phoneticMap, setPhoneticMap] = useState<Map<string, string[]> | null>(null);

Step 2: Load Data and Build the Inverted Index

You must now load the semantic data and explicitly build the inverted index before allowing users to search.

Before (v0.1.5):*


const [data, morphology, dictionary] = await Promise.all([

loadQuranData(),

loadMorphology(),

loadWordMap(),

]);

After (v0.3.x-(athar)):*


import { loadQuranData, loadMorphology, loadWordMap, loadSemanticData, loadPhoneticData, buildInvertedIndex } from 'quran-search-engine';

const [data, morphology, dictionary, semantic, phonetic] = await Promise.all([

loadQuranData(),

loadMorphology(),

loadWordMap(),

loadSemanticData(), // New requirement for v0.3.x

loadPhoneticData(), // New requirement for phonetic search

]);

setQuranData(data);

setMorphologyMap(morphology);

setWordMap(dictionary);

setSemanticMap(semantic);

setPhoneticMap(phonetic);

// Build the inverted index synchronously

const index = buildInvertedIndex(morphology, data, semantic);

setInvertedIndex(index);

Step 3: Refactor the Search Function Call

The search() function no longer accepts quranData, morphologyMap, and wordMap as separate positional arguments. They must be grouped into a context object alongside the invertedIndex.

Before (v0.1.5):*


const response = search(

query,

quranData,

morphologyMap,

wordMap,

options,

{ page: 1, limit: 10 },

undefined,

searchCache

);

After (v0.3.x-(athar)):*


const response = search(
  query,
  {
    quranData,
    morphologyMap,
    wordMap,
    invertedIndex,
    semanticMap, // Used for semantic search (cross-language)
    transliterationMap, // Used for phonetic search (pronunciation)
  },
  {
    ...options,
    isRegex: false, // New regex layer option
    isBoolean: false, // New boolean parsing option
    phonetic: false, // New phonetic matching option
  },
  { page: 1, limit: 10 },
  undefined,
  searchCache
);

Step 4: Update Web Worker Initialization (If Applicable)

If you were offloading search to a Web Worker, the initialization path has been simplified using Vite/Webpack dynamic URL imports.

Before (v0.1.5):*


const client = createSearchWorker({

workerUrl: new URL('quran-search-engine/worker', import.meta.url),

});

await client.initData();

After (v0.3.x):*


// Dynamically import the worker URL

const mod = await import('quran-search-engine/worker?url');

const client = createSearchWorker({ workerUrl: mod.default });

await client.initData();

You can now also gracefully catch worker errors:


import { WorkerError } from 'quran-search-engine';

try {

await client.runSearch(query, options, pagination);

} catch (error) {

if (error instanceof WorkerError) {

console.error(`Worker failed with code: ${error.code}`);

}

}

Complete Before & After Comparison

To see the big picture, here is how a standard React component initialization looks before and after the migration.

❌ The Old Way (v0.1.5)


useEffect(() => {

async function init() {

const [data, morphology, dictionary] = await Promise.all([

loadQuranData(),

loadMorphology(),

loadWordMap(),

]);

setQuranData(data);

setMorphologyMap(morphology);

setWordMap(dictionary);

}

init();

}, []);

// Later in the search handler...

search(query, quranData, morphologyMap, wordMap, options, pagination);

✅ The New Way (v0.3.x-(athar))


useEffect(() => {

async function init() {

const [data, morphology, dictionary, semantic, phonetic] = await Promise.all([

loadQuranData(),

loadMorphology(),

loadWordMap(),

loadSemanticData(),

loadPhoneticData(),

]);

setQuranData(data);

setMorphologyMap(morphology);

setWordMap(dictionary);

setSemanticMap(semantic);

setPhoneticMap(phonetic);

// Build the inverted index once

const index = buildInvertedIndex(morphology, data, semantic);

setInvertedIndex(index);

}

init();

}, []);

// Later in the search handler...

search(query, { quranData, morphologyMap, wordMap, invertedIndex, semanticMap, phoneticMap }, options, pagination);

Conclusion

The release of v0.3.x marks a maturity milestone for the quran-search-engine. By shifting from Arrays to Maps and enforcing an Inverted Index, the library now guarantees O(1) access times, allowing for instant, complex searches across the entire Quran text without freezing the main thread.

While migrating from v0.1.5 requires adjusting your data loading logic and function signatures, the performance gains are well worth the effort.

Ready to upgrade? Run yarn add quran-search-engine@latest today! If you encounter any issues during your migration, feel free to check out the official repository and open an issue. Happy coding!

Quran Search Engine v0.3.x-(athar) Released: What's New and How to Migrate from v0.1.5

What is `quran-search-engine`?

The New Layered Search Architecture

Data Modularity

1. Dynamic Data Loading

2. "Bring Your Own Language"

3. Custom Data Validation

Highly Typed (TypeScript First)

Production Ready & Heavily Tested

Included Playground & Examples

Offline & AI-Friendly Documentation

Powered by the ITQAN Community

Step-by-Step Migration Guide: From v0.1.5 to v0.3.x-(athar)

Step 1: Update Your State Definitions

Step 2: Load Data and Build the Inverted Index

Step 3: Refactor the Search Function Call

Step 4: Update Web Worker Initialization (If Applicable)

Complete Before & After Comparison

❌ The Old Way (v0.1.5)

✅ The New Way (v0.3.x-(athar))

Conclusion

Comments

More from this blog

How i Implemented a Flexible Dimension Control in Open Quran View v0.5.0

Building a Word-Level Quran Viewer with Layered Data: How open-quran-view Enables Rich Quranic Experiences

Use Lubb AI Writer for Free with OpenRouter

I Built an AI Writing Tool That Works With Any Provider — Here's How

Command Palette

What is quran-search-engine?

The New Layered Search Architecture

Data Modularity

1. Dynamic Data Loading

2. "Bring Your Own Language"

3. Custom Data Validation

Highly Typed (TypeScript First)

Production Ready & Heavily Tested

Included Playground & Examples

Offline & AI-Friendly Documentation

Powered by the ITQAN Community

Step-by-Step Migration Guide: From v0.1.5 to v0.3.x-(athar)

Step 1: Update Your State Definitions

Step 2: Load Data and Build the Inverted Index

Step 3: Refactor the Search Function Call

Step 4: Update Web Worker Initialization (If Applicable)

Complete Before & After Comparison

❌ The Old Way (v0.1.5)

✅ The New Way (v0.3.x-(athar))

Conclusion

Comments

More from this blog

What is `quran-search-engine`?