The numbersPlainly stated.
77,430
Total words
with every repetition counted
3,680
Unique lemmas
dictionary entries, before grammar
1,685
Three-letter roots
the wells from which lemmas spring
~50%
Reached with ~88 words
the connective core, repeated thousands of times
Why repetition is the whole story.
A learner reads the figure of seventy-seven thousand words and despairs. They should not. The Qurʼān is repetitive on purpose — its rhetorical structure depends on it. The same particles thread together every verse. The same nouns name the same realities, again and again. The same verbs return as the same warnings, the same promises, the same consolations.
This is why a small vocabulary goes a very long way. If a hundred words are repeated thousands of times each, learning those hundred words already covers most of the page. A careful count of the highest-frequency lemmas shows that fewer than ninety words account for roughly half of every word in the Qurʼān. Extend the list to about three hundred — chosen by frequency rather than by alphabet — and you reach somewhere near 85% of the running text.
Tier IFirst ~50%
The connective core
The smallest words do the heaviest lifting. Pronouns, prepositions, particles, and a handful of demonstratives are repeated tens of thousands of times across the muṣḥaf. Master a few dozen of them and half of every page becomes legible.
Tier II50% → 65%
The recurring nouns
A second band — nouns and adjectives that name the recurring concerns of the Book: belief, mercy, the Day, the Garden, the Fire, light, the Book, the people. Adding these to the connective core takes a learner past two-thirds comprehension.
Tier III65% → 85%
The verbs and their patterns
Arabic verbs do not stand alone — each root flowers into a small family of related verbs through ten classical patterns. Once a learner sees the pattern, they have not learned one verb but ten. This is where the curve bends sharply upward.
Tier IVThe last 15%
The long tail
The remaining vocabulary is a long tail of words that appear only a few times in the entire Qurʼān — a name, a one-off image, a singular phrase. They are precious but not the foundation. They are best learned in the verses where they live.
What this means for a learner.
The mistake most courses make is teaching Modern Standard Arabic to someone who only ever wanted to understand the Qurʼān. The curricula then carry the learner through hundreds of words they will never meet again outside a newspaper. The path is long because the path is wrong.
The shorter path is to follow the frequency curve. Begin with the connective core that opens half of every page. Layer the recurring nouns. Add the verb patterns and watch comprehension widen with each new pattern, not each new word. Reach the long tail last, and only as the verses themselves require it.
This is the entire premise of the Quran85 curriculum. The path is short on purpose, because the Qurʼān's vocabulary is finite on purpose, and the highest-frequency words — by the design of the Book itself — do most of the work.
وَلَقَدْ يَسَّرْنَا الْقُرْآنَ لِلذِّكْرِ
And We have certainly made the Qurʼān easy for remembrance.
Al-Qamar · 54 : 17
A note on the 85%.
Eighty-five percent is a comprehension threshold, not a finish line. It is the share of running words a learner will recognise on the page after completing all three tiers above. The remaining 15% is the rare and the singular — words that need their own moment, in their own verse. The Qurʼān rewards every revisit, and a lifetime of reading is the real curriculum.
Sources for the figures cited. Word, lemma, and root counts are derived from the Quranic Arabic Corpus (corpus.quran.com) maintained at the University of Leeds, and from equivalent published statistical analyses of the 'Uthmānī muṣḥaf. The tier framework above is the Quran85 curriculum's own pedagogical reading of the resulting frequency curve.