A Statistical Analysis of Genesis 1-2

Dr. Steven Boyd has authored an essay to purportedly demonstrate—through a statistical analysis of linguistic features—that Genesis 1:1-2:3 is a work of historical narrative, not poetry or anything like it. He concludes:

Thus, we conclude with statistical certainty that this text is narrative, not poetry. It is therefore statistically indefensible to argue that this text is poetry. The hermeneutical implication of this finding is that this text should be read as other historical narratives, whose authors evinced supererogatory concern with the past and staunchly upheld the historicity of their accounts even to the point of challenging their contemporaries to prove or disprove their documented historical references.

What are our thoughts on the merits of Boyd’s methodology and/or findings?

Skipping ahead to the conclusion, I find that Genesis is not only historical narrative but true historical narrative. First, because it was intended to be read that way, but most importantly because to treat it as mistaken is to reject God. Will anyone here accept that reasoning?


Right away there seems a set of false dichotomies and equations: a text must be poetry or prose, figurative or literal, fiction or history, and the corresponding terms of each set are equated, so that there can be no such thing as historical narrative poetry or prose fiction. Further, prose apparently can contain no metaphor. Not a good start.


But on the plus side, he uses some very fancy words to dress it up.


Good point! The author seems to be precluding from the outset the possibility that a text may have multiple meanings, and discourses that cut through each other, for a variety of purposes. I’m reminded of C. John Collins’ exegesis of Genesis 1-11, which incorporates both broadly historical and poetic/polemical readings at once.


It is an interesting article, which seems like it could be far far shorter.

The key point seems to be that he chose particular features that skew the analysis to prose vs. poetry. For example, he entirely neglects parallelism, giving this argument against it:

Although parallelism (phonological, morphological, syntactic, the infrequent lexical, semantic, and merely formal) is the main structural feature of Biblical Hebrew poetry—in particular the poetic line—not all would agree that it rigorously distinguishes poetry from prose. Kugel [1981]—who calls parallelism a “seconding sequence”—argues that this feature occurs in both poetry and prose (albeit, I would argue, blatantly and almost always in the former and more subtly and rarely in the latter).23

They point to methodological reasons:

The discussion above shows that characteristic features of poetry are also found in narrative. The converse is also true: characteristic features of narrative are also present in poetry. We conclude that qualitative descriptions of poetry and prose—although helpful in identifying their genre—do not rigorously distinguish them. We turn instead therefore to examine countable features of texts, which admit a statistical analysis.

As argued above, parallelism is not easily quantifiable, and, therefore, this most prevalent linguistic feature of Biblical Hebrew poetry cannot be used to distinguish prose from poetry. But Biblical Hebrew has other linguistic features, which are easily counted, measurable characteristics, such as morphological distribution, word order, and clause length.37

Except, it is actually quite easy to quantify parallelism :).

Instead he focuses on the proportion of perfect vs. imperfect verb tenses. This is the difference (Learn Biblical Hebrew: Lesson 13 | AHRC),

Each Hebrew verb also identifies the tense of the verb. In English a verb can have three tenses - past, present or future. Examples of these would be “You cut a tree” (past), “You are cutting a tree” (present) and “You will cut a tree” (future). Biblical Hebrew only has two tenses - perfect and imperfect. While the three verb tenses in English are related to time, Biblical Hebrew verb tenses are related to action. The perfect tense is a completed action while the imperfect tense is an incomplete action.

As we have learned, the verb קצרתי identifies the subject of the verb as first person - “I” but, it also identifies the verb as “perfect tense,” a completed action - “I cut.” When the verb is written as אקצר the subject of the verb is also first person - “I” but, the tense is now “imperfect tense,” an incomplete action and can be translated as “I am cutting a tree” (an action that has begun but not yet completed) or “I will cut a tree” (an action that has not yet begun).

They also look at preterites vs. finite verbs. Perhaps @deuteroKJ can further enlighten us.

I don’t think there is much good reasoning to think that poetry as a general class should be skewed towards imperfect or finite verbs.

Parallelism seems to be a much better metric, and on that count Genesis 1 would be off the charts. It does seem that his analysis could be corrected by including some good measures of parallelism (it is not hard at all to do), and by looking at particular cases.

Though, I’m not really sure that this sort of quantitative analysis very helpful at all.


And here is one of the most telling results:

Of the 97 sample narrative and poetry passages, only two were misclassified: Ezekiel 19 was classified as narrative and Exodus 33 was classified as poetry. These misclassifications inform us about the quality of our model and about the nature of Biblical Hebrew narrative. Our model caught an incorrect analysis of Ezekiel 19 but was tripped up by Exodus 33, a narrative that largely recounts habitual action.

On the one hand, our model misclassified Ezekiel 19, because it was incorrectly included in the poetry population from which the random sample was drawn. Ezekiel 19 was assigned to the poetry population, because it has an elaborate, extended metaphor: two of the last four kings of Judah are portrayed as lions. Jehoahaz (third son of Josiah) and Jehoiachin (grandson of Josiah), whom Neco II of Egypt and Nebuchadnezzar II of Babylon deposed, deported to Egypt and Babylon respectively, and replaced with puppet kings, are pictured as lion cubs, reared by a lioness (Judah). They became young lions. They learned to hunt and became man-eaters. The nations heard about them, trapped them in pits, and brought them to Egypt and Babylon by hooks.

This is highly symbolic language. Most of the specifics of the text did not happen: kings are not lions, no lioness reared them, and they were not caught in a pit. But, they were taken off by hooks into captivity to Egypt and Babylon.

The identification of kings and kingdoms with animals (or trees) reminds us of portions of Daniel, Zechariah, other passages in Ezekiel and even the vine imagery in Isaiah 5:1–7. Ezekiel 19 therefore belongs to neither genre tested: it is neither historical narrative nor poetry but rather, apocalyptic.

So prose can make use of highly symbolic language that should not be taken literally. And there are other categories than poetry and prose. In fact many scholars think that Genesis 1 falls into a different category too!

You can see Ezekiel 19 as the right most red dot, at the bottom of the graph.

Both Ezekiel 19 and Genesis 1 have a high ratio of perfect and preterites, and both are classified as prose. But both are outliers, which are not well characterized by the prose-poetry dichotomy.

As I said before, there are some simple ways of measuring parallelism. It would be interesting to see how that changes the distribution. I am fairly certainly we’d see that Genesis 1 is a wild outlier, with more parallelism that the vast majority of other passages. And parallelism is an indicator of “poetry”, so no what?


What is the full citation? What book is it in?

Seems to be from 2005, in the conference proceedings of RATE II?


Like everyone else at AIG, Boyd has an a prior commitment to a literalist, historical reading of Genesis, as a blow-by-blow chronicle of what happened in the past. Thus, despite all the pretense of mathematical and scientific objectivity (graphs, charts, equations, expressions such as “confidence level”, etc.), Boyd never would have drawn any conclusion other than the one he drew. If the quasi-scientific analysis he presents had gone the other way, i.e., had shown that Genesis 1-11 were not to be read historically, he would not have accepted that result.

It is not at all surprising to me that a literalist would turn to statistics, etc. – to a mechanical approach to the text. Literalists are almost always mechanical thinkers. But far more useful, and far more sensitive to the character of Genesis (and the rest of Hebrew narrative) is the approach of people like Alter, who, with their background in English and other literature, understand that narrative must be approached as story (which does not necessarily mean fiction, or false) in order to be understood. The Bible writers teach by telling stories. The stories may have historical elements in them, maybe many historical elements. But their character as story must first be grasped; premature attempts to prove that all or this or that part of the story is “history” risk denaturing the character of the text. If this author could first understand the nature of “story”, it might save him from muddles about prose vs. poetry, which don’t really get at the heart of things.

I have no sympathy at all for what this author is trying to do, and the attempt to dress up a conclusion he would hold in any case with the trappings of objective science is, to me, even worse than the old-fashioned fundamentalism which made no such pretenses, but simply affirmed a literalist reading. At least with the old-fashioned fundamentalist, you knew you were dealing with a grassroots commitment to a literalist reading of the Bible, coming from people who frankly admitted that if science or secular history clashed with the Bible, they would take their literal reading of the Bible and reject the other. But this new breed of fundamentalist, which pretends to a clinical objectivity about the text, and that it believes what it believes as the result of open-minded sciencey-sounding investigations, is disingenuous.

The pattern of “sevenness” (words occurring seven times or in multiples of seven times) in Genesis 1.1-2.4a certainly does not suggest “history” in the normal sense of the word; nor does the neat way Genesis 1 falls into two columns suggest “history”; nor does the chiastic structure found in the Flood story (cf. Lamoureux, though my teacher of OT noticed it 30 years before Lamoureux) suggest “history” as that term is normally used. Genesis 1-11 is narrative, yes, but it is narrative that teaches through story. It might well contain many historical elements, but it is not a history text. No fancy numerical analysis can change this. Reading a text like Genesis requires literary sensitivity and literary judgment, not the sledgehammer tools of statistical analysis.


Statistics isn’t necessarily a sledgehammer. :slight_smile:

But I largely agree with you. I don’t think this is the right way to approach the question. Perhaps statistics can provide more information to supplement literary criticism, but it can’t be taken as the final word in any sense.


Yeah, all of this suggests to me that Boyd, intentionally or otherwise, constrained the rules of his analysis to brute force a desired conclusion. Why, with all the refusal to consider nuanced genres, prose that makes use of metaphor and symbolism, parallelisms, numerical themes, chiasms, and overemphasis on peculiar grammatical features without much by way of motivation.


There is truth in these words.

…and the old-fashioned fundamentalism, more often than not, allowed for interpretation of Genesis one in light of its evident poetic quality. This was the basis of the day age or progressive creation approaches, which while not necessarily held by most, were generally accepted alternatives.


Fair enough. Statistics, in the sense of noticing certain distributions of words and expressions, can be helpful to exegesis. But Bible scholars have already been doing that sort of “statistical” analysis for years (minus the fancy graphs and equations and jargon). They notice that certain words, tenses of verbs, stylistic devices, etc. tend to appear more often in some texts than others, in some authors than others, etc. And they take such data up in their interpretations. But what they do is much less crude than saying, “My mathematical analysis proves that text X is prose and not poetry” or “My statistical analysis proves with hard science that this text was meant to read as literal history.” That sort of mechanical approach to determining things does not take us into the spirit and depths of the Biblical texts.


I wonder how Job would be counted. Is it poetry or is it narrative history? (He does mention a third thing, apocalyptic prophecy, but I don’t think Job is that either.)

I have to say that I am in awe of his vocabulary.

The Documentary Hypothesis has long been a playground for textual criticism of the Pentateuch, dedicated to distinguishing, by particular divine name preferences, style, and Hebrew word forms, a mashup of contributors of which none were directly held to be Moses. This is anathema to literalist commentators such as would seek to establish Genesis One as narrative. Boyd does not mention the Documentary Hypothesis anywhere in his chapter, and given the commonality with his proposed statistical approach to the same literature, this to me is glaring.

But wow, what a lot of work when it is so obvious that parameters were selected which will result in the desired outcome. May I suggest a similar exercise where a chapter of a standard biology textbook is statistically analyzed against Genesis to determine which might be characterized as science.


Hi @swamidass, @John_Harshman, @Eddie, @Chris_Matthew and @RonSewell,

I just dug up this article by creationist Jonathan Sarfati, which I remember reading several years ago: Genesis is history! (6 July 2015). It succinctly summarizes Boyd’s research. Here’s what Sarfati has to say about parallelism in Genesis:

So what would Genesis look like if it were poetry? Hebrew poetry, such as the Psalms, has a different style.(10) The defining characteristic of Hebrew poetry is not rhyme or metre, but parallelism. That is, the statements in two or more consecutive lines are related in some way. For example, in synonymous parallelism there is one statement, then it is immediately followed by another statement saying the same thing in different words. Psalm 19:1–2 nicely illustrates this:

The heavens declare the glory of God,
and the sky above proclaims his handiwork.
Day to day pours out speech,
and night to night reveals knowledge.

In antithetical parallelism, the first statement is followed by a statement of the opposite, as in Proverbs 28:1 and 7:

The wicked flee when no one pursues,
but the righteous are bold as a lion.
The one who keeps the law is a son with understanding,
but a companion of gluttons shames his father.

In synthetic or constructive parallelism , the first statement is extended by the next one, e.g. Psalm 24:3–4:

Who shall ascend the hill of the Lord?
And who shall stand in his holy place?
He who has clean hands and a pure heart,
who does not lift up his soul to what is false,
and does not swear deceitfully.

However, parallelism is absent from Genesis, except where people are quoted, e.g. Genesis 4:23–24. However, they stand out from the rest of Genesis—if Genesis were truly poetic, it would use parallelisms throughout.(11) In fact, the Bible has a poetic celebration of God’s creative work of Genesis: Psalm 104so if we want to see what a poetic account of creation looks like, that’s where to look. For example, Psalm 104:7, 11 illustrates parallelism perfectly:

At your rebuke they fled;
at the sound of your thunder they took flight.
They give drink to every beast of the field;
the wild donkeys quench their thirst.

Also, Hebrew scholar Dr Steven Boyd has shown that different types of verb (perfect and imperfect) are frequent in Hebrew poetry, but not in historical books. So from his verb analysis, he found that the probability that Genesis 1:1–2:3 is narrative (not poetry) is 0.99997.(12)

I must say that I think Boyd has a point, on purely literary grounds. Whatever else Genesis 1 may be, it’s not poetry. Nor is Genesis 2. There certainly seems to be a kind of structural parallel between the first three days of creation and the last three in Genesis 1, but that doesn’t make the chapter a poem. Genesis 2 looks even less like one.

As for the patterns of sevenness: it’s reasonable to infer that they serve a didactic purpose, spelt out in Exodus 20:11:

For in six days the Lord made the heavens and the earth, the sea, and all that is in them, but he rested on the seventh day. Therefore the Lord blessed the Sabbath day and made it holy.

You could, if you like, see Genesis 1 as a piece of propaganda dressed up as history: God rested on the seventh day, and so should you. But poetry it ain’t. It doesn’t read like a poem, but a story. The real question is: to what degree did the writer intend his story to be historical? I don’t know the answer to that question. All I know is that even those ancient commentators who favored an allegorical interpretation of Genesis 1 and 2 also believed that it contained a historical kernel. Perhaps “mytho-history” might be the best term. Thoughts?


Here’s a few quickies:

  1. Hebrew verbs are marked more for aspect than tense, and it’s not always easy which one is being emphasized in a given text. So any simple perfect/preterite = past and imperfect = non-past is, well, too simplistic. However, Boyd is correct that narratives (which are largely about the past) are known for the high use of the so-called waw-consecutive + imperfect (or preterite)…better called a wayyiqtol (the wa and doubling of first letter of root is normally the sequential “and then…”). Formal poetry, in contrast, is known for only rare occurrences of the wayyiqtol.

  2. Boyd’s first major problem is only speaking of poetry/prose at the formal level (i.e., syntax between clauses). In other words, he does not understand that poetry and poetic(s) (both as adjective and noun) are two different things. Thus Boyd assumes a false this-or-that bifurcation, when the reality of texts is much more complex. Even in a formally prose text (like Gen 1, other than formal poetry in v. 27; cf. 2:4 and 2:23), there is a range/continuum, where to build up of poetic devices calls for different genre–and thus–interpretive implications. Genesis 1 is well known for an amazing amount of poetic devices, the combination of which is found in no other (especially prosaic) text–so most scholars classify its genre as in a pretty unique category (sui generis). (Examples include special numbers, repetition, unexpected vocabulary, and large-scale parallelism [as opposed to between two corresponding lines].)

  3. Another problem is the assumption behind Boyd’s use of “historical narrative” as a genre label (to be fair, he’s not alone in this). History is not a literary genre, but a way of referring to the past. But there’s a range of things that can be called historical but have different intentions on precision (i.e., historicity). For example, consider the difference between a documentary, a movie based on historical events, a movie inspired by historical events, and historical fiction. Each of these is historical. So, the genre label “historical narrative” is too broad to be useful.

  4. Related: the historical value would need to be determined by factors that have nothing to do with syntax. A parable, for example, is written in prose narrative, full of wayyiqtol verbs. Yet, Boyd’s analysis would conclude these obviously-non-literal texts as “historical narrative.”


The beginning and end of Job (chs. 1-2, 42:7ff) are prose; the rest is poetry. The book is a great counterexample to Boyd’s methodology.


I agree in the main here with the formal distinction between prose and poetry. However, Sarfati’s use of 18th c categories of parallelism are quite outdated. This shows he’s not really acquainted with current studies in Hebrew poetry.

The other type of parallelism, which is evident in Gen 1, is the larger panel parallels between Days 1-3 // Days 4-6, and then within each of the triads (1//4; 2//5; 3//6). This is one of many examples that are not common in OT prose. Actually, I can’t think of another text like it.