A LARGE-SCALE STUDY OF WORLD MYTHS.

AuthorThuillard, Marc
PositionCritical essay
  1. Introduction

    We consider here myths as narratives explaining and justifying the present state of the world. They are always regarded as telling the truth in the societies where there are told. The scientific study of this type of story is fairly recent, and it is the subject of a particular discipline: comparative mythology. To facilitate comparisons, the thousands of myths known in all documented societies have been classified into several fundamental types: cosmogonic myths expose the origin of the universe, anthropogonic myths explain the appearance of mankind, ethnogonic myths tell how humanity was divided into different peoples speaking different languages, etc. Many other myths expose the origin of this or that natural or cultural phenomenon: sun, fire, sexuality, domestication, writing, etc. (Le Quellec and Sergent 2017). Each myth can be deconstructed into 'motifs', defined here as "any features or combinations of features in folklore texts (images, episodes, sequences of episodes) which are subject to replication and found in different traditions" (Berezkin 2015a).

    Take, for example, the many myths of the origin of fire (Frazer 1930). It will be easier to compare them if we take into account the presence/absence of motifs such as these: A Living Creature Personifies Fire, Woman Gives Birth to Fire, First Fire is Stolen from Original Owner, Original Owner is a Jaguar, Original Owner is a Toad, etc. The advantage of choosing such motifs as units of analysis is the degree of abstraction it implies. Even if the superficial details of the story have been mistranslated or partially forgotten, the motif is still easy to identify.

    First, many mythological motifs remain stable over time and are easily identifyable in similar complex stories at long distances (e.g. Gouhier 1892, Bogoras 1902, Jochelson 1905, Hatt 1949, Levi-Strauss 2002). Second, the distribution of myths seems to be geographically stable over very long periods of time, as shown for instance by the worldwide contrasting distribution of the 'emergence motif' (i.e. apparition of the first humans from under the earth) and the 'earth diver' motif (Berezkin 2007, Le Quellec 2014). As another example, the myth of the Frog/Toad in the Moon is already documented during the Han dynasty (Dai Lin and Cai Yun-zhang 2005), and propagates over very large distance, from Asia to the northwest coast of North America where it is widespread, without much change. Other motifs are found in similar complex stories and widespread on either side of the Bering Strait (for numerous instances, see Hatt (1949), Berezkin (2013)).

    In their studies, folklorists and folk tale specialists generally use Thompson's repertoire of motifs (1955-1958), but this tool is poorly suited to global comparative studies because, for example, Eurasia and North America are over-represented in relation to Africa and Oceania. Thompson has a total of 639 bibliographic sources, and Berezkin no less than 7456, among them more than 2484 are original sources in Russian, mostly about Siberian, Altaic and Finno-Ugric peoples rarely or never mentioned in the motif index. As far as Africa is concerned, Berezkin uses 469 sources, whereas Thompson used only 58.

    So, we use here the considerable database of myths elaborated by Berezkin (2015b) which is more comprehensive and better adjusted to mythological studies. This corpus contains over 2264 motifs from over 934 different peoples from all over the world. It was compiled manually and is based on the reading of some 10.000 books, papers and various reports in multiple languages.

    A particular myth can be studied in all its details and versions to identify its transformations. This approach allows integrating information from different disciplines, for instance linguistics, anthropology, astronomy, or from ancient written sources. Such work has been done, for example, with the myth of the bird-nester in America (Levi-Strauss 1964-1971) and Eurasia (Sergent 2009). Alternatively, one may consider a very large corpus of myths or mythological motifs and extract general trends. This last approach has the advantage of facilitating a global analysis. The difference between the two approaches is equivalent, in the field of genetics, to the difference between the study of a particular gene and a whole genome analysis. A global study of Berezkin's corpus was previously done (Korotayev and Khaltourina 2011, Berezkin 2013, 2017) using Principal Components Analysis (PCA). PCA is a method well adapted to big data but furnishing a limited amount of information in comparison to the methods used in this study. The analysis showed that the different peoples are grouped within clusters corresponding to well-defined geographical regions. It is one of the goals of this work to verify the existence of these clusters with an independent method and to analyse in more details the proximity relationships between the different clusters.

  2. Methods

    2.1. Phylogenetic approaches

    The study of myths using mathematical methods has its roots in their formalization, allowing a structural analysis. The use of biological metaphors (for review, see Hafstein 2001) and of statistics (e.g. Boas 1895:341-347) is very old in comparative mythology. Adler (1987) was the first person to apply phylogenetic tools to classify myths and folktales, followed by Oda (2001) and Tehrani (2013). The phylogenetic method was also used to reconstruct the evolution of myths and traditions (d'Huy 2012, Le Quellec 2015), to study the ecotypification of many variants of a same myth (d'Huy 2013, Ross, Greenhill and Atkinson 2013) and it seems compatible, at least for a part, with the structural approach (Thuillard and Le Quellec 2017). This summary is given for memory, and it is important to note that our own paper moves away from these classical phylogenetic approaches.

    After coding, typically with binary characters, the different versions of a myth can be analysed using mathematics or computational methods. The data are coded so that if a motif is present, it takes state '1' while if it is absent it has state '0'. In this sense, each motif can be interpreted as a binary character and each entry (people) as a taxon. The distance matrix between two taxa is computed summing up the distance on each motif. The distance is zero if the two taxa have the same motif's state and one otherwise.

    The representation of the different motifs on a phylogenetic tree is based on the following assumptions:

    i) Motifs are transmitted unchanged over time and space except for minor transformations that may be compared to mutations. A mutation is defined as the appearance or disappearance of a given motif.

    ii) A new motif appears only once.

    The condition ii) is a mathematical condition that is seldom perfectly satisfied. A central result in phylogenetic study, applied to myths, is that motifs transforming according to i)-ii) can be exactly represented by a phylogenetic tree (Semple and Steele 2003). Figure 1a shows an illustration of this result. In real applications, if the probability of a motif to appear twice is very low then a phylogenetic tree is often a good representation of the data.

    Unlike genes, cultural elements can be acquired both from other members of the same group of peoples and from outside that group, i.e. they can move from people to people without the need for those peoples to be genetically related. Thus, the distribution of cultural elements and genetic markers will not necessarily co-occur across different populations. Transmission may occur within a population or through cultural interaction between different populations.

    In recent years, it has become increasingly clear that a phylogenetic tree is often a too crude representation of the relationships between motifs. A distinct group (i. e. taxon) may inherit motives from several other groups. Figure 1b shows an example in which a motif is inherited both in direct descend as well as through interaction with a distant taxon. This latter process is named in analogy to genetics a lateral transfer. As long as lateral transfers are between adjacent nodes, the different motives can be represented by a phylogenetic network (Thuillard and Moulton 2011). Phylogenetic analysis of data proceeds into two steps.

    ii) Order the different taxa. Figure 2 shows, using a simple example, the action of the NeighborNet ordering algorithm.

    iii) Validate the data to find out if they fit well to a...

To continue reading

Request your trial

VLEX uses login cookies to provide you with a better browsing experience. If you click on 'Accept' or continue browsing this site we consider that you accept our cookie policy. ACCEPT