Welcome!

This website lets you easily search through the Manulex databases, which constitute a comprehensive description of the written French addressed to a child while he/she learn to read in the primary grades (1st to 5th grade).

The databases allow the manipulation and the control of experimental variables in empirical studies based on objective data, and the development of instructional methods to keep with the distributional characteristics of French orthography.

Searches can be made on any available criterion and the results can be exported as Excel and Open-Office Calc compatible files. Complex queries are facilitated thanks to our request wizard, especially when text criteria are given.

Manulex

Manulex is based on a corpus of 1.9 million words extracted from 54 readers used in French primary schools between the first and fifth grades. The readers cover a range of topic areas, each with an appreciable amount of data coming from different types of texts (from novels to various kinds of fiction, from newspaper reporting to technical writing, and from poetry to theater plays) written by different authors from a variety of backgrounds.

The database contains two lexicons: the wordform lexicon (48886 entries) and the lemma lexicon (23812 entries). Each lexicon provides a grade-level-based list of words found in first-grade, second-grade, and third-to-fifth grade readers (hereafter called levels G1, G2, G3-5, respectively). A fourth level (G1-5) was generated by combining all readers.

Manulex-infra

Manulex-infra is an extension of Manulex. It was developed to describe the distributional characteristics of the sublexical and lexical units in Manulex.

All entries in the Manulex-wordform lexicon were used for the computations, except abbreviations, interjections, and compound entries. This left a total of 45080 entries. Among these, 10861 were in G1, 18131 were in G2, and 42422 were in G3-5.

At each grade level, quantitative estimates were computed for several infralexical variables such as grapheme-to-phoneme mappings, bigrams, syllables, and for lexical variables such as lexical neighborhood, homophony and homography.

Manulex-morpho

Manulex-morpho (Peereman, Sprenger-Charolles, & Messaoud-Galusi, 2013) extends the description of the grapheme-phoneme and phoneme-grapheme associations initiated in Manulex-infra by considering the contribution of the mophological information (essentially inflectional) to the consistency of the GP and PG associations.

The lexical corpus corresponds approximately to the 10,000 most frequent words of the Manulex database. Four categories of morphological cues were used to mark graphemes and phonemes. The vast majority related to word finals. 1) gender and number inflections, 2) verbal inflections, 3) the grapheme -ent and its corresponding pronunciation when occurring in adverbs ending with the derivation -ment (e.g.,rarement, rarely), 4) final consonants that can be silent in the word, but heard in inflected or derived words (e.g., -d in grand-grande-grandeur).

The online searchable version of Manulex-morpho is not currently available. Statistical descriptions by word or GP / PG association (excel files) are available on the Manulex-morpho website.

Authors

These are the people who have been involved in creating the databases, and this website. You can find more information about our work in our respective pages. You can also get the original papers from the download page.

Website

The electronic eManulex website was created by:

Bernard Lété Professeur de Psychologie Cognitive, Université Lumière Lyon 2, Laboratoire d’Étude des Mécanismes Cognitifs (EA 3082)
Éric Ortega

Manulex

The Manulex database was developed by:

Bernard Lété Professeur de Psychologie Cognitive, Université Lumière Lyon 2, Laboratoire d’Étude des Mécanismes Cognitifs (EA 3082)
Liliane Sprenger-Charolles Senior researcher (Emeritus), Aix-Marseille Université, Laboratoire de Psychologie Cognitive (UMR 7290)
Pascale Colé Professeur de Psychologie Cognitive, Aix-Marseille Université, Laboratoire de Psychologie Cognitive (UMR 7290)

Manulex-infra

The Manulex-infra database was developed by:

Ronald Peereman Chargé de Recherche CNRS, Université Grenoble Alpes, Laboratoire de Psychologie et NeuroCognition (UMR 5105)
Bernard Lété Professeur de Psychologie Cognitive, Université Lumière Lyon 2, Laboratoire d’Étude des Mécanismes Cognitifs (EA 3082)
Liliane Sprenger-Charolles Senior researcher (Emeritus), Aix-Marseille Université, Laboratoire de Psychologie Cognitive (UMR 7290)

Citations

For Manulex:

Lété, B., Sprenger-Charolles, L., & Colé, P. (2004). Manulex: A grade-level lexical database from French elementary-school readers. Behavior Research Methods, Instruments, & Computers, 36, 156-166.

For Manulex-infra:

Peereman, R., Lété, B., & Sprenger-Charolles, L. (2007). Manulex-infra: Distributional characteristics of grapheme-phoneme mappings, infra-lexical and lexical units in child-directed written material. Behavior Research Methods, 39, 593-603.

For Manulex-morpho:

Peereman, R., Sprenger-Charolles, L., & Messaoud-Galusi, S. (2013). The contribution of morphology to the consistency of spelling-to-sound relations: A quantitative analysis based on French elementary school readers. Annee Psychologique, 113, 3-33.

For the electronic version eManulex:
Ortega, É., & Lété, B. (2010). eManulex: Electronic version of Manulex and Manulex-infra databases. Retrieved from http://www.manulex.org.

Notice: references to the electronic version must always be accompanied by the references to the papers.