There are three major ways in which the CLLD project helps publishing cross-linguistic datasets:
The following datasets are maintained by the CLLD project, i.e. fall into categories 1 and 2 above:
Name | Description | Editors | CLDF dataset on ZENODO |
---|---|---|---|
WALS Online | The World Atlas of Language Structures | Matthew Dryer & Martin Haspelmath | |
WOLD | The World Loanword Database | Martin Haspelmath & Uri Tadmor | |
APiCS Online | The Atlas of Pidgin and Creole Language Structures | Susanne Maria Michaelis, Philippe Maurer, Martin Haspelmath, and Magnus Huber | |
ValPaL | Valency Patterns Leipzig | Iren Hartmann, Martin Haspelmath & Bradley Taylor | |
eWAVE | The Electronic World Atlas of Varieties of English | Bernd Kortmann & Kerstin Lunkenheimer | |
AfBo | A world-wide survey of affix borrowing | Frank Seifart | |
IDS | The Intercontinental Dictionary Series | Bernard Comrie & Hans-Jörg Bibiko | |
ASJP | The database of the Automated Similarity Judgement Program | Søren Wichmann et al. | |
Numerals | Numerals in the World’s Languages | Eugene Chan | |
Glottolog | catalog of all languages, families and dialects, with comprehensive reference information | Harald Hammarström, Martin Haspelmath, Robert Forkel & Sebastian Bank | |
SAILS Online | The South American Indigenous Language Structures Online | Harald Hammarström | |
PHOIBLE Online | The world's largest database of phonological inventories | Steven Moran, Daniel McCloy and Richard Wright | |
Tsammalex | A multilingual lexical database on plants and animals | Christfried Naumann & Steven Moran & Guillaume Segerer & Robert Forkel | |
CSD | The Comparative Siouan Dictionary | Rankin, Robert L. & Carter, Richard T. & Jones, A. Wesley & Koontz, John E. & Rood, David S. & Hartmann, Iren | |
Concepticon | The Concepticon | List, Johann Mattis & Rzymski, Christoph & Greenhill, Simon & Schweikhard, Nathanael & Pianykh, Kristina & Tjuka, Annika & Wu, Mei-Shin & Forkel, Robert | |
Dogonlanguages | Dogon and Bangime Linguistics | Moran, Steven & Forkel, Robert & Heath, Jeffrey | |
Dictionaria | Open-access journal publishing dictionaries from all over the world | Chief editors: Haspelmath, Martin & Stiebels, Barbara; Managing editor: Hartmann, Iren | |
LDH | The Language Description Heritage library | Managing editor: Robert Forkel | Community on Zenodo |
TuLeD | Tupían Lexical Database | Fabrício Ferraz Gerardi and Stanislav Reichert |
Among datasets in category 3 above, the following have come to our attention:
Name | Description | Editors |
---|---|---|
NorthEuraLex | Lexicostatistical Database of Northern Eurasia | Johannes Dellert and Gerhard Jäger |
MosLex | Moscow Lexical Database | Alexei Kassian |
DoReCo | DoReCo (Language DOcumentation REference COrpus) | Frank Seifart et al. |
CLLD datasets follow the update model of the traditional publications: Errata or additions are collected until a new edition of the dataset is released. Typically we aim to have not more than one edition per year.
But since we still want to exploit the fact that online publications could be continuously updated, we distinguish two categories of data:
CLLD data is meant to be easily re-usable and we would love to hear about cases where it has been reused - be it in research or teaching.