CLLD – Cross-Linguistic Linked Data


posted Monday, September 15, 2014 by Robert Forkel

Checking the success of the newest CLLD archive registration with OLAC (the Open Language Archives Community) I thought it was time for some bragging.

OLAC uses the OAI-PMH protocol to harvest metadata for language resources from participating archives. Since each application built with the clld framework comes with an implementation of the data provider side of this protocol, registration of a new archive is simple. By default, the details pages of each language identified with an ISO 639-3 code will be disseminated as language resources.

Now on to the bragging: As can be seen on OLAC's archive metrics page :

  • Three repositories in the top-ten of archives ordered by most distinct languages (including the number one, Glottolog) are CLLD databases.
  • When ranked by metadata quality, 7 out of 20 5-star archives are CLLD databases.

It has to be noted, though, that OLAC only indexes resources for languages identified by ISO 639-3 codes; thus it's difficult to register the resources provided by APiCS Online or eWAVE. Hopefully someday it will be possible to also register resources tagged with a Glottocode (Glottolog's languoid identifier).