We are happy to announce the release of PHOIBLE 2.0. This release adds almost 1,000 new phoneme inventories for almost 500 new languages over PHOIBLE 1.0. It also includes information about allophones for many inventories.
Since PHOIBLE 1.0 was published well before the
CLDF specification, a
CLDF representation of the data
was only added last year, created from the database underlying
the web application.
With PHOIBLE 2.0 we turned this process around. So now, the database for the
clld app is initialized with data from
the CLDF dataset
.
This new workflow is in line with our strategy of moving the curation of our
datasets to a more homogeneous environment - using formats like CLDF and
collaboration platforms like GitHub. This also helps us with making the code to
initialize clld databases simpler and more transparent, because CLDF provides a lot
more uniformity than the diverse “raw” data formats we had before.
In terms of long-term access to our datasets, this process also provides advantages: With CLDF data archived on ZENODO, we are implementing current best practices of research data management. This also means that the main citable unit is no longer the web application but rather the CLDF dataset identified by the DOI minted by ZENODO - but fortunately, ZENODO allows adding alternative identifiers for datasets - such as the URL of the web application. So hopefully, this gain in long-term data access does not come at the price of fractured citations of essentially the same dataset.