CLDR based language data and utilities
The language data with following details are populated from the current version of CLDR supplemental data
- The script in which a language is written.
- The script code
- The language code
- The regions in which the language is spoken
- The autonym - language name written in its own script
- The directionality of the text
Adding languages
New languages must be added to the data/langdb.yaml file.
The file format is:
ISO 639 code: [writing system code, [regions list], autonym]
The writing system is indicated using ISO 15924 codes. Make sure that the code appears in the scriptgroups section towards the end of the file, and add it if it doesn't.
The list of region codes appears at the end of data/langdb.yaml.
The autonym is the name of the language in the language itself. In some cases, for example for extinct languages such as Jewish Babylonian Aramaic (tmr), the name can be something that is useful for modern users, but in most cases it should be the natural name in the language itself. Please do your best to verify that it's spelled correctly in reliable sources.
After adding a language to data/langdb.yaml, run php ulsdata2json.php in the data/ directory to generate the language-data.json file. Don't edit language-data.json manually.