Move documentation to Read the Docs
Read the docs is used to host the documentation. https://language-data.readthedocs.io/en/latest/ The following documentation has been moved, * Introduction * Using the PHP / Node.js libraries * Adding new languages * PHP API documentation Updated the README.md to point to the new documentation. Doxygen is used to pull out the PHPDoc comments into XML. This is parsed via doxyphp2sphinx into Sphinx which is then used by Read the docs to generate the documentation. Read the docs has been configured to update the code documentation under the docs/api folder automatically whenever a commit is made so no manual work is needed. Bug: T218639
This commit is contained in:
8
.gitignore
vendored
8
.gitignore
vendored
@@ -15,4 +15,10 @@ yarn-error.log*
|
||||
*.sln
|
||||
|
||||
# PHP
|
||||
vendor/
|
||||
vendor/
|
||||
|
||||
# Docs
|
||||
docs/_build/
|
||||
docs/warnings.log
|
||||
docs/xml/
|
||||
docs/html/
|
||||
|
||||
46
README.md
46
README.md
@@ -1,41 +1,35 @@
|
||||
# CLDR based language data and utilities
|
||||
|
||||
[![npm][npm]][npm-url]
|
||||
[![node-build][node-build]][node-build-url]
|
||||
[![php-build][php-build]][php-build-url]
|
||||
|
||||
The language data with following details are populated from the current version of [CLDR supplemental data](http://unicode.org/repos/cldr/trunk/common/supplemental/supplementalData.xml)
|
||||
This library contains language related data, and utility libraries written in PHP and Node.js to
|
||||
interact with that data.
|
||||
|
||||
1. The script in which a language is written.
|
||||
The language related data comprises of the following,
|
||||
|
||||
1. The script in which a language is written
|
||||
2. The script code
|
||||
3. The language code
|
||||
4. The regions in which the language is spoken
|
||||
5. The autonym - language name written in its own script
|
||||
6. The directionality of the text
|
||||
|
||||
## Adding languages
|
||||
This data is populated from the current version of
|
||||
[CLDR supplemental data](http://unicode.org/repos/cldr/trunk/common/supplemental/supplementalData.xml).
|
||||
|
||||
New languages must be added to the `data/langdb.yaml` file.
|
||||
## Documentation
|
||||
|
||||
The file format is:
|
||||
|
||||
ISO 639 code: [writing system code, [regions list], autonym]
|
||||
|
||||
The writing system is indicated using ISO 15924 codes. Make sure that the code appears in the scriptgroups section towards the end of the file, and add it if it doesn't.
|
||||
|
||||
The list of region codes appears at the end of `data/langdb.yaml`.
|
||||
|
||||
The autonym is the name of the language in the language itself. In some cases, for example for extinct languages such as Jewish Babylonian Aramaic (tmr), the name can be something that is useful for modern users, but in most cases it should be the natural name in the language itself. Please do your best to verify that it's spelled correctly in reliable sources.
|
||||
|
||||
After adding a language to `data/langdb.yaml`, run `php src/util/ulsdata2json.php` in the base directory to generate the language-data.json file. Don't edit language-data.json manually.
|
||||
|
||||
Example:
|
||||
`myv: [Cyrl, [EU], эрзянь]`
|
||||
|
||||
This is the [Erzya language](https://en.wikipedia.org/wiki/Erzya_language). Its writing system is Cyrillic (ISO 15924: Cyrl). It's spoken in Europe (EU). Its autonym is "эрзянь".
|
||||
|
||||
Some languages are listed as redirects. In this case, the only value in the square brackets is the target language code. For example:
|
||||
`fil: [tl]`
|
||||
|
||||
This is the Filipino language, which is a redirect to Tagalog (tl).
|
||||
1. [Full documentation](https://language-data.readthedocs.io/en/latest/index.html)
|
||||
2. [Using the PHP library](https://language-data.readthedocs.io/en/latest/index.html#using-the-php-library)
|
||||
* [PHP API documentation](https://language-data.readthedocs.io/en/latest/api/languagedata.html)
|
||||
3. [Using the Node.js library](https://language-data.readthedocs.io/en/latest/index.html#using-the-node-js-library)
|
||||
4. [Adding Languages](https://language-data.readthedocs.io/en/latest/user/adding_new_language.html)
|
||||
|
||||
[npm]: https://img.shields.io/npm/v/@wikimedia/language-data.svg
|
||||
[npm-url]: https://npmjs.com/package/@wikimedia/language-data
|
||||
[npm-url]: https://npmjs.com/package/@wikimedia/language-data
|
||||
[node-build]: https://github.com/Abijeet/language-data/workflows/Node.js%20build/badge.svg
|
||||
[node-build-url]: https://github.com/Abijeet/language-data/actions?query=workflow%3A%22Node.js+build%22
|
||||
[php-build]: https://github.com/Abijeet/language-data/workflows/PHP%20build/badge.svg
|
||||
[php-build-url]: https://github.com/Abijeet/language-data/actions?query=workflow%3A%22PHP+build%22
|
||||
2494
docs/Doxyfile
Normal file
2494
docs/Doxyfile
Normal file
File diff suppressed because it is too large
Load Diff
19
docs/Makefile
Normal file
19
docs/Makefile
Normal file
@@ -0,0 +1,19 @@
|
||||
# Minimal makefile for Sphinx documentation
|
||||
#
|
||||
|
||||
# You can set these variables from the command line.
|
||||
SPHINXOPTS =
|
||||
SPHINXBUILD = sphinx-build
|
||||
SOURCEDIR = .
|
||||
BUILDDIR = _build
|
||||
|
||||
# Put it first so that "make" without argument is like "make help".
|
||||
help:
|
||||
@$(SPHINXBUILD) -M help "$(SOURCEDIR)" "$(BUILDDIR)" $(SPHINXOPTS) $(O)
|
||||
|
||||
.PHONY: help Makefile
|
||||
|
||||
# Catch-all target: route all unknown targets to Sphinx using the new
|
||||
# "make mode" option. $(O) is meant as a shortcut for $(SPHINXOPTS).
|
||||
%: Makefile
|
||||
@$(SPHINXBUILD) -M $@ "$(SOURCEDIR)" "$(BUILDDIR)" $(SPHINXOPTS) $(O)
|
||||
5
docs/_templates/breadcrumbs.html
vendored
Normal file
5
docs/_templates/breadcrumbs.html
vendored
Normal file
@@ -0,0 +1,5 @@
|
||||
<!-- Remove the Edit on Github link -->
|
||||
{%- extends "sphinx_rtd_theme/breadcrumbs.html" %}
|
||||
|
||||
{% block breadcrumbs_aside %}
|
||||
{% endblock %}
|
||||
10
docs/api.rst
Normal file
10
docs/api.rst
Normal file
@@ -0,0 +1,10 @@
|
||||
API documentation
|
||||
=================
|
||||
|
||||
.. toctree::
|
||||
:glob:
|
||||
|
||||
api/*
|
||||
|
||||
|
||||
|
||||
172
docs/api/languagedata.rst
Normal file
172
docs/api/languagedata.rst
Normal file
@@ -0,0 +1,172 @@
|
||||
LanguageData
|
||||
============
|
||||
|
||||
A singleton utility class to query the language data.
|
||||
|
||||
:Qualified name: ``Wikimedia\LanguageData``
|
||||
|
||||
.. php:class:: LanguageData
|
||||
|
||||
.. php:method:: addLanguage (string $languageCode, array $options)
|
||||
|
||||
Adds a language in run time and sets its options as provided. If the target option is provided, the language is defined as a redirect. Other possible options are script (string), regions (array) and autonym (string).
|
||||
|
||||
:param string $languageCode:
|
||||
New language code.
|
||||
:param array $options:
|
||||
Language properties.
|
||||
|
||||
.. php:method:: getAutonym (string $languageCode)
|
||||
|
||||
Returns the autonym of the language
|
||||
|
||||
:param string $languageCode:
|
||||
Language code
|
||||
:returns: string|bool Autonym of the language or false if the language is unknown
|
||||
|
||||
.. php:method:: getAutonyms () -> array
|
||||
|
||||
Returns all language codes and corresponding autonyms
|
||||
|
||||
:returns: array -- The key is the language code, and the values are corresponding autonym
|
||||
|
||||
.. php:method:: getDir (string $languageCode)
|
||||
|
||||
Return the direction of the language
|
||||
|
||||
:param string $languageCode:
|
||||
Language code
|
||||
:returns: string|bool Returns 'rtl' or 'ltr'. If the language code is unknown, returns false.
|
||||
|
||||
.. php:method:: getGroupOfScript (string $script) -> string
|
||||
|
||||
Returns the script group of a script or "Other" if it doesn't belong to any group
|
||||
|
||||
:param string $script:
|
||||
Name of the script
|
||||
:returns: string -- Script group name or "Other" if the script doesn't belong to any group
|
||||
|
||||
.. php:method:: getLanguages ()
|
||||
|
||||
Get all the languages. The properties in the returned object are ISO 639 language codes The value of each property is an array that has, [writing system code, [regions list], autonym]
|
||||
|
||||
:returns: object
|
||||
|
||||
.. php:method:: getLanguagesByScriptGroup (array $languageCodes) -> array
|
||||
|
||||
Return the list of languages passed, grouped by their script group
|
||||
|
||||
:param array $languageCodes:
|
||||
List of language codes to group
|
||||
:returns: array -- List of language codes grouped by script group
|
||||
|
||||
.. php:method:: getLanguagesByScriptGroupInRegion (string $region) -> LanguageData::getLanguagesByScriptGroupInRegions
|
||||
|
||||
Returns an associative array of languages in a region, grouped by their script
|
||||
|
||||
:param string $region:
|
||||
Region code
|
||||
:returns: :class:`LanguageData::getLanguagesByScriptGroupInRegions` --
|
||||
|
||||
.. php:method:: getLanguagesByScriptGroupInRegions (array $regions) -> array
|
||||
|
||||
Returns an associative array of languages in several regions, grouped by script group
|
||||
|
||||
:param array $regions:
|
||||
List of strings representing region codes
|
||||
:returns: array -- Returns an associative array. They key is the script group name, and the value is a list of language codes in that region.
|
||||
|
||||
.. php:method:: getLanguagesInScript (string $script) -> array
|
||||
|
||||
Returns all languages written in the given script
|
||||
|
||||
:param string $script:
|
||||
Name of the script
|
||||
:returns: array --
|
||||
|
||||
.. php:method:: getLanguagesInScripts (array $scripts) -> array
|
||||
|
||||
Returns all languages written in the given scripts
|
||||
|
||||
:param array $scripts:
|
||||
List of strings, each being the name of a script
|
||||
:returns: array --
|
||||
|
||||
.. php:method:: getLanguagesInTerritory (string $territory)
|
||||
|
||||
Returns the languages spoken in a territory
|
||||
|
||||
:param string $territory:
|
||||
Territory code
|
||||
:returns: array|bool List of language codes in the territory, or else false if invalid territory is passed
|
||||
|
||||
.. php:method:: getRegions (string $languageCode)
|
||||
|
||||
Returns the regions in which a language is spoken
|
||||
|
||||
:param string $languageCode:
|
||||
Language code
|
||||
:returns: array|bool List of regions or false if language is unknown
|
||||
|
||||
.. php:method:: getScript (string $languageCode)
|
||||
|
||||
Returns the script of the language
|
||||
|
||||
:param string $languageCode:
|
||||
Language code
|
||||
:returns: string|bool Language script or false if the language is unknown
|
||||
|
||||
.. php:method:: getScriptGroupOfLanguage (string $languageCode) -> string
|
||||
|
||||
Returns the script group of a language. Language belongs to a script, and the script belongs to a script group
|
||||
|
||||
:param string $languageCode:
|
||||
Language code
|
||||
:returns: string -- script group name
|
||||
|
||||
.. php:method:: isKnown (string $languageCode) -> bool
|
||||
|
||||
Checks if a language code is valid
|
||||
|
||||
:param string $languageCode:
|
||||
Language code
|
||||
:returns: bool --
|
||||
|
||||
.. php:method:: isRedirect (string $languageCode)
|
||||
|
||||
Checks if the language is a redirect and returns the target language code
|
||||
|
||||
:param string $languageCode:
|
||||
Language code
|
||||
:returns: string|bool Target language code if it's a redirect or false if it's not
|
||||
|
||||
.. php:method:: isRtl (string $languageCode) -> bool
|
||||
|
||||
Check if a language is right-to-left
|
||||
|
||||
:param string $languageCode:
|
||||
Language code
|
||||
:returns: bool -- true if it is an RTL language, else false. Returns false if an unknown language code is passed.
|
||||
|
||||
.. php:method:: sortByAutonym (array $languageCodes) -> array
|
||||
|
||||
Sort languages by their autonym
|
||||
|
||||
:param array $languageCodes:
|
||||
List of language codes to sort
|
||||
:returns: array -- List of sorted language codes returned by their autonym
|
||||
|
||||
.. php:method:: sortByScriptGroup (array $languageCodes) -> array
|
||||
|
||||
Return the list of languages sorted by their script groups
|
||||
|
||||
:param array $languageCodes:
|
||||
List of language codes to sort
|
||||
:returns: array -- Sorted list of strings containing language codes
|
||||
|
||||
.. php:staticmethod:: get () -> LanguageData
|
||||
|
||||
Returns an instance of the class that can be used to then call the other methods in the class.
|
||||
|
||||
:returns: :class:`LanguageData` --
|
||||
|
||||
190
docs/conf.py
Normal file
190
docs/conf.py
Normal file
@@ -0,0 +1,190 @@
|
||||
# -*- coding: utf-8 -*-
|
||||
#
|
||||
# Configuration file for the Sphinx documentation builder.
|
||||
#
|
||||
# This file does only contain a selection of the most common options. For a
|
||||
# full list see the documentation:
|
||||
# http://www.sphinx-doc.org/en/master/config
|
||||
|
||||
# -- Path setup --------------------------------------------------------------
|
||||
|
||||
# If extensions (or modules to document with autodoc) are in another directory,
|
||||
# add these directories to sys.path here. If the directory is relative to the
|
||||
# documentation root, use os.path.abspath to make it absolute, like shown here.
|
||||
#
|
||||
# import os
|
||||
# import sys
|
||||
# sys.path.insert(0, os.path.abspath('.'))
|
||||
|
||||
|
||||
# -- Project information -----------------------------------------------------
|
||||
|
||||
project = u'LanguageData'
|
||||
copyright = u'2020, Wikimedia Foundation'
|
||||
author = u'Wikimedia Foundation'
|
||||
|
||||
# The short X.Y version
|
||||
version = u''
|
||||
# The full version, including alpha/beta/rc tags
|
||||
release = u''
|
||||
|
||||
|
||||
# -- General configuration ---------------------------------------------------
|
||||
|
||||
# If your documentation needs a minimal Sphinx version, state it here.
|
||||
#
|
||||
# needs_sphinx = '1.0'
|
||||
|
||||
# Add any Sphinx extension module names here, as strings. They can be
|
||||
# extensions coming with Sphinx (named 'sphinx.ext.*') or your custom
|
||||
# ones.
|
||||
extensions = [
|
||||
"sphinxcontrib.phpdomain"
|
||||
]
|
||||
|
||||
# Add any paths that contain templates here, relative to this directory.
|
||||
templates_path = ['_templates']
|
||||
|
||||
# The suffix(es) of source filenames.
|
||||
# You can specify multiple suffix as a list of string:
|
||||
#
|
||||
# source_suffix = ['.rst', '.md']
|
||||
source_suffix = '.rst'
|
||||
|
||||
# The master toctree document.
|
||||
master_doc = 'index'
|
||||
|
||||
# The language for content autogenerated by Sphinx. Refer to documentation
|
||||
# for a list of supported languages.
|
||||
#
|
||||
# This is also used if you do content translation via gettext catalogs.
|
||||
# Usually you set "language" from the command line for these cases.
|
||||
language = None
|
||||
|
||||
# List of patterns, relative to source directory, that match files and
|
||||
# directories to ignore when looking for source files.
|
||||
# This pattern also affects html_static_path and html_extra_path.
|
||||
exclude_patterns = [u'_build', 'Thumbs.db', '.DS_Store']
|
||||
|
||||
# The name of the Pygments (syntax highlighting) style to use.
|
||||
pygments_style = None
|
||||
|
||||
|
||||
# -- Options for HTML output -------------------------------------------------
|
||||
|
||||
# The theme to use for HTML and HTML Help pages. See the documentation for
|
||||
# a list of builtin themes.
|
||||
#
|
||||
html_theme = 'sphinx_rtd_theme'
|
||||
|
||||
# Theme options are theme-specific and customize the look and feel of a theme
|
||||
# further. For a list of options available for each theme, see the
|
||||
# documentation.
|
||||
#
|
||||
# html_theme_options = {}
|
||||
|
||||
# Add any paths that contain custom static files (such as style sheets) here,
|
||||
# relative to this directory. They are copied after the builtin static files,
|
||||
# so a file named "default.css" will overwrite the builtin "default.css".
|
||||
html_static_path = ['_static']
|
||||
|
||||
# Custom sidebar templates, must be a dictionary that maps document names
|
||||
# to template names.
|
||||
#
|
||||
# The default sidebars (for documents that don't match any pattern) are
|
||||
# defined by theme itself. Builtin themes are using these templates by
|
||||
# default: ``['localtoc.html', 'relations.html', 'sourcelink.html',
|
||||
# 'searchbox.html']``.
|
||||
#
|
||||
# html_sidebars = {}
|
||||
|
||||
|
||||
# -- Options for HTMLHelp output ---------------------------------------------
|
||||
|
||||
# Output file base name for HTML help builder.
|
||||
htmlhelp_basename = 'LanguageDatadoc'
|
||||
|
||||
|
||||
# -- Options for LaTeX output ------------------------------------------------
|
||||
|
||||
latex_elements = {
|
||||
# The paper size ('letterpaper' or 'a4paper').
|
||||
#
|
||||
# 'papersize': 'letterpaper',
|
||||
|
||||
# The font size ('10pt', '11pt' or '12pt').
|
||||
#
|
||||
# 'pointsize': '10pt',
|
||||
|
||||
# Additional stuff for the LaTeX preamble.
|
||||
#
|
||||
# 'preamble': '',
|
||||
|
||||
# Latex figure (float) alignment
|
||||
#
|
||||
# 'figure_align': 'htbp',
|
||||
}
|
||||
|
||||
# Grouping the document tree into LaTeX files. List of tuples
|
||||
# (source start file, target name, title,
|
||||
# author, documentclass [howto, manual, or own class]).
|
||||
latex_documents = [
|
||||
(master_doc, 'LanguageData.tex', u'LanguageData Documentation',
|
||||
u'Wikimedia Foundation', 'manual'),
|
||||
]
|
||||
|
||||
|
||||
# -- Options for manual page output ------------------------------------------
|
||||
|
||||
# One entry per manual page. List of tuples
|
||||
# (source start file, name, description, authors, manual section).
|
||||
man_pages = [
|
||||
(master_doc, 'languagedata', u'LanguageData Documentation',
|
||||
[author], 1)
|
||||
]
|
||||
|
||||
|
||||
# -- Options for Texinfo output ----------------------------------------------
|
||||
|
||||
# Grouping the document tree into Texinfo files. List of tuples
|
||||
# (source start file, target name, title, author,
|
||||
# dir menu entry, description, category)
|
||||
texinfo_documents = [
|
||||
(master_doc, 'LanguageData', u'LanguageData Documentation',
|
||||
author, 'LanguageData', 'One line description of project.',
|
||||
'Miscellaneous'),
|
||||
]
|
||||
|
||||
|
||||
# -- Options for Epub output -------------------------------------------------
|
||||
|
||||
# Bibliographic Dublin Core info.
|
||||
epub_title = project
|
||||
|
||||
# The unique identifier of the text. This can be a ISBN number
|
||||
# or the project homepage.
|
||||
#
|
||||
# epub_identifier = ''
|
||||
|
||||
# A unique identification for the text.
|
||||
#
|
||||
# epub_uid = ''
|
||||
|
||||
# A list of files that should not be packed into the epub file.
|
||||
epub_exclude_files = ['search.html']
|
||||
|
||||
# PHP Syntax
|
||||
from sphinx.highlighting import lexers
|
||||
from pygments.lexers.web import PhpLexer
|
||||
lexers["php"] = PhpLexer(startinline=True, linenos=1)
|
||||
lexers["php-annotations"] = PhpLexer(startinline=True, linenos=1)
|
||||
|
||||
# Set domain
|
||||
primary_domain = "php"
|
||||
|
||||
# Regenerate API docs via doxygen + doxyphp2sphinx
|
||||
import subprocess, os
|
||||
read_the_docs_build = os.environ.get('READTHEDOCS', None) == 'True'
|
||||
if read_the_docs_build:
|
||||
subprocess.call(['doxygen', 'Doxyfile'])
|
||||
subprocess.call(['doxyphp2sphinx', 'Wikimedia'])
|
||||
113
docs/index.rst
Normal file
113
docs/index.rst
Normal file
@@ -0,0 +1,113 @@
|
||||
.. LanguageData documentation master file, created by
|
||||
sphinx-quickstart on Thu Jan 30 13:47:31 2020.
|
||||
You can adapt this file completely to your liking, but it should at least
|
||||
contain the root `toctree` directive.
|
||||
|
||||
CLDR based language data and utilities
|
||||
========================================
|
||||
|
||||
This library contains language related data, and utility libraries written in PHP and Node.js to
|
||||
interact with that data.
|
||||
|
||||
The language related data comprises of the following,
|
||||
|
||||
1. The script in which a language is written
|
||||
2. The script code
|
||||
3. The language code
|
||||
4. The regions in which the language is spoken
|
||||
5. The autonym - language name written in its own script
|
||||
6. The directionality of the text
|
||||
|
||||
This data is populated from the current version of
|
||||
`CLDR supplemental data <http://unicode.org/repos/cldr/trunk/common/supplemental/supplementalData.xml>`_.
|
||||
|
||||
Using the PHP library
|
||||
----------------------------
|
||||
|php-build|
|
||||
|
||||
.. |php-build| image:: https://github.com/Abijeet/language-data/workflows/PHP%20build/badge.svg
|
||||
:target: https://github.com/Abijeet/language-data/actions?query=workflow%3A%22PHP+build%22
|
||||
|
||||
Installation
|
||||
^^^^^^^^^^^^^
|
||||
You can add this library to your project by running:
|
||||
|
||||
.. code-block:: bash
|
||||
|
||||
composer install wikimedia/language-data
|
||||
|
||||
Basic usage
|
||||
^^^^^^^^^^^^^
|
||||
The basic usage is like this:
|
||||
|
||||
.. code-block:: php
|
||||
|
||||
<?php
|
||||
use Wikimedia\LanguageData;
|
||||
|
||||
$languageData = LanguageData::get();
|
||||
// Returns English
|
||||
$languageData->getAutonym( 'en' );
|
||||
|
||||
For a full list of methods see the documentation for the `LanguageData <api/languagedata.html>`_ class.
|
||||
|
||||
Using the Node.js library
|
||||
------------------------------
|
||||
|
||||
|npm| |npm-build|
|
||||
|
||||
.. |npm| image:: https://img.shields.io/npm/v/@wikimedia/language-data.svg
|
||||
:target: https://npmjs.com/package/@wikimedia/language-data
|
||||
|
||||
|
||||
.. |npm-build| image:: https://github.com/Abijeet/language-data/workflows/Node.js%20build/badge.svg
|
||||
:target: https://github.com/Abijeet/language-data/actions?query=workflow%3A%22Node.js+build%22
|
||||
|
||||
Installation
|
||||
^^^^^^^^^^^^^
|
||||
You can add this library to your project by running,
|
||||
|
||||
.. code-block:: bash
|
||||
|
||||
npm i @wikimedia/language-data
|
||||
|
||||
Basic usage
|
||||
^^^^^^^^^^^^^
|
||||
The basic usage is like this:
|
||||
|
||||
.. code-block:: js
|
||||
|
||||
const languageData = require('@wikimedia/language-data');
|
||||
|
||||
// Returns English
|
||||
languageData.getAutonym( 'en');
|
||||
|
||||
The exposed methods are similar to the methods present in the PHP `LanguageData <api/languagedata.html>`_ class.
|
||||
|
||||
Contribute
|
||||
----------
|
||||
|
||||
- Issue Tracker: https://github.com/wikimedia/language-data/issues
|
||||
- Source Code: https://github.com/wikimedia/language-data
|
||||
|
||||
Navigation
|
||||
==========
|
||||
|
||||
.. toctree::
|
||||
:maxdepth: 1
|
||||
:caption: User Documentation
|
||||
|
||||
Adding new languages <user/adding_new_language.rst>
|
||||
|
||||
.. toctree::
|
||||
:maxdepth: 2
|
||||
:caption: PHP API Documentation
|
||||
|
||||
LanguageData class <api/languagedata.rst>
|
||||
|
||||
|
||||
Indices and tables
|
||||
==================
|
||||
|
||||
* :ref:`genindex`
|
||||
* :ref:`search`
|
||||
35
docs/make.bat
Normal file
35
docs/make.bat
Normal file
@@ -0,0 +1,35 @@
|
||||
@ECHO OFF
|
||||
|
||||
pushd %~dp0
|
||||
|
||||
REM Command file for Sphinx documentation
|
||||
|
||||
if "%SPHINXBUILD%" == "" (
|
||||
set SPHINXBUILD=sphinx-build
|
||||
)
|
||||
set SOURCEDIR=.
|
||||
set BUILDDIR=_build
|
||||
|
||||
if "%1" == "" goto help
|
||||
|
||||
%SPHINXBUILD% >NUL 2>NUL
|
||||
if errorlevel 9009 (
|
||||
echo.
|
||||
echo.The 'sphinx-build' command was not found. Make sure you have Sphinx
|
||||
echo.installed, then set the SPHINXBUILD environment variable to point
|
||||
echo.to the full path of the 'sphinx-build' executable. Alternatively you
|
||||
echo.may add the Sphinx directory to PATH.
|
||||
echo.
|
||||
echo.If you don't have Sphinx installed, grab it from
|
||||
echo.http://sphinx-doc.org/
|
||||
exit /b 1
|
||||
)
|
||||
|
||||
%SPHINXBUILD% -M %1 %SOURCEDIR% %BUILDDIR% %SPHINXOPTS%
|
||||
goto end
|
||||
|
||||
:help
|
||||
%SPHINXBUILD% -M help %SOURCEDIR% %BUILDDIR% %SPHINXOPTS%
|
||||
|
||||
:end
|
||||
popd
|
||||
4
docs/requirements.txt
Normal file
4
docs/requirements.txt
Normal file
@@ -0,0 +1,4 @@
|
||||
Sphinx==1.8.4
|
||||
sphinx-rtd-theme==0.4.2
|
||||
sphinxcontrib-phpdomain==0.6.3
|
||||
doxyphp2sphinx>=1.0.1
|
||||
31
docs/user/adding_new_language.rst
Normal file
31
docs/user/adding_new_language.rst
Normal file
@@ -0,0 +1,31 @@
|
||||
Adding new languages
|
||||
=========================
|
||||
|
||||
New languages must be added to the ``data/langdb.yaml`` file.
|
||||
|
||||
The file format is: `ISO 639 <https://en.wikipedia.org/wiki/ISO_639>`_ code:
|
||||
``[writing system code, [regions list], autonym]``
|
||||
|
||||
The writing system is indicated using `ISO 15924 <https://en.wikipedia.org/wiki/ISO_15924>`_
|
||||
codes. Make sure that the code appears in the ``scriptgroups`` section towards the end of
|
||||
the file, and add it, if it doesn't.
|
||||
|
||||
The list of region codes appears at the end of ``data/langdb.yaml``.
|
||||
|
||||
The autonym is the name of the language in the language itself. In some cases, for example for
|
||||
extinct languages such as Jewish Babylonian Aramaic (tmr), the name can be something that is
|
||||
useful for modern users, but in most cases it should be the natural name in the language itself.
|
||||
Please do your best to verify that it's spelled correctly in reliable sources.
|
||||
|
||||
Example: ``myv: [Cyrl, [EU], эрзянь]``
|
||||
|
||||
This is the `Erzya language <https://en.wikipedia.org/wiki/Erzya_language>`_. Its writing system
|
||||
is Cyrillic (ISO 15924: Cyrl). It's spoken in Europe (EU). Its autonym is "эрзянь".
|
||||
|
||||
Some languages are listed as redirects. In this case, the only value in the square brackets is
|
||||
the target language code. For example: fil: [tl]
|
||||
|
||||
This is the Filipino language, which is a redirect to Tagalog (tl).
|
||||
|
||||
After adding a language to ``data/langdb.yaml``, run ``php src/util/ulsdata2json.php`` in the
|
||||
base directory to generate the ``language-data.json`` file. Don't edit ``language-data.json`` manually.
|
||||
@@ -2,7 +2,7 @@
|
||||
"name": "@wikimedia/language-data",
|
||||
"version": "0.1.2",
|
||||
"description": "Language data and utilities",
|
||||
"homepage": "https://github.com/wikimedia/language-data",
|
||||
"homepage": "https://language-data.readthedocs.io/en/latest/index.html",
|
||||
"keywords": [
|
||||
"cldr",
|
||||
"internationalization",
|
||||
|
||||
@@ -1,7 +1,6 @@
|
||||
<?php
|
||||
/**
|
||||
* Contains a utility class to query the language data.
|
||||
*
|
||||
* @file
|
||||
* @license GPL-2.0-or-later
|
||||
*/
|
||||
@@ -9,19 +8,36 @@
|
||||
namespace Wikimedia;
|
||||
|
||||
/**
|
||||
* Utility class to query the language data.
|
||||
* A singleton utility class to query the language data.
|
||||
*/
|
||||
class LanguageData {
|
||||
/**
|
||||
* Instance of the class.
|
||||
* @var LanguageData
|
||||
*/
|
||||
private static $instance;
|
||||
|
||||
/**
|
||||
* If language does not belong to a script group, this is returned instead.
|
||||
* @var string
|
||||
*/
|
||||
public const OTHER_SCRIPT_GROUP = 'Other';
|
||||
|
||||
/**
|
||||
* Path of the language data file
|
||||
* @var string
|
||||
*/
|
||||
private const LANGUAGE_DATA_PATH = '../data/language-data.json';
|
||||
|
||||
/**
|
||||
* Cached language data object
|
||||
* @var object
|
||||
*/
|
||||
private $data;
|
||||
|
||||
/**
|
||||
* Returns an instance of the class
|
||||
* Returns an instance of the class that can be used to then call the other methods in the
|
||||
* class.
|
||||
* @return LanguageData
|
||||
*/
|
||||
public static function get(): LanguageData {
|
||||
@@ -39,7 +55,7 @@ class LanguageData {
|
||||
|
||||
/**
|
||||
* Checks if a language code is valid
|
||||
* @param string $languageCode
|
||||
* @param string $languageCode Language code
|
||||
* @return bool
|
||||
*/
|
||||
public function isKnown( string $languageCode ): bool {
|
||||
@@ -47,7 +63,7 @@ class LanguageData {
|
||||
}
|
||||
|
||||
/**
|
||||
* Is this language a redirect to another language?
|
||||
* Checks if the language is a redirect and returns the target language code
|
||||
* @param string $languageCode Language code
|
||||
* @return string|bool Target language code if it's a redirect or false if it's not
|
||||
*/
|
||||
@@ -63,7 +79,9 @@ class LanguageData {
|
||||
}
|
||||
|
||||
/**
|
||||
* Get all the languages
|
||||
* Get all the languages. The properties in the returned object are ISO 639 language codes
|
||||
* The value of each property is an array that has,
|
||||
* [writing system code, [regions list], autonym]
|
||||
* @return object
|
||||
*/
|
||||
public function getLanguages() {
|
||||
@@ -71,9 +89,9 @@ class LanguageData {
|
||||
}
|
||||
|
||||
/**
|
||||
* Returns the script of the language or false
|
||||
* @param string $languageCode
|
||||
* @return string|bool Language script if its a known language, else false.
|
||||
* Returns the script of the language
|
||||
* @param string $languageCode Language code
|
||||
* @return string|bool Language script or false if the language is unknown
|
||||
*/
|
||||
public function getScript( string $languageCode ) {
|
||||
if ( !$this->isKnown( $languageCode ) ) {
|
||||
@@ -89,9 +107,9 @@ class LanguageData {
|
||||
}
|
||||
|
||||
/**
|
||||
* Returns the regions in which a language is spoken.
|
||||
* @param string $languageCode
|
||||
* @return string[]|bool Array of regions or false if language is unknown.
|
||||
* Returns the regions in which a language is spoken
|
||||
* @param string $languageCode Language code
|
||||
* @return array|bool List of regions or false if language is unknown
|
||||
*/
|
||||
public function getRegions( string $languageCode ) {
|
||||
if ( !$this->isKnown( $languageCode ) ) {
|
||||
@@ -107,11 +125,11 @@ class LanguageData {
|
||||
}
|
||||
|
||||
/**
|
||||
* Returns the autonym of the language.
|
||||
* Returns the autonym of the language
|
||||
* @param string $languageCode Language code
|
||||
* @return string|bool
|
||||
* @return string|bool Autonym of the language or false if the language is unknown
|
||||
*/
|
||||
public function getAutonym( $languageCode ) {
|
||||
public function getAutonym( string $languageCode ) {
|
||||
if ( !$this->isKnown( $languageCode ) ) {
|
||||
return false;
|
||||
}
|
||||
@@ -127,7 +145,8 @@ class LanguageData {
|
||||
|
||||
/**
|
||||
* Returns all language codes and corresponding autonyms
|
||||
* @return array
|
||||
* @return array The key is the language code, and the values are corresponding
|
||||
* autonym
|
||||
*/
|
||||
public function getAutonyms(): array {
|
||||
$languages = $this->getLanguages();
|
||||
@@ -143,9 +162,9 @@ class LanguageData {
|
||||
}
|
||||
|
||||
/**
|
||||
* Returns all languages written in the given scripts.
|
||||
* @param string[] $scripts
|
||||
* @return string[]
|
||||
* Returns all languages written in the given scripts
|
||||
* @param array $scripts List of strings, each being the name of a script
|
||||
* @return array
|
||||
*/
|
||||
public function getLanguagesInScripts( array $scripts ): array {
|
||||
$languages = $this->getLanguages();
|
||||
@@ -165,19 +184,18 @@ class LanguageData {
|
||||
}
|
||||
|
||||
/**
|
||||
* Returns all languages written in the given script.
|
||||
* @param string $script
|
||||
* @return string[]
|
||||
* Returns all languages written in the given script
|
||||
* @param string $script Name of the script
|
||||
* @return array
|
||||
*/
|
||||
public function getLanguagesInScript( string $script ): array {
|
||||
return $this->getLanguagesInScripts( [ $script ] );
|
||||
}
|
||||
|
||||
/**
|
||||
* Returns the script group of a script or 'Other' if it doesn't
|
||||
* belong to any group.
|
||||
* @param string $script Script code
|
||||
* @return string script group name
|
||||
* Returns the script group of a script or "Other" if it doesn't belong to any group
|
||||
* @param string $script Name of the script
|
||||
* @return string Script group name or "Other" if the script doesn't belong to any group
|
||||
*/
|
||||
public function getGroupOfScript( string $script ): string {
|
||||
$scriptGroups = $this->data->scriptgroups;
|
||||
@@ -191,7 +209,8 @@ class LanguageData {
|
||||
}
|
||||
|
||||
/**
|
||||
* Returns the script group of a language.
|
||||
* Returns the script group of a language. Language belongs to a script, and the script
|
||||
* belongs to a script group
|
||||
* @param string $languageCode Language code
|
||||
* @return string script group name
|
||||
*/
|
||||
@@ -200,9 +219,9 @@ class LanguageData {
|
||||
}
|
||||
|
||||
/**
|
||||
* Return the list of languages passed, grouped by script.
|
||||
* @param string[] $languageCodes Array of language codes to group
|
||||
* @return array Array of language codes grouped by script
|
||||
* Return the list of languages passed, grouped by their script group
|
||||
* @param array $languageCodes List of language codes to group
|
||||
* @return array List of language codes grouped by script group
|
||||
*/
|
||||
public function getLanguagesByScriptGroup( array $languageCodes ): array {
|
||||
$languagesByScriptGroup = [];
|
||||
@@ -230,9 +249,10 @@ class LanguageData {
|
||||
|
||||
/**
|
||||
* Returns an associative array of languages in several regions,
|
||||
* grouped by script group.
|
||||
* @param string[] $regions array of region codes
|
||||
* @return array
|
||||
* grouped by script group
|
||||
* @param array $regions List of strings representing region codes
|
||||
* @return array Returns an associative array. They key is the script group name,
|
||||
* and the value is a list of language codes in that region.
|
||||
*/
|
||||
public function getLanguagesByScriptGroupInRegions( array $regions ): array {
|
||||
$languagesByScriptGroupInRegions = [];
|
||||
@@ -262,20 +282,21 @@ class LanguageData {
|
||||
}
|
||||
|
||||
/**
|
||||
* Returns an associative array of languages in a region, grouped by their script.
|
||||
* Returns an associative array of languages in a region, grouped by their script
|
||||
* @see LanguageData#getLanguagesByScriptGroupInRegions
|
||||
* @param string $region Region code
|
||||
* @return array
|
||||
*/
|
||||
public function getLanguagesByScriptGroupInRegion( $region ): array {
|
||||
public function getLanguagesByScriptGroupInRegion( string $region ): array {
|
||||
return $this->getLanguagesByScriptGroupInRegions( [ $region ] );
|
||||
}
|
||||
|
||||
/**
|
||||
* Return the list of languages sorted by script groups.
|
||||
* @param string[] $languageCodes Array of language codes to sort
|
||||
* @return string[] Array of language codes
|
||||
* Return the list of languages sorted by their script groups
|
||||
* @param array $languageCodes List of language codes to sort
|
||||
* @return array Sorted list of strings containing language codes
|
||||
*/
|
||||
public function sortByScriptGroup( array $languageCodes ) {
|
||||
public function sortByScriptGroup( array $languageCodes ): array {
|
||||
$groupedLanguageData = $this->getLanguagesByScriptGroup( $languageCodes );
|
||||
ksort( $groupedLanguageData, SORT_STRING | SORT_FLAG_CASE );
|
||||
|
||||
@@ -288,9 +309,9 @@ class LanguageData {
|
||||
}
|
||||
|
||||
/**
|
||||
* Sort languages by their autonym.
|
||||
* @param string[] $languageCodes
|
||||
* @return string[]
|
||||
* Sort languages by their autonym
|
||||
* @param array $languageCodes List of language codes to sort
|
||||
* @return array List of sorted language codes returned by their autonym
|
||||
*/
|
||||
public function sortByAutonym( array $languageCodes ): array {
|
||||
$sortedLanguages = [];
|
||||
@@ -307,9 +328,10 @@ class LanguageData {
|
||||
}
|
||||
|
||||
/**
|
||||
* Check if a language is right-to-left.
|
||||
* Check if a language is right-to-left
|
||||
* @param string $languageCode Language code
|
||||
* @return bool
|
||||
* @return bool true if it is an RTL language, else false. Returns false if an
|
||||
* unknown language code is passed.
|
||||
*/
|
||||
public function isRtl( string $languageCode ): bool {
|
||||
$script = $this->getScript( $languageCode );
|
||||
@@ -317,9 +339,10 @@ class LanguageData {
|
||||
}
|
||||
|
||||
/**
|
||||
* Return the direction of the language. Returns false if the direction is unknown.
|
||||
* Return the direction of the language
|
||||
* @param string $languageCode Language code
|
||||
* @return string|bool
|
||||
* @return string|bool Returns 'rtl' or 'ltr'. If the language code is unknown,
|
||||
* returns false.
|
||||
*/
|
||||
public function getDir( string $languageCode ) {
|
||||
if ( $this->isKnown( $languageCode ) ) {
|
||||
@@ -330,9 +353,10 @@ class LanguageData {
|
||||
}
|
||||
|
||||
/**
|
||||
* Returns the languages spoken in a territory.
|
||||
* Returns the languages spoken in a territory
|
||||
* @param string $territory Territory code
|
||||
* @return string[]|bool list of language codes
|
||||
* @return array|bool List of language codes in the territory, or else false if invalid
|
||||
* territory is passed
|
||||
*/
|
||||
public function getLanguagesInTerritory( string $territory ) {
|
||||
if ( isset( $this->data->territories->$territory ) ) {
|
||||
@@ -345,7 +369,7 @@ class LanguageData {
|
||||
/**
|
||||
* Adds a language in run time and sets its options as provided.
|
||||
* If the target option is provided, the language is defined as a redirect.
|
||||
* Other possible options are script, regions and autonym.
|
||||
* Other possible options are `script` (string), `regions` (array) and `autonym` (string).
|
||||
* @param string $languageCode New language code.
|
||||
* @param array $options Language properties.
|
||||
*/
|
||||
@@ -361,7 +385,7 @@ class LanguageData {
|
||||
|
||||
/**
|
||||
* Return the language data based on language code. Performs no check, meant for
|
||||
* internal use only.
|
||||
* internal use only
|
||||
* @param string $languageCode
|
||||
* @return array
|
||||
*/
|
||||
|
||||
Reference in New Issue
Block a user