Commit Graph

258 Commits

Author SHA1 Message Date
Niklas Laxström
e87dd20cdd Improve ULS language search api
* Store prefixes and infixes separately in the data
* First match language code, then prefixes, then infixes
* Try to use suggestion either in user language or autonym first
* use formatversion=2 to avoid escaping Unicode

Using Language::fetchLanguageName might can have a small
performance impact. On the other hand there is now check
to skip languages we already found, avoiding some fuzzy
matching.

This is in a preparation for a change in jquery.uls to use
the search API more, while trying to reduce the amount of
weird autocompletion suggestions we show to the user.

Bug: T73891
Change-Id: Id94c5352d9a591969bf90144d1d2d5e758d08301
2017-11-27 14:57:42 +01:00
Niklas Laxström
a353c5ab65 Perform search on every word of language name
See e.g. T132021. This favours coverage over quality.

Change-Id: I3fc8fb1702802bc002c3d7e2941563840914f325
2017-11-23 09:14:10 +00:00
Santhosh Thottingal
dc84413373 Remove Madan font for ne
* Unknown upstream
* Not updated for years
* ne has better support in operating systems
* Non-default font for ne

Bug: T180422

Change-Id: Ife0b81e4db3bc069752d89c53f4690ddcfad7ef3
2017-11-14 15:49:35 +01:00
Santhosh Thottingal
d5f0666025 Remove non-default Gubby font for Kannada and Tulu
Bug: T180422
Change-Id: I78af0a3889e48625ebb38b1b212cb8b454a5639c
2017-11-14 15:23:09 +01:00
Santhosh Thottingal
146426ffe7 Remove fonts for Odia(or) from fontrepo
Remove non-default Utkal font - Its upstream is unknown now.

Change-Id: Iefa9eeaf953d87d4a5c8766fa575d61f9bd96d2b
2017-11-13 15:31:27 +05:30
Santhosh Thottingal
c0bbd9efc1 Remove Tamil fonts from fontrepo
* These fonts are with no upstream now and little known among
  Tamil users.
* Tamil is very well supported in all operating systems
  now a days.
* Also reduce the metadata size for fontrepo

Change-Id: I4e7afb6476a4714f8d87bd2a048309b732883b2f
2017-11-13 15:26:59 +05:30
Santhosh Thottingal
28c0ba6bca Remove fonts for Malayalam from font repo
* I maintain these fonts in upstream and there we several releases
  since these fonts were added.
* Malayalam has better support in operating systems compared to 2012
  when these fonts were added.
* Reduce font metadata size for wikipedia pages when webfonts are
  enabled.

Change-Id: Ie5b54cc866b1c67849b094a9701b2c80d876b55f
2017-11-13 09:16:58 +00:00
Santhosh Thottingal
6bddc79773 Remove Lohit family of fonts from fontrepo
* The languages covered by these fonts are now available in all
operating systems.
* These fonts are not updated for years in our repo
* Saves the amount of font repo data we deliver for *every wikipedia
  page* when webfonts is enabled

Change-Id: Ia0f1b6acc4cf8b7a354671bea47b58425ab8c08e
2017-11-13 09:16:22 +00:00
Santhosh Thottingal
4f3461a9aa Remove autonym font and its usage
I no longer maintain the Autonym font.
Also remove the tofu detection.

Bug: T135464
Bug: T135465
Change-Id: I103aab40ea5f5fc403a7ee5b23d1b634cc9c6ee1
2017-11-13 08:03:12 +00:00
Niklas Laxström
56d3f2af43 Make output of LanguageNameIndexer more consistent
Change-Id: I13f06b9b1c65068206f1728f8a427c4ca46f28ec
2017-10-31 16:25:01 +01:00
Amire80
101532cfa6 Add special language names to facilitate searching
This adds several custom languages.

The addition of Punjabi addresses Bug T178070.

The addition of Chinese addresses Bug T73891.

Georgian and Catalan (Valencian) variant spellings
are added because these are the most frequent languages
that are not found in the ULS search box.

Bug: T73891
Bug: T178070
Change-Id: Ifbb08b560e454643d246379c19f725bde61917e9
2017-10-25 13:50:12 +05:30
Santhosh Thottingal
18c09bc6d3 Update language name data index with CLDR 31
Change-Id: I7c7b26a01b5c5780cbf7a19983388e16b4e97cc1
2017-10-24 17:52:29 +05:30
Umherirrender
7761a9e60b Improve some parameter docs
Change-Id: Icd8fd55cf1a4a83a6f674038e098b9be8257dc0c
2017-10-07 16:54:28 +02:00
Umherirrender
1a4ac5a6d6 build: Updating mediawiki/mediawiki-codesniffer to 0.10.1
Change-Id: Ib7a361cf2973bf0bba0fb8944762216f44c226a8
2017-07-26 23:22:26 +02:00
Kartik Mistry
eb8eed98e9 Add Sundanese font
Bug: T162221
Change-Id: Iabf1a22838bd4375be9c8ed3aabad9205523ef8e
2017-04-19 08:42:19 +00:00
Niklas Laxström
55b68c329d LanguageNameSearch: do not mix different scripts in same buckets
To keep the average and maximum bucket size low, I made codepoints
< 4000 more granular and code points >= 4000 less granular. This
could be tweaked further for sure to reach more even sized buckets.

Bucket stats before:
 - 773 buckets
 - smallest has 1 entries
 - largest has 1804 entries
 - median size is 66 entries
 - average size is 45.394566623545 entries

Bucket stats after:
 - 698 buckets
 - smallest has 1 entries
 - largest has 1792 entries
 - median size is 16 entries
 - average size is 50.272206303725 entries

Change-Id: Id62d93658117564b05294c2fe36ca7c182784859
2016-08-08 16:21:52 +02:00
Niklas Laxström
f73f9a8b5d LanguageNameIndexer: print bucket stats
Change-Id: If50b65b1bbda010f0dbde7d344edcb5bdcd382df
2016-08-08 13:38:53 +00:00
Niklas Laxström
bc7ee1ed19 LanguageNameIndexer: sort buckets
Change-Id: Ib33bc432d5f61de2fbb6e83f3566baebb184c441
2016-08-08 13:18:30 +00:00
Niklas Laxström
42f4f9650b LanguageNameIndexer: Remove directionality chars that cannot be typed
Change-Id: I8e5b9f300a3307a90054e4e759279f91594a2fa3
2016-08-08 10:56:39 +00:00
Niklas Laxström
b3ba423354 LanguageNameIndexer: Generate PHP file instead of serialized file.
Serialized format is no longer in style for data. PHP files can
take advantage of AutoLoader and caching so they can even be faster
than serialized files. As side bonus we can have readable diffs
for updates.

Only downside is that the file generation takes about ten lines of
ugly string manipulation.

Change-Id: If09704d1172daa13c72a308814534cac1fe9899f
2016-08-08 07:55:42 +00:00
jenkins-bot
8bb0c2f683 Merge "LanguageNameIndexer/Search: use unicode aware lowercasing" 2016-08-08 07:06:57 +00:00
Niklas Laxström
9daeacf1c5 LanguageNameIndexer/Search: use unicode aware lowercasing
With this MEÄNKELI with typos=1 finds results.

Updated test case for lowercased result. Renamed variables in test
file for clarity. Updated the default value for MW_INSTALL_PATH to
work with the default layout.

Change-Id: Id93c84d308705f55b4d2378fc8c7b7f243e1b53f
2016-08-08 08:43:15 +02:00
jenkins-bot
40f695b6b5 Merge "LanguageNameIndexer: Simplify code" 2016-08-08 05:08:43 +00:00
jenkins-bot
c9a32d17c1 Merge "Update Skeirs II font" 2016-07-28 13:47:22 +00:00
Kartik Mistry
d4ea8550fa Update Skeirs II font
This fixes display of Latin text in Gothic script.

Bug: T124785
Change-Id: Iaf2cc6b05591368356c241e7b65ce4a8e33c24e0
2016-07-28 15:32:27 +02:00
Kartik Mistry
1fe7d5bcf4 Remove reference to removed eot fonts
Change-Id: I0be9dae3433afe9868672d7fc45f48e15aba9e98
2016-07-27 20:51:06 +05:30
Niklas Laxström
920155fb18 LanguageNameIndexer: Simplify code
Inline one loop and remove autonym handling since getting the
translation of a language name in its own language gives the
autonym.

Change-Id: I0c8ff8e3ce0c7f23d123656b091df37aa71b2cd7
2016-06-15 11:22:02 +02:00
Niklas Laxström
9ac675811b LanguageNameIndexer: Add new language names
Change-Id: I5ffbad77a30b10b0d677cdb8b109f61d0c10ee05
2016-06-15 11:21:41 +02:00
Niklas Laxström
c37be51fa0 LanguageNameIndexer: Rename variables to make this code more understandable
Change-Id: Ib328bf49b46b222b04fcfe60359baff2fbbc3a7f
2016-06-15 11:21:34 +02:00
Niklas Laxström
b2b75b15eb Drop eot from supported webfont formats
IE9+ which is lowest that MediaWiki still supports is supposed to
support both ttf and woff par some exceptions.

This reduces uncompressed repo size almost 4000 bytes.

Change-Id: If80f4ec898d86d5fd4cf873d0d86245e66da2f0b
2016-05-24 08:06:11 +02:00
jenkins-bot
d6af3112ff Merge "First attempt at font test page generation" 2016-05-17 11:34:00 +00:00
Santhosh Thottingal
a2cfd0287f Add WOFF2 version of fonts
Modern browsers will use woff2, which has smaller size than woff

Updated the README in the font repository explaining how different
file formats are produced.

Bug: T128291
Change-Id: I81c5380fdbf0ff76142b67cf8fce9db20e8164fa
2016-05-11 10:31:01 +05:30
Niklas Laxström
71133ffdfc First attempt at font test page generation
Does not include all variants and not all languages have example content.

Change-Id: If5b759f2ed6e8e487f73ea7a88be5cc6b741b356
2016-05-10 07:33:47 +02:00
Kunal Mehta
6b8c33e763 build: Updating mediawiki/mediawiki-codesniffer to 0.7.1
Also added "composer fix" command.

Change-Id: I6f3f29f03abb607fbca9cec6f140875f2a3468a0
2016-05-09 18:30:34 -07:00
Niklas Laxström
651f8bc1c3 Refactor font repo compiler so it can be reused
Includes changes to the generated repository file
because the script had not been run for last update.

Change-Id: I6b5d1ce980c6e5b42e36c0044729536b6b0ae4dc
2016-04-14 10:12:20 +02:00
Siebrand Mazeland
f8487a54eb Use single quotes where possible
Change-Id: I8c0098e4840d7eff16cf5818f2247b134946d77b
2016-03-07 07:13:10 +00:00
Siebrand Mazeland
f03d5659fd Move assignment of $dir outside of loop
Change-Id: I24009db72d0afbb47532eb2c00329e488461066f
2016-03-05 16:25:23 +01:00
Siebrand Mazeland
53fd91f3c0 Update newlines
Change-Id: I147338b76d3c9b1a34de51978dbfdebd17026bc8
2016-03-05 16:23:39 +01:00
Siebrand Mazeland
49b4cc0028 Declare functions with access modifiers
Change-Id: I047d3dc6642de07130a43ad4c2fd4a8106450aac
2016-03-05 16:10:52 +01:00
jenkins-bot
6171eaba10 Merge "Update Amiri fonts" 2016-02-13 20:29:03 +00:00
Reedy
7c5336df8a Update OpenDyslexic from 2.1.0+git060dc841 to 2.1.0+git03aa683
Change-Id: Ide39c565d03fac70805ada3399c6010e17540fe4
2015-12-05 20:24:34 +05:30
Reedy
d533dc0c13 Fix syntax error in Akkadian font.ini.
~ in a string must be quoted in an ini file

Change-Id: I92548cf5bc35259f6bcb37456967f480eac76bc8
2015-12-05 01:28:51 +00:00
Niklas Laxström
f3a05e271b Remove not useful comments
Change-Id: I99c30a6c6d6b59e86e5e3efe31f38439c5e95095
2015-08-28 18:24:49 +02:00
Kartik Mistry
5d9a884243 Update Amiri fonts
Version: 0.107
URL: www.amirifont.org
Bug: T43940
Change-Id: I8421c85d590ad271ca40302b17325cafd5e5caeb
2015-07-22 20:48:58 +00:00
Kartik Mistry
c3f3f496f6 Add Gothic font
Bug: T52901
Change-Id: I85d35d813b0d0515ff86d05cad93fb42558ac195
2015-07-10 23:44:07 +05:30
Thiemo Mättig
bdece37fe3 Simplify LanguageNameSearch code
A more efficient "startsWith" lookup.
And simplifying redundant modulo code.

Change-Id: I87ab0ff70d26fc058fc40bd20745b9b6effb7ddd
2014-12-08 19:00:32 +01:00
Niklas Laxström
314e1c8c28 Update bugzilla references to phabricator and remove some excess links
Change-Id: I2cb920fd084a1ab333678e1e3c8f4524b39cc6cd
2014-12-06 22:09:23 +00:00
Niklas Laxström
be3f8f1435 Remove ComicNeue for languages which it does not support
List of characters coming from different font (not exhaustive list):
af: ê
bk: unknown language
ca: óàéí
da: æåø
de: üäö
es: áóñ
et: üä
fi: äö
fo: ø
fr: èé
fy: úêâûô
ga: íóáúé
gd: òèà
gl: áó
hu: öáóűé
is: Þýðí
it: à
lb: ëäé
li: äöèó
mi: ā
nb: åø
oc: çè
pl: ęą
pt: úêáçã
sq: ë
sv: äöå
tr: ğıçİşöú
wa: é

Change-Id: I115724d644bf2efddfec2558dd386d24f6238f24
2014-11-29 10:14:10 +01:00
Santhosh
3ff6bfa5e6 Revert "Update Malayalam fonts"
This reverts commit d546f6e71f.
Noted line height variations between WOFF and WOFF2.
Reverting till we investigate this
Similar to https://gerrit.wikimedia.org/r/#/c/174081/

Change-Id: I31e1a2dfe643fcd979953acc5cb6f93e872a2a87
2014-11-18 10:46:26 +00:00
Santhosh
9b396918b7 Revert "Add WOFF2 version of all fonts"
This reverts commit 90519fa8db.

Noted line height variations between WOFF and WOFF2. Reverting till we investigate this

Change-Id: I706c6b552f9045a4f36dd947d6339840b6d2665c
2014-11-18 09:43:33 +00:00