Commit Graph

29 Commits

Author SHA1 Message Date
Thiemo Kreuz
f2e2e640c6 Avoid counting where not necessary
Change-Id: Iaae939780df26066de40e1584492865cb0ac80a7
2018-08-10 14:42:41 +00:00
libraryupgrader
38b449ceef build: Updating mediawiki/mediawiki-codesniffer to 20.0.0
Change-Id: I02db015a34f67a8b268feead090d2f6be5658935
2018-05-26 07:31:25 +00:00
Kunal Mehta
10a5865f9a Fix MediaWiki.Commenting.LicenseComment.InvalidLicenseTag errors
Change-Id: I2c868006d108b35adaa19d179bc6ebe95e29d0ef
2018-05-23 23:02:54 -07:00
libraryupgrader
f97802e4e2 build: Updating mediawiki/mediawiki-codesniffer to 18.0.0
The following sniffs are failing and were disabled:
* MediaWiki.VariableAnalysis.UnusedGlobalVariables.UnusedGlobal$wgWBClientSettings

Change-Id: Ia5423c3d7ea419b3f073f35736de7a9379d4429a
2018-04-14 07:39:31 +00:00
libraryupgrader
858ebd5552 build: Updating mediawiki/mediawiki-codesniffer to 17.0.0
The following sniffs are failing and were disabled:
* MediaWiki.Commenting.LicenseComment.InvalidLicenseTag

The following sniffs now pass and were enabled:
* MediaWiki.Commenting.FunctionComment.MissingParamComment

Change-Id: I06e0542d737cec5e2500aad6d85f72951f8b584d
2018-03-29 06:53:52 +00:00
Niklas Laxström
1e15341fd1 Use dash as separator for non-prefix matches in language name search
Bug: T186480
Change-Id: Ib785e2b070e0c5a218b236be194417f0b1fbd102
2018-02-06 17:26:21 +01:00
jenkins-bot
603cfea7d0 Merge "Improve ULS language search api" 2017-12-01 04:47:30 +00:00
Santhosh Thottingal
3bf7361262 LanguageNameSearch: Optimize levenshteinDistance
1. Do string comparison for equality early in the method so that we can
   do early return if it passes.
2. Move the zero length check for string up for early return. This may
   not have any significant change in performance though.

Change-Id: I86bdd612a4a31c5ebfac6bcd7687b829acc69cda
2017-11-30 16:38:41 +05:30
Niklas Laxström
e87dd20cdd Improve ULS language search api
* Store prefixes and infixes separately in the data
* First match language code, then prefixes, then infixes
* Try to use suggestion either in user language or autonym first
* use formatversion=2 to avoid escaping Unicode

Using Language::fetchLanguageName might can have a small
performance impact. On the other hand there is now check
to skip languages we already found, avoiding some fuzzy
matching.

This is in a preparation for a change in jquery.uls to use
the search API more, while trying to reduce the amount of
weird autocompletion suggestions we show to the user.

Bug: T73891
Change-Id: Id94c5352d9a591969bf90144d1d2d5e758d08301
2017-11-27 14:57:42 +01:00
Niklas Laxström
a353c5ab65 Perform search on every word of language name
See e.g. T132021. This favours coverage over quality.

Change-Id: I3fc8fb1702802bc002c3d7e2941563840914f325
2017-11-23 09:14:10 +00:00
Umherirrender
7761a9e60b Improve some parameter docs
Change-Id: Icd8fd55cf1a4a83a6f674038e098b9be8257dc0c
2017-10-07 16:54:28 +02:00
Umherirrender
1a4ac5a6d6 build: Updating mediawiki/mediawiki-codesniffer to 0.10.1
Change-Id: Ib7a361cf2973bf0bba0fb8944762216f44c226a8
2017-07-26 23:22:26 +02:00
Niklas Laxström
55b68c329d LanguageNameSearch: do not mix different scripts in same buckets
To keep the average and maximum bucket size low, I made codepoints
< 4000 more granular and code points >= 4000 less granular. This
could be tweaked further for sure to reach more even sized buckets.

Bucket stats before:
 - 773 buckets
 - smallest has 1 entries
 - largest has 1804 entries
 - median size is 66 entries
 - average size is 45.394566623545 entries

Bucket stats after:
 - 698 buckets
 - smallest has 1 entries
 - largest has 1792 entries
 - median size is 16 entries
 - average size is 50.272206303725 entries

Change-Id: Id62d93658117564b05294c2fe36ca7c182784859
2016-08-08 16:21:52 +02:00
Niklas Laxström
b3ba423354 LanguageNameIndexer: Generate PHP file instead of serialized file.
Serialized format is no longer in style for data. PHP files can
take advantage of AutoLoader and caching so they can even be faster
than serialized files. As side bonus we can have readable diffs
for updates.

Only downside is that the file generation takes about ten lines of
ugly string manipulation.

Change-Id: If09704d1172daa13c72a308814534cac1fe9899f
2016-08-08 07:55:42 +00:00
Niklas Laxström
9daeacf1c5 LanguageNameIndexer/Search: use unicode aware lowercasing
With this MEÄNKELI with typos=1 finds results.

Updated test case for lowercased result. Renamed variables in test
file for clarity. Updated the default value for MW_INSTALL_PATH to
work with the default layout.

Change-Id: Id93c84d308705f55b4d2378fc8c7b7f243e1b53f
2016-08-08 08:43:15 +02:00
Kunal Mehta
6b8c33e763 build: Updating mediawiki/mediawiki-codesniffer to 0.7.1
Also added "composer fix" command.

Change-Id: I6f3f29f03abb607fbca9cec6f140875f2a3468a0
2016-05-09 18:30:34 -07:00
Siebrand Mazeland
49b4cc0028 Declare functions with access modifiers
Change-Id: I047d3dc6642de07130a43ad4c2fd4a8106450aac
2016-03-05 16:10:52 +01:00
Thiemo Mättig
bdece37fe3 Simplify LanguageNameSearch code
A more efficient "startsWith" lookup.
And simplifying redundant modulo code.

Change-Id: I87ab0ff70d26fc058fc40bd20745b9b6effb7ddd
2014-12-08 19:00:32 +01:00
Siebrand Mazeland
b19b89374d Refactor getCodepoint() to more consistently handle return values
Change-Id: Ida90e6c78be41e8527eaefd14feb45c57413945e
2013-08-05 09:29:41 +02:00
jenkins-bot
3e54cd9de8 Merge "Refactor complex ternary operation" 2013-08-05 07:22:36 +00:00
Siebrand Mazeland
622e388a6a Refactor complex ternary operation
Change-Id: I1b6cc1cf0348bc7e19f9f327c7a3d6d936cfaaf2
2013-08-05 09:06:19 +02:00
Siebrand Mazeland
102f257427 Fix CodeSniffer errors and warnings
More fixes will be submitted upstream.

Change-Id: Ib22997f8756537b063fd6eed3f1f74f3eda315d7
2013-08-05 05:55:03 +02:00
Niklas Laxström
da255cdc77 Fix Undefined offset notice
Also removed some dead code that never ran, there is no variable named
"$buckets" so it'll never have an offset.

Bug: 45327
Change-Id: I1f70ef0ec4f2434f9f072e718140ff8050b81ba3
2013-04-26 14:29:08 +00:00
Siebrand Mazeland
e1a4f7f0cb After training the PHPStorm code formatter.
See https://github.com/siebrand/MediaWiki-PHPStorm

Issue remains with anonymous functions in JavaScript.

Change-Id: I2b520f8df127452acf02deb659277a6465e6ca59
2012-09-17 17:10:59 -07:00
Reedy
0f0732f865 Fixup a few minor documentation issues
Added some newlines

Left a FIXME in LanguageNameIndexer.php

Losslessly compressed display.png

Change-Id: I884b423d3812ddb964a6a70f75a6331a73371165
2012-08-19 01:19:46 +01:00
Siebrand Mazeland
72d2519c4d Fix some issues pointed out by IDE.
* Update .gitignore to ignore .idea.
* Removed unused local variables.
* use local context and Message class instead of deprecated wfMsg* methods.
* Remove redundant px in CSS where possible.
* Combine CSS statements where possible.
* Replace b by strong.

Change-Id: I9d5ed7b7ce585a1c101044254bcbdfc33d42afc1
2012-08-15 17:32:49 +02:00
Santhosh Thottingal
76f9038aff Allow typo in search key
* Introduce Levenshtein algorithm
* New API param 'typos' to give number of typos allowed
* test cases

Change-Id: I22bf34d08a910d1509d7eab5adc292eadc9a7c7d
2012-08-03 07:33:12 +00:00
Santhosh Thottingal
3d9807e7f2 Fix php warnings.
Change-Id: Icd1302f7db425157def4771ffe0d7c816164eb23
2012-07-31 16:03:52 +05:30
Santhosh Thottingal
08c14dafa4 Cross-language language name search
Implementation of Also Written As language name
search algorithm.
See http://etherpad.wikimedia.org/l10n-uls-language-search

Change-Id: Iff84408c531b650a44d031b63d5c823737cceafc
2012-07-30 14:08:26 +05:30