Thijs/wiki.techinc.nl

Author	SHA1	Message	Date
Bartosz Dziewoński	0313128b10	Use PHP 7 "\u{NNNN}" Unicode codepoint escapes in string literals In cases where we're operating on text data (and not binary data), use e.g. "\u{00A0}" to refer directly to the Unicode character 'NO-BREAK SPACE' instead of "\xc2\xa0" to specify the bytes C2h A0h (which correspond to the UTF-8 encoding of that character). This makes it easier to look up those mysterious sequences, as not all are as recognizable as the no-break space. This is not enforced by PHP, but I think we should write those in uppercase and zero-padded to at least four characters, like the Unicode standard does. Note that not all "\xNN" escapes can be automatically replaced: * We can't use Unicode escapes for binary data that is not UTF-8 (e.g. in code converting from legacy encodings or testing the handling of invalid UTF-8 byte sequences). * '\xNN' escapes in regular expressions in single-quoted strings are actually handled by PCRE and have to be dealt with carefully (those regexps should probably be changed to use the /u modifier). * "\xNN" referring to ASCII characters ("\x7F" and lower) should probably be left as-is. The replacements in this commit were done semi-manually by piping the existing "\xNN" escapes through the following terrible Ruby script I devised: chars = eval('"' + ARGV[0] + '"').force_encoding('utf-8') puts chars.split('').map{\|char\| '\\u{' + char.ord.to_s(16).upcase.rjust(4, '0') + '}' }.join('') Change-Id: Idc3dee3a7fb5ebfaef395754d8859b18f1f8769a	2018-06-04 16:20:13 +00:00
Amir Sarabadani	5a21de8abb	Remove everything related to CollationFa This workaround was needed when ICU in production was broken but after T189295 this is not needed anymore and we switched off this collation from all Persian Wikis already Bug: T139110 Change-Id: Ifad89555b6ac96a3eb36ca24b55e1f8ee57a1f05	2018-05-18 18:33:25 +02:00
jenkins-bot	1a21a63d52	Merge "Add collation for Abkhaz (ab)"	2018-01-23 18:42:29 +00:00
Thiemo Mättig	ef470ebf7f	Remove @param comments that literally repeat what the code says These comments do not add anything. I argue they are worse than having no comments, because I have to read them first to understand they actually don't explain anything. Removing them makes room for actual improvements in the future (if needed). Change-Id: Iee70aad681b3385e9af282d5581c10addbb91ac4	2018-01-10 14:14:26 +01:00
Kunal Mehta	a9960b53be	tests: Use checkPHPExtension() instead of re-implementing it Change-Id: I7f5e8684d556befc0aefa302187c573e7a3cff62	2017-12-25 22:06:37 +00:00
Kunal Mehta	222afabc80	Add @covers tags for Collation tests Change-Id: I8b0623a6b716acdc9d369349fd4e306dbdc91d18	2017-12-25 21:59:01 +00:00
Bartosz Dziewoński	e94587dfbb	Add collation for Abkhaz (ab) * Adding new class AbkhazUppercaseCollation, mapped to 'uppercase-ab'. * Extended CustomUppercaseCollation with support for sorting digraphs and for alphabets larger than 64 letters (up to 4096). Bug: T183430 Change-Id: I16d44568e44d7ef5b39c38b1a6257b9fe10a34d4	2017-12-25 14:37:14 +00:00
Huji Lee	f05821e87c	Do not run CollationFaTest if 'intl' is not loaded Bug: T176040 Change-Id: I6b19bf1123d4dca5a1c8e002c0de65bab2138180	2017-09-16 16:30:18 -04:00
Brian Wolff	d16c26fd2c	Unit tests for CollationFa (`0bfcbd724`) Change-Id: I8286244cc1a61f34a3599c4f2e6201ba91c5e79a	2017-05-30 19:05:16 +00:00
Brian Wolff	73f5937047	Add collation for Bashkir (ba) This is based on a numeric uppercase collation. Bashkir characters will be remapped to the private use area for the purpose of sorting. Bug: T162823 Change-Id: I65f1af0b57ff6ded7d464e39efd401f178a3519e	2017-05-10 04:17:46 +00:00

10 commits