Follows-up I301f471f86ba2.
For ease of navigation, move Converter subclasses to a group called
"Languages", which for documentation purposes is a subgroup of
"Language". The next commit does the same for Messages* files,
and Language subclasses (done separately for ease of review).
Change-Id: If1cef9aa15f536ebaedd4477ad7453426e7f3b85
In cases where we're operating on text data (and not binary data),
use e.g. "\u{00A0}" to refer directly to the Unicode character
'NO-BREAK SPACE' instead of "\xc2\xa0" to specify the bytes C2h A0h
(which correspond to the UTF-8 encoding of that character). This
makes it easier to look up those mysterious sequences, as not all
are as recognizable as the no-break space.
This is not enforced by PHP, but I think we should write those in
uppercase and zero-padded to at least four characters, like the
Unicode standard does.
Note that not all "\xNN" escapes can be automatically replaced:
* We can't use Unicode escapes for binary data that is not UTF-8
(e.g. in code converting from legacy encodings or testing the
handling of invalid UTF-8 byte sequences).
* '\xNN' escapes in regular expressions in single-quoted strings
are actually handled by PCRE and have to be dealt with carefully
(those regexps should probably be changed to use the /u modifier).
* "\xNN" referring to ASCII characters ("\x7F" and lower) should
probably be left as-is.
The replacements in this commit were done semi-manually by piping
the existing "\xNN" escapes through the following terrible Ruby
script I devised:
chars = eval('"' + ARGV[0] + '"').force_encoding('utf-8')
puts chars.split('').map{|char|
'\\u{' + char.ord.to_s(16).upcase.rjust(4, '0') + '}'
}.join('')
Change-Id: Idc3dee3a7fb5ebfaef395754d8859b18f1f8769a
* Fix errors spotted by new release
* Introduce "composer fix", which uses phpcbf to automatically fix some
errors spotted by phpcs.
* Drop $PHPCS_ARGS variable that didn't work on Windows, and add -s flag
* Remove rules from phpcs.xml that are now in MW-CS ruleset.
Change-Id: I13e2155695918c918b67497ac65b85a03897095e
* Interface strings are now elsewhere
* MessagesQQQ no longer exists
* Prefer https for translatewiki.net
Change-Id: I76652ea94cca80441cd5d978029e4707ee41c4fd
Format the three messages in header as one paragraph with three
sentences, instead of a paragraph and two split unordered lists with
one item each and inconsistent full stops.
Message changes: In 'wlheader-enotif' and 'wlheader-showupdated',
remove initial bullet point if present and add final full stop if
missing. First used the regexes below, then went through each language
file and manually changed the messages if applicable (e.g., Thai not
using full stops at all, Asian languages using '。', Devanagari
languages using '।' etc.)
Find: ('wlheader-[^']+'\s*=>\s*)(['"])(?:\*\s*)?([\s\S]+?)\.?\2,\n
Replace with: $1$2$3.$2,\n
Bug: 48615
Change-Id: I856f71f36d7f4b4baff5e968d88e4d3f7aeecce2