fed an invalid sequence in UtfNormal::cleanUp
normalizer_normalize seems to return false if fed an invalid unicode sequence (Which is quite different
from what our built in normalization functions do). So use quickIsNFC if it returns false.
(Noticed when investigating bug 28541).
* Work around HipHop issue 314 (volatile broken) and issue 308 (no compilation detection) by adding some large and ugly compilation detection code to WebStart.php and doMaintenance.php.
* Provide an MW_COMPILED constant which can be used to detect compiled mode throughout the codebase.
* Introduced wfIsHipHop(), which detects either compiled or interpreted mode. Used this to work around unusual eval() return value in eval.php.
* Work around lack of ini_get() in Maintenance.php, by duplicating wfIsHipHop().
* In Maintenance::shouldExecute(), accept "include" as an inclusion function name, since all kinds of inclusion give this string in HipHop.
* Introduced new class MWInit, which provides some static functions in the pre-autoloader environment.
* Introduced MWInit::compiledPath(), which provides a relative path for invoking a compiled file, and MWInit::interpretedPath(), which provides an absolute path for interpreting a PHP file. Used these new functions in the appropriate places.
* When we are running compiled code, don't include files which would generate duplicate class, function or constant definitions. Documented the new requirements on the contents of Defines.php and UtfNormalDefines.php.
* In HipHop compiled mode, it's not possible to have executable code in the same file as a class definition.
** Moved MimeMagic initialisation to the constructor.
** Moved Namespace.php global variable initialisation to Setup.php.
** Moved MemcachedSessions.php initialisation to the caller in GlobalFunctions.php.
** Moved Sanitizer.php constants and global variables to static class members. Introduced an accessor function for the attribs regex, as a new place to put code formerly at file level.
** Moved Language.php initialisation of $wgLanguageNames to Language::getLanguageNames(). Removed the global variable, marked "private" since forever.
* In two places: don't use error_log() with type=3 to append to a file, HipHop doesn't support it. Use file_put_contents() with FILE_APPEND instead.
* Work around the terrible breakage of class_exists() by using MWInit::classExists() instead in various places. In WebInstaller::getPageByName(), the class_exists() was marked with a fixme comment already, so I replaced it with an autoloader solution.
This pulls the source texts used for the UtfNormalBench.php benchmarks, repeats them into much larger strings, and attempts to reproduce an out-of-memory error in the middle of UtfNormal::cleanUp():
$ php
Testing testdata/washington.txt (English text)...
quickIsNFCVerify 1078.3ms 14,969,652 bytes/s (changed)
cleanUp 992.1ms 16,270,594 bytes/s (no change)
Testing testdata/berlin.txt (German text)...
PHP Fatal error: Allowed memory size of 136314880 bytes exhausted (tried to allocate 71 bytes) in /var/www/trunk/includes/normal/UtfNormal.php on line 285
Exact failure point may require adjustment depending on platform, etc.
* This involved updating of autogenerated Unicode normalisation tables, which are compliant with Unicode 5.2.0 now. All normalization tests pass except for RandomTest.php which have bitrotten to its death from PHP fatals.
* Added a line about r59036 to release notes
Doxygen documentation update:
* Changed alls @addtogroup to @ingroup. @addtogroup adds the comment to the group description, but doesn't add the file, class, function, ... to the group like @ingroup does. See for example http://svn.wikimedia.org/doc/group__SpecialPage.html where it's impossible to see related files, classes, ... that should belong to that group.
* Added @file to file description, it seems that it should be explicitely decalred for file descriptions, otherwise doxygen will think that the comment document the first class, variabled, function, ... that is in that file.
* Removed some empty comments
* Removed some ?>
Added following groups:
* ExternalStorage
* JobQueue
* MaintenanceLanguage
One more thing: there are still a lot of warnings when generating the doc.
Note that case mappings will only be used if mbstring extension is not present.
Normalization data files updated to Unicode 5.1.0; passes the automated tests.
Seem to have long since lost the script I originally used to generate the Utf8Case.php mapping file, which appears not to have been updated since 2002 or so. :)
Made a new one and moved it into the UtfNormal sub-library.
Note a couple limitations:
* Case mapping (still) uses only the 1:1 simple mappings. Any full or locale-specific mappings are ignored.
* These case mappings are not used anyway when the PHP mbstring extension is available; mbstring's case conversion functions are used instead, with whatever version of Unicode support and whatever complex mapping support they may or may not have.
* The generated Utf8Case.php file is not used directly -- you must also regenerate the serialized version in the 'serialized' directory after updating it to a new Unicode version.
Moved them out to a separate file in the library which can be cleanly included from both places for transparent happiness.
A fresh rebuild & test of UtfNormal data via 'make test' now works fine.
* Added support for configuration of an arbitrary number of commons-style file repositories.
* Split Image.php into filerepo/File.php and filerepo/LocalFile.php
* Renamed Image::getImagePath() to File::getPath()
* Added initial support for timestamp-based file fetching (OldLocalFile), to be expanded upon by aaron.
* Changed the interface for Image/File object creation: use wfFindFile() or wfLocalFile() depending on semantics
* ImageGallery::add() now accepts a title object as the first parameter
* Moved file handling operations on upload from SpecialUpload to File
* Removed path-related functions from ImageFunctions.php. Removed static path accessors from File.
* Added a Content-Disposition header to thumb.php output
* Improved thumb.php error handling
* Updated the unit test suite to kind of partially work with modern computers. RunTests.php doesn't work just yet. Fixed an actual regression that the test suite detected -- moved some defines to Defines.php where they will be loaded consistently.
* Seems like an opportune time to introduce "@addtogroup Media" documentation tags.
* Merge "@addtogroup Metadata" (used by Exif.php) into "@addtogroup Media".
* Few more moving comment blocks to above classes.
This can be done either by:
* Using explicit full paths, using the $IP global for the installation directory full path, and then working down the tree from there.
* Using explicit full paths, using the "dirname(__FILE__)" directive to get a full directory path for the includer file.
* Occasionally removing the line altogether, and then for some files the inclusion is handled by the autoloader.
For example, if the "extensions/wikihiero/wh_main.php" file does an include or require on "wh_list.php", then PHP does the following:
* tries to open "wiki/wh_list.php", and fails.
* tries to open "wiki/includes/wh_list.php", and fails.
* tries to open "wiki/languages/wh_list.php", and fails.
* tries to open "wiki/extensions/wikihiero/wh_list.php", and succeeds.
So in this example, the first 3 calls can be prevented if PHP is told where the file is.
Testing Method: On a Linux box, run these commands to attach strace to all the apache2 processes, and log their system calls to a temporary file, then generate some activity, and then stop the strace:
-----------------------------------
rm /tmp/strace-log.txt
strace -tt -o /tmp/strace-log.txt -p `pidof apache2 | sed 's/ / -p /g'` &
php maintenance/fuzz-tester.php --keep-passed-tests --include-binary --max-runtime=3 > /tmp/strace-tests.txt
killall -9 strace
grep "No such file or directory" /tmp/strace-log.txt | sort -u
-----------------------------------
Any failed file stats will be marked with: "-1 ENOENT (No such file or directory)".
Also:
* Strict Standards: Undefined offset: 230 in includes/normal/UtfNormal.php on line 637
* Strict Standards: iconv() [<a href='function.iconv'>function.iconv</a>]: Detected an illegal character in input string in languages/Language.php on line 776
[Note: Partial only - despite adding "//IGNORE", it still seems to be possible with some
messed- up binary input to cause PHP 5.1.2's iconv() function to squeal like a stuck pig].
* Update one $fname variable (method belongs to HistoryBlobStub class).
* use diffchange class alone for backwards compatibility with old renderings and diff plugins
* set text-decoration: none in diffs in RSS/Atom feeds
* fix bad diff regex in UTF-8 RandomTest script
* removing some unused global declarations.
* removing or commenting out or adding comments for unused local vars.
* Adding one or two local var declarations.
* Declaring $matches array passed to preg_match() / preg_match_all() as array() before using [not required, just have a slight preference for the explicitness].
* remove one or two pass-by-reference function declarations where the value is not modified.
* Adding some braces to if-else blocks.
* In Parser.php, stripstrate is now an object rather than an array as per r17820, so we no longer need ask for a reference to it (as in "$x =& $this->mStripState;"), and in fact it's probably just simpler to get rid of $x altogether.
* Moving some preg regexes from "" quoting to '' quoting to stop static analyzer whinging about bad escape sequences.
... up to "LinksUpdate.php" in the includes/ directory.