wiki.techinc.nl/serialized
Brion Vibber c012a63d95 * (bug 13615) Update case mappings and normalization to Unicode 5.1.0
Note that case mappings will only be used if mbstring extension is not present.

Normalization data files updated to Unicode 5.1.0; passes the automated tests.

Seem to have long since lost the script I originally used to generate the Utf8Case.php mapping file, which appears not to have been updated since 2002 or so. :)
Made a new one and moved it into the UtfNormal sub-library.

Note a couple limitations:
* Case mapping (still) uses only the 1:1 simple mappings. Any full or locale-specific mappings are ignored.
* These case mappings are not used anyway when the PHP mbstring extension is available; mbstring's case conversion functions are used instead, with whatever version of Unicode support and whatever complex mapping support they may or may not have.
* The generated Utf8Case.php file is not used directly -- you must also regenerate the serialized version in the 'serialized' directory after updating it to a new Unicode version.
2008-05-08 06:28:50 +00:00
..
.htaccess Don't allow access from the web to serialized/ directory 2008-04-16 09:41:49 +00:00
Makefile * (bug 13615) Update case mappings and normalization to Unicode 5.1.0 2008-05-08 06:28:50 +00:00
README * Updated numbers 2007-01-14 15:45:53 +00:00
serialize-localisation.php Remove ?>'s from files. They're pointless, and just asking for people to mess with the files and add trailing whitespace. (Yes, I looked over every one and reverted those that were bogus. Slash-enter a million times in less worked well enough, although it was a bit mind-numbing.) 2007-06-29 01:19:14 +00:00
serialize.php Remove ?>'s from files. They're pointless, and just asking for people to mess with the files and add trailing whitespace. (Yes, I looked over every one and reverted those that were bogus. Slash-enter a million times in less worked well enough, although it was a bit mind-numbing.) 2007-06-29 01:19:14 +00:00
Utf8Case.ser * (bug 13615) Update case mappings and normalization to Unicode 5.1.0 2008-05-08 06:28:50 +00:00

This directory contains data files in the format of PHP's serialize() function. 
The source data are typically array literals in PHP source files. We have 
observed that unserialize(file_get_contents(...)) is faster than executing such 
a file from an oparray cache like APC, and very much faster than loading it by 
parsing the source file without such a cache. It should also be faster than 
loading the data across the network with memcached, as long as you are careful 
to put your MediaWiki root directory on a local hard drive rather than on NFS. 
This is a good idea for performance in any case.

To generate all data files:

   cd /path/to/wiki/serialized
   make

This requires GNU Make. At present, the only serialized data file which is 
strictly required is Utf8Case.ser. This contains UTF-8 case conversion tables, 
which have essentially never changed since MediaWiki was invented. 

The Messages*.ser files are localisation files, containing user interface text 
and various other data related to language-specific behaviour. Because they 
are merged with the fallback language (usually English) before caching, they 
are all quite large, about 140 KB each at the time of writing. If you generate 
all of them, they take up about 20 MB. Hence, I don't expect we will include 
all of them in the release tarballs. However, to obtain optimum performance, 
YOU SHOULD GENERATE ALL THE LOCALISATION FILES THAT YOU WILL BE USING ON YOUR 
WIKIS.

You can generate individual files by typing a command such as:
   cd /path/to/wiki/serialized
   make MessagesAr.ser

If you change a Messages*.php source file, you must recompile any serialized 
data files which are present. If you change MessagesEn.php, this will 
invalidate *all* Messages*.ser files. 

I think we should distribute a few Messages*.ser files in the release tarballs,
specifically the ones created by "make dist".