Thijs/wiki.techinc.nl

Author	SHA1	Message	Date
Nick Jenkins	baaee13afc	Prevent some unnecessary lstat system calls, generated by include or require directives. This can be done either by: * Using explicit full paths, using the $IP global for the installation directory full path, and then working down the tree from there. * Using explicit full paths, using the "dirname(__FILE__)" directive to get a full directory path for the includer file. * Occasionally removing the line altogether, and then for some files the inclusion is handled by the autoloader. For example, if the "extensions/wikihiero/wh_main.php" file does an include or require on "wh_list.php", then PHP does the following: * tries to open "wiki/wh_list.php", and fails. * tries to open "wiki/includes/wh_list.php", and fails. * tries to open "wiki/languages/wh_list.php", and fails. * tries to open "wiki/extensions/wikihiero/wh_list.php", and succeeds. So in this example, the first 3 calls can be prevented if PHP is told where the file is. Testing Method: On a Linux box, run these commands to attach strace to all the apache2 processes, and log their system calls to a temporary file, then generate some activity, and then stop the strace: ----------------------------------- rm /tmp/strace-log.txt strace -tt -o /tmp/strace-log.txt -p `pidof apache2 \| sed 's/ / -p /g'` & php maintenance/fuzz-tester.php --keep-passed-tests --include-binary --max-runtime=3 > /tmp/strace-tests.txt killall -9 strace grep "No such file or directory" /tmp/strace-log.txt \| sort -u ----------------------------------- Any failed file stats will be marked with: "-1 ENOENT (No such file or directory)". Also: * Strict Standards: Undefined offset: 230 in includes/normal/UtfNormal.php on line 637 * Strict Standards: iconv() [<a href='function.iconv'>function.iconv</a>]: Detected an illegal character in input string in languages/Language.php on line 776 [Note: Partial only - despite adding "//IGNORE", it still seems to be possible with some messed- up binary input to cause PHP 5.1.2's iconv() function to squeal like a stuck pig]. * Update one $fname variable (method belongs to HistoryBlobStub class).	2007-02-09 05:36:56 +00:00
Brion Vibber	8fab89a6c4	Cleanup from r19742: * use diffchange class alone for backwards compatibility with old renderings and diff plugins * set text-decoration: none in diffs in RSS/Atom feeds * fix bad diff regex in UTF-8 RandomTest script	2007-02-04 18:42:07 +00:00
Antoine Musso	fe7d2d15d4	Fix #6844 : Semantically correct tags for diffchanges (<ins> && <del>) Bumps wgStyleVersion to 55. Patch by Messi <messias+spam@gmail.com>	2007-02-03 21:47:53 +00:00
Antoine Musso	c771fc9c96	Use Doxygen @addtogroup instead of phpdoc @package && @subpackage	2007-01-20 15:09:52 +00:00
Brion Vibber	e398816be7	use number_format on bytes/sec in output to make it easier to read	2007-01-13 04:22:47 +00:00
Brion Vibber	f15f0e05bb	fix benchmark test data downloads; fix link for english text; find another page for korean text (page was deleted)	2007-01-13 02:57:58 +00:00
Brion Vibber	161d9aee1f	* (bug 7250) Updated Unicode normalization tables to Unicode 5.0	2007-01-13 02:30:59 +00:00
Brion Vibber	25a2f1b60a	adjust CleanUpTest to run with PHPUnit 3	2007-01-13 02:15:19 +00:00
Nick Jenkins	14c53b728f	Code housekeeping stuff (and barring any stuff-ups on my behalf, there should be no changes in behaviour whatsoever after this) - * removing some unused global declarations. * removing or commenting out or adding comments for unused local vars. * Adding one or two local var declarations. * Declaring $matches array passed to preg_match() / preg_match_all() as array() before using [not required, just have a slight preference for the explicitness]. * remove one or two pass-by-reference function declarations where the value is not modified. * Adding some braces to if-else blocks. * In Parser.php, stripstrate is now an object rather than an array as per r17820, so we no longer need ask for a reference to it (as in "$x =& $this->mStripState;"), and in fact it's probably just simpler to get rid of $x altogether. * Moving some preg regexes from "" quoting to '' quoting to stop static analyzer whinging about bad escape sequences. ... up to "LinksUpdate.php" in the includes/ directory.	2006-11-23 08:25:56 +00:00
Yuri Astrakhan	7b49a7bdda	Marked all functions as static	2006-10-21 08:30:48 +00:00
Tim Starling	f3ce9d418d	Use absolute path in require_once, errors reported in some configurations due to odd include_path.	2006-10-03 13:06:39 +00:00
Antoine Musso	93154120cc	Remove forced dereferencements (new() returns a reference in PHP5)	2006-07-11 14:11:23 +00:00
Antoine Musso	473cd5cbcc	unused variables as per #3692	2006-05-01 10:53:59 +00:00
Antoine Musso	69689725c1	Switching from phpdoc to doxygen (use less than 32MB of memory). Run maintenance/mwdocgen.php to generate doc in ./docs/html/ .	2006-04-19 15:46:24 +00:00
Brion Vibber	3bbf7dcbd2	Remove .cvsignore files	2006-04-05 08:23:27 +00:00
Brion Vibber	f2c29baf9f	Update the FSF's address in all these GPL stub headers	2006-04-05 07:43:17 +00:00
Tim Starling	11f0b952f6	Replaced codepointToUtf8 calls with string literals, should save a few milliseconds according to xdebug. Ran unit test.	2006-03-05 03:03:03 +00:00
Brion Vibber	266d41f165	* Added wfDie() wrapper, and some manual die(-1), to force the return code to the shell to return nonzero when we crap out with an error.	2006-01-14 02:49:43 +00:00
Ævar Arnfjörð Bjarmason	a26d5a49d7	* s~\t+$~~	2006-01-07 13:31:29 +00:00
Ævar Arnfjörð Bjarmason	7bbe971aec	* s~ +$~~	2006-01-07 13:09:30 +00:00
Brion Vibber	af2177edfd	Code cleanup: normalize case for intval(), strval(), floatval() calls.	2005-08-16 23:36:16 +00:00
Brion Vibber	f77b1cbbf3	Update files as currently generated.	2005-05-18 09:18:07 +00:00
Antoine Musso	2104f62734	fix phpdoc comment	2005-01-27 19:51:47 +00:00
Brion Vibber	9f963dfac7	notes	2004-12-03 10:41:57 +00:00
Brion Vibber	11e0f6ecff	Require running from command line	2004-12-03 10:30:50 +00:00
Brion Vibber	727e4d1aab	Fix composition bug: completed hangul syllable should not be merged with another following final jamo	2004-11-15 00:59:40 +00:00
Brion Vibber	deb0452649	Add a utf-8 to hex sequence function for debugging	2004-11-15 00:58:36 +00:00
Brion Vibber	66e64d98d2	Test: feeds random strings to both pure PHP and ICU code paths looking for differences.	2004-11-14 21:40:44 +00:00
Brion Vibber	c6340de5b3	Fix regression in ICU-mode UTF-8 verification: U+FFFF is forbidden	2004-11-14 21:36:43 +00:00
Brion Vibber	e4e75a58a6	Support using ICU to do most of the heavy lifting in cleanUp() if the extension is loaded. Modestly faster for roman text (1-2x), 16-20x faster than the PHP looping for already normalized Russian, Japanese, and Korean text.	2004-11-14 05:17:29 +00:00
Brion Vibber	4a4f248655	Fix regression: surrogate half followed by extra tail bytes	2004-11-14 04:27:03 +00:00
Brion Vibber	9535fc035b	Fix UTF-8 validation regression: well-formed but forbidden UTF-8 sequence followed by bogus tail bytes	2004-11-14 04:07:28 +00:00
Brion Vibber	dd69eb14f5	Fix UTF-8 validation regression where a bad head byte is followed by ascii, then bad tail byte.	2004-11-14 03:48:49 +00:00
Brion Vibber	dec06744da	Ignore some Mac-related files	2004-11-14 02:25:44 +00:00
Brion Vibber	7bf6095d73	Fix UTF-8 validation bug where some cases didn't get replacement chars inserted correctly	2004-11-14 02:24:44 +00:00
Brion Vibber	b108d98286	Add a Russian test file to the benchmark (2-byte characters, using ASCII spacing and punctuation)	2004-11-11 07:05:21 +00:00
Brion Vibber	961187ba17	Tweak benchmark a bit; display times in milliseconds instead of seconds for legibility.	2004-11-07 22:01:57 +00:00
Brion Vibber	eae361e2f0	cleanUp() optimization: speed up Japanese, Korean tests by another 15% by rearranging the loop and avoiding rebuilding the string if there are no illegal characters. Removed restrictions on U+FDD0 and friends; these do seem to be allowed by XML, though they 'recommend' you avoid them.	2004-11-07 11:28:00 +00:00
Brion Vibber	8efe66008c	Don't run the control characters through the invariant test, as they are stripped by cleanUp() for XML safety.	2004-11-06 03:00:29 +00:00
Brion Vibber	7434438b98	Don't forgot to actually _make_ the replacements for illegal chars. :P	2004-11-06 02:52:25 +00:00
Brion Vibber	93c098dfb7	Adding some extra tests for the cleanUp() function	2004-11-06 02:51:43 +00:00
Brion Vibber	51dd271399	Shave off a few more milliseconds from cleanUp() inner loop.	2004-11-05 09:13:02 +00:00
Brion Vibber	97f577163c	Shave a few more percentage points from times on cleanUp() on unicode text by building a combined NFC-check hash.	2004-11-05 08:22:56 +00:00
Brion Vibber	0db79dbed6	More incremental optimization on cleanUp(): * when splitting ascii vs non-ascii chunks, don't split punctuation and control chars as aggressively; this benefits the Korean test data * use output buffer and echo; it's _slightly_ faster than string concatenation. * Separate the surrogate check from the others; many Korean letters fall in the adjacent area with the same head byte, so this gives a small speed boost on Korean text	2004-11-05 04:07:04 +00:00
Brion Vibber	874f8b48c6	cleanUp() optimization: about 1/8 speed boost on unicode-dominant text (Japanese, Korean test data)	2004-11-05 00:47:03 +00:00
Brion Vibber	9ba6a6c74a	cleanUp() optimization: split the string into pure ASCII chunks and chunks which need to be checked byte by byte. Over 5x speedup for German text sample.	2004-11-05 00:26:09 +00:00
Brion Vibber	48cb181bd2	Optimization on cleanUp(): roughly 1/3 speed boost on ascii-dominant but not ascii-pure text (eg German)	2004-11-04 23:53:44 +00:00
Brion Vibber	5f530ba1f3	Optimize inner loop in cleanUp(): boosts performance on non-ASCII text by about 20%. Also, trim the XML-illegal control characters from pure ASCII as well as non-ASCII strings.	2004-11-04 11:44:45 +00:00
Brion Vibber	1897c54f2a	The pass-by-reference on the string on fastCompose() really slows things down sometimes in PHP4. Taking it out speeds up processing of Japanese text significantly.	2004-10-30 12:35:37 +00:00
Brion Vibber	286dd13042	More inlining; fastCompose() is now twice as fast on hangul chars, which cuts down the NFC() time on Korean text a fair chunk.	2004-10-30 12:06:31 +00:00
Brion Vibber	dafeb1fe3b	Work through the NFC substeps with the actual data to make the substep times more meaningful	2004-10-30 10:20:19 +00:00
Brion Vibber	711899c70d	Benchmark was pulling the wrong Tokyo article (shorter than the others)	2004-10-30 06:47:36 +00:00
Brion Vibber	959f097c2d	Add some sub-functions back to the benchmark	2004-10-30 06:42:39 +00:00
Brion Vibber	de3549d9e9	Optimize inner loops a bit.	2004-10-30 06:02:30 +00:00
Brion Vibber	5cf94de93f	Subject UtfNormal::cleanUp() to the same tests as UtfNormal::toNFC()	2004-10-30 05:24:24 +00:00
Brion Vibber	d2e152e6de	Munge doc comments. Mark as its own package for docs.	2004-10-28 02:56:13 +00:00
Brion Vibber	6377e82b76	Load form C data on demand; if we are dealing in all-ASCII text we can save some memory and time by not loading it.	2004-10-09 08:08:26 +00:00
Brion Vibber	0824182956	Add support for using ICU to perform normalization, which is much much faster than the PHP code! Still need to add support for cleanup/verification.	2004-10-07 05:59:10 +00:00
Brion Vibber	bcd1e9e844	Fetch test data for the benchmark	2004-10-07 03:40:06 +00:00
Brion Vibber	f0610d0f67	Doc comments	2004-09-27 02:59:24 +00:00
Brion Vibber	106d11a197	Add remotely fetched files to .cvsignore to reduce screen pollution	2004-09-23 07:29:25 +00:00
Brion Vibber	dd195aa594	Some more phpdoc bits	2004-09-04 09:35:01 +00:00
Antoine Musso	ba2afcd9fa	Split files and classes in different packages for phpdocumentor. I probably changed some double quotes to single and used function foo () { shema	2004-09-03 23:00:01 +00:00
Antoine Musso	705bb88da0	Change the way comment are generated so they are compatible with phpdocumentor. Changes already existing files as well.	2004-09-03 22:52:28 +00:00
Brion Vibber	9857a47c3f	Correction to the \r stripping	2004-09-03 06:44:57 +00:00
Brion Vibber	ed46bd50fe	Add UtfNormal::cleanUp() function: strips XML-unsafe characters and illegal UTF-8 sequences, then normalizes to form C.	2004-09-03 05:39:30 +00:00
Brion Vibber	53e71c1702	Split the data arrays for form KC, KD to a separate include file and load it on demand. These are less likely to be used, so save the memory and parse time...	2004-09-02 07:39:06 +00:00
Brion Vibber	a5cfdf0360	Unicode normalization routines. See: http://www.unicode.org/reports/tr15/	2004-08-29 10:30:23 +00:00

1 2 3

118 commits