Commit graph

12 commits

Author SHA1 Message Date
Ævar Arnfjörð Bjarmason
a26d5a49d7 * s~\t+$~~ 2006-01-07 13:31:29 +00:00
Ævar Arnfjörð Bjarmason
7bbe971aec * s~ +$~~ 2006-01-07 13:09:30 +00:00
Antoine Musso
2104f62734 fix phpdoc comment 2005-01-27 19:51:47 +00:00
Brion Vibber
727e4d1aab Fix composition bug: completed hangul syllable should not be merged with another following final jamo 2004-11-15 00:59:40 +00:00
Brion Vibber
c6340de5b3 Fix regression in ICU-mode UTF-8 verification: U+FFFF is forbidden 2004-11-14 21:36:43 +00:00
Brion Vibber
e4e75a58a6 Support using ICU to do most of the heavy lifting in cleanUp() if the extension is loaded.
Modestly faster for roman text (1-2x), 16-20x faster than the PHP looping for already normalized Russian, Japanese, and Korean text.
2004-11-14 05:17:29 +00:00
Brion Vibber
4a4f248655 Fix regression: surrogate half followed by extra tail bytes 2004-11-14 04:27:03 +00:00
Brion Vibber
9535fc035b Fix UTF-8 validation regression: well-formed but forbidden UTF-8 sequence followed by bogus tail bytes 2004-11-14 04:07:28 +00:00
Brion Vibber
dd69eb14f5 Fix UTF-8 validation regression where a bad head byte is followed by ascii, then bad tail byte. 2004-11-14 03:48:49 +00:00
Brion Vibber
7bf6095d73 Fix UTF-8 validation bug where some cases didn't get replacement chars inserted correctly 2004-11-14 02:24:44 +00:00
Brion Vibber
eae361e2f0 cleanUp() optimization: speed up Japanese, Korean tests by another 15% by rearranging the loop and avoiding rebuilding the string if there are no illegal characters.
Removed restrictions on U+FDD0 and friends; these do seem to be allowed by XML, though they 'recommend' you avoid them.
2004-11-07 11:28:00 +00:00
Brion Vibber
93c098dfb7 Adding some extra tests for the cleanUp() function 2004-11-06 02:51:43 +00:00