Why: - DeduplicateStyles runs as a default post-cache output transformation for every backend pageview. It tokenizes the article HTML via Remex to deduplicate style nodes within. - This is expensive for large pages. On the Barack Obama page, the transform takes 350+ ms on a parser cache hit. - Some other transforms, like HandleSectionLinks, already use regexes to only run Remex-driven transforms on relevant elements to avoid a potentially expensive tokenization of the whole page. What: - Use a regular expression to limit this transform so that it only tokenizes potential <style> nodes. This takes ~2ms to execute on a large page[1], compared to ~166ms currently. - Restrict this optimization to legacy parser output transformations, since the naïve regex used might otherwise match encoded style tags within data-parsoid attribute values, as described in I32d3d1772243c3819e1e1486351d16871b6e21c4. Add a test for this. [1] https://en.m.wikipedia.org/wiki/Democratic_Party_(United_States)?action=render Bug: T394059 Change-Id: I33ebcc2da7685b4b6dafdad3ed3ef2a9edea9a00 (cherry picked from commit 02f69d5dc99a964981c57b597eedffa1f253a14c) |
||
|---|---|---|
| .phan | ||
| cache | ||
| docs | ||
| extensions | ||
| images | ||
| includes | ||
| languages | ||
| maintenance | ||
| mw-config | ||
| resources | ||
| skins | ||
| tests | ||
| vendor@d9b7761127 | ||
| .dockerignore | ||
| .editorconfig | ||
| .eslintignore | ||
| .eslintrc.json | ||
| .fresnel.yml | ||
| .git-blame-ignore-revs | ||
| .gitattributes | ||
| .gitignore | ||
| .gitmessage | ||
| .gitmodules | ||
| .gitreview | ||
| .mailmap | ||
| .phpcs.xml | ||
| .stylelintrc.json | ||
| .svgo.config.js | ||
| .vsls.json | ||
| api.php | ||
| autoload.php | ||
| CODE_OF_CONDUCT.md | ||
| composer.json | ||
| composer.local.json-sample | ||
| COPYING | ||
| CREDITS | ||
| DEVELOPERS.md | ||
| docker-compose.yml | ||
| FAQ | ||
| Gruntfile.js | ||
| HISTORY | ||
| img_auth.php | ||
| index.php | ||
| INSTALL | ||
| jsdoc.json | ||
| load.php | ||
| opensearch_desc.php | ||
| package-lock.json | ||
| package.json | ||
| phpunit.xml.dist | ||
| README.md | ||
| RELEASE-NOTES-1.43 | ||
| rest.php | ||
| SECURITY | ||
| thumb.php | ||
| thumb_handler.php | ||
| UPGRADE | ||
MediaWiki
MediaWiki is a free and open-source wiki software package written in PHP. It serves as the platform for Wikipedia and the other Wikimedia projects, used by hundreds of millions of people each month. MediaWiki is localised in over 350 languages and its reliability and robust feature set have earned it a large and vibrant community of third-party users and developers.
MediaWiki is:
- feature-rich and extensible, both on-wiki and with hundreds of extensions;
- scalable and suitable for both small and large sites;
- simple to install, working on most hardware/software combinations; and
- available in your language.
For system requirements, installation, and upgrade details, see the files RELEASE-NOTES, INSTALL, and UPGRADE.
- Ready to get started?
- Setting up your local development environment?
- Looking for the technical manual?
- Seeking help from a person?
- Looking to file a bug report or a feature request?
- Interested in helping out?
MediaWiki is the result of global collaboration and cooperation. The CREDITS file lists technical contributors to the project. The COPYING file explains MediaWiki's copyright and license (GNU General Public License, version 2 or later). Many thanks to the Wikimedia community for testing and suggestions.