Commit graph

105 commits

Author SHA1 Message Date
Derk-Jan Hartman
b00f7237ee Searchindex primary key, title length and utf8mb4
- Make si_page primary key
- Allow si_title to be larger
- Switch from utf8 to utf8mb4
- Remove default of empty string for titles

Sqlite is not migrated, as it has it's own overrides
Postgres is migrated, but is not in actual use

This is mostly from I273e3a7715abf97d2889904642c7c375e76de4f6

Bug: T249976
Bug: T231827
Change-Id: I12adff3e6ca6a9986ff207bef16272195c3a6a48
(cherry picked from commit a2ba7ee14d6b38a5e89bcc63c3bd0ca8b5107702)
2024-11-02 11:14:37 +00:00
Umherirrender
9854946391 schema: Remove allowInfinite from page_links_updated
This timestamp is to track the last refreshlinks run and is always now
or in the past, there is no need to hold the infinite value.

This only affects mysql

Bug: T298317
Change-Id: I16978e076a795258639591a2fbfe353f67d0ec64
2024-07-29 22:48:07 +02:00
Amir Sarabadani
bb6c6e4174 schema: Drop old pagelinks columns
It has been dropped in production already.

Bug: T299947
Change-Id: I8ec1e7d9224c81d6494c39c78df9e4bdac38d377
2024-06-18 21:13:38 +02:00
Tim Starling
2998d9bd47 block: Migrate to the new block schema on non-WMF wikis
Migrate from ipblocks to block/block_target and drop the ipblocks
table. Update tests.

In PostgresUpdater, change some schema update functions to skip field
updates if the table doesn't exist, by analogy with
DatabaseUpdater::modifyField.

Bug: T346293
Change-Id: Icf91b35f7f729cead7c800429653eb30731762a1
2024-05-09 10:14:43 -04:00
Alexander Vorwerk
28bdd790ee Make rc_id a bigint
Wikidata is at 51.3% of its maximum value (see [1]), so we still have
comfortable time to make this change, but it will inevitably be
necessary and there is no point in postponing this change into the
future. The autoincrement value will not get smaller. ¯\_(ツ)_/¯

Also since rc only stores stuff for 30 days, the table is not that
big.

[1] - https://grafana.wikimedia.org/d/79S1Hq9Mz/wikidata-reliability-metrics?viewPanel=29&orgId=1

Bug: T63111
Change-Id: Icf3dc9815814ef73aa6a39f1c221a349e6b76872
2024-05-04 21:14:51 +02:00
Alexander Vorwerk
b2d44a162c Cleanup revision table schema
* Drop default value from rev_actor and rev_comment_id
* Make rev_id a bigint

Bug: T215466
Depends-On: I88318d7bcc063bc86a56eeb5f00048ea6e81964b
Change-Id: Id0a3d920e8b2dc8643fa3c0341b34ab3ed5761dc
2024-05-04 20:22:36 +02:00
Amir Sarabadani
f0bfc3d433 Schema: Drop iwl_prefix_from_title from iwlinks
After exhuastive research, we concluded that iwl_prefix_from_title is
not used and in case it's actually used, other indexes provide enough
cardinality.

This table is about to grow quite large in Commons, let's avoid making
it bigger than it needs to be.

Bug: T343131
Change-Id: I89e40dff384291968d2465e4109a3d212ae2f8c7
2024-02-08 17:01:26 +01:00
Amir Sarabadani
616744db1d Schema: Drop unused and useless indexes of sites table
This table has eight indexes plus PK. It has around 1000 rows only. Even
if it needs these indexes (which it doesn't), they are still useless.

Looking at the code, the only potential useful index is the one on
site_global_key, they are showing up in the report of unused indexes in
the db and I checked with Fandom (which might benefit from an index on
this table) and they said they don't use sites table.

Bug: T342856
Change-Id: I06b3db0f33bd35bfa68f4b418d8c2f4b9b988409
2024-02-01 15:32:05 +01:00
Amir Sarabadani
f056a26242 Schema: Drop cl_collation_ext index
Unused since Ie4dd91ee29308c980e

Bug: T342854
Change-Id: I3acf563c64ff176ade3e0c6745839a168e92473b
2024-01-26 16:16:41 +01:00
Tim Starling
fd9c7c2d3e Add the new block and block_target tables
Bug: T346293
Change-Id: I3822ad03227405a608dea1d788bcdb8321b95bb3
2024-01-08 14:29:59 +11:00
Thalia
caf9912323 Use year in temporary user names and restart index each year
Why:

* Part of a temporary user name is generated from an index that
  increments, which is stored in the database.
* As specified in T345855, the index will be restarted each year.
* Also specified in T345855, the year will be included in
  generated temporary user names.

What:
* Since the year must be included in the name in order to avoid
  naming conflicts if the index is restarted each year, both are
  implemented together and controlled by a single config.
* Add a new config option that, when true, restarts the name
  generation index at the start of each year and add the year into
  the user name: $wgAutoCreateTempUser['serialProvider']['useYear']
* Add a uas_year column to the user_autocreate_serial table, which
  is unique in combination with uas_shard, so the index can be
  stored for each shard/year combination.
* The year is added into the username just after the prefix, as
  specified in T345855. This is based on research that having the
  year near the start of the name aids understanding that the
  names are not IP addresses. The position of the year within the
  name is therefore not configurable (though whether to include
  it is). See T345855 for the research.

Bug: T349494
Bug: T349501
Depends-On: I6b3c640a4e74f52fd4a4f46de5a2cbe80fe3b665
Change-Id: If51acb3f4efa361ce36d919c862a52501a5a7d24
2024-01-05 17:14:19 +00:00
Brian Wolff
86608dfd4f Store image sizes as 64-bit bigint instead of 32-bit integers
This is meant in preparation for MediaWiki supporting files
larger than 4gb.

Bug: T191805
Change-Id: Ie67dd01aa0a8b28d9afc1805243e711fcadbc0f8
2023-10-04 08:01:25 -07:00
Amir Sarabadani
e5eda1c358 Schema: Drop old externallinks columns and indexes
Already dropped from production

Also dropping FixExtLinksProtocolRelative as it's not useful anymore and
it has been run in previous releases so it's not worth fixing.

Bug: T312666
Change-Id: I1dd6e704b34e685ada6e316da11243d10827d769
2023-09-05 15:32:23 +02:00
Amir Sarabadani
dae881296d Schema: Add pl_target_id column to pagelinks
Similar to templatelinks

Bug: T342689
Change-Id: I2ed692d7d0cdf756d29618363bec7fc761ff3df1
2023-07-25 21:03:45 +02:00
Amir Sarabadani
080883da09 Schema: Set default or nullable to three columns of externallinks
We need to drop these columns and we need to make them take default
values so we can issue write queries without these columns.

Also noting that MySQL prior to 8.0 can't set default values to blob
columns making this way more complicated than it should be but MariaDB
can set them (https://mariadb.com/kb/en/blob/). We also made these
columns nullable to make this work in MySQL.

Bug: T341828
Change-Id: I0d60742b6ce7adf642393ee00b66aa539b76dfc1
2023-07-18 11:59:09 +02:00
Alexander Vorwerk
b3611755d7 Drop revision_comment_temp
Bug: T299954
Change-Id: I85d21b1eff70a7d70e8ce14f25d66f7e7c76e5fe
2023-06-07 15:34:57 +00:00
Thalia
f283c0e990 schema: Add user_is_temp column to the user table
This allows temporary users to be identified from applications
external to MediaWiki, by the user table alone (without referring
to the $AutoCreateTempUser['matchPattern'] config).

Bug: T333223
Change-Id: I83c5ff42654164590fb0361c84e65a5315ddbda8
2023-05-10 19:18:39 +01:00
Amir Sarabadani
080c70879a schema: Add new fields for externallinks so we can reduce duplication
Bug: T318604
Change-Id: I217817bc518eaa86c9952187c6f1a861f480ccaf
2022-10-18 16:18:54 +00:00
Amir Sarabadani
6c4194e23e schema: Drop tl_title and tl_namespace fields from templatelinks
The day has gone. Still keeping the code as the schema changes are not
done in production but the data migration has been finished.

Bug: T299417
Change-Id: I906e069a63d1dae14924c72318b22b16244371d6
2022-09-06 19:53:15 +02:00
Amir Sarabadani
692dde00df Add support for write new for templatelinks migration
- schema change to allow tl_namespace and tl_title being empty
   This is done by removing them from primary key. They don't need to be
   nullable as they have default value.
 - Make sure with WRITE_NEW, updater avoids writing to the old columns

Bug: T306674
Change-Id: I2b8a29043e952060e7a79b6a7a3d647d48cd16fb
2022-07-12 14:46:54 +02:00
Amir Sarabadani
ce0b6f40fa schema: Drop legacy page_restrictions in page table
Bug: T35334
Change-Id: I17e73d5ef165481a5dd4c210da933b99c65ff79c
2022-05-26 13:59:57 +02:00
Amir Sarabadani
24115a8f4c Start clean up of revision_actor_temp table
It is being dropped in production

Bug: T215466
Change-Id: I66b2cb8653252e720c897351065978119f040ba7
2022-05-23 15:37:42 +00:00
diesel kapasule
6e97f2491b Schema: Updating user_editcount field to unsigned
I have updated user_editcount field to unsigned in
tables.json

This patch doesn't apply to Postgres.

Bug: T305340
Change-Id: I07a360944a10be9cc8ed8731c6286412294413a3
2022-05-09 21:34:11 +00:00
Umherirrender
52aebe227b schema: Make ipblocks.ipb_id unsigned
It makes it more consistent with AUTO_INCREMENT PRIMARY KEY columns on
other tables.

This includes ipblocks.ipb_parent_block_id and
ipblocks_restrictions.ir_ipb_id as well

This patch doesn't apply to Postgres.

Bug: T297208
Change-Id: I950aba63b2b226abbaf4010fbb51415bbff117f4
2022-04-14 17:45:41 +02:00
Tim Starling
e8dbf5f80c TempUser infrastructure and services
Add services and utilities for automatic creation of temporary user
accounts on page save, in order to avoid exposing the user's IP
address.

* Add $wgAutoCreateTempUser, for configuring the system
* Add TempUserConfig service, which interprets the config.
* Add TempUserCreator service, which creates users during page save as
  requested by EditPage. With proxy methods to TempUserConfig for
  convenience.
* Add table user_autocreate_serial. Table creation is necessary before
  the feature is enabled but is not necessary before deployment of this
  commit.

Bug: T300263
Change-Id: Ib14a352490fc42039106523118e8d021844e3dfb
2022-04-14 09:23:55 +10:00
Amir Sarabadani
9137db1d8a Add tl_target_id to templatelinks
Part of normalizing it, note that the field must be nullable now and
that will change later.

Bug: T299418
Change-Id: Id543dfa20a153312f66d2f45a64ac23e7272dabe
2022-01-28 17:41:01 +01:00
Umherirrender
fa40110671 schema: Make page_id references unsigned
This includes:
page_props.pp_page
page_restrictions.pr_page
ipblocks_restrictions.ir_value

These columns must hold all possible values from page.page_id,
which is an unsigned integer.

These patches doesn't apply to Postgres.

Bug: T297212
Change-Id: I789f19f4d52daeab08f3090771404d078f86d0b3
2022-01-27 20:20:00 +01:00
Amir Sarabadani
203d422dc4 Drop rev_page_id index on revision
This is already applied in production and known to be safe.

Bug: T163532
Change-Id: Ief29372f13b2d7cdb19395dcda6eb15e9a53efca
2022-01-27 09:43:27 +01:00
Amir Sarabadani
4a6e986141 Add linktarget table
Bug: T299416
Change-Id: Icae4513dd99635335857100d8a0c7102986933e5
2022-01-21 15:52:58 +01:00
Umherirrender
45c84bf927 schema: Make filearchive.fa_id unsigned
It makes it more consistent with AUTO_INCREMENT PRIMARY KEY columns on
other tables.

This patch doesn't apply to Sqlite or Postgres.

Bug: T297207
Change-Id: I5e58f23ea2441ef93ce7254210158d8094bd9010
2021-12-07 15:25:13 +01:00
Amir Sarabadani
ede308ed55 Drop pr_user from page_restrictions
It's not used for 13 years, safe to say it won't be used anytime soon

Bug: T199377
Change-Id: Iecf4fb6046699a1758ad2d1dc55a3ee8eb4b0389
2021-12-06 14:15:39 +01:00
Ammar Abdulhamid
1adaca51c3 Rename change_tag indexes to have ct_ prefix
Bug: T270033
Change-Id: I8a429726c99f6cadea0d671fd871f66b5611c856
2021-06-08 17:57:15 +01:00
Ammarpad
a8c01d7726 Rename name_title index to have page_ prefix
Bug: T270033
Change-Id: Id70d0e0a37dd0d000079820d51cef2791f5ec42e
2021-06-05 20:21:07 +02:00
Amir Sarabadani
7fdca7ce51 Add index on oldimage.oi_timestamp
Mirroring what happens on image table.

Bug: T279982
Change-Id: Ib67c32b10d3b88e09514d8fbb5dcb4f977108ba2
2021-05-29 20:44:32 +02:00
Amir Sarabadani
acd5076099 Migrate searchindex to abstract schema
For MySQL/Sqlite:
 - Nothing

For Postgres:
 - Introduce the table. It's not used but it'll be used in the future (T220450)

Bug: T164898
Bug: T230428
Bug: T220450
Change-Id: I9f33132676344f8cd813f7a438b3a6a078fd281c
2021-05-27 15:01:24 +02:00
Ammarpad
c397868e3d Remove custom table options from 'revision' table
This is not useful anymore.

Follow-up: Ia07dd52e43123473a1728523a3f863280537db8e
Change-Id: Ib903c69e3e331e96783c78b2ec0ab32e05f7f0d5
2021-05-23 21:57:04 +01:00
Ammarpad
5954b380fc Migrate revision table to abstract schema
Postgres:
 - Drop foreign key from rev_page
 - Make rev_page not nullable
 - Change rev_comment_id from int to bigint
 - Change rev_actor from int to bigint
 - Sync rev_page_id index with MySQL

MySQL/SQlite:
 - Drop default from rev_timestamp

Additional changes in the generator script to handle more
formatting issues due to use of additional custom table options

Bug: T230428
Bug: T164898
Change-Id: Ia07dd52e43123473a1728523a3f863280537db8e
2021-05-20 14:52:15 -07:00
Aaron Schulz
12923c1cd5 objectcache: add last-modified token field to objectcache table
Also added token and flags fields. The token field can
be used as a tie-breaker for modtime and also for faster
cas() operations. The flags field makes serialization and
compression format changes easier in the future.

Bug: T274174
Change-Id: I45731a877b21835652993c2d285165a76eeae3e9
2021-05-18 20:06:37 -07:00
Amir Sarabadani
80c309788b Make page_is_redirect and page_is_new unsigned
Mistake during the abstraction (Ibdaf332ea1d)

It doesn't need schema change since 1.36 is not released yet.

Bug: T230428
Change-Id: I071c39dfadf185b01e9109085907d6e3deb1f02f
2021-05-12 10:52:32 +02:00
Ammarpad
88bfadf5ed Migrate user table to abstract schema
Postgres:
 - Rename the table from `mwuser` to `user`
 - Transition user_id from int to serial
 - Make user_token not nullable and add default
 - Make user_real_name not nullable and add default
 - Make user_email not nullable
 - Make user_newpassword not nullable
 - Make user_password not nullable
 - Drop UNIQUE contraint on user_name and add default

MySQL/SQLite:
 * No changes

Bug: T164898
Bug: T230428
Change-Id: I746714f7b3ae16f9625f97bcca84280fcd8b61a0
2021-04-24 21:44:26 +01:00
Ammar Abdulhamid
6a3aa5b5a2 Migrate page to abstract schema
Postgres:
 - Change page_namespace from smallint to int
 - Change page_random from numeric with arbitrary precision to float
 - Make page_touched not nullable

MySQL/SQLite:
 - Change datatype of page_title from varchar (with binary collation)
   to varbinary(255)
 - Drop default empty string from timestamp field of page_touched

Bug: T230428
Bug: T164898
Change-Id: Ibdaf332ea1da309d31d35a6ebbc1b8fefced335e
2021-03-21 12:07:12 +01:00
Amir Sarabadani
2cc79854e8 Migrate archive table to abstract schema
One of the last ones left.

For MySQL/Sqlite:
 - Dropping default of ar_timestamp, empty string is not a valid
   timestamp.
 - Changing ar_title from "varchar() binary" to varbinary

for Postgres:
 - Set default for ar_namespace and ar_title
 - Change datatype of ar_comment_id, ar_actor, ar_namespace
The indexes were fixed separately.

Bug: T230428
Bug: T164898
Bug: T42626
Depends-On: I83cf1cd51ac9cf933c9175cefd6e38a6914f3494
Change-Id: Ic1d13a82b27f7fa39a0f0ea9c5b7b193b007e4ab
2021-03-13 21:51:16 +01:00
Amir Sarabadani
1053405664 Rename new_name_timestamp on recentchanges to rc_new_name_timestamp
To make it have a uniform prefix for index and column names

Bug: T270033
Change-Id: I8eb600416913092bd5aeb70389bba6e8a54d1d57
2021-03-01 19:00:52 +00:00
Amir Sarabadani
da74e2139a Make rc_id unsigned
On very heavily edited wikis, it is possible for the value of signed
int to be too small. This increases the max value of rc_id from
2,147,483,647 to 4,294,967,295. It also makes it more consistent with
AUTO_INCREMENT PRIMARY KEY columns on other tables.

This patch doesn't apply to Sqlite or Postgres.

Bug: T62962
Change-Id: I3e2c1cf20004dcdeaf195611f762a8f6ffd2bff2
2021-03-01 16:57:41 +00:00
Amir Sarabadani
4a6d4baaed Migrate recentchanges table to abstract schema
This table is massive but thankfully we fixed most of its complexities
in previous patches.

For MySQL/Sqlite:
 - Change type of rc_title and rc_source from "varchar binary" to
   "varbinary"
 - Drop default of rc_timestamp
 - Change rc_timestamp from varbinary(14) to binary(14) to standardize
   timestamp datatypes

One index doesn't follow the uniform prefix rule but since it's in a
maintenance script, will fix that in a follow up.

Bug: T230428
Bug: T42626
Change-Id: I13994e02ad3a2293148346ef7be96746578ad854
2021-02-26 13:56:56 +01:00
Reedy
2290df4b6b Extend iwlinks.iwl_prefix to VARBINARY(32) on MySQL
Bug: T275242
Change-Id: I19252d2cd167832f7c429c31374e5e2dd3b89726
2021-02-25 01:49:36 +00:00
Ammar Abdulhamid
dbd4dd19f8 Rename site_identifiers indexes to have si_ prefix
Bug: T270033
Change-Id: I6751f0fd992054b61222ece55c83d05d24af9000
2021-01-30 12:19:55 +01:00
Amir Sarabadani
fe6ff91134 Migrate image table to abstract schema
For MySQL/Sqlite:
 - Change datatype of img_name from "varchar() binary" to varbinary.
 - Drop default of img_timestamp

For Postgres:
 - Adding two missing indexes.
 - Renaming two indexes
 - Setting default value for five fields
 - Fix data type of four fields
 - Drop default of img_metadata
 - Make three fields not nullable

Bug: T230428
Bug: T164898
Change-Id: I237af3558b0e1c1fecd874c3c90ba6780e50aaa4
2021-01-28 20:13:03 +01:00
Amir Sarabadani
6e53acb7cc Migrate ipblocks to abstract schema
For MySQL/Sqlite:
 - Drop default values of ipb_timestamp and ipb_expiry as part of
   standardizing timestamp columns

For Postgres:
 - Drop foreign key on two columns as approved by the RFC
 - Set default of ipb_user
 - Make three columns not nullable to sync with MySQL
 - Change data type of two columns to BIGINT to be the same as MySQL

Bug: T230428
Bug: T164898
Bug: T42626
Change-Id: I2c5303d76c6ce059d7fef324a4521c6336c5b1f3
2021-01-23 18:21:24 +00:00
Ammar Abdulhamid
20d1849b53 Migrate objectcache to abstract schema
For Postgres:
 - Drop Unique constraint on `keyname` and make primary key
 - Change type of `value` from BYTEA to TEXT and drop its default
 - Make `value` nullable to sync with MySQL/SQLite

MySQL:
 - Change exptime from DATETIME to TIMESTAMP

MySQL/SQLite:
 - Make 'exptime' not nullable

Bug: T230428
Bug: T164898
Change-Id: Iab9de8a1bb2cb01b6e3e69e66f1bbe089d53d0a7
2021-01-20 13:50:28 +01:00