WikiTextStructure: Also extract figcaption elements as captions

The figcaption element looks to be many years old, but was not
previously considered a caption by the search engine. Some integration
tests have recently started failing as an element that previously
contained the thumbcaption css element is now inside a figcaption
class.

Update the code that extracts captions from the text into a separate
field to also extract the figcaption, as it has the same purpose as
thumbcaption.

Change-Id: I2a4a309e58602281d6cca65744036efb4a5ce5b5
This commit is contained in:
Erik Bernhardson 2023-03-23 08:54:29 -07:00
parent 2034de32ed
commit 44c3a886af

View file

@ -52,6 +52,7 @@ class WikiTextStructure {
private $auxiliaryElementSelectors = [
// Thumbnail captions aren't really part of the text proper
'.thumbcaption',
'figcaption',
// Neither are tables
'table',
// Common style for "See also:".