Page MenuHomeTJonesSep 11 2023, 3:16 PM
Tags
Referenced Files
None
Subscribers

Description

T342444 was halted because the reindexing was too much slow.

  • Update config with a more efficient interim analysis chain in case any reindexing needs to be done.
  • Refactor recent analysis upgrades (acronyms and camelCase) to be acceptably efficient as custom filters in the extra plugin a new plugin
    • Enable plugin version-checking in analysis config (so we know we have the new extra plugin)
    • Enable less expensive fallback versions of camelCase and acronym processing for 3rd party users without the new plugin
  • Possibly investigate other slow points in global configs (implement immediately or open new tickets)

New dependency: We can/should link this with T332337, which also needs a new filter and put everything in one new plugin.

Details

SubjectRepoBranchLines +/-
search/extramaster+859 -825
mediawiki/extensions/CirrusSearch
TJones updated the task description. (Show Details)
TJones updated the task description. (Show Details)
Gehel set the point value for this task to 13.Sep 11 2023, 3:39 PM

Change 957806 had a related patch set uploaded (by Tjones; author: Tjones):

[mediawiki/extensions/CirrusSearch@master] Refactor and Revert Analysis Harmonization

Change 957806 merged by jenkins-bot:

[mediawiki/extensions/CirrusSearch@master] Refactor and Revert Analysis Harmonization

We previously discussed how to bundle the new filters, but talked about it again today.

Since acronym and camelCase processing aren't language-specific, creating a separate plugin isn't an obvious requirement or even desirable. Moving them into the extra plugin made sense from an architectural point of view, but the added complexity for our own deployment, 3rd party users, and even developers is undesirable. Trying to resolve everything in the config builder by testing for specific WMF versions of plugins (e.g., "v7.10.2-wmf5 or newer") is possibly more complexity than it is worth at the moment.

OTOH, creating and checking for the presence of a new plugin is easy. Though it is possible that in the future that the overhead of many plugins is a problem, but at the moment there is no evidence of that. For now, our standard operating procedure will be to create a new plugin when we have a batch of new filters to create.

As a big-picture compromise, it makes sense to work on T332337 (ICU tokenizer repair) before returning to T332342 (folding), and bundle the new filter there with the two here, so that all three new filters can be in one plugin.

Change 965602 had a related patch set uploaded (by Tjones; author: Tjones):

[search/extra@master] Refactor Acronym Fixer Analysis into New Textify Plugin

Change 965603 had a related patch set uploaded (by Tjones; author: Tjones):

[search/extra@master] Refactor CamelCase Analysis into Textify Plugin

Change 965793 had a related patch set uploaded (by Tjones; author: Tjones):

[search/extra@master] Add limited_mapping to Textify Plugin

Change 965575 had a related patch set uploaded (by Tjones; author: Tjones):

[mediawiki/extensions/CirrusSearch@master] Allow Fallback Filters, Config CamelCase Plugin

Change 965576 had a related patch set uploaded (by Tjones; author: Tjones):

[mediawiki/extensions/CirrusSearch@master] Config Acronym Fixer Plugin

Change 967912 had a related patch set uploaded (by Tjones; author: Tjones):

[mediawiki/extensions/CirrusSearch@master] Allow limited_mapping when textify plugin is present

Change 965575 merged by jenkins-bot:

[mediawiki/extensions/CirrusSearch@master] Allow Fallback Filters, Config CamelCase Plugin

Change 965576 merged by Tjones:

[mediawiki/extensions/CirrusSearch@master] Config Acronym Fixer Plugin

Mediawiki.

Highlights:

Change 967912 merged by jenkins-bot:

[mediawiki/extensions/CirrusSearch@master] Allow limited_mapping when textify plugin is present

Change 965602 merged by jenkins-bot:

[search/extra@master] Refactor Acronym Fixer Analysis into New Textify Plugin

Change 965603 merged by jenkins-bot:

[search/extra@master] Refactor CamelCase Analysis into Textify Plugin

Change 965793 merged by jenkins-bot:

[search/extra@master] Add limited_mapping to Textify Plugin