From https://www.w3.org/International/questions/qa-bidi-controls.en.html
Unicode control codes are not useful for bidi formatting when working with structural or paragraph-level markup.
For inline content we recommend that, wherever possible, you use markup in HTML and XML, rather than the Unicode control characters.
One practical reason for it is those control characters will end up in user text copy paste of typical users who are unaware and clueless and it breaks string matching logic. e.g. T27277
And bidi control includes many different characters (LRI, RLI, FSI, LRE, RLE, LRO, RLO, PDI, PDF) which all should be avoided but one main place generates them in MediaWiki is
- https://codesearch.wmcloud.org/deployed/?q=getDirMarkEntity
- https://codesearch.wmcloud.org/deployed/?q=getDirMark
We should reduce uses of them and eventually if there aren't any plain text uses of the methods, make them deprecated and remove them from MediaWiki or at least adding to their documentation that most of the time they should be avoided.
To help developers not familiar with bidi, what is the propose of such things in MediaWiki at the first place anyway? Consider the following, that $title can be a username or a page title with
abc $title (123)
If $title = 'title'; this will be turned into,
abc title (123)
but if $title = '1شسی'; the very same code results in
abc 1شسی (123)
Note where 123 has went, between parts of the title string! It's still logically after the $title but it's displayed before it because bidi algorithm is mislead and to solve this on plain text one can put a LRM which results in,
abc ۱شسی (123)
Which has the hidden character but if one copies the title it will contain the character which is undesirable so a better solution can be use of <bdi> which results in,
abc <bdi>1شسی</bdi> (123)
which is good but the placement of that 1 is changed so the better solution would be to add dir="ltr" (the LTR or RTL here should match site or user language direction depending on the context, for page title of mono language wikis, better to wiki's content direction, for usernames however not user of enforcing any direction and letting default direction to be used can be better):
abc <bdi dir="ltr">1شسی</bdi> (123)
For example in the use https://gerrit.wikimedia.org/g/mediawiki/extensions/ProofreadPage/+/c30b1e384ab694161fa10665636eb3f2f4c4349a/includes/Special/SpecialProofreadPages.php#357 just wrapping $plink with <bdi> should be enough.
Generally to understand and review these changes keep this rule of thumb in mind that whenever a user generated content such as username, page title, summary and content is used in one line with text and messages that are part of software UI of MediaWiki, those user generated content should be wrapped with <bdi>. It's like XSS and SQL Injection but for bidi and the antidote is an appropiate wrapping with HTML <bdi> tag. (but this is just a simplification with leaving some details out)
Also related suggestions from W3C to use HTML tag and attribute over CSS styles from https://drafts.csswg.org/css-writing-modes-3/
Because HTML UAs can turn off CSS styling, we recommend HTML authors to use the HTML dir attribute and <bdo> element to ensure correct bidirectional layout in the absence of a style sheet. Authors should not use direction in HTML documents.
Because HTML UAs can turn off CSS styling, we recommend HTML authors to use the HTML dir attribute, <bdo> element, and appropriate distinction of text-level vs. grouping-level HTML element types to ensure correct bidirectional layout in the absence of a style sheet. Authors should not use unicode-bidi in HTML documents.
DetailsShow related patches Comment ActionsChange #1077067 had a related patch set uploaded (by Ebrahim; author: Ebrahim):
[mediawiki/core@master] Mark now unused getDirMarkEntity as deprecated
Comment ActionsChange #1077072 had a related patch set uploaded (by Ebrahim; author: Ebrahim):
[mediawiki/core@master] Replace use of getDirMark with <bdi> tag
Comment ActionsChange #1077078 had a related patch set uploaded (by Ebrahim; author: Ebrahim):
[mediawiki/core@master] Replace use of getDirMark with <bdi> in Special:DoubleRedirects
Comment ActionsChange #1077078 merged by jenkins-bot:
[mediawiki/core@master] Replace use of getDirMark with <bdi> in Special:DoubleRedirects
Comment ActionsChange #1077072 merged by jenkins-bot:
[mediawiki/core@master] Replace use of getDirMark with <bdi> tag in ProtectedPagesPager
Comment ActionsChange #1077445 had a related patch set uploaded (by Ebrahim; author: Ebrahim):
[mediawiki/core@master] Use a better bidi aware in CommentParser
Comment ActionsChange #1077705 had a related patch set uploaded (by Ebrahim; author: Ebrahim):
[mediawiki/core@master] Use HTML markup instead of bidi control chars in Special:NewPages
https://gerrit.wikimedia.org/r/c/mediawiki/core/+/1077705 and am writing again here,
Generally to understand and review these changes keep this rule of thumb in mind that whenever a user generated content such as username, page title, summary and content is used in one line with text and messages that are part of software UI of MediaWiki, those user generated content should be wrapped with <bdi>. It's like XSS and SQL Injection but for bidi and the antidote is an appropiate wrapping with HTML <bdi> tag. (but this is just a simplification with leaving some details out)
Comment ActionsChange #1077718 had a related patch set uploaded (by Ebrahim; author: Ebrahim):
[mediawiki/core@master] Use HTML markup instead of bidi control chars in Special:ShortPages
Comment ActionsChange #1077739 had a related patch set uploaded (by Ebrahim; author: Ebrahim):
[mediawiki/core@master] Use HTML markup instead of bidi control chars in action=history
Comment ActionsChange #1077739 merged by jenkins-bot:
[mediawiki/core@master] Use HTML markup instead of bidi control chars in action=history
Comment ActionsChange #1077718 merged by jenkins-bot:
[mediawiki/core@master] Use HTML markup instead of bidi control chars in Special:ShortPages
Comment ActionsChange #1077705 merged by jenkins-bot:
[mediawiki/core@master] Use HTML markup instead of bidi control chars in Special:NewPages
Comment ActionsChange #1077067 merged by jenkins-bot:
[mediawiki/core@master] Mark now unused getDirMarkEntity as deprecated
Comment ActionsChange #1077771 had a related patch set uploaded (by Ebrahim; author: Ebrahim):
[mediawiki/core@master] Use HTML markup instead of bidi control chars in Special:WhatLinksHere
Comment ActionsChange #1077775 had a related patch set uploaded (by Ebrahim; author: Ebrahim):
[mediawiki/core@master] Use HTML markup instead of bidi control chars in Special:ListRedirects
Comment ActionsChange #1077778 had a related patch set uploaded (by Ebrahim; author: Ebrahim):
[mediawiki/core@master] Use HTML markup instead of bidi control chars in revision delete
Comment ActionsChange #1077790 had a related patch set uploaded (by Ebrahim; author: Ebrahim):
[mediawiki/core@master] Use HTML markup instead of bidi control chars in wiki changes
Comment ActionsChange #1077796 had a related patch set uploaded (by Ebrahim; author: Ebrahim):
[mediawiki/core@master] Mark getDirMark as deprecated
Comment ActionsChange #1077771 merged by jenkins-bot:
[mediawiki/core@master] Use HTML markup instead of bidi control chars in Special:WhatLinksHere
Comment ActionsChange #1077775 merged by jenkins-bot:
[mediawiki/core@master] Use HTML markup instead of bidi control chars in Special:ListRedirects
Comment ActionsChange #1077778 merged by jenkins-bot:
[mediawiki/core@master] Use HTML markup instead of bidi control chars in revision delete
Comment ActionsChange #1077790 merged by jenkins-bot:
[mediawiki/core@master] Use HTML markup instead of bidi control chars in wiki changes
Comment ActionsChange #1077796 merged by jenkins-bot:
[mediawiki/core@master] Mark getDirMark as deprecated
Comment ActionsChange #1077445 merged by jenkins-bot:
[mediawiki/core@master] Use a better bidi aware markup in CommentParser
Comment ActionsChange #1077945 had a related patch set uploaded (by Ebrahim; author: Ebrahim):
[mediawiki/core@master] Remove getDirMark the same way as RevDelLogItem
Comment ActionsChange #1077945 merged by jenkins-bot:
[mediawiki/core@master] Remove getDirMark the same way as RevDelLogItem
Comment ActionsChange #1077995 had a related patch set uploaded (by Ebrahim; author: Ebrahim):
[mediawiki/core@master] Wrap category link around <bdi>
Comment ActionsChange #1077997 had a related patch set uploaded (by Ebrahim; author: Ebrahim):
[mediawiki/extensions/CategoryTree@master] Replace use of getDirMark with HTML markup
Comment ActionsChange #1077995 merged by jenkins-bot:
[mediawiki/core@master] Wrap category links around <bdi>
Comment ActionsChange #1077997 merged by jenkins-bot:
[mediawiki/extensions/CategoryTree@master] Replace use of getDirMark with HTML markup
Comment ActionsChange #1078045 had a related patch set uploaded (by Ebrahim; author: Ebrahim):
[mediawiki/core@master] Use <bdi> in Language::specialList
Comment ActionsChange #1078099 had a related patch set uploaded (by Ebrahim; author: Ebrahim):
[mediawiki/core@master] Deprecate embedBidi the same way as getDirMark
Comment ActionsChange #1078100 had a related patch set uploaded (by Ebrahim; author: Ebrahim):
[mediawiki/extensions/GrowthExperiments@master] Replace embedBidi with <bdi> HTML tag
Comment ActionsChange #1078101 had a related patch set uploaded (by Ebrahim; author: Ebrahim):
[mediawiki/core@master] Aggregate unicode directional formatting characters
Comment ActionsChange #1078045 merged by jenkins-bot:
[mediawiki/core@master] Use <bdi> in Language::specialList
Comment ActionsChange #1078099 merged by jenkins-bot:
[mediawiki/core@master] Deprecate embedBidi the same way as getDirMark
Comment ActionsChange #1078100 merged by jenkins-bot:
[mediawiki/extensions/GrowthExperiments@master] Replace embedBidi with <bdi> HTML tag
Comment ActionsChange #1078348 had a related patch set uploaded (by Ebrahim; author: Ebrahim):
[mediawiki/core@master] Use content dir in Special:DoubleRedirects
Comment ActionsChange #1078101 merged by jenkins-bot:
[mediawiki/core@master] Move definition of all bidi control characters to one place
Comment ActionsChange #1078348 merged by jenkins-bot:
[mediawiki/core@master] Use content dir in DoubleRedirects and ShortPages special pages
Comment ActionsChange #1079218 had a related patch set uploaded (by Hashar; author: Hashar):
[mediawiki/core@wmf/1.43.0-wmf.26] Revert "Use HTML markup instead of bidi control chars in wiki changes"
Comment ActionsChange #1079218 merged by Aklapper:
[mediawiki/core@wmf/1.43.0-wmf.26] Revert "Use HTML markup instead of bidi control chars in wiki changes"
Comment ActionsMentioned in SAL (#wikimedia-operations) [2024-10-10T09:07:49Z] <aklapper@deploy2002> Finished scap sync-world: Backport for [[gerrit:1079218|Revert "Use HTML markup instead of bidi control chars in wiki changes" (T375975 T376814)]] (duration: 12m 09s)
Change #1077067 had a related patch set uploaded (by Ebrahim; author: Ebrahim):
[mediawiki/core@master] Mark now unused getDirMarkEntity as deprecated
Change #1077072 had a related patch set uploaded (by Ebrahim; author: Ebrahim):
[mediawiki/core@master] Replace use of getDirMark with <bdi> tag
Change #1077078 had a related patch set uploaded (by Ebrahim; author: Ebrahim):
[mediawiki/core@master] Replace use of getDirMark with <bdi> in Special:DoubleRedirects
Change #1077078 merged by jenkins-bot:
[mediawiki/core@master] Replace use of getDirMark with <bdi> in Special:DoubleRedirects
Change #1077072 merged by jenkins-bot:
[mediawiki/core@master] Replace use of getDirMark with <bdi> tag in ProtectedPagesPager
Change #1077445 had a related patch set uploaded (by Ebrahim; author: Ebrahim):
[mediawiki/core@master] Use a better bidi aware in CommentParser
Change #1077705 had a related patch set uploaded (by Ebrahim; author: Ebrahim):
[mediawiki/core@master] Use HTML markup instead of bidi control chars in Special:NewPages
https://gerrit.wikimedia.org/r/c/mediawiki/core/+/1077705 and am writing again here,
Generally to understand and review these changes keep this rule of thumb in mind that whenever a user generated content such as username, page title, summary and content is used in one line with text and messages that are part of software UI of MediaWiki, those user generated content should be wrapped with <bdi>. It's like XSS and SQL Injection but for bidi and the antidote is an appropiate wrapping with HTML <bdi> tag. (but this is just a simplification with leaving some details out)
Change #1077718 had a related patch set uploaded (by Ebrahim; author: Ebrahim):
[mediawiki/core@master] Use HTML markup instead of bidi control chars in Special:ShortPages
Change #1077739 had a related patch set uploaded (by Ebrahim; author: Ebrahim):
[mediawiki/core@master] Use HTML markup instead of bidi control chars in action=history
Change #1077739 merged by jenkins-bot:
[mediawiki/core@master] Use HTML markup instead of bidi control chars in action=history
Change #1077718 merged by jenkins-bot:
[mediawiki/core@master] Use HTML markup instead of bidi control chars in Special:ShortPages
Change #1077705 merged by jenkins-bot:
[mediawiki/core@master] Use HTML markup instead of bidi control chars in Special:NewPages
Change #1077067 merged by jenkins-bot:
[mediawiki/core@master] Mark now unused getDirMarkEntity as deprecated
Change #1077771 had a related patch set uploaded (by Ebrahim; author: Ebrahim):
[mediawiki/core@master] Use HTML markup instead of bidi control chars in Special:WhatLinksHere
Change #1077775 had a related patch set uploaded (by Ebrahim; author: Ebrahim):
[mediawiki/core@master] Use HTML markup instead of bidi control chars in Special:ListRedirects
Change #1077778 had a related patch set uploaded (by Ebrahim; author: Ebrahim):
[mediawiki/core@master] Use HTML markup instead of bidi control chars in revision delete
Change #1077790 had a related patch set uploaded (by Ebrahim; author: Ebrahim):
[mediawiki/core@master] Use HTML markup instead of bidi control chars in wiki changes
Change #1077796 had a related patch set uploaded (by Ebrahim; author: Ebrahim):
[mediawiki/core@master] Mark getDirMark as deprecated
Change #1077771 merged by jenkins-bot:
[mediawiki/core@master] Use HTML markup instead of bidi control chars in Special:WhatLinksHere
Change #1077775 merged by jenkins-bot:
[mediawiki/core@master] Use HTML markup instead of bidi control chars in Special:ListRedirects
Change #1077778 merged by jenkins-bot:
[mediawiki/core@master] Use HTML markup instead of bidi control chars in revision delete
Change #1077790 merged by jenkins-bot:
[mediawiki/core@master] Use HTML markup instead of bidi control chars in wiki changes
Change #1077796 merged by jenkins-bot:
[mediawiki/core@master] Mark getDirMark as deprecated
Change #1077445 merged by jenkins-bot:
[mediawiki/core@master] Use a better bidi aware markup in CommentParser
Change #1077945 had a related patch set uploaded (by Ebrahim; author: Ebrahim):
[mediawiki/core@master] Remove getDirMark the same way as RevDelLogItem
Change #1077945 merged by jenkins-bot:
[mediawiki/core@master] Remove getDirMark the same way as RevDelLogItem
Change #1077995 had a related patch set uploaded (by Ebrahim; author: Ebrahim):
[mediawiki/core@master] Wrap category link around <bdi>
Change #1077997 had a related patch set uploaded (by Ebrahim; author: Ebrahim):
[mediawiki/extensions/CategoryTree@master] Replace use of getDirMark with HTML markup
Change #1077995 merged by jenkins-bot:
[mediawiki/core@master] Wrap category links around <bdi>
Change #1077997 merged by jenkins-bot:
[mediawiki/extensions/CategoryTree@master] Replace use of getDirMark with HTML markup
Change #1078045 had a related patch set uploaded (by Ebrahim; author: Ebrahim):
[mediawiki/core@master] Use <bdi> in Language::specialList
Change #1078099 had a related patch set uploaded (by Ebrahim; author: Ebrahim):
[mediawiki/core@master] Deprecate embedBidi the same way as getDirMark
Change #1078100 had a related patch set uploaded (by Ebrahim; author: Ebrahim):
[mediawiki/extensions/GrowthExperiments@master] Replace embedBidi with <bdi> HTML tag
Change #1078101 had a related patch set uploaded (by Ebrahim; author: Ebrahim):
[mediawiki/core@master] Aggregate unicode directional formatting characters
Change #1078045 merged by jenkins-bot:
[mediawiki/core@master] Use <bdi> in Language::specialList
Change #1078099 merged by jenkins-bot:
[mediawiki/core@master] Deprecate embedBidi the same way as getDirMark
Change #1078100 merged by jenkins-bot:
[mediawiki/extensions/GrowthExperiments@master] Replace embedBidi with <bdi> HTML tag
Change #1078348 had a related patch set uploaded (by Ebrahim; author: Ebrahim):
[mediawiki/core@master] Use content dir in Special:DoubleRedirects
Change #1078101 merged by jenkins-bot:
[mediawiki/core@master] Move definition of all bidi control characters to one place
Change #1078348 merged by jenkins-bot:
[mediawiki/core@master] Use content dir in DoubleRedirects and ShortPages special pages
Change #1079218 had a related patch set uploaded (by Hashar; author: Hashar):
[mediawiki/core@wmf/1.43.0-wmf.26] Revert "Use HTML markup instead of bidi control chars in wiki changes"
Change #1079218 merged by Aklapper:
[mediawiki/core@wmf/1.43.0-wmf.26] Revert "Use HTML markup instead of bidi control chars in wiki changes"
Mentioned in SAL (#wikimedia-operations) [2024-10-10T09:07:49Z] <aklapper@deploy2002> Finished scap sync-world: Backport for [[gerrit:1079218|Revert "Use HTML markup instead of bidi control chars in wiki changes" (T375975 T376814)]] (duration: 12m 09s)