Best practice document: Extracting data for TTS and a "reader mode"

Text-to-speech (TTS) is among the most popular features in reading apps and slowly creeping up as a must-have feature in Web browsers as well.

But despite the popularity and usefulness of TTS, there is no best practice document providing guidance for developers on how they should implement this feature.
The group working on accessibility for FXL publications has also identified that in addition to TTS, extracting text from an FXL resource could be used to provide a "reader mode" of the current page/spread, enabling users to adjust the text and layout to their needs.

For both TTS and a reader mode, reading systems need guidance about the way they should extract data from XHTML to build these alternate renderings:

- using accessibility metadata to infer what might be possible (`accessModeSufficient`, `readingOrder`, `alternativeText`, `longDescription`)
- walking the DOM to create an alternate tree-like structure
- rules to extract context (language for example) and semantics (HTML and ARIA) that will be relevant for these alternate renderings
- recommendations for either breaking down longer text into multiple utterances (a paragraph broken down into sentences) or merging multiple text nodes to re-create a full utterance (a single sentence but divided into multiple strings in an FXL resource) that will be passed to the TTS engine
- skippability and escapability rules
- building a reader mode view from that tree-like structure

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Best practice document: Extracting data for TTS and a "reader mode" #69

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Best practice document: Extracting data for TTS and a "reader mode" #69

Activity

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions