How It Works

DECtalk Software converts ASCII English language text into speech output through a speech synthesizer. The two ways to feed text into the speech synthesizer are:

·      By way of programming DECtalk Software API function calls in your own application program.

·      Using the DECtalk Dtsample applet user interface (Windows only).

The flow of the DECtalk Software Text-to-Speech Conversion Process is shown below:

Flow of the DECtalk Software Text-To-Speech Conversion Process

ebx_-637825421.gif

Legend

Œ

Text is selected for processing by DECtalk Software.

A sentence parser breaks the input stream into separate words and locates some clause boundaries (indicated by commas and other punctuation marks as well as by special words loaded in the DECtalk Software internal dictionary). The sentence parser also recognizes and deals with phonemic symbols and commands that you might have added to the input text.

A word parser breaks words into their component parts, dividing words by their final pronounceable form. Strings of text that do not form pronounceable English words are spelled out letter by letter. A number formatter is used if the text contains numerals. The number formatter applies the rules for many common number formats and converts the numbers into English words.

Ž

A dictionary lookup routine searches the pronunciation dictionaries. DECtalk Software has a built-in dictionary of many commonly used words. DECtalk Software also has a user dictionary for programmers and general users that can be filled with words specific to an application. This dictionary and how to load it are described in Chapter 3.

A letter-to-sound module uses a set of English pronunciation rules to assign phonemic form and lexical stress patterns to words not found in the dictionary. See Chapter 3 for more information on modifying the phonemic form of words, and the DECtalk Software Reference Guide for enhancing special voice qualities, such as emphasis and singing.

A phrase structure module recombines all phonemic output from the dictionary search and other modules. Duration of phonemes and pitch commands is computed for the clause, and appropriate sound variants are selected for those phonemes that can be pronounced in more than one way.

The phoneme-to-voice module processes clauses passed from the phrase structure module and converts them to control signals for the speech synthesizer. This module modifies the clauses by changing the phonemes/allophones into parameters that determine the natural resonant frequencies of the vocal tract (formants), and sound source amplitudes. The control parameters are sent to the speech synthesizer for output.

The DECtalk Software speech synthesizer computes a speech waveform with acoustic characteristics that are determined by the synthesizer control commands.