Speech Processing

Download Report

Transcript Speech Processing

Speech Processing
Speech Processing:
 Review
 Review
of DSP Concepts
of Probability and Stochastic Processes
 Anatomy and Physiology of Speech Production
System
 Phonemics and Phonetics
 Spectrogram Reading
 Linear Prediction Analysis
 Speech Coding and Compression
 Speech Synthesis (Text to Speech)
 Speech Quality Assessment (Subjective and
Objective)
 Speech Recognition (Speech to Text)
 Speech Enhancement
2
Speech Processing:

Marking Scheme:
 Homeworks:10%
 Projects
:
 Quizzes:
 Midterm:
 Final Exam:
15%
20%
25%
30%
3
Speech Processing:

Text:
 Spoken language processing
 Huang, Acero, Hon, 2000
 Introduction to Digital Speech Processing
 Lawrence R. Rabiner and Ronald W. Schafer, 2007
 Discrete time processing of
 Deller,Proakis,Hansen,1993
 Fundamentals of speech
 Rabiner,Juang,1993

speech Signals
recognition
Password for any documents for the course:
 40967fall95
4
‫‪ ‬ارسطو‌‪:‬‬
‫انسان‪ ،‬حيوان ناطق است‪.‬‬
‫‪5‬‬
Old Speech Synthesizers
– Speech organ of Wheatstone, based on a system proposed by Wolfgang
von Kempelen in 1791
6
Old Speech Synthesizers
(cont’d)
– Speech organ of Joseph Faber (1830-40)
7
Old Speech Synthesizers
(cont’d)
– Voder demonstrated in 1939
Source: http://www.ling.su.se/staff/hartmut/kemplne.htm
8
More modern labs
(ICP lab in Grenoble, France)
– Study of the face movements to be included in speech synthesis (and
recognition).
9
Communication via Spoken Language
10
Communication via Spoken Language
11
Virtues of Spoken Language
Natural:
Requires no special training
Flexible:
Leaves hands and eyes free
Efficient:
Has high data rate
Economical:
Communicated inexpensively
Expressive:
Conveys more than just words
Popular/preferred: Verbal-acoustic problem solving
Much longer evolution, compared to written language
12
Virtues of Spoken Language

Speech interfaces are ideal for
information access and management
when:
 The
information space is broad and complex,
 The users are not allowed (or at ease or capable) to use
their eyes to read text messages,
 The users are technically naive, or
 Only telephones are available.
13
Diverse Sources of Constraint for
Spoken Language Communication
Acoustic:
Phonetic:
Phonological:
Phonotactic:
Syntactic:
Semantic:
Contextual:
human vocal tract
let us pray
lettuce spray
gas shortage
fish sandwich
sprachst (german)
I am flying to Chicago tomorrow
tomorrow I flying Chicago am to
Is the baby crying
Is the bay bee crying
It is easy to recognize speech
It is easy to wreck a nice beach
14
A Conversational System Architecture
15
Demo: Conversational
Interface

Jupiter weather information system
 Access through telephone
 500 cities worldwide
 Harvest weather information from the Web
several times daily
16