- GTV Creations

Download Report

Transcript - GTV Creations

GUIDE
Mr.R.Balamurugan.M.E.,
Assistant Professor,
Dept of CSE, APEC.
TEAM
Suresh.C
Tamilarasan.C
Thiruvikraman.G
Speech recognition is a challenging problem in
Artificial Intelligence. It is a solved problem in some
restricted settings, for example spoken words are limited
to a small vocabulary. In this case speech recognition
systems are already in common use, e.g. for command and
name recognition. Error-free recognition of unrestricted
continuous speech remains a difficult & unsolved
problem. To provide flexibility in recognition, the system
can need a self learning program internally to understand the
voice modulation by listening and updating different
pronunciations.
Today speech recognition technology finds broad
applications in telephone & software-based systems.
Different research and industrial sectors believe that the
rapid advancements in the technology have contributed to
the ever robust, evolving and advanced speech recognition
products. Favorite search engines can search the text of
millions of Web pages to produce results, it would be far
better if they could do a similar search over the millions of
recorded interviews and conversations that frequently appear
on news channels and talk shows.
Although speech recognition technology has met
with much success, companies like Kyocera have repeatedly
highlighted its failure in this area and it becomes a very
challenging task to provide service for customers through
speech recognition system.
1.Artificial Intelligence : study and design of intelligent
agents & describe a property of machines or programs.
2.Speech Recognition : process of converting a speech
signal to a sequence of words, by means of an algorithm
implemented as a computer program.
3.Natural Language Processing : processing of natural
language is an area of artificial intelligence.
4.Machine Learning : a branch of artificial intelligence, is
about the construction and study of systems that can learn
from data.
5. Neural Networks : to simulate the behavior of the brain
using interconnected abstractions of the real neurons.
1. Recognition limited to a small vocabulary
(restrictions in recognition to avoid confusion)
2. Static in functions.
3. It may confuse and process multiple commands
when it is placed in noisy area.
1. To provide a customized speech recognition
system.
2. To provide system with manual learning to avoid
vocabulary limitations and constraints.
3. It recognize by differentiating background noise
based on frequency depth.
PROBLEM DEFINITION
The problem defines that how to provide flexibility in
recognition in order to get reliable information with learning
mechanism based on perception.
PROBLEMS TO SOLVE
1.Homonyms : words that sound the same but have different
meanings “hear” and “here”, for example.
2.Context
: On what topic is the speaker focusing?
3.Accents
: Due to regional differences in spoken
languages at a global level. The same person may pronounce
the same words differently in emotional states.
4.Environmental interference: It is difficult to
communicate effectively in a noisy room or a busy office.
Hardware Requirements
Computer connected with Speaker and Microphone
160GB Hard Disk
2GB RAM
2GHz Clock Speed
Intel® Core (TM) 2 Duo
14 Inch color monitor
Software Requirements
Windows XP /7
C#.NET & MS-Speech SDK 5.1
SQL 2005 Database
In this topic the various UML diagrams that are used for the
implementation of the project are discussed.
The various UML diagram used here are
Use Case diagram
Sequence diagram
Activity diagram
Data Flow diagram
The system contains the following
modules
Login and Register Users
Task and Data Set
Data Representation
Initializing Recognition &
Processing
In computer security,

A login or logon refers to the credentials required to
obtain access to a computer system or other restricted area.
Logging in or on and signing in or on is the process by
which individual access to a computer system is controlled
by identifying and authenticating the user through the
credentials presented by the user in order to access that
system (e.g., a computer or a website).

It is an integral part of computer security procedures.

In mathematical logic and theoretical computer
science a register machine is a generic class of abstract
machines used in a manner similar to a Turing machine.
SCREENSHOTS
LOGIN
REGISTRATION
The two tasks provide the basis for design of an information
technology framework that is capable to identify and
disseminate information.
SENTENCE IDENTIFICATION - The first task identifies
and extracts informative topics.
RELATION IDENTIFICATION - The second performs
faster a fine grained classification of these sentences
according to the semantic relations that exists between
words and occurrence.
The sentence identification is a task which is similar to a
scan of sentence contained in the abstract of an article in
order to present to the user-only sentence that are identified
as containing relevant information.
It has deeper semantic dimension and it is focused on
complex relations in the sentence already selected as being
informative. The approach used to solve the two proposed
task is based on NLP and ML techniques. In a standard
supervised ML setting, a training set and a test set are
required. The training set is used to train the ML algorithm
and the test set to test its performance.
BAG-OF-WORDS REPRESENTATION
The bag-of-words (BOW) representation is
commonly used for text classification tasks. It is a
representation in which features are chosen among the
words that are present in training data.
Selection techniques are used in order to identify the
most suitable words as features.
NLP CONCEPT REPRESENTATION
Lemma (logic), which is simultaneously a premise
for a contention above it and a contention for premises
below it. We choose to use lemmas because there are a lot of
inflected form for the same word and the lemmatized form
will gives us the same base from all for all of them. Another
reason is to reduce the data sparseness problem.
Set your command
The user can initialize recognition and perform processing
which is based on the predefined data or query in database
manually assigned by the user.
This module works on the basis of Neural Networks as
preprocessing features such as transformation and
dimensionality reduction.
Neural networks emerged as an attractive acoustic
modeling approach in SR. It has been used in many aspects
of speech recognition such as phoneme classification,
isolated word recognition, and speaker adaptation.
In contrast to HMMs, neural networks make no
assumptions about feature statistical properties and have
several qualities making them attractive recognition models
for speech recognition in a natural and efficient manner.
SPEECH ANALYSIS
1. Morphological analysis
2. Syntactic analysis
3. Semantic analysis
The morphological analysis splits a sentence into
words. Each word is looked up in a dictionary in order to
determine its word class and inflection. This information is
used as input for the syntactic analysis.
The syntactic analysis checks that the words form a
legal sentence. The rules of which sentences are legal in the
language are expressed in the form of a grammar. In this
project are used so-called extended state diagrams for the
purpose. They are also called ATN- grammars (an
abbreviation of Augmented Transition Networks). The result
of the syntactic analysis is input for the semantic analysis.
The semantic analysis determines the meaning of a
sentence, among other things by looking up the meaning of
the individual words of the sentence.
Process happens in normal speech chain
Speech Chain Process in system
Example : The dog likes a man
INPUT PERCEPTION & TRANSLATION
Perception (from the Latin perceptio, percipio) is the
organization, identification and interpretation of sensory information in
order to represent and understand the environment. All perception involves
signals in the nervous system, which in turn result from physical stimulation
of the sense organs
Machine translation (translation from one natural language to
another)
Initializing Engine
Result after processing
HMM - EXISTING SYSTEM
The observation is a probabilistic function of the state.
Situation
: User State
State
: bad, neutral, or a good
Viterbi algorithm - Searching for the most probable path
Forward algorithm - Probability of a sequence
Continuous observation probability density function is used.
LIMITATIONS
Data intensive
Computationally intensive
DEANNT+ - PROPOSED SYSTEM
It is derived from DE Algorithm
DE Algorithm without adaptive selection control parameters in
ANN Training results in problem.
To overcome the problem, the modification of DE is provided
with control parameters & multiple trial vectors, is renamed as
DEANNT+.
In ANN Training, it is helpful in classifying parity-p problems.
ADVANTAGES
Efficient memory utilization
Lower computational complexity
Lower computational effort
Effective in nonlinear constraint optimization
Optimizing multimodal problems
The multiple trial vectors technique increases the probability of
generating a better solution because a greater number of temporary
solutions are generated around the existing solutions.
Analysis efficiency of algorithm based on parameters
where,
N
M
P
Q
K
-
number of samples in the analysis frame
number of samples shift between frames
LPC analysis order
dimension of LPC derived cepstral vector
number of frames over which cepstral time
derivatives are computed
Comparison of Existing Vs. Proposed System
Military sector : High performance fighter aircraft,
Helicopters, Battle management, Training air traffic
controllers, Telephony and other domains, people with
disabilities.
Education Sector : Enabling students who are physically
handicapped and unable to use a keyboard to enter text
verbally.
Outside education sector : Computer and video games,
Gambling, Precision surgery.
Domestic sector : Oven, refrigerators, dishwashers and
washing machines.
Dictation : Dictation systems on the market accepts
continuous speech input which replaces menu system.
Speech recognition technology improvements in future,
The first is improvement of computer interaction
interfaces. Today, hardly any computer comes pre-loaded
with a speech recognition system yet the day is not far off
when speech recognition systems will be common in most
computer systems.
The second is as a common proxy to current GUI systems
in the future this itself would revolutionize the way we
interact with computers in our everyday work.
The third potential application is in speech recognition
systems as aids for visually-challenged and writing-impaired
people, to help them express themselves and obtain an
education.
We have also encountered a number of practical
limitations which hinder a widespread deployment of
application and services.
There is now increasing interest in finding ways to
bridge such a performance gap.
Although these areas of investigations are important,
the significant advances will come from studies in
acoustic-phonetics, speech perception, linguistics, and
psychoacoustics.
Future systems need to have an efficient way of
representing, storing, and retrieving knowledge required
for natural conversation.
[1] Adam Slowik,” Application of an Adaptive Differential Evolution
Algorithm With Multiple Trial Vectors to Artificial Neural Network
Training,” IEEE Transactions on Industrial Electronics, vol. 58, No. 8,
August 2011
[2] M. M. EI Choubassi, H. E. EI Khoury, C. E. Jabra Alagha, J. A.
Skaf and M. A. Al-Alaoui,” Arabic Speech Recognition Using
Recurrent Neural Networks”.
[3] Itamar Arel, Derek C. Rose, and Thomas P.Karnowski,” Deep
Machine Learning,” A New Frontier in Artificial Intelligence Research.
[4] Simon Corston-Oliver, Michael Gamon and Chris Brockett,” A
machine learning approach to the automatic evaluation of machine
translation,” Microsoft Research.
[5] Tan Lee and P.C. Ching,” A Neural Network Based speech
Recognition For Isolated Cantonese Syllables”.