Notes/Discussions on T-ASL EDICS Revision

Download Report

Transcript Notes/Discussions on T-ASL EDICS Revision

Notes/Discussions on
T-ASL EDICS Revision
Summary:
1) In Language: From SLP to HLT, and adding Machine
Learning for Language processing
2) In Speech: Adding “Deep Learning”
3) In Audio: Adding Music IR and “Semantics” in Audio SP
Details in the following slides (starting with “Language”)
The Old SLP EDICS
1.
SLP-UNDE - Spoken Language Understanding
2.
SLP-LADL - Human Language Acquisition, Development and Learning
3.
SLP-SMMD - Spoken and Multimodal Dialog Systems
4.
SLP-SMIR - Speech Data Mining and Document Retrieval
5.
SLP-SSMT - Machine Translation of Speech
6.
SLP-LANG - Language Modeling (for Speech and SLP)
7.
SLP-REAN - Spoken Language Resources and Annotation
Paralinguistic (emotion, age, gender, rate, etc.) information; nonlinguistic (meaning external to language)
information, gestures, etc.; semantic classification; question/answering from speech; entity extraction from
speech; spoken document summarization; detecting linguistic/discourse structure (e.g., disfluencies,
sentence/topic boundaries, speech acts); relation to and interpretation of sign language.
Language acquisition, development, and learning models; computer aids for language learning; attributes and
modeling techniques for assessment of language fluency.
Spoken and multimodal dialog systems, applications, and architectures; stochastic Learning for dialog
modeling; response Generation; technologies for the aged; evaluations and standardizations; speech/voicebased human-computer interfaces (HCI); speech HCI for individuals with impairments and universal access
(UA); other applications.
Analysis and Evaluations for mining spoken data; search/retrieval of speech documents; mining
heterogeneous speech and multimedia data; speech data mining theory, algorithms, and methods; core
machine learning algorithms for data mining; topic spotting and classification; pattern discovery and
prediction from data; applications and tools for speech data mining.
Semi-automatic and data driven methods; speech processing for MTS; corpora, annotation, and other
resources; interlingua and transfer approaches; integration of speech and linguistic processing; machine
transliteration for named entity; evaluation metrics (e.g., BLEU); systems and applications for MTS.
N-grams, their generalizations and smoothing methods; language model adaptation; grammar based
language modeling; maxent and feature based language modeling; dialect, accent, and idiolect at the
language level; discriminative LM training methods; other approaches to LMs; structured classification
approaches.
General corpora, annotation, and other resource
The Task
Objectives:
•
•
To facilitate possible merger of IEEE T-ASL and ACM T-SLP
To cover both ‘spoken language processing’ and selected topics in ‘natural
language processing (computational linguistics) with a focus on ‘processing’
and ‘computational’ for linguistic topics
Considerations
•
•
•
To cover mainly both EDICS of IEEE T-ASL and ACM T-SLP
To reflect emerging new areas, increasing interests HLT from IEEE
community
To use the technical areas of established journals and conferences as a
reference
Contributors
•
•
•
•
•
•
Haizhou Li, IEEE T-ASLP AE, ACM T-SLP AE
Li Deng, EiC, IEEE T-ASLP
Pascale Fung, IEEE T-ASLP AE, ACM T-SLP AE, TACL AE
Dilek Hakkani-Tur, IEEE T-ASLP AE
Jian Su, ACL Executive Board Member; TACL AE
Gokhan Tur, IEEE T-ASLP AE
EDICS Review
•
December 2012 – June 2013
Summary of Changes
– From Spoken Language Processing to Human
Language Processing
– Increase 7 subsections to 9 subsections
– Re-organize 9 subsections as follows
• HLT-LANG (Language Modeling, add computational phonology and phonetics)
• HLT-MTSW (Machine Translation for Spoken and Written Language, add ‘text’
translation topics)
• HLT-UNDE (Spoken Language Understanding and Computational Semantics)
• HLT-DIAL (Discourse and Dialog)
• HLT- SDTM (Spoken Document Retrieval and Text Mining, add NLP topic related to
text mining and IR)
• HLT-STPA (Segmentation, Tagging, and Parsing, new topic to cover core sentence-level
language processing topics - word segmentation, tagging and parsing)
• HLT- HLLI (Human Language Learning and Interface)
• HLT-MLMD (Machine Learning Methods, new topic to reflect increasing interests)
• HLT-LRSE (Language Resources and System Evaluation)
The HLT New Section
•
•
•
•
•
•
•
•
•
HLT-LANG (Language Modelling)
N-grams, their generalizations and smoothing methods; language model adaptation: grammar-based, structured
language modelling; discriminative, maximum-entropy and feature-based language modelling; computational
phonology and phonetics; dialect, accent, and idiolect at the language level;
HLT-MTSW (Machine Translation for Spoken and Written Language)
Example/phrase/syntax/semantics-based machine translation; hybrid machine translation: word/sentence/document
alignments; synchronous grammar induction; decoding; system combination; post-editing; machine transliteration and
transcription; spoken language translation: speech processing for machine translation;
HLT-UNDE (Spoken Language Understanding and Computational Semantics )
Spoken language understanding; paralinguistic (emotion , age, gender, etc.), non-linguistic (gesture, sign, etc)
Information processing; semantic role labelling, multiword expressions; word sense disambiguation, representation of
meaning; lexical semantics; distributional semantics; text entailment; ontology;
HLT-DIAL (Discourse and Dialog)
Learning of linguistic/discourse structure (e.g., disfluencies, sentence/topic boundaries, speech acts); co-reference and
anaphora resolution; dialog management/generation/analysis; semantic analysis for discourse and dialog: intent
determination: dialog act tagging;
HLT-SDTM (Spoken Document Retrieval and Text Mining)
Spoken document retrieval; linguistic pattern discovery and prediction from data; spoken term detection: named entity
recognition; question answering; document summarization and generation; spoken document summarization;
information extraction and retrieval; subjectivity and sentiment analysis; text and spoken document classification;
spam detection; topic detection and tracking; trend detection;
HLT-STPA (Segmentation, Tagging, and Parsing)
Morphology analysis; word segmentation; part-of-speech tagging, chunking and supertagging; models and algorithms
for parsing; grammar induction; dependency parsing; multilingual parsing;
HLT-HLLI (Human Language Learning and Interface )
Language acquisition, development, and learning models; computer aids for language learning; assessment of
language fluency; human computer interface; assistive technology for the aged, universal access and individuals with
Impairments;
HLT-MLMD (Machine Learning Methods )
Supervised, unsupervised, semi-supervised learning; statistical methods; symbolic learning methods; biologically
inspired and neural networks; reinforcement learning; active learning; online learning; deep learning; recurs1ve and
structured models, graphical and latent variable models; kernel methods; domain adaptation;
HLT-LRSE (Language Resources and System Evaluation)
Annotation and evaluation of corpora; linguistic resources development methodologies, standards, tools and
evaluations; crowd-sourcing; evaluations, systems and applications of human language technology;
Speech: Adding two items
SPE- Acoustic Modeling for Automatic Speech Recognition
RECO
Acoustic feature extraction; low-level feature modeling Gaussians & beyond; statistic and neural network models,
deep learning models, pronunciation modeling; state clustering
and novel state definitions; prosody and other speech
characteristics; dialect, accent, and idiolect at the acoustic level;
discriminative acoustic training methods for ASR; articulatory and
physiological modeling; non-acoustic microphones for ASR;
feature transformation and normalization; sparse models and
regularization methods.
Audio: Main Changes
After long discussions of AATC with
their final approval in June 2013;
Expanding “Music IR” and Symbolic
Processing
AUD-MIR Music Information Retrieval and Music Language Processing
Content-based processing; discrimination; classification;
structure analysis; content-based retrieval; fingerprinting; data
mining; symbolic music processing; grammar-based models;
music composition and improvisation; score following and
music accompaniment; music annotation and metadata;
symbolic music corpora.
Audio: Old EDICS
AUDIO AND ELECTROACOUSTICS
AUD-ROOM
AUD-TRAN
AUD-LMAP
AUD-ANCO
AUD-ECHO
AUD-AUDI
AUD-SSEN
AUD-SMCA
AUD-ACOD
AUD-ANSY
Room Acoustics and Acoustic System Modeling
Room acoustics and acoustic system modeling; room response measurement, modeling, simulation and compensation;
architectural and physical acoustics; physical modeling of musical instruments; room acoustics for music performance and
reproduction.
Transducers
Transducer modeling and design; transducer calibration and compensation; novel transducers.
Loudspeaker and Microphone Array Signal Processing
Far-field and near-field beamforming and array processing; source localization and tracking; time-delay estimation; audio
enhancement using transducer arrays; wavefield synthesis; sound field analysis and synthesis.
Active Noise Control
Acoustic noise cancellation and suppression; adaptive techniques for feedforward control; feedback control algorithms;
multichannel systems.
Echo Cancellation
Single-channel and multichannel acoustic echo cancellation; echo path estimation and modeling; echo suppression and
dereverberation; double-talk detection; adaptive filter theory for audio applications.
Auditory Modeling and Hearing Aids
Human audition and psychoacoustics; computational auditory scene analysis; perceptual and psychophysical models of
audio algorithms and systems; perceptual measures of audio quality; aids for the handicapped; medical aids (cochlear
implants, hearing aids); binaural hearing.
Audio Source Separation and Enhancement
Single-channel and multichannel source separation; blind deconvolution; noise reduction, compensation, and equalization;
audio denoising and restoration.
Spatial and Multichannel Audio
Spatial sound analysis and reproduction; spatialization and virtualization; measurement, modeling, and use of head-related
transfer functions; crosstalk cancellation and binaural synthesis; artificial reverberation algorithms.
Audio Coding
Low bit-rate and high-quality audio coding; scalable and lossless audio coding; spatial audio coding; joint source-channel
coding; signal representations for coding; parametric and structured audio coding; psychoacoustic models for coding;
objective and subjective quality assessment; error detection, correction, and concealment.
Audio Analysis and Synthesis
Music analysis, modification, and synthesis; models and representations for musical signals; pitch and multi-pitch
estimation; audio feature analysis and extraction; melody, note, chord, key, and rhythm estimation and detection; automatic
transcription.
Audio: New EDICS
AUD-MAAE
Modeling, Analysis and Synthesis of Acoustic Environments
Acoustic system modeling; room response measurement, modeling and simulation; room geometry inference; reflector localization;
reverberation time estimation; direct-to-reverberation ratio estimation.
AUD-AMHA
Auditory Modeling and Hearing Aids
Human audition and psychoacoustics; computational auditory scene analysis; perceptual and psychophysical models of audio algorithms and
systems; cochlear implants; hearing aids; binaural hearing; signal processing in hearing aids.
AUD-ASAP
Acoustic Sensor Array Processing
Far-field and near-field beamforming; acoustic sensor array processing; source localization and tracking; time-delay estimation; speech
enhancement using acoustic sensor arrays; distributed and ad-hoc microphone arrays.
AUD- NEFR
Active Noise Control, Echo Reduction and Feedback Reduction
Active noise cancellation and suppression; Single-channel and multichannel acoustic echo cancelation; echo path estimation and modeling;
echo suppression; nonlinear echo reduction; double-talk detection; adaptive filter theory for audio applications; adaptive techniques for
feedforward control; feedback cancellation; feedback suppression.
AUD-SIRR
System Identification and Reverberation Reduction
SIMO and MIMO identification; reverberation cancelation and suppression; blind deconvolution; channel-shortening; channel equalization.
AUD-SEP
Audio and Speech Source Separation
Single-channel and multichannel source separation; computational acoustic scene analysis.
AUD-SEN
Signal Enhancement and Restoration
Noise reduction; noise estimation, compensation, and equalization; audio de-noising and restoration; bandwidth expansion; clipping
restoration, near-end listening enhancement.
AUD-QIM
Quality and Intelligibility Measures
Perceptual measures of audio quality; objective and subjective quality assessment; network audio quality assessment; speech intelligibility
measures.
AUD-SARR
Spatial Audio Recording and Reproduction
Analysis and synthesis of sound Fields; wave-field synthesis; loudspeaker array processing; Ambisonics; panning; multipoint synthesis and
binaural synthesis; crosstalk cancellation; virtual auditory environments; Auralization, spatialization and virtualization; measurement and
modeling of head-related transfer functions; binaural rendering; artificial reverberation algorithms.
Audio: New EDICS
AUD-AMCT
Audio and Speech Modeling, Coding and Transmission
Sparse representations; Probabilistic modeling; Low bit-rate and high-quality audio coding; scalable and lossless audio coding; spatial audio
coding; joint source-channel coding; signal representations for coding; parametric and structured audio coding; psychoacoustic models for
coding; low-delay audio coding; error detection, correction, and concealment.
AUD-MSP
Music Signal Analysis, Processing and Synthesis
Analysis; modification; synthesis; models and representations for musical signals; pitch and multi-pitch estimation; audio feature
extraction; melody, note, chord, key, and rhythm estimation and detection; automatic transcription; musical voice separation; instrument
modeling.
AUD-MIR
Music Information Retrieval and Music Language Processing
Content-based processing; discrimination; classification; structure analysis; content-based retrieval; fingerprinting; data mining; symbolic
music processing; grammar-based models; music composition and improvisation; score following and music accompaniment; music
annotation and metadata; symbolic music corpora.
AUD-AUMM
Audio for Multimedia
Audio watermarking and data hiding; data encryption, security, and privacy; digital rights management; joint processing of audio and video;
human-machine audio interfaces; auditory displays; distant learning and virtual reality.
AUD-SYST
Audio Processing Systems and Transducers
Hardware and software systems and implementations; consumer and professional audio; Transducer modeling and design; transducer
calibration and compensation; novel transducers.
AUD-BIO
Bioacoustics and Medical Acoustics
Breathing and snoring analysis; investigation of sound production and reception in animals; echo-localization.