Sonority as a Basis for Rhythmic Class Discrimination
Download
Report
Transcript Sonority as a Basis for Rhythmic Class Discrimination
Sonority as a Basis for Rhythmic
Class Discrimination
Antonio Galves, USP.
Jesus Garcia, USP.
Denise Duarte, USP and UFGo.
Charlotte Galves, UNICAMP.
The starting point :
Ramus, Nespor & Mehler (1999)
What we do
• Our goal: a new approach to the problem of
finding acoustic correlates of the rhythmic
classes.
• Main ingredient: a rough measure of sonority
defined directly from the spectrogram of the
signal.
• Major advantage: can be implemented in an
entirely automatic way, with no need of previous
hand-labelling of the acoustic signal.
Our main result
Applied to the same linguistic
samples considered in RNM, our
approach produces the same
clusters corresponding to the
three conjectured rhythmic
classes.
RNM revisited
Striking features
• Linear correlation
between ΔC and %V
(-0.93).
• Clustering into three
groups.
A parametric probabilistic model for RNM
Duarte et al. (2001) propose a parametric family
of probability distributions that closely fit the data
in RNM.
This has two advantages:
• It provides a deeper insight of the phenomena.
• It makes it possible to perform statistical
inference, i-e to extend results from the sample
(data set) to the population (the set of all
potential sentences).
The probabilistic model
• The duration of the successive consonantal intervals are
independent and identically distributed random variables.
• The duration of each consonantal interval is distributed
acording to a Gamma distribution.
• Different languages have Gamma distributions with
different standard deviations.
• The standard deviation is constant for all languages
belonging to the same rhythmic class.
• The standard deviations of different classes are different.
Statistical evidence for the clustering
• The model enables testing the hypothesis that
the eight languages are clustered in three
groups.
• The hypothesis that the standard deviations of
the Gamma distributions are constant within
classes and differ among classes are compatible
with the data presented in RNM.
Estimated standard deviations of the Gamma
distribution for the consonantal intervals
Problems for RNM (1)
• RNM is based on a hand-labeling
segmentation which is time-consuming
and depends on decisions which are
difficult to reproduce in an homogeneous
way.
• This is a problem for linguists.
Problems for RNM (2)
• Newborn babies discriminate rhythmic
groups from signal filtered at 400 Hz
(Mehler et al. 1996). At this frequency, it is
impossible to fully discriminate consonants
and vowels.
• ΔC depends on a complex computation.
• This is a problem for babies!
Sonority as a basis for rhythmic
class discrimination
• Mehler et al. (1996)’s results strongly suggest
that the discrimination of rhythmic classes by
babies relies not on a fine-grained distinction
between vowels and consonants, but on a
coarse-grained perception of sonority in
opposition to obstruency.
• A natural conjecture is that the identification of
rhythmic classes must be possible using a rough
measure of sonority.
A rough measure of sonority
Goal: to define a function that maps local
windows of the signal on the interval [0,1].
This function should assign
• values close to 1 for spans displaying
regular patterns, characteristic of the
sonorant regions of the signal,
• values close to 0 for regions characterized
by high obstruency.
Technical specifications
• The function s(t) is based on the
spectrogram of the signal.
• Values of the spectrogram are estimated
with a 25ms Gaussian window.
• The step unit of the function is 2ms.
• Computations are made with Praat
(http://www.praat.org)
Definition of the function s(t)
pt(f) = re-normalized power spectrum for frequency f around time t.
This re-normalization makes pt a probability measure.
A regular pattern characteristic of sonorant spans will produce a
sequence of probability measures which are close in the sense of
relative entropy.
This suggests defining the function sonority as
1 t 4 3
s(t ) 1 min1,
h( pu | pu i )
27u t 4 i 1
Values of 1- s(t) on a Japanese example
Estimators
1 T
S s(t )
T t 1
1 T
S s(t ) s(t 1)
T t 1
Explaining the estimators
• S is the sample mean of the function s(t).
• δS measures how important are the high
obstruency regions in the sample. This is
due to the fact that typically the values of
p(t), and consequently s(t), present large
variations when t belongs to intervals with
high obstruency.
Distribution of the eight languages on the
(S ,S) plane
S
Extra statistical features
Languages
P(s<0.3) Q3-Q1
Japanese
0.170
0.47
Catalan
0.177
0.47
Italian
0.182
0.48
Spanish
0.205
0.54
French
0.210
0.53
Polish
0.218
0.58
English
0.234
0.60
Dutch
0.254
0.64
•
•
•
The distance between the first and
third quartile increases from Japanese
to Dutch. In other terms, the
dispersion of sonority increases from
mora-timed to stress-timed languages.
The empirical probability of having
sonority smaller than 0.3 also
increases from Japanese to Dutch.
This reinforces the idea present in
Duarte et al. (2001) that the relevant
information to discriminate among
rhythmic classes is contained in the
less sonorant part of the signal.
Distribution of the eight considered
languages on the ( S ,%V) plane
Distribution of the eight languages
on the (S,DC) plane
Conclusions
• The main purpose of this presentation was
to show that the relevant evidence about
rhythmic classes can be automatically
retrieved from the acoustic signal, through
a rough measure of sonority.
• In addition, our statistics are based on a
coarse-grained treatment of the speech
signal which is likely to be closer to the
linguistic reality of the early acquisition.
• This work is part of the Project
RHYTHMIC PATTERNS, PARAMETER
SETTING AND LANGUAGE CHANGE,
funded by Fapesp (grant no 98/03382-0).
• http://www.ime.usp.br/~tycho
Values of 1- s(t) on a Dutch example
Values of 1- s(t) on a French example