Science topic

Speech Prosody - Science topic

Explore the latest questions and answers in Speech Prosody, and find Speech Prosody experts.
Questions related to Speech Prosody
  • asked a question related to Speech Prosody
Question
7 answers
For an upcoming study, I am in search of a quick Spanish placement test that can be made by L2 learners (preferably online) to determine their L2 Spanish proficiency level.
Ideally, the test would not be longer than 10 minutes and can be used for free, but please also contact me with recommendations for longer or paid tests. These could still be a useful starting point for us.
Thank you in advance!
Lieke
Relevant answer
Answer
Hi, Lieke!
Regarding your question about a quick Spanish placement test, I recommend that you search the following link: https://www.tiatula.com/spanish-placement-tests/
Best luck!
  • asked a question related to Speech Prosody
Question
9 answers
Hello,
We are working on a review regarding the relationship between language and the mutiple-demand network. You will be responsible for addressing the reviewer's criticisms. Please leave your email address if you are interested.
Best,
W
Relevant answer
Answer
I hope you get it... Best of luck. Commenting for better each for you...
  • asked a question related to Speech Prosody
Question
1 answer
Hello everyone,
for my thesis I want to extract some voice features from audio data recorded during psychotherapy sessions. For this I am using the openSMILE toolkit. For the fundamental frequency and jitter I already get good results, but the extraction of center frequencies and bandwidths of the formants 1-3 is puzzling me. For some reason there appears to be just one formant (the first one) with a frequency range up to 6kHz. Formants 2 and 3 are getting values of 0. I expected the formants to be within a range of 500 to 2000 Hz.
I tried to fix the problem myself but could not find the issue here. Does anybody have experience with openSMILE, especially formant extraction, and could help me out?
For testing purposes I am using various audio files recorded by myself or extracted from youtube. My config file looks like this:
///////////////////////////////////////////////////////////////////////////
// openSMILE configuration template file generated by SMILExtract binary //
///////////////////////////////////////////////////////////////////////////
[componentInstances:cComponentManager]
instance[dataMemory].type = cDataMemory
instance[waveSource].type = cWaveSource
instance[framer].type = cFramer
instance[vectorPreemphasis].type = cVectorPreemphasis
instance[windower].type = cWindower
instance[transformFFT].type = cTransformFFT
instance[fFTmagphase].type = cFFTmagphase
instance[melspec].type = cMelspec
instance[mfcc].type = cMfcc
instance[acf].type = cAcf
instance[cepstrum].type = cAcf
instance[pitchAcf].type = cPitchACF
instance[lpc].type = cLpc
instance[formantLpc].type = cFormantLpc
instance[formantSmoother].type = cFormantSmoother
instance[pitchJitter].type = cPitchJitter
instance[lld].type = cContourSmoother
instance[deltaRegression1].type = cDeltaRegression
instance[deltaRegression2].type = cDeltaRegression
instance[functionals].type = cFunctionals
instance[arffSink].type = cArffSink
printLevelStats = 1
nThreads = 1
[waveSource:cWaveSource]
writer.dmLevel = wave
basePeriod = -1
filename = \cm[inputfile(I):name of input file]
monoMixdown = 1
[framer:cFramer]
reader.dmLevel = wave
writer.dmLevel = frames
copyInputName = 1
frameMode = fixed
frameSize = 0.0250
frameStep = 0.010
frameCenterSpecial = center
noPostEOIprocessing = 1
buffersize = 1000
[vectorPreemphasis:cVectorPreemphasis]
reader.dmLevel = frames
writer.dmLevel = framespe
k = 0.97
de = 0
[windower:cWindower]
reader.dmLevel=framespe
writer.dmLevel=winframe
copyInputName = 1
processArrayFields = 1
winFunc = ham
gain = 1.0
offset = 0
[transformFFT:cTransformFFT]
reader.dmLevel = winframe
writer.dmLevel = fftc
copyInputName = 1
processArrayFields = 1
inverse = 0
zeroPadSymmetric = 0
[fFTmagphase:cFFTmagphase]
reader.dmLevel = fftc
writer.dmLevel = fftmag
copyInputName = 1
processArrayFields = 1
inverse = 0
magnitude = 1
phase = 0
[melspec:cMelspec]
reader.dmLevel = fftmag
writer.dmLevel = mspec
nameAppend = melspec
copyInputName = 1
processArrayFields = 1
htkcompatible = 1
usePower = 0
nBands = 26
lofreq = 0
hifreq = 8000
usePower = 0
inverse = 0
specScale = mel
[mfcc:cMfcc]
reader.dmLevel=mspec
writer.dmLevel=mfcc1
copyInputName = 0
processArrayFields = 1
firstMfcc = 0
lastMfcc = 12
cepLifter = 22.0
htkcompatible = 1
[acf:cAcf]
reader.dmLevel=fftmag
writer.dmLevel=acf
nameAppend = acf
copyInputName = 1
processArrayFields = 1
usePower = 1
cepstrum = 0
acfCepsNormOutput = 0
[cepstrum:cAcf]
reader.dmLevel=fftmag
writer.dmLevel=cepstrum
nameAppend = acf
copyInputName = 1
processArrayFields = 1
usePower = 1
cepstrum = 1
acfCepsNormOutput = 0
oldCompatCepstrum = 1
absCepstrum = 1
[pitchAcf:cPitchACF]
reader.dmLevel=acf;cepstrum
writer.dmLevel=pitchACF
copyInputName = 1
processArrayFields = 0
maxPitch = 500
voiceProb = 0
voiceQual = 0
HNRdB = 0
F0 = 1
F0raw = 0
F0env = 1
voicingCutoff = 0.550000
[lpc:cLpc]
reader.dmLevel = fftc
writer.dmLevel = lpc1
method = acf
p = 8
saveLPCoeff = 1
lpGain = 0
saveRefCoeff = 0
residual = 0
forwardFilter = 0
lpSpectrum = 0
[formantLpc:cFormantLpc]
reader.dmLevel = lpc1
writer.dmLevel = formants
copyInputName = 1
nFormants = 3
saveFormants = 1
saveIntensity = 0
saveNumberOfValidFormants = 1
saveBandwidths = 1
minF = 400
maxF = 6000
[formantSmoother:cFormantSmoother]
reader.dmLevel = formants;pitchACF
writer.dmLevel = forsmoo
copyInputName = 1
medianFilter0 = 0
postSmoothing = 0
postSmoothingMethod = simple
F0field = F0
formantBandwidthField = formantBand
formantFreqField = formantFreq
formantFrameIntensField = formantFrameIntens
intensity = 0
nFormants = 3
formants = 1
bandwidths = 1
saveEnvs = 0
no0f0 = 0
[pitchJitter:cPitchJitter]
reader.dmLevel = wave
writer.dmLevel = jitter
writer.levelconf.nT = 1000
copyInputName = 1
F0reader.dmLevel = pitchACF
F0field = F0
searchRangeRel = 0.250000
jitterLocal = 1
jitterDDP = 1
jitterLocalEnv = 0
jitterDDPEnv = 0
shimmerLocal = 0
shimmerLocalEnv = 0
onlyVoiced = 0
inputMaxDelaySec = 2.0
[lld:cContourSmoother]
reader.dmLevel=mfcc1;pitchACF;forsmoo;jitter
writer.dmLevel=lld1
writer.levelconf.nT=10
writer.levelconf.isRb=0
writer.levelconf.growDyn=1
nameAppend = sma
copyInputName = 1
noPostEOIprocessing = 0
smaWin = 3
[deltaRegression1:cDeltaRegression]
reader.dmLevel=lld1
writer.dmLevel=lld_de
writer.levelconf.isRb=0
writer.levelconf.growDyn=1
nameAppend = de
copyInputName = 1
noPostEOIprocessing = 0
deltawin=2
blocksize=1
[deltaRegression2:cDeltaRegression]
reader.dmLevel=lld_de
writer.dmLevel=lld_dede
writer.levelconf.isRb=0
writer.levelconf.growDyn=1
nameAppend = de
copyInputName = 1
noPostEOIprocessing = 0
deltawin=2
blocksize=1
[functionals:cFunctionals]
reader.dmLevel = lld1;lld_de;lld_dede
writer.dmLevel = statist
copyInputName = 1
frameMode = full
// frameListFile =
// frameList =
frameSize = 0
frameStep = 0
frameCenterSpecial = left
noPostEOIprocessing = 0
functionalsEnabled=Extremes;Moments;Means
Extremes.max = 1
Extremes.min = 1
Extremes.range = 1
Extremes.maxpos = 0
Extremes.minpos = 0
Extremes.amean = 0
Extremes.maxameandist = 0
Extremes.minameandist = 0
Extremes.norm = frame
Moments.doRatioLimit = 0
Moments.variance = 1
Moments.stddev = 1
Moments.skewness = 0
Moments.kurtosis = 0
Moments.amean = 0
Means.amean = 1
Means.absmean = 1
Means.qmean = 0
Means.nzamean = 1
Means.nzabsmean = 1
Means.nzqmean = 0
Means.nzgmean = 0
Means.nnz = 0
[arffSink:cArffSink]
reader.dmLevel = statist
filename = \cm[outputfile(O):name of output file]
append = 0
relation = smile
instanceName = \cm[inputfile]
number = 0
timestamp = 0
frameIndex = 1
frameTime = 1
frameTimeAdd = 0
frameLength = 0
// class[] =
printDefaultClassDummyAttribute = 0
// target[] =
// ################### END OF openSMILE CONFIG FILE ######################
Relevant answer
Answer
Hi,
Please pay attention to these parameters:
...
nFormants = 3
formants = 1
bandwidths = 1
...
Change the 1's with 3's
  • asked a question related to Speech Prosody
Question
3 answers
Hello,
there are a lot of papers on differences between read and spontaneous speech, even detailed comparisons. However, is anyone aware of attempts to automatically detect if given recording contains read or spontaneous speech? 
Thank you for any helpful suggestions.
Relevant answer
Answer
Thank you, Saeid. Actually, before asking this question, we have done something pretty similar to your suggestion:) Except for the features - instead of MFCC we used pauses-related parameters (recently I was interested in using information on pauses to differrent applications in speech technology). Although, it is good idea to try also MFCC in further work, as you suggested.
We obtained accuracy 70-80 %, with the ubm-gmm approach. I was surprised that I couldn't find any former reasearch on recognition of these two classes to compare our results to. However, there's been research on automatic recognition of 3 levels of spontaneity: http://www.sciencedirect.com/science/article/pii/S0167639313000976
Regards,
Magda
  • asked a question related to Speech Prosody
Question
7 answers
i need some literature about speech prosody intervention in children.
Relevant answer
Answer
Hello Zohre!
I am sending you an article on the relationship between prosody and intelligibility in children with CP that can be useful for the intervention
Regards,
Jonathan
  • asked a question related to Speech Prosody
Question
6 answers
I am looking for recent work dealing with the relation between accentual/metrical patterns and acoustic strengthening in Spanish (Castilian or Mexican). Particularly I would like to find some research analyzing the relation between the acoustic realization of Spanish vowels and the prosodic structure. References for other languages are welcome as well. Thanks. 
Relevant answer
Answer
  • asked a question related to Speech Prosody
Question
9 answers
I developed a few tests but they are too long. I am looking at alternative solutions.
Relevant answer
Answer
My Z-Bell Testing is a quick screening that we use with brain injured people. 6 minutes for the "quick and dirty" pass/fail which category (autonomic or central nervous system) is the culprit. 10 more minutes for a more detailed evaluation of how to remediate the auditory filtering ability.
  • asked a question related to Speech Prosody
Question
8 answers
For instance I wanted to mark vocalic intervals with a quarter, eighth, sixteenth note type notation, but can't find a good contrast for consonantal intervals. They are not really rests, but they wouldn't be "Musical" in the same since that vowels are. I had thought about using percussion notation, which seems like it would be ok for stops/obstruents, but I don't know how to mark for continuant consonantal intervals (like fricatives) - maybe some sort of cymbal notation? Thoughts or Ideas greatly appreciated.
Relevant answer
Answer
From a musician's standpoint, have you considered trying Finale software? You can download "Finale Notepad" for free which will allow you to place items on a staff, including percussion and hear the sounds played back. I don't know that it would be as helpful as other software you have mentioned. You could also try to work with a percussionist to see if they can help. Best of luck!
  • asked a question related to Speech Prosody
Question
34 answers
I am looking for any theoretical and fieldwork studies dealing with the relation between linguistic rhythm (primary and secondary stress, syllable constituency, segment prosodic distribution, metrical feet, prosodic phrasing, intonational contours, etc.) and rhythmic patterns in poetry and folk music, so that evidence of a link between prosody and several types of codified oral production could be eventually found. I am very interested in data covering as many typologically different languages as possible.
Relevant answer
Answer
Joao, this week I published a book comparing the whole of Spanish poet Federico Garcia Lorca´s poetic corpus to the whole of Spanish composer Manuel de Falla´s music: "Lorca in Tune with Falla" (Toronto: U. Toronto Press, 2014), 300 pp. They knew each other intimately and shared the same portion of the audible universe, Granada, Spain, from 1920 to 1936. To make my poetry-music comparisons (which include theatre-music comparisons), I rely on Juan David Garcia Bacca´s "Filosofía de la música" (Barcelona: Anthropos, 1990). Garcia Bacca holds that music employs a type of language composed of notes, and it uses organs of expression, the instruments. In the same way, literature, including philosophy and poetry, makes use of words and of conventional syntax. Garcia Bacca analyzes patterns of musical motifs, and recommends the same analysis of motifs in poetry. In addition, I occasionally employ stylistic analysis of Lorca´s sounds, comparable to guitar-playing, when he resorts to onomatopoeia. Falla too would use the orchestra as a giant flamenco guitar.