Science topic
Speech Technologies - Science topic
Explore the latest questions and answers in Speech Technologies, and find Speech Technologies experts.
Questions related to Speech Technologies
Hi all,
I have involved and interested to estimate the age of users through speech signals. Kindly suggest some of the free corpus available to do this research.
Thanks
Hi everyone. I have been conducting a few experiments with simultaneous speech, but I have been using recorded speech (.wav, .ogg or .mp3 files) in all of them. However, I would like to play the simultaneous speech using Text-to-Speech solutions directly, instead of saving to a file first (mainly to avoid the delay, but also to be used across the OS/device).
All my attempts to play two simultaneous TTS voices (separate threads/processes, ...) have failed, as it seems that speech synthesis / TTS uses a unique channel (resulting in sequential audio).
Do you know any alternatives to make this work (independent of the OS/device - although windows / android are preferred)? Moreover, can you provide me additional information / references on why it doesn't work, so I can try to find a workaround?
Thanks in advance.
Hi all,
I would like to ask from all the experts here, in order to get the better view on the usage of cleaned signals which already removed the echo using few types of adaptive algorithms with method of AEC.(acoustic echo cancellation)
How the significance of MSE and PSNR can improve in the classification processes? Which i mean normally we evaluate using the technique of WER, Accuracy and may EER too.Is there any kind connectivity of MSE and PSNR values in terms of improving those classification metrics.?
wish to have the clarification on this.
Thanks much
I recently started to work on Speaker/Language recogntion using i-vector, and after consluting with researcher on researchgate, I came to the following steps:
1) Database
i) Developement dataset (UBM, T training), if labeled (LDA and PLDA also)
ii) Training dataset(For speaker/Language Enrollment, modeled speakers), if the Developement dataset is not labeled, I trained LDA and PLDA on training dataset(needs comments on this)
iii) Testing dataset (for testing the modeled speakers/language)
About Language Detection:
If I have lot of speech samples, but no labled for that speech utterance, how can I train LDA/PLDA for languages? or can I trained these on training languages data?
What about the Gender? how much the results will be effected if we have different/same UBM, T? Is it ok to have single UBM, T for both genders?
Is there any way to apply the i-vector detection without applying LDA and PLDA such as SVM on i-vectors without i-vector reduction??
Hi,
I need to write a tool that would by given keywords search in database of articles and recommend to users the most likely article containing proper information. I was thinking to use following search heyristics:
1. if keywords are in text in close proximity (near each other) it is more probable that article is on topic
2. if I can find article on some topic lets say from 2008 and on the same topic from 2012 (so newer) with many negations in text I could assume that old research was wrong and I should proritise newer article
3. I should allow queries in which user could define if he is looking exactly for some amount of keyword in one text or only one or more of them. Or for example that some keywords must be found and some does not
Are my assuptions correct ? Do you have any better ideas to return more accurate results ?
I'm seeking for free Speech Recognition for Arabic language (ASR). Can you help me to find it?