Science topic

Audio Analysis - Science topic

Explore the latest questions and answers in Audio Analysis, and find Audio Analysis experts.
Questions related to Audio Analysis
  • asked a question related to Audio Analysis
Question
2 answers
Transana is a software for qualitative research that enables researchers to work on authentic data that is recorded or videotaped. If anyone has employed this tool, I would be interested in learning more about their experience with the tool and its effectiveness in managing and analyzing complex qualitative data in their studies. Furthermore, the software is with different versions, so I am confused about which one to use.
Relevant answer
Answer
Dear Mohammed Azam,
Have you employed tranzana in your research? If so, which version please. You mentioned a free version that I have bot yet found, do you have access to it? Many thanks dearest friend.
  • asked a question related to Audio Analysis
Question
11 answers
I would like to get them to make a vocal sound related to a texture. This would have them use their voice to answer the question and I would collect the audio recording to use as research in my paper.
Thank you,
Colm
Relevant answer
Answer
Has anyone used Phonic? https://www.phonic.ai/product/surveys . I would like to use it for speech research, but I don't know anyone who has yet.
  • asked a question related to Audio Analysis
Question
12 answers
Hello everyone,
I am looking for links of Mexican datasets that can be used in classification tasks in machine learning. Preferably the datasets have been exposed in scientific journals.
Thank you for your attention and valuable support.
Regards,
Cecilia-Irene Loeza-Mejía
Relevant answer
Answer
Can you translate Vietnamese to Mexican datasets? https://www.researchgate.net/post/Can_you_show_me_Vietnamese_datasets
  • asked a question related to Audio Analysis
Question
3 answers
I have a project in which I have given a dataset (more than enough) of 10-20seconds audio files (singing these "swar" / "ragas": "sa re ga ma pa" ) without any labels, nothing well in data ... and I have to create a deep learning model which will recognise what speech it is and for how long it is present in the audio clip (time range of particular "Swar" sa ,re ,ga, ma )
The answesr to questions that I am looking for are
1. how I can achieve my goal , should I use RNN , CNN ,LSTM or hidden Markov model or something else like unsupervised learning for speech recognition ?
2. How to get correct speech tone for Indian language as most acoustic speech recognition models are tuned for English ?
3. How to find the time range ?for what range particular sound with particular "swar" is present in music clip ? how to add that time range recognition with speech recognition model ?
4. are there any existing music recognition models which resembles my research topic ? ,if yes please tag them .
I am looking for full guide for this project as it's completely new and people who are interested to work with me /guide me are also welcome .
  • asked a question related to Audio Analysis
Question
3 answers
Hello everyone,
I am looking for links of audio datasets of indigenous Mexican languages that can be used in classification tasks in machine learning.
Thank you for your attention and valuable support.
Regards,
Cecilia-Irene Loeza-Mejía
Relevant answer
  • asked a question related to Audio Analysis
Question
7 answers
Hello everyone,
I am looking for links of audio datasets that can be used in classification tasks in machine learning. Preferably the datasets have been exposed in scientific journals.
Thank you for your attention and valuable support.
Regards,
Cecilia-Irene Loeza-Mejía
Relevant answer
Answer
Hi, I would recommend this website for checking emotional stimuli datasets https://airtable.com/shrnVoUZrwu6riP9b/tbljKUnVvikhzaNvF/viwlo7OvlHBG2q88P?blocks=hide
"KAPODI - the searchable database of free emotional stimuli sets."
Best regards,
Diogo Branco
  • asked a question related to Audio Analysis
Question
55 answers
This Project continues as an examination of ALL-PASS Band-Pass circuits.
Project Paper : updated Feb 01, 2022
"Analog Phase-Filtering
in Active-Band-Pass Circuits"
emphasizing the use of All-Pass filters.
- - - Here, we continue our earlier "AFX" Project, which was presented in RGN at :
.......
Introduction for the "AFC" project :
...
We examine the "ALL-PASS-FILTER" and develop an Analog Narrow-Band-Pass Audio Filter, which has immediate application in receiving Morse Code signals in a Amateur Radio Station.
...
Our resulting model is an experiment to gather this data.
A Proper Analysis of this design may aid in understanding the nature of All-Pass Filtering. Once an adequate system equation is achieved, then resulting models may be useful in designing Band-Pass Filters for Audio applications which can be based on Non-Resonant Phase-Filtered circuits, similar to our "AFX" design.
...
Theory:
All-Pass (phase-shifting) filters have frequency responses which must be " zero at w=0 and at w=pi ". From the research, This means that AllPass Filters cannot be used for (1) Low-Pass nor (2) High-Pass nor (3) Band-Pass designs. This is because the resulting combination of waveforms are homogeneous ; ie, the combinations are always simple phase shifts,
producing no frequency & amplitude changes. ... *** The authors have developed working Dual-Notch Band-Pass circuits which (1) perform a BAND-PASS function which is f(0) peaked at 700Hz. (2) generates DUAL-NOTCHES around f(0) at plus/minus aprox. 200 Hz . The current All-Pass project is titled : "AFC"
...
*** First Experimental Target : (1) Utilize All-Pass stages to replace resonance tuned Active-BandPass stages.
(2) Reduce Number of MFB active filter stages required to Align Signal Phases (a) in order to support Dual-Notch Generation around f(0) ; (b) in support of our previous project "AFX" "AFV-3RL-v4F-D-vQ-Man".
... Continued Project now uses the Schematic in the groups:
AFC_1R-1A-12A-2F-Sum-S-451 and AFC-3R-2F-8A-Dif-S-451 .
The Bode plot and Magnitude plot are in the pre-paper.
...
The Problem to be resolved is why this design (1) using one All-Pass Lo-Pass paralleled with twelve All-Pass Hi-Pass Filters (2) will produce an Wave-Form Output in the Bode plot. ...The Problem to be resolved is " Why Do One APF Lo-Pass paralleled with Twelve APF Hi-Pass interact in an unfamiliar manner.
...
This "AFC" project is derived from our previous "AFX" project
...
Our long series of projects in Analog Narrow Band-Pass Filters has been presented on our website at : http://www.geocities.ws/glene77is/
...
2021 Oct 12 ...This Project continues as an examination of ALL-PASS Band-Pass circuits. ...This "AFC" project is derived from our previous "AFX" project https://www.researchgate.net/post/Are-there-any-Analog-Active-Audio-Filters-that-match-any-Digital-Signal-Processing-filters. ...
Latest upload: 2021 Oct26
We have a paper attached : "AFC_All-Pass_Phase-Filter_Paper.pdf"
...
Latest upload: 2021 Nov 29
"AFC_All-Pass_Phase-Filter_Proj-211129-0502"
...
Relevant answer
Answer
2021 Oct 04 ...
This Project continues as an examination of All-Pass Band-Pass circuits.
.....We have a Pre-Print attached :
AFC_All-Pass-Phase-Filter-project-211004-1558.pdf
  • asked a question related to Audio Analysis
Question
4 answers
Hi everyone,
I and my teammates want to find out if there is a way to do (remote) scientific collaboration in the field of Machine Learning/Deep Learning about speech recognition and audio analysis. The goal is only to learn and to become a member in our project.
Thanks in advance.
Relevant answer
Answer
Please have look on our(Eminent Biosciences (EMBS)) collaborations.. and let me know if interested to associate with us
Our recent publications In collaborations with industries and academia in India and world wide.
EMBS publication In association with Universidad Tecnológica Metropolitana, Santiago, Chile. Publication Link: https://pubmed.ncbi.nlm.nih.gov/33397265/
EMBS publication In association with Moscow State University , Russia. Publication Link: https://pubmed.ncbi.nlm.nih.gov/32967475/
EMBS publication In association with Icahn Institute of Genomics and Multiscale Biology,, Mount Sinai Health System, Manhattan, NY, USA. Publication Link: https://www.ncbi.nlm.nih.gov/pubmed/29199918
EMBS publication In association with University of Missouri, St. Louis, MO, USA. Publication Link: https://www.ncbi.nlm.nih.gov/pubmed/30457050
EMBS publication In association with Virginia Commonwealth University, Richmond, Virginia, USA. Publication Link: https://www.ncbi.nlm.nih.gov/pubmed/27852211
EMBS publication In association with ICMR- NIN(National Institute of Nutrition), Hyderabad Publication Link: https://www.ncbi.nlm.nih.gov/pubmed/23030611
EMBS publication In association with University of Minnesota Duluth, Duluth MN 55811 USA. Publication Link: https://www.ncbi.nlm.nih.gov/pubmed/27852211
EMBS publication In association with University of Yaounde I, PO Box 812, Yaoundé, Cameroon. Publication Link: https://www.ncbi.nlm.nih.gov/pubmed/30950335
EMBS publication In association with Federal University of Paraíba, João Pessoa, PB, Brazil. Publication Link: https://www.ncbi.nlm.nih.gov/pubmed/30693065
Eminent Biosciences(EMBS) and University of Yaoundé I, Yaoundé, Cameroon. Publication Link: https://pubmed.ncbi.nlm.nih.gov/31210847/
Eminent Biosciences(EMBS) and University of the Basque Country UPV/EHU, 48080, Leioa, Spain. Publication Link: https://www.ncbi.nlm.nih.gov/pubmed/27852204
Eminent Biosciences(EMBS) and King Saud University, Riyadh, Saudi Arabia. Publication Link: http://www.eurekaselect.com/135585
Eminent Biosciences(EMBS) and NIPER , Hyderabad, India. Publication Link: https://www.ncbi.nlm.nih.gov/pubmed/29053759
Eminent Biosciences(EMBS) and Alagappa University, Tamil Nadu, India. Publication Link: https://www.ncbi.nlm.nih.gov/pubmed/30950335
Eminent Biosciences(EMBS) and Jawaharlal Nehru Technological University, Hyderabad , India. Publication Link: https://www.ncbi.nlm.nih.gov/pubmed/28472910
Eminent Biosciences(EMBS) and C.S.I.R – CRISAT, Karaikudi, Tamil Nadu, India. Publication Link: https://www.ncbi.nlm.nih.gov/pubmed/30237676
Eminent Biosciences(EMBS) and Karpagam academy of higher education, Eachinary, Coimbatore , Tamil Nadu, India. Publication Link: https://www.ncbi.nlm.nih.gov/pubmed/30237672
Eminent Biosciences(EMBS) and Ballets Olaeta Kalea, 4, 48014 Bilbao, Bizkaia, Spain. Publication Link: https://www.ncbi.nlm.nih.gov/pubmed/29199918
Eminent Biosciences(EMBS) and Hospital for Genetic Diseases, Osmania University, Hyderabad - 500 016, Telangana, India. Publication Link: https://www.ncbi.nlm.nih.gov/pubmed/28472910
Eminent Biosciences(EMBS) and School of Ocean Science and Technology, Kerala University of Fisheries and Ocean Studies, Panangad-682 506, Cochin, India. Publication Link: https://www.ncbi.nlm.nih.gov/pubmed/27964704
Eminent Biosciences(EMBS) and CODEWEL Nireekshana-ACET, Hyderabad, Publication Link: https://www.ncbi.nlm.nih.gov/pubmed/26770024
Eminent Biosciences(EMBS) and Bharathiyar University, Coimbatore-641046, Tamilnadu, India. Publication Link: https://www.ncbi.nlm.nih.gov/pubmed/27919211
Eminent Biosciences(EMBS) and LPU University, Phagwara, Punjab, India. Publication Link: https://www.ncbi.nlm.nih.gov/pubmed/31030499
Eminent Biosciences(EMBS) and Department of Bioinformatics, Kerala University, Kerala. Publication Link: http://www.eurekaselect.com/135585
Eminent Biosciences(EMBS) and Gandhi Medical College and Osmania Medical College, Hyderabad 500 038, India. Publication Link: https://www.ncbi.nlm.nih.gov/pubmed/27450915
Eminent Biosciences(EMBS) and National College (Affiliated to Bharathidasan University), Tiruchirapalli, 620 001 Tamil Nadu, India. Publication Link: https://www.ncbi.nlm.nih.gov/pubmed/27266485
Eminent Biosciences(EMBS) and University of Calicut - 673635, Kerala, India. Publication Link: https://www.ncbi.nlm.nih.gov/pubmed/23030611
Eminent Biosciences(EMBS) and NIPER, Hyderabad, India. ) Publication Link: https://www.ncbi.nlm.nih.gov/pubmed/29053759
Eminent Biosciences(EMBS) and King George's Medical University, (Erstwhile C.S.M. Medical University), Lucknow-226 003, India. Publication Link: https://www.ncbi.nlm.nih.gov/pubmed/25579575
Eminent Biosciences(EMBS) and School of Chemical & Biotechnology, SASTRA University, Thanjavur, India Publication Link: https://www.ncbi.nlm.nih.gov/pubmed/25579569
Eminent Biosciences(EMBS) and Safi center for scientific research, Malappuram, Kerala, India. Publication Link: https://www.ncbi.nlm.nih.gov/pubmed/30237672
Eminent Biosciences(EMBS) and Dept of Genetics, Osmania University, Hyderabad Publication Link: https://www.ncbi.nlm.nih.gov/pubmed/25248957
EMBS publication In association with Institute of Genetics and Hospital for Genetic Diseases, Osmania University, Hyderabad Publication Link: https://www.ncbi.nlm.nih.gov/pubmed/26229292
Sincerely,
Dr. Anuraj Nayarisseri
Principal Scientist & Director,
Eminent Biosciences.
Mob :+91 97522 95342
  • asked a question related to Audio Analysis
Question
3 answers
I am trying to build a voice cloning model. Is there some scripted text I should use for the purpose or speak anything randomly?
What should be the length of the audio and any model suggestions that are fast or accurate?
Relevant answer
Answer
Text to Speech Synthesis is a problem that has applications in a wide range of scenarios. They can be used to read out pdfs loud, help the visually impaired to interact with text, make chatbots more interactive etc. Historically, many systems were built to tackle this task using signal processing and deep learning approaches.In this article, let’s explore a novel approach to synthesize speech from the text presented by Ye Jia, Yu Zhang, Ron J. Weiss, Quan Wang, Jonathan Shen, Fei Ren, Zhifeng Chen, Patrick Nguyen, Ruoming Pang, Ignacio Lopez Moreno and  Yonghui Wu, researchers at google in a paper published on 2nd January 2019.
Regards,
Shafagat
  • asked a question related to Audio Analysis
Question
18 answers
The work of George and Shamir describes a method to use spectrogram as an image for classifying audio records. The method described is interesting, but the results seemed to me a little adjusted to the chronology and not to the spectrogram properties at itself. The spectrogram gives a limited information about the audio signal, but it is enough to do a classification method?
Relevant answer
Answer
Spectrogram presented not only the frequency content of the signal but also its energy. The spectrogram's vertical axis represents frequency, with the lowest frequencies at the bottom and the highest frequencies at the top, while the horizontal axis represents time; it runs from left to right of the axis. The colors enrich the spectrogram representation as its third dimension; different colors represent different energy levels . so if you combine it with classifier such as CNN which is manly image classifier it might give good results
  • asked a question related to Audio Analysis
Question
5 answers
Hi guys,
is there any option in AVISOFT SASLab Pro software which enables you to eliminate unwanted noise from digital recording without effecting your original sound? In my case, sounds are recorded in the experimental tanks with a hydrophone connected to the digital audio recorder. The lab is full of low-frequency noise, which in some proportions, disrupts my sound of interest. If I high-pass filter recording, there is still noise which is not eliminated and it is overlapping with the sound frequency spectra.
Any advise would be helpful.
Relevant answer
Answer
Avisoft SasLab Pro 5.2 software has got inbuilt lowpass, highpass, notchpass and bandpass filters critical in elimination of unwanted noise. Open the Avisoft SasLab Pro 5.2 software, the choose the edit menu. Under edit menu select Filter. Under filter, choose the Time domain IIR or FIR Filter. If its the background noise, then you may record the room tone and filter it out. Similarly the Avisoft UltraSound Gate allows you to attenuate unwanted signals.
  • asked a question related to Audio Analysis
Question
9 answers
[Tell us about the issues that you had while developing impedance tubes]
[The issue that I had has been solved, but I didn't manage to fully understand why, since I change my measurement system and my data analysis script also.
[Nevertheless, I would like you to tell us about the issues that arose while developing an impedance tube since it could provide reference information for other researchers]
I developed an impedance tube to measure the sound absorption coefficient with ISO 10534-2 method. While processing reflection coefficient and absorption coefficient from transfer function data or audio files obtained using (Arta or audio recordings of white noise, sine sweep or MLS inside the tube) (transfer function is in dB), I obtained negative reflection coefficients or data out of common bounds (see images) Some ideas on possible sources of error or necessary preprocessing of the signals or the transfer function?
Complex Reflection Coefficient R = [ {H-e(-j*k*s)} / {e(j*k*s)-H} ] * e(2*j*k(L+s))
absorption coefficient alpha = 1-|R|^2
[The issue that I had has been solved, but I didn't manage to fully understand why, since I change my measurement system and my data analysis script also.
[Nevertheless, I would like you to tell us about the issues that arose while developing an impedance tube since it could provide reference information for other researchers]
Relevant answer
Answer
I have developed impedance based on the ASTM-E 1050
Design parameter and MatLab code attached in the form of jpg. Hope this will help you.
  • asked a question related to Audio Analysis
Question
9 answers
It is well known that audio compression (e.g., MP3, AAC) usually processes the audio data frame-by-frame. However, I am curious about the feasibility of single frame based processing.
A commonly accepted notion is that frame based processing has time resolution of audio data while a single frame processing does not have. This is similar to comparing DFT and STFT.
However, why we need time resolution of audio signal during compression? For a given audio clip, its single frame FFT has super frequency resolution (huge points) and no time resolution. However, we can still calculate tonal and non-tonal elements, masking curves, and generate quantization index, etc. In this way, the modifications of any frequency bins will be reflected throughout time domain whenever this frequency appears along the time axis in the compressed time domain audio samples.
I personally do not see any potential problems of performing single frame compression as described above. The only problem I can imagine is in terms of hardware implementation for huge DCT points. But the computational complexity of FFT is O(nlogn) which approaches a linear function of n when n is large. Hence I do not see this as a big problem with the consideration of rapid developed computer capabilities.
Please help to point out my mistakes in the above statements.
Relevant answer
You want to increase the size of the audio frame to contain the whole audio message. At first i think such question needs investigation to answer it.
But i think the frame size is determined by the latency in the transmission system.
It is required to carry out all the signal processing in such latency time. One other important point is that the transmission medium may be dynamic such that the decoding may be very complicated or even impossible if we increased the frame size.. I think the size of the frames are dictated more by the more complex decoding process that the coding process. Also, the transmission medium imposes restriction on the frame size..
Also the cost of the processing even if it increases only proportional to the size of the frame must be considered as one can not use powerful computing platforms for all communication equipment.
An another important point that audio signal is not a continuous signal in nature but contains interruptions.
I think the optimum frame size varies among applications.
Best wishes
  • asked a question related to Audio Analysis
Question
5 answers
Hello everyone I'm working in audio analysis for emotion classification. I'm using parselmouth (a PRAAT integration in python) to get feature. I'm not well versed in audio analysis, starter. After reading many papers and forums. I see mfcc are used for this, I've also discovered some features they're (jitter, shimmer, hnr, f0, zero_crossing) are they used for this work?
What I've to do with audio files before extracting mfcc and these features?
After getting these features I've to predict emotion using machine learning.
It'll involve:
- The algorithm must be able to make predictions in real time or near
- Taking into account the sex and the neutral voice of each person (for example, by reducing and centering the variables of the model to consider only their variations with respect to the mean - average which will thus change value as and when the sequential analysis since it will be first calculated between 0 and 1 second, then 0 and 2 seconds, etc.)
Any help and suggestion for best practice are welcome.
Thanks
Relevant answer
Answer
On the recognition of emotions from audio recordings.
It is necessary to identify the measurable cognitive precursors that can induce emotion, they are essential since the innocence in front of the stimulus that faces the subject carries the emotional potentiality.
To be or not to be, that's the question: Sakespere said in the famous monologue phrase of the main character Hamlet.
As for the emotions, it could be added:
Knowing or not knowing, that's the emotion.
  • asked a question related to Audio Analysis
Question
3 answers
I would like to outsource the transcription of interviews (around 20-30h of audio recording/IDI). Do you have any experience with Polish companies in that field? Can you recommend any?
I would be really grateful :)
Relevant answer
Answer
It's a bit late but you may wish to use
<info@e-transkrypcje.pl> in the future - very quick and efficient service.
  • asked a question related to Audio Analysis
Question
2 answers
I came across a few links that looked promising, but they are no longer active.
Relevant answer
Answer
Thank you so much, Diana! It's difficult to know how these things are done when you are just getting started. I was able to get in contact with him. I appreciate your answer!
  • asked a question related to Audio Analysis
Question
3 answers
What type of models, analytics, data science is used by call center companies where the large number of calls are made everyday.
Does it involves audio analytics or speech to text conversion and then analyze? which approach is better, any pros and cons for it?
Any suggestion, discussion or reply is appreciated. Thanks in advance!
Relevant answer
Answer
Hi Mayur, here is an article that might serve as a starting point:
  • asked a question related to Audio Analysis
Question
1 answer
I have two frequency spectrums as shown in the attached picture. The shape of the peaks are similar but they have only slightly shifted in frequency. I want to match the frequencies with similar peaks.
I tried the DP matching algorithm and backtrace the least cost path to find the frequencies that are most similar. I have attached that image too. I was intending to insert/delete/replace the amplitudes of these matching frequencies so that a score can be calculated between these two spectrums such that it is not influenced by these peaks. But seeing the least cost path output, there are some one-to-many mapping between the features (especially from reference to test pattern) which I am unable to understand how to interpret.
Is it possible to extract features that have similar peaks such as in the first picture ? If DP matching can do it, how do I apply the method ?
Thank you.
Relevant answer
Answer
Call the frequencies f for the test pattern and g for the reference pattern.
You have a fundamental peak (absolute max) at f(0), resp. g(0). Then you have a second peak at f(1), g(1), a third at f(2), g(2), a fourth at f(3), g(3).
I would study the df(i) = f(i+1) -f(i) and dg(i) =g(i+1)-g(i).
It would also be interesting to see the energies under each peak. Define f'(i) the local minimum following f(i) and preceding f(i+1) and g'(i) similarly.
Then you define S(i) as the surface under the curve between f(i) (the max) and f'(i) (the min), and T(i) the surface below the curve between g(i) and g'(i).
Is there anything which one can say about S(i)/T(i), as i varies?
Is there anything interesting for the values of S(i+1)/S(i) and those of T(i+1)/T(i)?
More pragmatically, looking at the f(i), i=0,1, 2, 3, it looks like harmonic spectrum and the same for g(i).
Is it perfectly harmonic, or drifting away?oascillating around harmonic?
  • asked a question related to Audio Analysis
Question
3 answers
I want to compare “neutral” baseline data with data recorded in a test session to finally be able to evaluate arousal/affect of the infant.
Which software would you recommend? Do you have any literature advice?
Any advice would be appreciated!
All the best
Sam
Relevant answer
Answer
Do you know this paper?
Shigeaki Amano, Tadahisa Kondo, and Sachiyo Kajikawa 2001. Analysis on infant speech with longitudinal recordings The Journal of the Acoustical Society of America 110, 2703 (2001);
  • asked a question related to Audio Analysis
Question
1 answer
Hi,
Does anyone have experience in synchronizing audio and videorecording using a single DAQ device?
In my setup I’m using two devices, one for videorecording (pointgrey camera fl3 u3 13s2m cs, https://www.ptgrey.com/flea3-13-mp-color-usb3-vision-sony-imx035-camera) and one for audiorecording of rat USVs (Ultrasoundgate 416H, http://www.avisoft.com/usg/usg416h.htm, 4 microphone channels), that are connected to the same computer, but started by two different softwares. What I’m trying to do is to find a way to start both recordings at the same time so that to synchronize the two data (video and sound). The goal is to know precisely when sounds occur during the video.
I'm completely new in this kind of tasks and in the field of data acquisition, so any help is truly appreciated.
- Do you think that connecting both recording systems to the same DAQ device will allow me to solve this issue? If so, once both systems are connected to the DAQ, can I start symultaneous recording? How?
-What type of DAQ device would be better for this task? May you give me some suggetion?
-What method of syncronization should be performed? I read about start trigger synchronization and sample clock synchronization, but I'm not sure which of the I need to use.
-Once the recording has been done, will I have 2 different files as output? (one for audio and one for the video?)
Please tell me if you need more information.
Than you very much.
Relevant answer
Answer
NB I've not done video recording but....If you did connect them to the same DAQ device the clocking should be synchronized anyway. You could just queue and start both when you start your session in the DAQ then they should be synchronized and you just collect data (in whatever format you specify) from the session which will be synchronized. Have a look at the DAQmx & DAQtoolbox documentation in matlab for more ideas. 
  • asked a question related to Audio Analysis
Question
5 answers
I have experimentally recorded the Sound Pressure Levels of a horn. The SPLs have been obtained through simulation from LMS. But the output of LMS is in the form of spectrum in excel. I want to conver this excel into an audible sound to make psychoacoustic characterisation. How can I do it in MATLAB or any other available resource?
Relevant answer
Answer
Sorry if I wasn't clear enough. On one hand, what I was saying (which is pretty much what Amaya said again at the beginning of her answer), is that using only sound pressure level data, it is not possible to reconstruct the original sound signal.
On the other hand, if you extract the h(t) (impulse response) using the LMS technique, what you get is the response of the system in the frequency domain, there is no original audio per se, you would have to create a signal based on the response of the system, frequency components, relative amplitude, and what is also important, phase information that is not present in the frequency domain. The result would be very similar to the sound produced by the horn, differences will be observed accordingly to the excitation used in order to produce sound with it.
Regards
  • asked a question related to Audio Analysis
Question
12 answers
I am seeking your responses for my research project. I am interested, your voice relating to Indigenous community. Any help will be greatly appreciated
Relevant answer
Answer
Knowledge is never neutral. An individual's location within the social structure conditions his/her access to knowledge. This implies that there is a knowledge hierarchy and what you are told is what they want you to know. The custodians of local knowledge can grant access and therefore consent only if they want.
  • asked a question related to Audio Analysis
Question
2 answers
LibSVM
Relevant answer
Answer
Your question misses details of your task. What are you planning to do with SVM, what are your targets?
If your goal is to classify the whole audio file, then SVM cannot be used directly, because the number of frames differs from one file to another. You could start with choosing sliding windows of fixed size composed of several frames. Then either do majority voting, or use a sequence classifier like HMM
If your goal is sequence labelling (each frame has some class associated with it) then you directly define your feature matrix composed from frame MFCC possibly with some context.
Please provide more details if you need further help
  • asked a question related to Audio Analysis
Question
4 answers
Dear all,
I need to analyse automatically conversational features such as the amount of time each person speaks, amount of overlapping speech, number of interruptions, who speaks louder, and so on.
I have separate audio files for each participant (only with his/her voice). How can I analyse these features automatically? Is there any tool that eases such analysis?
Thanks in advance
Relevant answer
Answer
If you have separate files for each participant you may want to perform Voice Activity Detection on each signal. This will tell you when the participant is speaking. You would then look for overlaps and the other information you need by comparing the active times.
For the loudness of each speaker you could analyze the energy contour of each signal.
If the signals were recorded in a quiet environment, and there's no much interference between speakers this should not be that hard.
 A good tool to perform these task is Praat.
  • asked a question related to Audio Analysis
Question
5 answers
Should the mean amplitude be average of SPL values at regular intervals ?
Relevant answer
Answer
In noise control it is usual to take the (energy) equivalent sound level (Leq) as the most important variable. Most sound level meters can determine this value immediately. In fact they integrate p^2 (sound pressure squared) over time and divide by the total measured time.
If you have a series of samples of sound levels Lsi you can determine the energetic average (by adding all samples 10^(Lsi/10); dividing the sum by the number of samples) values and taking the 10log, multiply by 10. 
  • asked a question related to Audio Analysis
Question
7 answers
I am wondering if there is a comprehensive review on feature construction, selection and classification for audio classification tasks (not necessarily music classification).
I am more interested in a problem where I am recording audio on a machine and let's say there is a fault in the machine, can I pick that up automatically using audio classification?
Thanks!
Sumeet
Relevant answer
Answer
Hi,
I have done some research in this area with my M.Sc. students. My research is divided into two main topics: Condition Monitoring and Music Information Retrieval (MIR).
Usually the phrase acoustic condition monitoring is used for the condition monitoring, but in reviews I received from some reviewers, they were considering ultrasound as for the acoustic signals only! The term audio would make them more satisfied if your are considering using signals within human hearing range. My reported work on the condition monitoring is mainly focused on the automobile engine fault diagnosis.
You may find a number of papers in my contributions to the research gate in my page if you were interested.
Regards
Peyman
  • asked a question related to Audio Analysis
Question
3 answers
Compression is used for the audio and video file from olden days. In a database how it can be handled in efficient way?
Relevant answer
Answer
compression a file with out changing its bit rate is better.......
  • asked a question related to Audio Analysis
Question
1 answer
The following are the details of the wave file,
sampling rate : 48 kHz
16-bit, 10-second duration file of 480000 samples, compressed in PCM format.
Relevant answer
Answer
48 kHz is way above the auditive treshold. I have no experience of ultrasound.
  • asked a question related to Audio Analysis
Question
15 answers
I am trying to reconcile some issues about the sounds bumblebees make while flying and while sonicating pollen from anthers. This link is interesting with good recording quality https://www.youtube.com/watch?v=yrjLZ_UYUl4 Now the quandry: It is sometimes said, sometimes with great authority, that the sound the bumblebees make while sonicating anthers is Middle C (C4) at 262 Hz. Is is also said, again sometimes with great authority that the the wing beat frequency of a bumblebee is 200 Hz. That would translate to a sound of 400 Hz (one compression on each of the upstroke and downstroke of the wing) which is close to A4 (440Hz) on a piano. The sonication vibration from the thorax of a worker of Bombus impatiens has been recorded by vibrometer at about 350 Hz, but does that translate to an F4 as a sound? It is clear, even to my ear, that the flight sound is at a much lower pitch than the sonication sound. Thus, there is something wrong with some of the conventional ideas of the sounds that bumblebees make. Perhaps one of our musically adept entomologists can listen to the sounds on the link and suggest clarifications as to sounds (notes and Hz) and wing and/or thoracic vibrations. Thank you, all. Peter
Relevant answer
Answer
Dear Peter,
few year ago Andreas Burkart, Clemens Schlindwein and me have done a comparative study of buzzing frequency in relation to flight frequency (PDF attached) in neotropical bees including one bumblebee. The rule the the buzzing frequency is twice as high as the flight frequency holds only for large bees. the smaller the bees the higher is the buzzing frequency and the more similar are buzzing and flight frequencies.
Klaus
  • asked a question related to Audio Analysis
Question
4 answers
I am working in steganography andi want top algorithms in audio scope. I wanna to improve security and capacity is not importatnt.
Relevant answer
Answer
study this
  • asked a question related to Audio Analysis
Question
3 answers
I want to know how to measure the frequency and intensity of cricket chirps efficiently. What do you think is the best way in doing so and is a sound meter enough to do so?
Relevant answer
Answer
This is a very interesting area of research. My neighbours at the University of Bristol (not far from Bath) have a complete research unit working on this; they have published several articles explaining how to do different types of measurements, and their web page will surely be of help:
I visited their research facilities a few years ago and was impressed with the quality of their experimental setup and what they could get in terms of results. 
Have a look at their publications, and all the best with your research!
Philippe
  • asked a question related to Audio Analysis
Question
12 answers
I want to select an optimal window for STFT for different audio signals. For a signal with frequency contents from 10 Hz to 300 Hz what will be the appropriate window size ? similarly for a signal with frequency contents 2000 Hz to 20000 Hz, what will  be the optimal window size ?
I know that if a window size is 10 ms then this will give you a frequency resolution of about 100 Hz. But if the frequency contents in the signal lies from 100 Hz to 20000 HZ then 10 ms will be appropriate window size ? or we should go for some  other window size because of 20000 Hz frequency content in a signal ?
I know the classic "uncertainty principle" of the Fourier Transform. You can either have high resolution in time or high resolution in frequency but not both at the same time. The window lengths allow you to trade off between the two.
Relevant answer
Hi, Shibli Nisar.
If you want to detect 10 Hz in your audio signal, you need that at least one period of the sinusoid fully fits in the selected window, ie, 100msec. All the frequencies above 10Hz will be detected with that window size. For a 2000Hz signal you will need a 0.5msec window. 
If your problem is resolution you will need to increase number of points to compute the STFT. The STFT is a discrete representation, and the frequency bins are uniformly distributed in the bandwidth of your signal and the distance between two adjacent frequency bins is Fs/N, where N is the number of points in your FFT and Fs is the sampling frequency. Is Fs is fixed (as usual) to increase the resolution you need more FFT points. With a fixed temporal window size you can increase the resolution by zero-padding the signal and the performing the FFT.
  • asked a question related to Audio Analysis
Question
3 answers
As monaural auditory thresholds may be different, how can we measure binaural loudness discomfortable levels by presenting pure tones through earphones?
Relevant answer
Answer
Not pure tones, but very interesting!
Loudness Scaling test !
Monaural and binaural possible !
  • asked a question related to Audio Analysis
Question
7 answers
helllo everyone , i was doing my thesis entitled "filterless class-d amplifier", first i used the simple PWM scheme and i found my output glitchy and out of phase from the input audio signal. and when i measured the output frequency it is smaller than the input signals' frequency (<20khz). My goal is to develop a filterless class d amplifier that will amplify the amplitude and still remain the input frequency (which is 20khz).
Relevant answer
Answer
Dear Kevin,
No matter how you do it, you need some filter. It may be acceptable to use the speaker as a filter, but this has implications:
- You need to measure how the speaker filters and behaves for the high frequencies. (impedance and losses)
- You need to take care of EMC. I think the only way this can work, is to mount the amp on the speaker, so there are no radiating wires.
- On the modulation itself, you need to try to have as little unwanted high-frequency components as possible. Some possible techniques here:
  1. Use PWM modulation. PDM is easier (less high frequency needed), but produces more unwanted frequencies.
  2. Use a modulation that generates a lot of common-modo high frequency noise, but relatively little differential-modo noise (as the speaker is first-order insensitive to the common modo.)
Cheers,
Henri.
  • asked a question related to Audio Analysis
Question
6 answers
I am able to access the transcripts but I am unable to access the audio files even on free online corpora webpages. Could anyone tell me how to access both transcripts as well as audio files together?
Relevant answer
Answer
Sir, You can write to John M Swales who was instrumental in developing MICASE. He responds to our queries. Generally we get access to transcripts only. The audio databases are not shared. There is Dr Claudia from Germany, Dresden . she collected a lot of samples from Indian users of English.  Her contact is also useful.
  • asked a question related to Audio Analysis
Question
7 answers
I am working with an audio sound profile and. I want to analyse the frequency of that sound and  I am using wavpad sound editing software for frequency analysis of that sound. In that case, I have generated frequency versus time graph. But  these sound shows multiple frequencies at a time. So I'm not able to generate a perfect graph and also not able to find a frequency range of that sound. So, can you tell me how I can analyse these frequencies ?   
Relevant answer
If what you want is do pitch tracking (get a graph that shows you what note is being played at a particular time) you need something like melodyne. All frequency representations mentioned above will give you all the frequency components that are present on the signal, which is what I think you're getting with wavpad.
  • asked a question related to Audio Analysis
Question
2 answers
Thanks in advance for your replies.
  • asked a question related to Audio Analysis
Question
3 answers
For example, if we want to transmit audio or video streaming, how we can calculate signal and channel bandwidth?
Relevant answer
Answer
The usual definition of bandwidth is a difference between two frequencies chosen to encompass some frequency region of interest.  For example, a band pass filter has a bandwidth typically described as follows.  Define the center frequency, Fc, as the frequency most readily transmitted by the band pass filter.  Now locate two frequencies, as close as possible to Fc, where the response of the filter is 3 decibels (dB) less than the response at Fc (the so-called "half-power points").  The bandwidth is then the difference between these two frequencies.
Another definition is the difference between the highest and lowest frequency used anywhere in a given signal.
In digital communications, bandwidth is also related to the speed of bit transmission.  Whether these bits represent physical computer memory or Shannon/Weaver-style information measures is a matter of choice.  For example, a standard audio CD uses 44,1100 samples per second to represent audio.  Since there are 2 audio channels (for stereo) and 16 bits per sample, the total bandwidth is 2 x 16 x 44100 = 1,411,200 bits per second.  Real communication channels typically have additional requirements for headers and "side information" such as lyrics.  This is also called the data transmission rate.
But if the data bits are all zero, the amount of information transmitted through a channel of any capacity can be zero as well.
  • asked a question related to Audio Analysis
Question
2 answers
I want to extract the pitch of many files (<100) using Wavesurfer and the RAPT method. I know it is possible to generate a file with the pitch information by opening the audio file and choosing the Save Data File. But I want to perform that automatically. Does anyone know how to perform this?
Thank you very much.
Relevant answer
Answer
Thank you very much for your suggestion mr. Koch. I'm going to try YAAPT as well, but I still need to test RAPT. For now, I'm going to test it in a few signals.
  • asked a question related to Audio Analysis
Question
14 answers
I am considering doing discourse analysis of presidential speeches as part of a larger research project. I need a transcription from audio files to code the text in Discourse Network Analyzer for later Social Network Analysis.
Which software could do the transcription as accurately as possible? I am a Mac user by the way. Has anyone experimented oTranscribe?
Relevant answer
Answer
Hi Paul,
Dragon Naturally Speaking is a suitable software for voice recognition and speech to text conversion, however, I am not certain if it can function well with prerecorded audio.
Please let me know if you could find a better solution.
  • asked a question related to Audio Analysis
Question
6 answers
I have audio file with me and also the text data for that audio. I want to map the text with audio or in short want to highlight the text with audio stream.
I don't want to use (text-to-speech) as I have audio with some background music. (Android)
Relevant answer
Answer
I hope that you will try to identify of speech in music background.
Autocorrelation-Based Features for Speech Representation.ACTA ACUSTICA UNITED WITH ACUSTICA Vol. 101 (2015) 1 – 1 to be published.
With best wishes,
Yoichi Ando
  • asked a question related to Audio Analysis
Question
8 answers
I am currently working on a project which requires me to characterise deviations from baseline using acoustics. I already have generated the frequency spectra of both the signals, but I am having trouble comparing them to see if there is any difference.
Any help?
Relevant answer
Answer
Hi Ambarish,
spectral means and spectral moments respectively MIGHT be an idea. What is your baseline looking like and what are your signal's properties? A FFT (spectrogram - e.g. with Praat or Matlab) of two representative example sounds (I assume you have time continuous signals, dont you?) might be helpfull to help you further (where are the differences). Best X.
  • asked a question related to Audio Analysis
Question
9 answers
Because until recently many scientist have not fully appreciated how widespread and important fish sounds are in the marine soundscape, I wonder if sounds produced by fishes that are being preyed upon by cetaceans could be mistaken for cetacean sounds in some, probably rare, cases.  Fish often only make sounds under particular conditions, such as when attacked by a predator, so you would only hear that sound in that circumstance, hence the possibility of mistaken identification.  To be shore most fish sounds have much more limited detection ranges than cetaceans.  But shouldn't scientists reporting new sounds at least consider the possibility?
Relevant answer
Answer
Long ago sperm whale click were called carpenter fish sounds. 
  • asked a question related to Audio Analysis
Question
2 answers
Hi all,
I need some help to use WEKA. The study which I am conducting researches if musical features of a song (such as the tempo or the key) are able to predict if a song will end up high or low in the charts. A logistic regression and discriminant analyses were conducted. In the next part of my study, I wanted to split the file on key (major and minor) and see if the other musical features are able to predict if a song will end high or low in the charts, when the data is split on key. In SPSS split file on key was easy, but how can I also do this in WEKA? So, what I am trying to figure out with this analysis is how songs which have a major key can predict if a song will end up high or low in the chart by using the other musical features and the same for minor key. Thanks for your help!
Relevant answer
Answer
Just curious how you define "The key of a song", if it is "more complex" than 4-5-1 ... 
Whats the other musical features ?
  • asked a question related to Audio Analysis
Question
10 answers
Not the freewares like PRAAT. For sleep deprivation studies.
Relevant answer
Answer
What is the problem about Praat? It is scriptable, so if you know what you want to do, you can do it. Of course MDVP offers some nice routines. In case you want to work on the signal itself also Matlab might be an option (toolboxes for signal analysis of especially: http://www.ee.ic.ac.uk/hp/staff/dmb/voicebox/voicebox.html)
  • asked a question related to Audio Analysis
Question
3 answers
Are there any java wrappers available for Praat? If so, which is the best one in terms of speed and functionality? I ask this because the Praat scripts that I have written take far too long to execute over my entire dataset of several audio files, and I was therefore wondering whether there may be a java wrapper for Praat which would allow me to execute all the functions of Praat through Java in a shorter time.
Relevant answer
Answer
I'd recommend openSMILE then: http://opensmile.sourceforge.net/. There is a very complete manual (the openSMILE book) and many configuration examples included. It works a bit differently from Praat; instead of writing a script, you write a configuration file that describes the processing chain used to extract your features. There are blocks for all the features you described, so I believe you may find this tool useful. Feature extraction, in my experience, is very fast.
  • asked a question related to Audio Analysis
Question
20 answers
I'm looking for a good tool to extract audio features like Mel-frequency, energy, etc. from a sound file. As my final aim is to extract the emotion of the speaker in the audio, it would be most preferable if I could have a tool that already does basic emotion extraction. I have come across some tools like:
and
Which could be useful for this task, but I have found that their user-base is not too much and so the tools themselves do not seem to be too user-friendly. Also, since I have still not started working with them, I wanted to know whether there are any better tools available that do the same task, in a better or easier way.
Relevant answer
Answer
Hi Tahir, some people whose work on speech emotion you may want to check include Eduardo Coutinho (e.g. https://www.academia.edu/1087299/Psychoacoustic_cues_to_emotion_in_speech_prosody_and_music), Petri Laukka (http://w3.psychology.su.se/staff/pela/), and Klaus Scherer (http://www.affective-sciences.org/user/scherer).
I will also mention one work of mine even though it is more towards art/design rather than generalisable analysis (https://www.academia.edu/842766/About_TreeTorika_Rhetorics_CAAC_and_Mao._book_chapter_) and perhaps of lesser interest to you. Have fun researching!
  • asked a question related to Audio Analysis
Question
3 answers
I understand that the time domain representation of white noise looks like impulses.
What do their autocorrelation functions look like? (for color noise)
Relevant answer
Answer
The most defined Noise after White Noise is Red (Brownian) Noise. For Which you can easily find out its statistical properties by some predefined process such as "Ornstein-Uhlenbeck process" that is also called "OU" Process .
For the Detail Description of OU process i have attached one snapshot which will hopefully let u to visualize the colored noise in time domain.
  • asked a question related to Audio Analysis
Question
3 answers
Is there any method or software for cleaning signals from music or separate between speech and music
Relevant answer
Answer
This kind of task is not trivial, and several techniques have been proposed in literature, even though i personally don't know any commercial or research software which is able to accomplish this task efficiently. The best method to apply usually depends on the type of a priori knowledge you have about the mixed audio signal. The domain is usually a time-frequency representation of the mixed signal. In the field of Blind Audio Source Separation (BASS) methods, in which we assume we don't have any knowledge about the input sources, techniques such as Independent Component Analysis (ICA), Non-Negative Matrix Factorization (NMF), or statistical approaches like Hidden Markov Model (HMM) are generally employed. If you have some knowledge of the recorded voiced signal, this can help in developing specific spectral masking/subtraction techniques allowing to separate the spectral content of recorded voice in the time-frequency representation of the mixed signal, which is modeled as a sparse matrix of pitched events and related harmonic patterns. However, since real sounds are composed by a comb-pattern of harmonics exponentially spaced in frequency, it is (still) almost impossible to resolve the harmonic overlap problem in the mixed signal, so that a distortion will be inevitably introduced in the separated signals after processing.
  • asked a question related to Audio Analysis
Question
6 answers
Do we need to address each and every byte of the file or only starting address and ending address? How to read/write a mp3 file into a memory?
Relevant answer
Answer
If the microSD card follows FAT32 file system (I do not know about other file systems), the audio file can be saved as any format of file. There is no special format for audio files. You dont have to address each byte.
The memory is classified as 512/1024/2048 bytes per sectors and 8 sectors per cluster. You have to save/retrieve data as a whole sector. You have to read boot sector to know the available memory, file entry table, free clusters, occupied clusters,etc.
I have done projects that uses microSD cards for data storage. My advice is "Study about file system".
Regards
Arun
  • asked a question related to Audio Analysis
Question
3 answers
There are a few good spectral editing programs available for Windows, but the process of isolating the different constituent sounds is strictly manual, and can be extremely tedious and difficult, often with disappointing results. So, is anyone aware of any software or plugin that can analyse a recording and automatically identify and isolate all of the various unique waveforms (i.e., the vocals and individual instruments) with a high degree of accuracy, so that they could then be placed on separate tracks, for example, in order to enable subsequent mixing into stereo? (I would imagine that this would be analogous to the "edge detect" effect in graphics editing software. Is this analogy correct?)
Most attempts to produce pseudo-stereo from mono recordings have historically used various tricks such as time delays, EQ adjustments, reverb, comb filters, etc., invariably with unsatisfactory results such as clearly noticeable artifacts and phase errors. This kind of pseudo-stereo is totally unrealistic, and could never be mistaken for true stereo.
However, utilizing spectral editing software, with accurate isolation and rendering of all constituent sonic waveforms, one can theoretically produce a result which is indistinguishable from true stereo, because it is in actuality a true stereo mix, having been constructed from individual tracks that each contain only one isolated component sound. These tracks are equivalent to the output of a multi-track machine.
The automation of the process of sonic detection and isolation would enable and greatly facilitate the production of these virtually flawless mixes.
Relevant answer
Answer
Algorithms for the automatic extraction of individual sound sources or instruments from audio mixtures are the topic of a research area known as sound source separation. This has been the focus of a large amount of research activity over the past decade and considerable progress has been made in improving the results obtained. These techniques can automate the process of extracting the sources to a considerable degree, but there will still be artifacts which require manual cleaning/editing. At present, most of these techniques have not been incorporated into publicly available software, but these techniques have been used to generate stereo mixes from mono recordings on a number of commercial reissues.
  • asked a question related to Audio Analysis
Question
2 answers
I have a huge audio dataset (m x n), m instances and n features, on which I would like to perform Principal Component Analsysis. Is there any method to use a separate Validation Dataset to choose the number of PCs 'k' ? Because the training validation error monotonically decreases and reaches 0 when all the PCs are taken (k=n). I know that we can set a cut off for percentage of variance explained, but is there any other way by using a validation dataset ?
Relevant answer
Answer
You also might consider to divide your set of data in two or more parts, equal in size or nor, and transform the different sets using the same procedure and then analyze the difference you find in structure as a function of the number of components extracted.
  • asked a question related to Audio Analysis
Question
4 answers
I want to extract a temporal envelope of a speech.
Relevant answer
Answer
if it is based on EEG, you can extract it through MATLAB after converting the EEG signals into wave form in matlab
  • asked a question related to Audio Analysis
Question
2 answers
I am working on a chapter about the importance of high quality sound in virtual restorative environments (for healthcare applications) and I am looking for studies that investigate the quality of audio on relaxation or any other dependent variable. Is anyone aware of any such studies? Note that the VR factor is not important at the moment, just the effect of low/high quality of audio on perception. If it is in a healthcare or relaxation field that's great but not essential. Any leads would be greatly appreciated.
Relevant answer
Answer
There are some nice studies on the effect of sound in a virtual environment by Robert Riener et al., e.g. a rowing simulator: "Virtual competitors influence rowers", in: Presence, August 2010, Vol. 19, No. 4, Pages 313-330.
They also built a virtual "Somnomat"
  • asked a question related to Audio Analysis
Question
6 answers
Two omnidirectional microphones are used to form the gradient microphone (Faller 2010) to get the stereo signal. But when I use two onmi-directional microphones, I get good stereo effect, but when I use the other two microphones, the stereo effect is much worse. Maybe it is because of the difference between the frequency response of these two microphones. I compensated for the spectrum of two microphones, so that they have the same frequency response. But the effect is not as good as expected. Is there any other reasons for this problem? And is there any methods to enhance the stereo effect?
Relevant answer
Answer
What you need for creating a pressure-gradient microphone is a patched pair of pressure (omni) capsules, such as those being employe din Sound Intensity microphones. The two capsules should be placed face-to-face with a solid spacer of 10-15mm in between.
The "velocity" signal derived by processing the signals from these two capsules, indeed, is quite noisy at low frequency, and the polar pattern is irregular at high frequency. Definitely, for getting a microphone with a "figure of eight" pattern, there are better and cheaper methods.
Then, for stereo recording (I assume X-Y, Blumlein, or the like), you need TWO pressure gradient microphones,, so in the end you need two pairs of matched capsules. This in total will be more expensive than buying a Tetramic from Coresond. Which allows you to get ANY kind and aiming of "virtual microphones" as you like...
  • asked a question related to Audio Analysis
Question
2 answers
Similar to the NU-6 auditory test.
If anyone has any information it would be greatly appreciated
Relevant answer
Answer
Thanks you very much for your help
  • asked a question related to Audio Analysis
Question
6 answers
A vocal tract replica has been excited with a sine sweep in an anechoic chamber, in order to compute the transfer function of the tract. Does the impulse response recorded by the mic need to be convolved with an inverse function of the sine sweep?
Relevant answer
Answer
Thinking of the impulse response h as an output to Delta function and staying within linear approximation we get the following simple equations in time and frequency domains:
b(t) = a(t) x h(t) or B(f) = A(f)H(f)
convolution with a(-t) gives
a(-t) x b(t) = a(-t) x a(t) x h(t) or A*(f)B(f) = A*(f)A(f)*H(f)
if a(t) is a sweep then a(-t) x a(t) approximates Delta function to some extent and h(t) can be found h(t) ~ a(-t) x b(t). However, the latter is an approximation that depends on the sweep, so regularization methods in the frequency domain should work better.