Science topic
Feature Extraction - Science topic
Explore the latest questions and answers in Feature Extraction, and find Feature Extraction experts.
Questions related to Feature Extraction
Hi, I'm Prithiviraja. I'm currently building a deep learning model to color SAR image. I came across lot of resources only using ASPP for feature extraction from SAR Image. I'm planning to use both FPN and ASPP for that process, while FPN is mostly used for object detection. Kindly tell me your suggestion.
Sign language is a visual language that uses hand shapes, facial expressions, and body movements to convey meaning. Each country or region typically has its own unique sign language, such as American Sign Language (ASL), British Sign Language (BSL), or Indian Sign Language (ISL). The use of AI models to understand and translate sign language is an emerging field that aims to bridge the communication gap between the deaf community and the hearing world. Here’s an overview of how these AI models work:
Overview
AI models for sign language recognition and translation use a combination of computer vision, natural language processing (NLP), and machine learning techniques. The primary goal is to develop systems that can accurately interpret sign language and convert it into spoken or written language, and vice versa.
Components of a Sign Language AI Model
1. Data Collection and Preprocessing:
• Video Data: Collecting large datasets of sign language videos is crucial. These datasets should include diverse signers, variations in signing speed, and different signing environments.
• Annotation: Annotating the data with corresponding words or phrases to train the model.
2. Feature Extraction:
• Hand and Body Tracking: Using computer vision techniques to detect and track hand shapes, movements, and body posture.
• Facial Expression Recognition: Identifying facial expressions that are integral to conveying meaning in sign language.
3. Model Architecture:
• Convolutional Neural Networks (CNNs): Often used for processing video frames to recognize hand shapes and movements.
• Recurrent Neural Networks (RNNs) / Long Short-Term Memory (LSTM): Useful for capturing temporal dependencies in the sequence of signs.
• Transformer Models: Increasingly popular due to their ability to handle long-range dependencies and parallel processing capabilities.
4. Training:
• Training the AI model on the annotated dataset to recognize and interpret sign language accurately.
• Fine-tuning the model using validation data to improve its performance.
5. Translation and Synthesis:
• Sign-to-Text/Speech: Converting recognized signs into written or spoken language.
• Text/Speech-to-Sign: Generating sign language from spoken or written input using avatars or video synthesis.
Challenges
• Variability in Signing: Different individuals may sign differently, and the same sign can have variations based on context.
• Complexity of Sign Language: Sign language involves complex grammar, facial expressions, and body movements that are challenging to capture and interpret.
• Data Scarcity: There is a limited amount of annotated sign language data available for training AI models.
Applications
• Communication Tools: Development of real-time sign language translation apps and devices to assist deaf individuals in communicating with non-signers.
• Education: Providing educational tools for learning sign language, improving accessibility in classrooms.
• Customer Service: Implementing sign language interpretation in customer service to enhance accessibility.
Future Directions
• Improved Accuracy: Enhancing the accuracy of sign language recognition and translation through better models and larger, more diverse datasets.
• Multilingual Support: Developing models that can handle multiple sign languages and dialects.
• Integration with AR/VR: Leveraging augmented reality (AR) and virtual reality (VR) to create more immersive and interactive sign language learning and communication tools.
The development of AI models for sign language holds great promise for improving accessibility and communication for the deaf and hard-of-hearing communities, fostering inclusivity and understanding in a diverse society.
Existing Sign Language AI Models
1. DeepASL
• Description: DeepASL is a deep learning-based system for translating American Sign Language (ASL) into text or speech. It uses Convolutional Neural Networks (CNNs) and Recurrent Neural Networks (RNNs) to process video frames and capture the temporal dynamics of sign language.
• Notable Feature: DeepASL incorporates a sign language dictionary to improve translation accuracy and can handle continuous sign language sequences.
2. Google AI - Hand Tracking
• Description: Google has developed a hand-tracking technology that can detect and track 21 key points on a hand in real-time. While not specifically designed for sign language, this technology can be used as a foundation for sign language recognition systems.
• Notable Feature: It offers real-time hand tracking using a single camera, which can be integrated into mobile devices and web applications.
3. SignAll
• Description: SignAll is a comprehensive sign language translation system that uses multiple cameras to capture hand movements and body posture. It translates ASL into English text and can be used for various applications, including education and customer service.
• Notable Feature: SignAll uses a combination of computer vision, machine learning, and NLP to achieve high accuracy in sign language translation.
4. Microsoft Azure Kinect
• Description: Microsoft’s Azure Kinect is a depth-sensing camera that can be used to capture detailed hand and body movements. It provides an SDK for developers to build applications that include sign language recognition capabilities.
• Notable Feature: The depth-sensing capability of Azure Kinect allows for precise tracking of 3D movements, which is essential for accurate sign language interpretation.
5. Sighthound
• Description: Sighthound is a company that develops computer vision software, including models for gesture and sign language recognition. Their software can detect and interpret hand gestures in real-time.
• Notable Feature: Sighthound’s software is highly customizable and can be integrated into various platforms and devices.
6. Kinect Sign Language Translator
• Description: This was an early project by Microsoft Research that used the Kinect sensor to capture and translate ASL. The project demonstrated the feasibility of using depth-sensing technology for sign language recognition.
• Notable Feature: It was one of the first systems to use depth sensors for sign language translation, paving the way for future developments.
7. AI4Bharat - Indian Sign Language
• Description: AI4Bharat, an initiative by IIT Madras, has developed models for recognizing Indian Sign Language (ISL). They aim to create an accessible communication platform for the deaf community in India.
• Notable Feature: Focuses on regional sign languages, which are often underrepresented in AI research.
Academic and Research Projects
• IBM Research: IBM has been involved in developing AI models for sign language recognition and translation, often publishing their findings in academic journals and conferences.
• University of Surrey - SLR Dataset: The University of Surrey has created large datasets for Sign Language Recognition (SLR) and developed models that are trained on these datasets.
Online Tools and Apps
• SignAll Browser Extension: A browser extension that translates ASL into text in real-time.
• ASL Fingerspelling Game: An online game that helps users learn ASL fingerspelling through AI-driven recognition and feedback.
These models and systems demonstrate the progress being made in the field of sign language recognition and translation, and they provide valuable tools for enhancing communication and accessibility for the deaf and hard-of-hearing communities.
Working in hyperspectral remote sensing? You can publish your work in this special issue! Discounts are available!
Extended deadline for manuscript submissions is 15 March 2024.
Understanding and predicting the complex or chaotic behaviors of complex and nonlinear systems is very interesting and challenging, especially when we deal with it in practical systems, like smart city or energy systems It turns out that complexity and chaos theories may provide us with a framework to explore these. Complexity in energy systems involve multiple factors, such as weather changes, unseen failures, demand, energy prices, and consumer behavior, which can exhibit unforeseen behaviors like emergence that are difficult to predict.
What is your opinion on the link between Complexity and Chaos Theories, and energy systems, which concerns the following questions:
- Can complexity and chaos theories be used to better understand and predict the behavior of energy systems?
- What challenges and limitations do we face when applying complexity and chaos theories to energy systems?
Recent studies have explored the use of computational techniques, such as machine learning and entropy-based information theoretical methods, to analyze complex systems and identify chaotic behavior. For example, one study proposed using complex system characteristics as features for time series data to identify chaotic behavior, achieving high precision on test data [1]. Another study explored the use of deep learning methods for classifying chaotic time series, achieving an impressive accuracy of 99.45 percent on unseen samples [2]. A third study utilized entropy-based methods to develop an effective approach for classifying time-series data [3].
Join us in this discussion and share your thoughts on how we can use complexity and chaos theories, as well as and other computational techniques to better understand and predict the behavior of complex systems in energy systems, and the like. Let's explore the potential of these methods and discuss the challenges and limitations we face in applying them to real-world energy systems.
References:
[1]
Conference Paper A Complex Systems Approach To Feature Extraction For Chaotic...
[3]
Conference Paper A Joint-Entropy Approach To Time-series Classification
Considering that deep learning models can automatically learn features from data, do they need other special feature engineering techniques to attain high accuracy given that it is challenging to extract relevant features using most feature engineering tools and that deep neural networks need more processing power and time to learn well when dealing with complex data?
If needed, in what context of application will doing this be required and how would this impact the model performance?
Contrary to the above, for better model performance what would be your recommendation of the most suited type of deep learning algorithm to be implemented for Image Recognition, Natural Language Processing (NLP), and different types of Predictive modeling projects from complex data without the use of additional feature engineering approach on dataset?
What is the the main advantage of Autoencoder networks vs CNN?
Is the Autencoder network better than the convolution neural network in terms of execution time and computational complexity?
What Deep Machine Learning models can I use for Pre-processing and Feature Extraction of partially occluded faces (nose masks) of individuals in Night Mode.
I want to learn and include the temporal relationship alongside my variables at each time series instance.
And I learnt that previously they used models to represent all variables by a single variable or reduce the dimensionality of variables.
I read also they created a variable for time indexing which include first difference values of time, ex. for first instance its T2-T1, and i want to know more techniques to include temporal relationship as a variable or by any means before feeding my dataset to clustering algorithms. Do you know any other techniques to represent this as a feature or by transforming existing feature to include temporal/spatial pattering , or what they call it inter/intra patterns?
Dear researchers,
My name is Abdelkbir Ouisaadane. I'm a doctor in computer science in usms.ma of Morocco. I write and publish research articles in noisy Speech Recognition, Signal processing, Artificial Intelligence, Feature Extraction .
I want to work in partnership with researchers in this field to exchange experiences and collaborate . If someone wants to work in collaboration with us please let us know.
Thank you
Email: abdelkbir.ouisaadane@usms.ma
I'm making an image retrieval algorithm in which features of the database images are stored in another database and the features of query image is compared to others(I skip the indexing-hashing process). So I need to get the image that is matched to the query image. Is there an algorithm or an approach that presents a solution for the problem?
I have been working on classifying ecg signals,and for feature extraction I am going to use AR modelling using Burgs method. After reading a few papers I got to know that the features are extracted after splitting the ecg signals into blocks of different duration,My question is why is it necessary to do so,and how could we fix a certain duration? For instance I have a signal with 50000 samples with fs = 256 hz ,so what could be the duration of each block.
And it would be really helpful if someone could help me understand the burg's method.There are videos and all for learning the Yule-Walker equations but did'nt find any for burgs method
Thank you in advance
dear community, I'm facing kind of a serious issue in a project concerning finding an equation explaining the relation between the input features and the output of a dataset, the equation should be like : Y= ax1+bx2+cx3+.....+kxn , where Y is my output and a,b,c,....k are coefficients and X1,x2,x3...xn are the input features of my dataset. the aim goal is to find a way to forecast the output Y based on the equation above using python , so if we want to find an exact value of the output Y we have an idea how to change the input variables.
thank you.
I am working on a classification task and I used 2D-DWT as a feature extractor. I want to ask about more details why I can concatenate 2D-DWT coefficients to make image of features. I am thinking to concatenate these coefficients(The horizontal,vertical and diagonal coeeficients) to make an image of features then fed this to CNN but I want to have an convincing and true evidence for this new approach.
I am searching for some algorithms for feature extraction from images which I want to classify using machine learning. I have heard only about SIFT, I have images of buildings and flowers to classify. Other than SIFT, what are some good algorithms.
I'm using Mutual Information (MI) for Multivariate Time Series data as a feature selection method.
MI is nonnegative MI >= 0 where 0 indicates that both variables are strictly independent and above that means the variables share a useful amount of information.
After computing the MI between 8 features and the target variable I got the following values:
[0.19, 0.23, 0.34, 0.19, 0.19, 0.12, 0.21, 0.071]
and when computing MI between some of the Input features, and another input feature to search for redundancy, I got the following values:
[4.0056916 , 1.58167938, 1.20578024, 1.0273157 , 0.93991675,0.9711158 ]
The values are not between -1 and 1 or between 0 and 1 so that it's easy to compare and draw conclusions.
My question is:
Is there a way to set a threshold value (t for example) where if MI >= t, then the variables are sharing a good amount of information and if MI < t they are not? How to know if the MI is high enough?
Any short introductory document from image domain please.
Hi everybody,
Given the different methods of speech feature extraction (IS09, IS10,...), which one do you suggest for EMODB and IEMOCAP datasets?
I'm using a T-Scan device to record occlusal contacts in a patient's mouth and would like to use this to aid in diagnosis. However, the variation of the occlusal force distribution over time is equivalent to Spatio-temporal data, and it is difficult to extract features from these data. At the same time, some quantitative indicators have been proposed in oral clinical research, such as occlusion time, disclusion time, and contact areas. It is doubtful that some of them are significantly different in normal populations and malocclusion populations. Because they are usually derived from the clinician's experience and some conflicting views emerge in the literature. Can I use meta-analysis to evaluate these indicators, just like a significant difference, and select some useful indicators as input for machine learning?
dear community, in the dataset bellow I'm trying to study how the features mentioned there affect the feature 'conversation par message' which would be the output of a forecasting model using machine learning, I have different questions can you please guid me :
-how to apply exploratory data analysis on my dataset , knowing that many cells in my output are empty with no value (means didn't reach no lead).
-how to apply a forecasting model to predict the output (regression model )
- how to know which feature is more affecting on my output ?
its to many Questions I know but I do really need some help because I tried my best and still struggling.
thank you so much
Hi There!
My data has a number of features (with contain continuous data) and a response feature (class label) of categorical data (binary). My intention is to study the variation of the response feature (Class ) due to all the other features using a variety of feature selection techniques. Kindly help in pointing out right techniques for the purpose. Data is like this:
------------------------------------------------------------------
f1 f2 f3 f4 ... fn class
------------------------------------------------------------------
0.2 0.3 0.87 0.6 ... 0.7 0
0.2 0.3 0.87 0.6 ... 0.7 1
0.2 0.3 0.87 0.6 ... 0.7 0
0.2 0.3 0.87 0.6 ... 0.7 1
-------------------------------------------------------------------
Hi there. I'm a medical student and now I'm working on survival prediction models. But I encountered some difficult problems about feature selection. The original sequencing data was huge, so that I relied on univariate COX regression to obtain a subset (subset A), and I'd like to perform Lasso regression to further select features for the final survival prediction model construction (using multi-variate COX regression). However, the subset A was still huge than I expected. Can I further obtain subsets by limiting the range of Hazard Ratio (HR) for Lasso regression? Or, could I perform Random Survival Forest to obtain subsets from subset A, for final survival prediction model construction? Is there anything I need to pay special attention to during these processes?
Hello everyone,
for my thesis I want to extract some voice features from audio data recorded during psychotherapy sessions. For this I am using the openSMILE toolkit. For the fundamental frequency and jitter I already get good results, but the extraction of center frequencies and bandwidths of the formants 1-3 is puzzling me. For some reason there appears to be just one formant (the first one) with a frequency range up to 6kHz. Formants 2 and 3 are getting values of 0. I expected the formants to be within a range of 500 to 2000 Hz.
I tried to fix the problem myself but could not find the issue here. Does anybody have experience with openSMILE, especially formant extraction, and could help me out?
For testing purposes I am using various audio files recorded by myself or extracted from youtube. My config file looks like this:
///////////////////////////////////////////////////////////////////////////
// openSMILE configuration template file generated by SMILExtract binary //
///////////////////////////////////////////////////////////////////////////
[componentInstances:cComponentManager]
instance[dataMemory].type = cDataMemory
instance[waveSource].type = cWaveSource
instance[framer].type = cFramer
instance[vectorPreemphasis].type = cVectorPreemphasis
instance[windower].type = cWindower
instance[transformFFT].type = cTransformFFT
instance[fFTmagphase].type = cFFTmagphase
instance[melspec].type = cMelspec
instance[mfcc].type = cMfcc
instance[acf].type = cAcf
instance[cepstrum].type = cAcf
instance[pitchAcf].type = cPitchACF
instance[lpc].type = cLpc
instance[formantLpc].type = cFormantLpc
instance[formantSmoother].type = cFormantSmoother
instance[pitchJitter].type = cPitchJitter
instance[lld].type = cContourSmoother
instance[deltaRegression1].type = cDeltaRegression
instance[deltaRegression2].type = cDeltaRegression
instance[functionals].type = cFunctionals
instance[arffSink].type = cArffSink
printLevelStats = 1
nThreads = 1
[waveSource:cWaveSource]
writer.dmLevel = wave
basePeriod = -1
filename = \cm[inputfile(I):name of input file]
monoMixdown = 1
[framer:cFramer]
reader.dmLevel = wave
writer.dmLevel = frames
copyInputName = 1
frameMode = fixed
frameSize = 0.0250
frameStep = 0.010
frameCenterSpecial = center
noPostEOIprocessing = 1
buffersize = 1000
[vectorPreemphasis:cVectorPreemphasis]
reader.dmLevel = frames
writer.dmLevel = framespe
k = 0.97
de = 0
[windower:cWindower]
reader.dmLevel=framespe
writer.dmLevel=winframe
copyInputName = 1
processArrayFields = 1
winFunc = ham
gain = 1.0
offset = 0
[transformFFT:cTransformFFT]
reader.dmLevel = winframe
writer.dmLevel = fftc
copyInputName = 1
processArrayFields = 1
inverse = 0
zeroPadSymmetric = 0
[fFTmagphase:cFFTmagphase]
reader.dmLevel = fftc
writer.dmLevel = fftmag
copyInputName = 1
processArrayFields = 1
inverse = 0
magnitude = 1
phase = 0
[melspec:cMelspec]
reader.dmLevel = fftmag
writer.dmLevel = mspec
nameAppend = melspec
copyInputName = 1
processArrayFields = 1
htkcompatible = 1
usePower = 0
nBands = 26
lofreq = 0
hifreq = 8000
usePower = 0
inverse = 0
specScale = mel
[mfcc:cMfcc]
reader.dmLevel=mspec
writer.dmLevel=mfcc1
copyInputName = 0
processArrayFields = 1
firstMfcc = 0
lastMfcc = 12
cepLifter = 22.0
htkcompatible = 1
[acf:cAcf]
reader.dmLevel=fftmag
writer.dmLevel=acf
nameAppend = acf
copyInputName = 1
processArrayFields = 1
usePower = 1
cepstrum = 0
acfCepsNormOutput = 0
[cepstrum:cAcf]
reader.dmLevel=fftmag
writer.dmLevel=cepstrum
nameAppend = acf
copyInputName = 1
processArrayFields = 1
usePower = 1
cepstrum = 1
acfCepsNormOutput = 0
oldCompatCepstrum = 1
absCepstrum = 1
[pitchAcf:cPitchACF]
reader.dmLevel=acf;cepstrum
writer.dmLevel=pitchACF
copyInputName = 1
processArrayFields = 0
maxPitch = 500
voiceProb = 0
voiceQual = 0
HNRdB = 0
F0 = 1
F0raw = 0
F0env = 1
voicingCutoff = 0.550000
[lpc:cLpc]
reader.dmLevel = fftc
writer.dmLevel = lpc1
method = acf
p = 8
saveLPCoeff = 1
lpGain = 0
saveRefCoeff = 0
residual = 0
forwardFilter = 0
lpSpectrum = 0
[formantLpc:cFormantLpc]
reader.dmLevel = lpc1
writer.dmLevel = formants
copyInputName = 1
nFormants = 3
saveFormants = 1
saveIntensity = 0
saveNumberOfValidFormants = 1
saveBandwidths = 1
minF = 400
maxF = 6000
[formantSmoother:cFormantSmoother]
reader.dmLevel = formants;pitchACF
writer.dmLevel = forsmoo
copyInputName = 1
medianFilter0 = 0
postSmoothing = 0
postSmoothingMethod = simple
F0field = F0
formantBandwidthField = formantBand
formantFreqField = formantFreq
formantFrameIntensField = formantFrameIntens
intensity = 0
nFormants = 3
formants = 1
bandwidths = 1
saveEnvs = 0
no0f0 = 0
[pitchJitter:cPitchJitter]
reader.dmLevel = wave
writer.dmLevel = jitter
writer.levelconf.nT = 1000
copyInputName = 1
F0reader.dmLevel = pitchACF
F0field = F0
searchRangeRel = 0.250000
jitterLocal = 1
jitterDDP = 1
jitterLocalEnv = 0
jitterDDPEnv = 0
shimmerLocal = 0
shimmerLocalEnv = 0
onlyVoiced = 0
inputMaxDelaySec = 2.0
[lld:cContourSmoother]
reader.dmLevel=mfcc1;pitchACF;forsmoo;jitter
writer.dmLevel=lld1
writer.levelconf.nT=10
writer.levelconf.isRb=0
writer.levelconf.growDyn=1
nameAppend = sma
copyInputName = 1
noPostEOIprocessing = 0
smaWin = 3
[deltaRegression1:cDeltaRegression]
reader.dmLevel=lld1
writer.dmLevel=lld_de
writer.levelconf.isRb=0
writer.levelconf.growDyn=1
nameAppend = de
copyInputName = 1
noPostEOIprocessing = 0
deltawin=2
blocksize=1
[deltaRegression2:cDeltaRegression]
reader.dmLevel=lld_de
writer.dmLevel=lld_dede
writer.levelconf.isRb=0
writer.levelconf.growDyn=1
nameAppend = de
copyInputName = 1
noPostEOIprocessing = 0
deltawin=2
blocksize=1
[functionals:cFunctionals]
reader.dmLevel = lld1;lld_de;lld_dede
writer.dmLevel = statist
copyInputName = 1
frameMode = full
// frameListFile =
// frameList =
frameSize = 0
frameStep = 0
frameCenterSpecial = left
noPostEOIprocessing = 0
functionalsEnabled=Extremes;Moments;Means
Extremes.max = 1
Extremes.min = 1
Extremes.range = 1
Extremes.maxpos = 0
Extremes.minpos = 0
Extremes.amean = 0
Extremes.maxameandist = 0
Extremes.minameandist = 0
Extremes.norm = frame
Moments.doRatioLimit = 0
Moments.variance = 1
Moments.stddev = 1
Moments.skewness = 0
Moments.kurtosis = 0
Moments.amean = 0
Means.amean = 1
Means.absmean = 1
Means.qmean = 0
Means.nzamean = 1
Means.nzabsmean = 1
Means.nzqmean = 0
Means.nzgmean = 0
Means.nnz = 0
[arffSink:cArffSink]
reader.dmLevel = statist
filename = \cm[outputfile(O):name of output file]
append = 0
relation = smile
instanceName = \cm[inputfile]
number = 0
timestamp = 0
frameIndex = 1
frameTime = 1
frameTimeAdd = 0
frameLength = 0
// class[] =
printDefaultClassDummyAttribute = 0
// target[] =
// ################### END OF openSMILE CONFIG FILE ######################
Dear community, after using the wavelet transform to extract the important features from my EEG signals , i'm wondering about how to calculate the Shanon entropy of each value of my coefficients (cD1,cD2,....cA6), another thing is how to use the Shanon entropy for dimension reduction ?
Thank you .
How can various features (including texture, color and shape) from different components or objects in an image be extracted/selected from images for multi label learning task
In AI based fault classification methods how can i justify a particular feature extraction method (like frequency filtering method) over other methods and how the use of a particular feature selection method (in my case i used Binary Particle Swarm Optimization, BPSO) over other methods like PCA, ReliefF etc?
Based on what I found, I should always first split the data into train/test set and then perform feature selection to prevent information leakage. Here's the part that I don't understand:
if I only remove the low predictive column from the train set, then my test set would have one more column than my train set. It doesn't make sense to me to build a model based on n-1 variables and then test it on a dataset with n variables. Shouldn't I remove the column before splitting into train/test?
Any help is appreciated.
Dear Researchers,
I would like to apply an optimization algorithm to MSER (Maximally Stable Extremal Regions) feature detector. So, the MSER detector has four parameters, can i use mono-objective optimization or multi-objective optimization to find the best combination of the four parameters?
In what case i must apply muti-objective optimization?
Thank you for your reponses
Best Regards
Khadidja BELATTAR
I want to make a hierarchical clustering algorithm is designed to cluster the product aspects into different groups, in which aspect similarity computation is conducted with the relevant aspect set. For example, groups of Laptop aspects
group 1 (screen & display): screen-size, resolution.
group 2 (battery): battery-life, battery-type, weight
group 3 (processor & CPU): performance, speed, CPU model manufacturer, processor count
Here is the situation: I am trying to predition the energy consumption (load) of households using the artificial intelligence (machine learning) techniques.
Problem: The data is only available for the 40% of the households. Is it possible to predict the energy consumption for the rest of 60% households based on the available data (features) of 40% of households?
dear community, my model is based feature extraction from non stationary signals using discrete Wavelet Transform and then using statistical features then machine learning classifiers in order to train my model , I achieved an accuracy of 77% maximum for 5 classes to be classified, how to increase it ? size of my data frame is X=(335,48) , y=(335,1)
Thank you
Hi,
i want to classifiy time series of varying length to classify drivers of a bike by the Torque. I was planning on dividing the signal in lengths of lets say 5 rotations so the length of the time series would vary by the velocity of rotation. Do I need to extract features like Mean value and fft or is it enough to simply apply the filtered signal to the classifier?
Thanks in advance
Hi,
I am working in brain computer Interface application. Is there any possibility to extract features through reinforcement learning. can you please guide me with some tutorials and materials.
Will application of feature extraction before classifying a multispectral data using SVM improve the result significantly? Since the SVM already uses margin maximization to cope with high dimensional problems?
I have a 3D Image of size 100x100x100. I have features (feature dimension 25) extracted for each pixel by considering a small pacth around it. So the input to 1D CNN is 100x100x100x25 reshaped to 1000000x25 in the form nxf where n is the number of samples( in this case pixels) and f being the feature dimension. Is using CNN ideal for this scenario? or is a different deep learning model necessary?
Hello all,
I want to utilize some method that detects edges from unorganized point clouds. The cloud scenes would contain objects such as buildings, roads, streets, ditches etc. Still, I would like that the method would be as universal as possible.
So far I have studied that good results have been found through eigen value analysis of point neighborhood's covariance matrix. Some also use principal component analysis to process the eigen values further. [
Conference Paper Fast and Robust Edge Extraction in Unorganized Point Clouds
]Another paper uses plane fitting and some threshold parameters for edge detection. Results seem promisin, but there are some parameters to fix on the other hand [https://www.mdpi.com/2072-4292/8/9/710]
I was thinking of trying the eigen analysis first.
Any opinions or other suggestion?
Dear Researchers,
`Since I am new in the field of internet security, I need your suggestion regarding the meaning of the following features.
We have DNS google.com or youtube.com, and so on, and I want to extract different features based on Lexical and Web Scrapped.
Lexical Features:
what is the meaning of the following features? Please write with an example.
1) different ratios (different ratios (number to length, alphabet to length) ?
2) hash?
3) distance between a number to an alphabet? (You can find the meaning of these features in the paper Feature Extraction Approach to Unearth Domain Generating Algorithms (DGAs) - Page 401)
4) English domain name, not English yet pronounceable domain names, uni-gram?
Web Scrapping:
we extract information of the queried domain name from the web using Python (You can find the meaning of these features in the paper Feature Extraction Approach to Unearth Domain Generating Algorithms (DGAs) - Page 403)).
1) Levenshtein distance (sq1,se2), what is seq2?
2) Typosquat process?
Thanks
Dear community , I tried to extract features using continuos wavelets transform using python on my data , but I faced some problems ; my dataset are sleep recordings for 10 patients (physionet sleep dataset) , after selecting a patient randomly ,I kept just 2 eeg channels and dropped the other channels (eog , ecg , emg ) , I extracted the epochs (channel , time , event) , how I can do my feature extraction ?
Thank you
Dear community , currently working on emotions recognition , as a first step I'm trying to extract features , I was checking some recources , I found that they used the SEED dataset , it contains EEG signals of 15 subjects that were recorded while the subjects were watching emotional film clips. Each subject is asked to carry out the experiments in 3 sessions. There are 45 experiments in this dataset in total. Different film clips (positive, neutral, and negative emotions) were chosen to receive highest match across participants. The length of each film clip is about 4 minutes. The EEG signals of each subject were recorded as separate files containing the name of the subjects and the date. These files contain a preprocessed, down-sampled, and segmented version of the EEG data. The data was down-sampled to 200 Hz. A bandpass frequency filter from 0–75 Hz was used. The EEG segments associated with every movie were extracted. There are a total of 45 .mat files, one for each experiment. Every person carried out the experiment three times within a week. Every subject file includes 16 arrays; 15 arrays include preprocessed and segmented EEG data of 15 trials in one experiment. An array named LABELS contains the label of the corresponding emotion- al labels (−1 for negative, 0 for neutral, and +1 for positive). I found that they loaded each dataset separately (negative , neutral , positive) , and they fixed length of signal at 4096 and number of signal for each class at 100 , and fixed number of features extracted from Wavelet packet decomposition at 83 , my question is why they selected 83 , 4096 and 100 exactly ?
I know that my question is a bit long but I tried to explain clearly the situation , I appreciate your help thank you .
Hello RG,
I have been working with EMG signals and a common approach to reduce the dimensionality (in order to classify events for example) is to extract a set of features (in time and frequency domains) from time segments or epochs. I wonder if the same feature sets apply to EEG? How the feature extraction methods of EMG and EEG signals diverge (or converge)?
Dear Community , I want to load my dataset from physionet Sleep edf , and try to separate the list of signals than the list of labels so I can apply the feature extraction , I used MNE python , but it gives the opportunity to create epochs for 1 subject only , any help please .
I want to apply Fine-tuning the deep CNN models like VGG-19, ResNet101, DenseNet, etc. using transfer learning with feature extraction techniques in MATLAB for infrared-band (thermal) image data. Which algorithm to use? How to implement it?
Can anyone help with a comprehensive suggestion?
I am interested in using MATLAB to extract texture features using LBP for each pixel in an image and clustering them using K-means algorithm. Anyone with relevant knowledge or the MATLAB code should assist.
The traditional target detection or scene segmentation model can realize the extraction of video features, but the obtained features cannot restore the pixel information of the original video (if there is a deviation in understanding, please correct me). May I ask which articles have introduced a method for extracting video features and using these features for video reconstruction (that is, to achieve feature-to-pixel mapping). That is, the extracted video features can contain semantic information, or can they be used for video reconstruction?
I plan to implement 3D image segmentation. I've features extracted from an unsupervised feature extraction method. The features have lower dimension than that of the input image. Which segmentation method suits best for this use case? I plan to implement it in python.
I'm looking for a traffic dataset for Pasadena city, California. Im interested in hourly data or even parts of the day (few hours interval) to indicate the traffic count for the year 2019. I've looked at the data from https://data.cityofpasadena.net/ but it is sparse and inconsistent. Is there any alternative ways to get the historical traffic data?
Hi everyone. I am doing features comparison and want to compare my work with other's work.
Let say I am using features f1, f2, f3, f4, f5, and f6.
And another researcher is using features f3, f4, f5, f6, f7, and f8.
It means we have some common features i.e. f3, f4, f5, and f6.
Now I want to fairly compare my work with his work. Can anybody tell me what should be the right method to compare? Please share your experience and any reference will be highly appreciated.
Thanks in Advance.
Hello to everybody,
I am new to the field of the Gait Analysis, but I did lots of study to explore this topic as this is the first part of my project proposal. Thus, the second part consists of developing a system that can extract gait features (like cadence, stride time etc) from a video, but I have no idea where to start because I have not found any source code yet. The main idea is extracting gait features related to PD from healthy people (Training Set) and, in a later moment, from PD people (Test Set), so that new gaits will be classified as abnormal.
Basically, here is explained the main idea: a classifier takes in input a video of one healthy person performing his/her gait, and the classifier extracts relevant features; this is executed multiple times with different people (generating the Training Set). At the end of the supervised learning, the classifier takes in input a new, unseen person gait and it has to evaluate if the person is healthy or ill (if ill, what disease); this is executed multiple times with different people (generating the Test Set).
Do you know any projects available that can carry out my task, or suitable to my situation? Of course, any programming language is accepted (Python and MATLAB are very welcome) as well as any datasets (either existent or self-made).
Thank you for your availability and patience.
Kind regards,
Luigi Manosperta
Hi,
I am working on project which ultimate goal is emotion classification from speech and I want to try several approaches how to do this. One of them is training convolutional neural network using MFCC coefficients extracted from audio. I can easily extract them, since there are several python libraries capable of doing so, but I am not quite sure how to use them, since I have matrix of 13xN values depending on how long is audio input, but that obviously is not good as input for neural network.
I understand that coefficients are calculated for very short frames and that's my N, and I possibly could feed network frame after frame, but since emotions are not changing rapidly in miliseconds, I'd like to work in wider context, let's say I'd like to have 13x1 vector for every 3-4 seconds. Now let's say I am able to isolate coefficients for given time (e.g. 13x200 matrix = 3 seconds of audio), but how do I make it into 13x1 with considering the fact that this vector is intended for emotion recognition? I mean, am I supposed to calculate e.g. mean and use 13 means as neural network input? Or standard deviances, or something else, or combination of few? What about normalisation or some different preprocess of coefficients?
The most papers covering this issue are very vague about this part of whole process, usualy saying only something like "we used 13 mfcc coefficients as neural network input", but no details about how to actually use them.
Can someone tell me what are the best practices with mfcc in emotion recognition or can someone recommend some papers covering this problem?
if i want to make Signature Texture Features Extraction Using GLCM Approach in android studio , what
1- why used this method in Android studo , what is the features of it ? to make me choose this method in Androide ?
I've implemented a neural network similar to the one in research paper mentioned below for unsupervised feature extraction. It consists of 2 layers stacked one after another. Each layer has two two layers in it. The first layer of the network captures square nonlineaity relationships in the input image patch. The second layer groups the response with respect to subpace size. It is very similar to Independent component analysis except it groups dependant sources together and these dependent groups are independent of each other. PCA is used at the beginning of each layer for dimensionality reduction.
My goal is visualize the features (weights at each layer) and the feature maps. Since the dimensions are reduced, the visualization is not straight forward. What is the best way for me to visualize the feature maps/features?
Also, how was figure 4 achieved in this paper?
Conference Paper Representation Learning: A Unified Deep Learning Framework f...
I am using autoencoder networks for deep learning based feature extraction purpose for spectral images. For the time being, the number of hidden nodes in the network is randomly chosen.
Is there any way to optimize this parameter so that the best feature representation is achieved?
What are the best techniques for geospatial datasets? Also, are there some techniques that are better suited for stacking of models than using a single model.?
Hi All,
I have an audio database consisting of various types of signals and I'm planning to extract features from the audio signal. So I would like to know whether it's a good idea to extract basic audio features (eg MFCC, Energy ) from the audio signal with a large window (Let's say 5s width 1s overlap) rather than using conventional small frame size (in ms). I know that the audio signal exhibits homogeneous behavior in a 5s duration.
Thanks in advance
I am new to Machine Learning and I am currently doing research on speech emotion recognition using deep learning. I found out that recent literatures were using mostly CNN and there are only few literatures found for SER using RNN. I also found out that most approaches used MFCCs.
My questions are:
- Is it true that CNN has been proved to outperform RNN in SER?
- If yes, what are the limitations that RNN have compared with CNN?
- Also, what are the limitations of the existing CNN approaches in SER?
- Why MFCC is used the most in SER? Does MFCC have any limitations?
Any help or guidance would be appreciated.
I'm working on an activity recognition project using smartphone sensors data and I want to implement feature extraction phase.
Assume the frequency of data collection is 20Hz and the sliding window is of size 2 sec. It means we have 40 data samples in each window. So, when we do feature extraction, we will have just one feature extracted data point for each sliding window. For example, if we use "Mean" as a feature, we should calculate the mean of all 40 data samples for each window. In fact, we have a kind of data reduction process here, from 40 raw data samples to 1 feature extracted data.
I would really appreciate it if anyone could help me which parts of my understanding are right and which parts are wrong.
Features extracted using MFCC gives 13*n matrix as the extracted features. Whereas those extracted using GFCC gives a 64*n matrix.
What is the reason for this?
https://github.com/mvansegbroeck/featxtra/issues/11 contains an image of features extracted using GFCC.
https://github.com/mvansegbroeck/featxtra/blob/master/feat/feature-gtf.cc This link shows the method used to evaluate GFCC features. This is a module which is integrated with kaldi toolkit used for Automatic Speech Recognition (ASR).
Can somebody verify the correctness of the GFCC module? Is it really supposed to give a 64*n matrix?
This is mainly addressed to biologists, especially those who are studying sensory perception in animals and/or human beings. Also, people from other areas, such as medicine or engineering, who have turned their attention to neuroscience are welcome to contribute. I want to know how our system extracts pertinent features such as frequency selective transduction in the cochlea by the hair cells and orientation selective neurons in the brain. What other such processing is known, that perform dedicated feature extraction from the sensory inputs ?
Queries for search engines (such as google) contains some information about the intent and interest of the user, which can be used for user profiling, recommendation etc. As far as I know, there are already lots of methods to deal with relatively long texts, such as news, articles, essays, and extract useful features from them. However, queries are usually too short and may related to many different areas, I wonder if there are some advanced methods (not simple word embedding) already verified to be effective in extracting information from query texts? Thanks !!!
Data file -> Which contains some tabular data. It can be a json or any file type but contains structural data.
Config file -> Which contains the configurations of the project in any file type.
We are looking for a solution that can include the contents of the file to classify a data file from a config file.
*we currently do not have any available dataset for this.
I am looking at different types of excitation methods introduced to a structure, they are:
- Shaker
- Impact Hammer
I would like to know what type of features can be extracted from the sensor for this type of excitation. Also, I have 2 side question:
- when using; for example, wavelet domain or time series models where coefficients are the output of the models, how can the coefficients be used as features in machine learning?
- What other excitation methods can be introduced to a structure other than the methods mentioned above?
Hello Everyone!
I am working on an application where I am adding facial recognition. The application will send the picture captured by a user to Firebase MLKit where the image classification algorithm will be checking whether the captured image is of the same person or not as labeled.
The user of the application might be having a veil(most of the parts of the face will be covered in spite of eyes).
What I wanna know is which algorithm should I use here so that the model will be able to identify the veil cases and non-veil cases equally.
I have never dealt with image processing before so I am unable to decide whether I should use a face recognition algorithm or some feature extraction algo. It would really be a great help to me. Thanks for your cooperation
I have a sequence of data vectors taken from periodic space. I'm looking for some shift-invariant transformation. I did try magnitudes of FFT, but it did not work well. Do You have other ideas? The shift in the periodic space, that I need to handle, is constant for the entire sequence.
I wanted to Know detailed method in steps for detecting Feature Extraction of ECG signal
Dear All,
i want to extract facial features from image. which is the best algorithm for this? would anybody please send me if any survey paper available. and also give me some inputs for the creation of emotion classification data set, secondly, what features to be considered as attributes of the data set (Facial Expressions Recognition).
I've been working with my partner project on developing various sklearn ids models that take some pcap traffic data as input and get the classification traffic in the output (benign or attack name). Now we want to deploy our models in production so the input will now take real time traffic. What tools do we have to consider in order to get this work in a network real time interface?
Thanks!
If we have multiple classifiers and we need to know which one is under-fitting, and which one is overfitting based on performance factors (classification accuracy, and model complexity)
Are there any method to select the dominate classifier (optimal fitting) that balance between the above-mentioned two factors?
Hello everyone I'm working in audio analysis for emotion classification. I'm using parselmouth (a PRAAT integration in python) to get feature. I'm not well versed in audio analysis, starter. After reading many papers and forums. I see mfcc are used for this, I've also discovered some features they're (jitter, shimmer, hnr, f0, zero_crossing) are they used for this work?
What I've to do with audio files before extracting mfcc and these features?
After getting these features I've to predict emotion using machine learning.
It'll involve:
- The algorithm must be able to make predictions in real time or near
- Taking into account the sex and the neutral voice of each person (for example, by reducing and centering the variables of the model to consider only their variations with respect to the mean - average which will thus change value as and when the sequential analysis since it will be first calculated between 0 and 1 second, then 0 and 2 seconds, etc.)
Any help and suggestion for best practice are welcome.
Thanks
(CANNOT USE LOGISTIC REGRESSION OR ANY OTHER MODEL DUE TO PROBLEM SITUATION) I have 5 continuous variables as independent variables and 1 Target Variable as Categorical variable. I am trying to find out what is Correlation/ Degree of Association/ Amount of Variance explained/ terms between INDEPENDENT CONTINUOUS variable and DEPENDENT CATEGORICAL Variable. Which Test Should I use? Which Effect Size should I use? I am using ANOVA test (I feel ANOVA can be applied to see other way around also ie not only category-continuous, it can also be generalized to continuous-category). Is my interpretation correct.
I am tried to figure out the ways to label the click logs from the advertising log data for ad-fraud detection. Label are Fraud, not fraud.
One way which comes to my mind is rule-based meaning i manually analyze few 100 records and accuse them of being fraud if certain conditions met!
Secondly i could use clustering algos, i tried it but couldn't get it right because i have mixed categorical and numeric data (i tried K-means and K-medoids) but data is very sparse and categorical features tends to expand over time while using one-hot encoding).
Any idea to label the data automatically/algorithmically?
Thanks,