Science method

# KNN - Science method

Explore the latest questions and answers in KNN, and find KNN experts.
Questions related to KNN
• asked a question related to KNN
Question
I am getting extremely less value of piezoelectric coefficient d33 of Li doped KNN ceramic as reported in literature at MPB . What could be the reason?
Piezoelectric coefficient
• asked a question related to KNN
Question
I'm running a binomial GWR in both MGWR and GWR4, and sometimes models get a high number of neighbors, sometimes even close to N-1. Is that a problem?
Should I interpret the number of neighbors relative to number of observations as a metric of model quality?
Also, should I be worried when GWR requires a large number of neighbors to have inversible matrices?
In view of your description above, If I understand it correctly, I think you can certainly use the GWLR model if the AICc value of your local model is greater than that of the global one by a certain extent.
• asked a question related to KNN
Question
Hi, I have been working on some Natural Language Processing research, and my dataset has several duplicate records. I wonder should I delete those duplicate records to increase the performance of the algorithms on test data?
I'm not sure whether duplication has a positive or negative impact on test or train data. I found some controversial answers online regarding this, which make me confused!
For reference, I'm using ML algorithms such as Decision Tree, KNN, Random Forest, Logistic Regression, MNB etc. On the other hand, DL algorithms such as CNN and RNN.
Hi Abdus,
I would suggest you should check the performance of your model with and without duplication of records. Generally, the duplication may increase the biasedness of the data, which may lead to a biased model. To solve this you can use the data augmentation approach.
• asked a question related to KNN
Question
I made KNN nanofibers, using recipe is TCD 5cm, Voltage 11kV, flow rate 0.75ml/h, around 22.6 degrees, and 16%hum.
But, the rainy season is coming, RH became 28 degrees and 45%hum.
When I tried to make electrospun same KNN solution with the same electrospinning condition, nanofibers were solidified early and hanged on the collector as seen in picture..
I tried changing Voltage to 7~11kV and TCD to 3~5cm, but the results were same..
How can i solve this problem..? I can't wait untill the rainy season is over.. and I can't change solution's concentration..
Humidity can discharge your jet, so additionally to what Marc Simonet proposed above, I would also try increasing the voltage instead of decreasing it.
• asked a question related to KNN
Question
Hello Friends,
I am applying ML algorithms (DT, RF, ANN, SVM, KNN, etc) in python to my dataset which has features and target variables as continuous data. For example, when I'm using DecisonTreeRegressor I get the r_2 square equal to 0.977. However, I'm interested to deploy the classification metrics like confusion matrix, accuracy score, etc. For this, I converted the continuous target values into categorical ones. Now when I'm applying the DecisionTreeClassifier, I get the accuracy square=1.0 which I think is overfitting. Then I applied the normality checks, and correlation techniques (spearman) but the accuracy remains the same.
My question is am I right to convert numeric data into categorical one?
Secondly, if both regressor and classifiers are used for the same dataset, will the accuracy be changed?
For details plz see the attached files.
Thanks for the time
I think there are two misconceptions here.
1) There is no reason to expect similar accuracies for regression and classification on the same data set. Turning a regression problem into a classification problem is tricky, and essentially pointless.
2) r_2 is definitely not a valid index of the quality of a regression model. Imagine a model that systematically gives a prediction equal to 10 times the observation. The r_2 of the model will be equal to 1, although the model is obviously very poor. For regression, the most useful quality index is the root mean squared error, computed on a test set, i.e. on data that have never been used for designing the model, neither for training nor for model selection.
• asked a question related to KNN
Question
As a student who wants to design a chip for processing CNN algorithms, I ask my question. If we want to design a NN accelerator architecture with RISC V for a custom ASIC or FPGA, what problems or algorithms do we aim to accelerate? It is clear to accelerate the MAC (Multiply - Accumulate) procedures with parallelism and other methods, but aiming for MLPs or CNNs makes a considerable difference in the architecture.
As I read and searched, CNN are mostly for image processing. So anything about an image is usually related to CNN. Is it an acceptable idea if I design architecture to accelerate MLP networks? For MLP acceleration which hw's should I work on additionally? Or is it better to focus on CNN's and understand it and work on it more?
As I understand from your question, you want to design the chip for your NN. There are two different worlds, one is developing a NN and converting it into an RTL description. Concerning this problem, if your design is sole to implement on ASIC then you have to take care of memories and their sizes. Also, you can use pipelining and other architectural techniques to design a robust architecture. But The other implements it on an ASIC with a commercial library of choice. This is the job of the design engineer who will take care of the physical implementation. Lastly, if you want to implement FPGA then you should take care to exploit DSPs and BRAMs in your design to gett he maximum performance of NN.
• asked a question related to KNN
Question
I am trying to do this as my undergrad research topic: a better kNN algorithim. My model DOES require training, so the "training-free" advantage of traditional knn is not applicable here. What are some other advantages of kNN?
Dear Wenqi Guo ,
What are the advantages of KNN ?
Simple to implement and intuitive to understand.
Can learn non-linear decision boundaries when used for classfication and regression. .
No Training Time for classification/regression : The KNN algorithm has no explicit training step and all the work happens during prediction.
Regards,
Shafagat
• asked a question related to KNN
Question
Hello to everyone. I am trying to implement KNN analysis to fix minPts in the DBSCAN clustering algorithm. My dataset is composed only of 4 variables and 935 observations. I have found that if k = 5 (no. of variables + 1) I get as output of DBASCAN 2 clusters: one of 911 observations and one of 8 observations. If I use a larger k, according to many papers as sqrt(no. of observations), I get 909 observation in only one cluster and the other are classified as noise points.
Both could be possible results, but their meaning is fundementally different. How can I get rid of this arbitrary choise of minPts hence k?
Thanks!
the boundary becomes smoother with increasing value of K. The training error rate and the validation error rate are two parameters we need to access different K-value.
To get the optimal value of K, you can segregate the training and validation from the initial dataset. Now plot the validation error curve to get the optimal value of K. This value of K should be used for all predictions.
The optimal K value usually found is the square root of N, where N is the total number of samples. Use an error plot or accuracy plot to find the most favorable K value. KNN performs well with multi-label classes, but you must be aware of the outliers.
• asked a question related to KNN
Question
I have received comments from reviewers on a paper regarding applying ML techniques (RF, KNN, Nnet etc) on a data with 384 samples. One of the reviewers is extremely concerned that about using machine learning techniques for 384 samples as it seemed very small to him/her. If there is any paper where ML is used for samples of 350 or less, it would be greatly helpful for me in this regard. I could cite the paper as an evidence that ML can be used for relatively smaller samples.
TIA.
• asked a question related to KNN
Question
I need MATLAB code of above mentioned models. How could I find those?
Plz provide me KNN matlab code
• asked a question related to KNN
Question
I want to implement a machine learning algorithm into a MIDI music dataset such as KNN,SVM,Decision Trees. Any implementation idea on python or MATLAB? Thank you so much in advance.
• asked a question related to KNN
Question
Meta-heuristics can be used in Feature selection acting as wrapper methods. A lot of works use KNN classifier for evaluations of the selected subsets. But if I want apply FS for regression models, what kind of machine learning algorithms can I use in Matlab?
• asked a question related to KNN
Question
I am working on binary classification. I have 2k tweets in my dataset. I applied NB,SVM,LR,DT,RF,KNN. I got best result in LR. How to justify the result ? How can I say that why Logistic Regression works well ?
Stay Happy Stay Healthy
• asked a question related to KNN
Question
Problem: "Semantic segmentation of humans and vehicles in images".
Following are the given information related to solve this problem;
Experimental study:
using a learning machine model: SVM, KNN, or another model
Using a deep learning model :
either Semi-dl: resNet, VGg, inception (Google net) or others
full DL site: Yolo, unet, CNN family (CNN, RCNN, faster RCNN), or others
Evaluation of the two models in the learning phase
Evaluation of both models with test data
Exploration & descriptions & analysis of the results obtained (confusion matrix, specificity, accuracy, FNR)
Thanks to all
• asked a question related to KNN
Question
machine learning
K-means clustering represents an unsupervised algorithm, mainly used for clustering, while KNN is a supervised learning algorithm used for classification.
• asked a question related to KNN
Question
for a dataset with 5 categorical classes , what would be the be the best classifiers ?
size of data (670,49)
use ECOC based SVM in matlab
• asked a question related to KNN
Question
I have prepared a relaxor ferroelectric (KNN - BZZ solid solution) bulk sample. During P - E loop measurements, we experienced saturation electric field of about 100 kV/cm before dielectric break. However people reported saturation fields of about 350 kV/cm. How is this possible? I thought that it may be related to the density of bulk samples but we also used a Cold Isostatic Press but the result s are the same.
@ Junjie Li, you provide a good supplement. Generally, the larger thickness requires higher electric field. The electrode also has an influence. For example, Ti and Ag elctrode are better than Pt elctrode because of the lower work function. Besides, the volume of sample (or the electrode area) also plays a different role.
• asked a question related to KNN
Question
I am currently trying to classify images using a HOG descriptor and a KNN classifier in C++.
The size of the feature vectors obtained with the HOG depends on the dimensions of my images and all of them have differents  dimensions !
How can I make the size of the feature vector not depending on the images dimensions (without adapting the number of cells and blocs), or how can I use the KNN classifier with differents sized feature vectors?
You can normalize all the images to the same size and then apply the HOG descriptor.
• asked a question related to KNN
Question
I am performing a KNN classification for a sample with n=5 and m= 200 (n:# features, m:#samples). To find the optimum value for K, I calculated an error rate for each K (1:50). However, I found the the error rate vector as well as the classification_report are not fixed, and they are changing by each run.
It looks that the dataset has different shuffled each time. You need to fix the training and testing dataset , so the data must be the same for each time you change the value of k.
• asked a question related to KNN
Question
If the data set is mixed (numerical, nominal, and binary) features, for classifying such data we need to define new measure able to handle the three types.
Generally, the combined approach is widely used for this case.
If the new measure is combination between different measures, it is very difficult to fulfill the triangle inequality, so it can be similarity measure or distance but not metric.
My question is:
Is it possible to use such measure with KNN?
as we know KNN requires metric distance.
In general:
Can I use KNN with a non-metric measure for classifying the data?
Dear Najat,
This is truly an interesting question.
I would add that, if data is even more complex, also containing missing values, for instance, some “metrics” also break the reflectivity property. This happens in fact with some well-known heterogeneous distance metrics (HEOM and HVDM), and yet they are still the current state-of-the-art distance metrics to handle mixed (heterogeneous) data.
Now, to answer your question directly, it seems that KNN has been previously used with pseudo-metrics. However, for heterogeneous data, if you don’t have missing values, HEOM and HVDM respect the metricity (they are metrics) and are perhaps a suitable option in your case.
You may find these papers worthy of consideration (heterogeneous distance functions and some considerations regarding their metricity, properties and behavior):
Wilson, D. R., & Martinez, T. R. (1997). Improved heterogeneous distance functions. Journal of artificial intelligence research, 6, 1-34.
Juhola, M., & Laurikkala, J. (2007). On metricity of two heterogeneous measures in the presence of missing values. Artificial Intelligence Review, 28(2), 163-178.
Santos, M. S., Abreu, P. H., Wilk, S., & Santos, J. (2020). How Distance Metrics influence Missing Data Imputation with k-Nearest Neighbours. Pattern Recognition Letters.
• asked a question related to KNN
Question
I am a beginner in the neural network
I have 6 class (1;2;3;4;5;6) in each class 8 samples means we all have (48 data serie time ).
the features are peaks after drawing datas .
Which classification program I can use here ?
and how i can do it this ?
• This is the data
• Each Class has 8 columns
• And each column contains 1000 values
• The feature to categorizing here is the peaks after we draw each column
• I made an attempt but it was not successful
• Can it be converted to svm ?
• asked a question related to KNN
Question
Hello dear friends
I need to run the K-nearest neighbor (KNN) model to predict flood risk areas using R software. Does anyone have this model code in the R software?
Just check for the function “knn” in R. If it isn’t already installed, you first have to install the package; I think the name package is still “knn”.
You may also want to use the package “caret”. It contains different functions (as knn) for modelling complex regression and classification problems.
• asked a question related to KNN
Question
Hello,
I'm currently working on my MSc thesis. One of the tasks I should do is classifying a crash dataset with different methods with R.
The dataset has 4 labels(fatal, major injury, minor injury, and pdo) and 12 predictors. The classes have equal number of rows (each class about 600 rows). I applied classification on the data with ANN, SVM, KNN, CART, and random forest methods. But unfortunately, all methods give me very low kappa and overall accuracy. I tried using just some of the predictors, or tuning the parameters, or even scaling and not scaling the data. Unfortunately, nothing changed. My data is attached and I appreciate anyone's opinion about this problem.
Thank you.
As all intelligent systems concern about initial state, change the domain of initial state (weights in artificial neural network , for example)
otherwise,
try min-max normalization to your dataset ... not Scaling
or you can use z-score normalization instead as a pre-processing steps.
Besides, if you use artificial neural networks , then make sure that your input vectors do not force your activation function (transfer function ) into saturation regions resulting in static response of 1 or 0 for all different data items.
• asked a question related to KNN
Question
Hello everyone,
I am trying to optimize a Pulsed Laser Deposition process for the growth of (K0.5Na0.5)NbO3 piezoelectric perovskite on Pt(111), and I've been struggling to figure out the origin of some phenomenon I see in the film morphology: the film presents a predominant (100) texturization, but displays something that looks like unoriented cubic crystallites or terraces (in the picture).
Has anyone seen anything like this before and know what it could be due to and/or how to remove these defects?
First of all, thanks for the replies.
Jürgen Weippert , the image I posted represents the best result I have obtained so far and both growth parameters and substrate cleaning seem to influence their appearence. Also, with increasing thickness, they seem to grow in dimension, but not significantly increase in number.
I am aware of the fact that unoriented grains can be quite common in film growth, but their perfect cubic shape seems quite unusual, and I wonder if it is something standard that happens for these kinds of systems, because I haven't been able to find references in literature presenting something similar so far.
Vladimir Dusevich it will be in a couple of months and I will most certainly give it a try, thanks for the advice!
• asked a question related to KNN
Question
Hello everyone;
I'm preparing my master thesis it's about influenza prediction using deep learning. The data that I had is the rate of dangerous cases and the suspicious ones per week and per region around the country.
for now I implemented the Random Forest, KNN, DNN, LSTM, CNN, CNN-LSTM and Deep Belife Network. I concluded that I'm faced with time series forecasting so I used the window method to make my problem surpervised with window_size=3, 2 and 1. Calculating the r2_score I got 10 regions under the 70% (which I had read that it's the acceptable threshold).
So I'm writing this question hoping to find a solution or get some idea or another deep learning technique and maybe special architecture of the technique used above to improve my prediction in this region.
(You will found attached some picture of the regions that I want to improve and my LSTM model.)
Spatial autoregressive models are statistical models A good paper is : About predictions in spatial autoregressive models: Optimal and almost optimal strategies’’.
The following link for the R codes used to obtain the simulation results included in this paper :
• asked a question related to KNN
Question
I am developing the application on Android for human activity recognition. The classification results is satisfactory on other classifiers like KNN and Decision Tree. But what do you think should I implement the Convolutional Neural Network on mobile phone to recognize the activities. I am sure, it will increase the accuracy but what about the computational complexity and resource utilization?
Yes , because of the Deep Learning is an incredibly versatile and powerful technology, but running neural networks can be pretty demanding in terms of computational power, energy consumption, and disk space. This is usually not a problem for cloud applications, which run on servers with large hard drives and multiple GPUs.
Unfortunately, running a neural network on a mobile device isn’t that easy. In fact, even though smartphones are becoming more and more powerful, they still have limited computational power, battery life, and available disk space, especially for apps that we like to keep as light as possible. Doing so allows for faster downloads, smaller updates, and longer battery life, all of which users appreciate.
In order to perform image classification, portrait mode photography, text prediction, and dozens of other tasks, smartphones need to use tricks to run neural networks fast, accurately, and without using too much disk space.
In this article, we’ll look at a few of the most powerful techniques to enable neural networks to run in real time on mobile phones.
Refer this link getting some ideas
• asked a question related to KNN
Question
thanks a lot Owais Bhat
• asked a question related to KNN
Question
i will be using the different types of algorithms like SVM,KNN,NAVIE BAYES,RANDOM FOREST how will i understand when to use which algorithm in increase in crop productivity.
You need to understand the exact nature of the problem that you want to solve it. And then understand the requirements of each algorithm and its efficiency and results. And then You can choose the appropriate algorithm to solve your problem.
• asked a question related to KNN
Question
I would like to know which method SVD or KNN will yield better prediction accuracy in recommendation systems. Has anyone done a comparative study which I can refer to.
The accuracy depends on the source of the data and the target objetive.
Most of cases, kNN is better in sets of data with low missing date proportion. And SVD is better in huge size data sets.
• asked a question related to KNN
Question
I am working in PSO for feature selection. I use KNN algorithm with 10 cross validation for the evaluation. before I use the 10cv the algorithm is quite cheap meaning no high computational cost has been faced, but after turning to 10cv the code is running too slow, sometimes for days. may I know if there is any problem in performing the 10cv. I have attached my codes here.... Please help me ! Thanks a million
10 cv means you are repeating the process 10 times. Therefore you should expect the cost to rise 10 fold. Additionally some partitions of data may not include all classes leading to difficulty in reaching an acceptable rate. This may appear if your code is expecting a minimum accuracy to be reached.
• asked a question related to KNN
Question
I am looking for some article related to Optimization of SVM, KNN, Ensemble and Decision tree classifier.
I proposed a new classification technique based on KNN
• asked a question related to KNN
Question
For a thesis, I am researching different classifier algorithms to be able to detect people from top-down view. This way it should be possible to detect people and count them in a real time video feed. I am keeping the research pre-neural network era.
Another thing is that I am completely new to Object detection and tracking + I am new to machine learning.
I have learned a lot about background subtraction, HOG + SVM (or the clasifier SVC), I have also learned about HAAR features and HAAR classifier.
But now my real question is why are feature extraction algorithms often used for a specific classifier (like HOG with SVC and not KNN, or forest). I have not been able to spit this out.
P.S. I'm stuck with the practical part of my thesis, so I could use some guidance, if you want to give me some, please contact me
The best feature for detection in images is corner points
• asked a question related to KNN
Question
Hi, I am trying to solve the problem of imbalanced dataset using SMOTE in text classification while using TfidfTransformer and K-fold cross validation. I want to solve this problem by using Python code. Actually it takes me over two weeks and I couldn't find any clear and easy way to solve this problem.
Do you have any suggestion where exactly to look?
After implementing SMOTE is it normal to get different results accuracy in the dataset?
You need to fix the seed number so that you can replicate the result each time you perform the task.
HTH.
Dr. Samer Sarsam
• asked a question related to KNN
Question
~specifically using horizontal quartz tube furnace
Certainly yes, try to optimize the distance
• asked a question related to KNN
Question
I have a data set with 750 cases and five features with unbalanced distribution of the target class.I applied SMOTE to solve the problem of unbalance distribution of the target class. then i applied ANN, Naive Base, LR, KNN , Random forest. I used Cross-validation to evaluate models performance and learning curves ( error rate as a function of sample size) of evaluate overfitting. I got an AUC of (89% to 92%) and accuracy of (85% to 88%).
I got a replay from a reviewer that sample size is not enough and learning curves are not evidence against overfitting!
For class balance issue, you also have to be careful how you do the final test. If your original data is imbalanced and you believe that it is representative of the world at large. Then you have to keep that in mind during the testing step and actually test your final model on imbalanced data. Otherwise you wouldn't know how your model would actually perform. Here, I agree with Sergey Porotsky that overall accuracy is not a good measure of your model's performance. As an alternative, you could compute the error rate individually for each class, you would want similar error rate across all classes. Better yet, examine the confusion matrix and compute other performance measures.
Additionally, I would refrain from using synthetic data during the final test, since it is possible that some of the synthetically generated data points would never occur in the wild and your algorithm could be learning wrong. Besides, since you're using cross-validation, you will end up testing every one of your available data points.
Also, because you're using cross-validation, then by "final test" I mean the test at the end of each cross-validation step, the one you sum and divide by the number of folds. By the way, another helpful evaluation measure is the variance in your performance metrics across all of the folds. If variance is high, then your model is not doing well on some regions in the feature space.
When you don't have much data and computation time allows for this, leave-one-out cross-validation is probably the best way to go.
• asked a question related to KNN
Question
• Actually, I tried to make KNN-xBTO nanoparticle by solid state reaction technique by firing the precursors at high temperature (1200°C )over 2 h at heating rate of 5 °C (also 10°C) min-1 inside alumina crucible.
• After firing, the as-prepared final product become solidified and quite impossible to take-off from the crucible bottom surface.
• Can anyone please give me suggestion, what to do in order to make this kind of nano-composite by solid state reaction???
Hi Toyabur,
I have faced the same problem as you have faced. Please use some hot DI water once you removed the crucible after the reaction gets completed. The solidified things would be easy to remove. Then centrifuge it to collect your exact material. Further sinter it again at 1200 C.
good luck
• asked a question related to KNN
Question
I am doing thesis on baby cry detection, i build the model with CNN and KNN, i got train accuracy of CNN is 99% and Test accuarcy is 98% and KNN train accuarcy is 98% and Test accuracy 98%.
please suggest me which algorithms i should choose from both of them and why?
I would suggest you to use the simpler one, which is by itself a good reason (Easier to code, fewer hyperparameters to adjust, reduced computational complexity). Another important justification is that you compared it with more powerful techniques, such as CNN, and obtained similar accuracy.
• asked a question related to KNN
Question
Hello,
I have a historical time series of 72-year monthly inflows. I need to generate, say 100, synthetic scenarios using the historical data.
Simple resampling (by reordering annual blocks of inflows) is not the goal and not accepted. I want synthetic scenarios to have different monthly values, but all summing up to the same value of the annual inflow as in the historical one (e.g. if in the historical data 2015 inflow is 2400 MCM, I want the synthetic scenarios to have the same value for 2015 inflow but with different monthly values). Surely these synthetic scenarios must satisfy some statistical metrics such as autocorrelation etc.
Does any one know of an existing code/script for this, preferably written in Excel, Python, or Matlab? I've heard of ARMA, KNN, etc. but none suits my purpose.
Thanks
Dear Majed
Just came across to your question and in case if you are still struggling with this I can help you. I have developed a novel model for simulating synthetic flow sequences and would be very to collaborate with you and share my algorithms.
Please get back to me at s.patidar@hw.ac.uk if you have any questions.
Cheers
Sandhya
• asked a question related to KNN
Question
I am working on dataset in which almost every feature has missiong values. I want to impute missing values with KNN method. But as KNN works on distance metrics so it is advised to perform normalization of dataset before its use. Iam using scikit-learn library for this.
But how can I perform normalization with missing values.
Thank you everyone for your valuable suggestions. I will work on these points.
• asked a question related to KNN
Question
My KNN model have 80% accuracy and 85% precision and SVC have 79% accuracy and 85% precision. What is the reason behind this? or both of these models are stable or not to use?
Dear Habib,
It is a nice question. With regard to this question asked, you need to understand the theoretical as well as physical interpretation of confusion matrix. Additionally, you need to make out the meanings of accuracy and precision, as well. Then only, you will be able to comprehend the same.
Thanks,
Sobhan.
• asked a question related to KNN
Question
I have applied traincascadedetector , KNN ,featurematching, estimategeomatric transform in Matlab, opencv & Python.
Can anyone suggest me some another method to detect the symbol?
Thank you mam @ Veronica Biga
• asked a question related to KNN
Question
I am using KNN algorithm with sklearn library for authentication purpose.
when i train with one train per user I got 99%AUC, and when I increase the number of training per user the AUC is decreasing?
can someone explain what is going on?
Dear Eman
I think this is because of underfitting. You should try k-fold validation
• asked a question related to KNN
Question
I am trying to fabricate KNN thin films via rf magnetron sputtering using lab-made target. However there is a great discrepancy in the literature regarding the sintering temperature of target starting from 650 C to 1100 C. The bulk KNN ceramics are usually sintered at greater than 1000 C. Target sintered at 650 C have 50% density whereas it is also stated in the literature that targets should be dense. I am trying to find out the reason for low temperature sintering of targets.
One thing i have noticed is, the target sintered at higher temperature fractured due to plasma impingement where as low temperature sintered target was intact after usage. Could this be the reason?
Pottasium based materials creats such problem. Optimization rquired with extra K.
• asked a question related to KNN
Question
Hi,
I would like to get any suggestion for any paper/reference that discussed about the KDTree with the fixed-radius algorithm.
I found numerous references but most of them are the extended / improvement of the KDtree with fiixed radius. My focus is on the fundamental concept.
I found this book discussed about the KDTree but it does not include the fixed radius in NN searching.
Appreciate if anyone can help me
Dear Aparna Sathya Murthy,
Thanks you very much for your suggestion.
Since you are experienced with NNs, can i ask you about the KDTree with a fixed-radius in this video.
In the minutes of 4.57 , its purpose is to find the new point (7,4). When a circle is created, by right, there are few points in the circle.Due to only two points (9,6) and (7,2) are within the circle and the rectangle space, they are considered as the nearest neighbor points of the (7.4). Is that (5,4) is consider excluded from the nearest neighbor of (7,4) due to its located out of the rectangle boundary?
• asked a question related to KNN
Question
I have training data of 1599 samples of 5 different classes with 20 features. I trained them using KNN, BNB, RF, SVM(different kernels and decission functions) used Randomsearchcv with 5 folds cv.
I get trainng accuracy not more than 60% Even the test accuracy is almost same. I used class_weights as 2 classes has more samples than others . I used PCA which reduced my feature size to 12 with 95% data covering. None helped in increasing accuracy of SVM and RF classifiers.
Can anyone suggest me any other different ways to improve accuracy or Fscore for my training data?
Dear,
You over-sampled the minority class and that may influence the results of Cross-validation. Shall you consider a combination of over-sampling and under-sampling.
SVM - Usually a strong classifier, therefore dificult to get rid of overfiting specially if and how you removed outliers/noise;
RF - Prunning is a good way to avoid overfitting, as well as minimum leaf size at some extent;
KNN - Since it is a deterministic algorithm, seems to be affected by the over-sampling.
FFNN - You should look at training/test curves and maybe you can control the overfitting with an early training stop. A dropout layer may also improve the generalisation of you algorithm.
• asked a question related to KNN
Question
I am bit confused how to arrange the data.
I have 2 classes and if i arrange the data like
Case1
Feature1 feature2 feature 3 class 1
Feature1 feature2 feature 3 class 1
Feature1 feature2 feature 3 class 1
Feature1 feature2 feature 3 class 2
Feature1 feature2 feature 3 class 2
Feature1 feature2 feature 3 class 2
value of k fold accuracy is lower than one time train test split accuracy.
Case2
and if i arrnage the data like this
Feature1 feature2 feature 3 class 1
Feature1 feature2 feature 3 class 2
Feature1 feature2 feature 3 class 1
Feature1 feature2 feature 3 class 2
K fold acurracy is higher than normal one time train test split accuracy.
So which one is the right approach?
thanks
----------------------------------------------------
edit
Let discuss the result, in case 1
train test split accuracy with test size 25% is 90%.
and accuracy with K fold CV 69% with std 0.07%
in case 2
train test split accuracy with test size 25% is 54%.
and accuracy with K fold CV 78% with std 0.08%
In all cases, data is shuffled and randomized
As random data u apply, as best training and testing u get....in matlab, u can rondom ur feature vectors by using the function rand perm for permutation randomization
• asked a question related to KNN
Question
Like KNN( Potassium Sodium Niobate)
Hi Vineetha,
Refractive index of any semiconducting material can be approximated from its band gap using the following relationship:
n2 = 1 + [A/(Eg + B)]2
Where n is the refractive index, Eg band gap (eV) and A and B are constants amounting to 13.6 and 3.4 eV respectively.
The band gap itself can be determined either through diffuse reflectance spectroscopy using Kubelka-Munk theory (approximately) or Ultraviolet Photoelectron spectroscopy (accurate).
See the attached link for more details.
I hope this helps.
• asked a question related to KNN
Question
I am performing classification task related to intrusion detection (Binary classificaton, i.e., normal and attack). Accuracy ad FAR are considered for comparisions of results of various classifiers like KNN, SVM etc. I have done that over dataset created from wireless network (IoT). As there are no other datasets available for IoT (RPL-6LoWPAN), from which dataset should i compare the performance results. Sholud i use KDD99, UNSW-NB15, ICSX etc.
I think it depends on the main goal for comparison process.
What is your goal from comparsion process? what do you try to demonstrate and prove ?! Does your goal build IDS for IoT environment ?!
• asked a question related to KNN
Question
In number of research papers the sintering temperature is given around 1050oC for KNN.But when I heat the system in the muffle furnace at heating rate 5oC/min. in the crucible it is getting evaporated or stick to the crucible which is almost impossible to remove from the crucible.
If you perform the sintering heat treatment in a muffle oven you will never get rid of oxygen. A sealed crucible means that you should use a lid on the top of the crucible to minimize volatilisation. This does not avoid oxygen to enter into the crucible. Try it and check whether it works or not.
Regards,
F.A. Costa Oliveira
Lisboa, 12th February 2018
• asked a question related to KNN
Question
Melting point of KNN constituents ie sodium carbonate and potassium carbonate of below 900o C.but while sintering we heat the system around 1050-1100o C.if i heat the system above 900 oC potassium carbonate and sodium carbonate will evaporate.Please explain this.
Thoroughly mixed powders of Na and K carbonate with Nb2O5 are first made to react at 800 to 850 C where these carbonates disassociates into to oxides and CO2 then the reaction product of precursors is KNN then formed KNN pervoskite structures are stable up to high temperatures which makes it feasible for sintering at high temperatures as 1050 to 1100C. I would suggest you to go through some review papers to have thorough understanding of the reaction schemes and crystal structures.
Never hesitate to contact me if you have any querries @ bhaskar_dmt@yahoo.co.in
• asked a question related to KNN
Question
my new interesting field is evaluation the accuracy of fault diagnostics based on artificial intelligence classification techniques like SVM , ANN , KNN ,PSO ,... on Dissolved Gas Analysis (DGA ) of power transformers .
i can't find standard DGA dataset for this purpose .
any help with this issue will be appreciated .
.
data are available in print in the article attached
they must have been uploaded somewhere in digital form, however ...
.
• asked a question related to KNN
Question
I am working on Brain MRI image classification using hybrid SVM and KNN algorithm
training is done using SVM and at time of testing it check for nearest distance for particular class
Dear Vaibhavi Solanki ,
Regards, Shafagat
• asked a question related to KNN
Question
Does parallel KNN algorithm exist?
• asked a question related to KNN
Question
please also explain how much quantity we need of k2CO3,Na2CO3,Nb2O5 to prepare 1 mole of KNN
xK2CO3 +(1-x)Na2CO3 +1/2 Nb2 O5 +ethanol or water = Kx Na(1-x) NbO3+1/2 CO2+H2O
• asked a question related to KNN
Question
Hi
I am working on classification task, my data contains numeric and categorical features (mixed), for classifying such data we need to use transformation method, for converting any type of data to another type, to be able to apply the exist classification algorithms such as SVM, Naive bayes or KNN .
My question is there any classification algorithm can handle mixed data without using any transformation method?
in my view may be i am not sure Random Forest may helpful, in this case.
• asked a question related to KNN
Question
we used many classifier as naive bias ,ANN,linear ,quadratic ,decision tree ,KNN and support vector machine ..which one is the best to prove the diagnosis of fault using machine learning
It depends on the domain to which you are applying and the nature of the data.
A combination will certainly lead to promoting results.
• asked a question related to KNN
Question
I gonna use the following Matlab code but I can't understand what does mean options in this code I mean I don't know what should I use from options set do I use all set or choose one thing?
the code is
function [eigvector, eigvalue] = LPP(W, options, data)
% LPP: Locality Preserving Projections
%
% [eigvector, eigvalue] = LPP(W, options, data)
%
% Input:
% data - Data matrix. Each row vector of fea is a data point.
% W - Affinity matrix. You can either call "constructW"
% to construct the W, or construct it by yourself.
% options - Struct value in Matlab. The fields in options
% that can be set:
%
% Please see LGE.m for other options.
%
% Output:
% eigvector - Each column is an embedding function, for a new
% data point (row vector) x, y = x*eigvector
% will be the embedding result of x.
% eigvalue - The sorted eigvalue of LPP eigen-problem.
%
%
% Examples:
%
% fea = rand(50,70);
% options = [];
% options.Metric = 'Euclidean';
% options.NeighborMode = 'KNN';
% options.k = 5;
% options.WeightMode = 'HeatKernel';
% options.t = 5;
% W = constructW(fea,options);
% options.PCARatio = 0.99
% [eigvector, eigvalue] = LPP(W, options, fea);
% Y = fea*eigvector;
%
%
% fea = rand(50,70);
% gnd = [ones(10,1);ones(15,1)*2;ones(10,1)*3;ones(15,1)*4];
% options = [];
% options.Metric = 'Euclidean';
% options.NeighborMode = 'Supervised';
% options.gnd = gnd;
% options.bLDA = 1;
% W = constructW(fea,options);
% options.PCARatio = 1;
% [eigvector, eigvalue] = LPP(W, options, fea);
% Y = fea*eigvector;
%
%
% Note: After applying some simple algebra, the smallest eigenvalue problem:
% data^T*L*data = \lemda data^T*D*data
% is equivalent to the largest eigenvalue problem:
% data^T*W*data = \beta data^T*D*data
% where L=D-W; \lemda= 1 - \beta.
% Thus, the smallest eigenvalue problem can be transformed to a largest
% eigenvalue problem. Such tricks are adopted in this code for the
% consideration of calculation precision of Matlab.
%
%
%
%Reference:
% Xiaofei He, and Partha Niyogi, "Locality Preserving Projections"
% Advances in Neural Information Processing Systems 16 (NIPS 2003),
%
% Xiaofei He, Shuicheng Yan, Yuxiao Hu, Partha Niyogi, and Hong-Jiang
% Zhang, "Face Recognition Using Laplacianfaces", IEEE PAMI, Vol. 27, No.
% 3, Mar. 2005.
%
% Deng Cai, Xiaofei He and Jiawei Han, "Document Clustering Using
% Locality Preserving Indexing" IEEE TKDE, Dec. 2005.
%
% Deng Cai, Xiaofei He and Jiawei Han, "Using Graph Model for Face Analysis",
% Technical Report, UIUCDCS-R-2005-2636, UIUC, Sept. 2005
%
% Xiaofei He, "Locality Preserving Projections"
% PhD's thesis, Computer Science Department, The University of Chicago,
% 2005.
%
% version 2.1 --June/2007
% version 2.0 --May/2007
% version 1.1 --Feb/2006
% version 1.0 --April/2004
%
% Written by Deng Cai (dengcai2 AT cs.uiuc.edu)
%
if (~exist('options','var'))
options = [];
end
[nSmp,nFea] = size(data);
if size(W,1) ~= nSmp
error('W and data mismatch!');
end
%==========================
% If data is too large, the following centering codes can be commented
% options.keepMean = 1;
%==========================
if isfield(options,'keepMean') && options.keepMean
;
else
if issparse(data)
data = full(data);
end
sampleMean = mean(data);
data = (data - repmat(sampleMean,nSmp,1));
end
%==========================
D = full(sum(W,2));
if ~isfield(options,'Regu') || ~options.Regu
DToPowerHalf = D.^.5;
D_mhalf = DToPowerHalf.^-1;
if nSmp < 5000
tmpD_mhalf = repmat(D_mhalf,1,nSmp);
W = (tmpD_mhalf.*W).*tmpD_mhalf';
clear tmpD_mhalf;
else
[i_idx,j_idx,v_idx] = find(W);
v1_idx = zeros(size(v_idx));
for i=1:length(v_idx)
v1_idx(i) = v_idx(i)*D_mhalf(i_idx(i))*D_mhalf(j_idx(i));
end
W = sparse(i_idx,j_idx,v1_idx);
clear i_idx j_idx v_idx v1_idx
end
W = max(W,W');
data = repmat(DToPowerHalf,1,nFea).*data;
[eigvector, eigvalue] = LGE(W, [], options, data);
else
options.ReguAlpha = options.ReguAlpha*sum(D)/length(D);
D = sparse(1:nSmp,1:nSmp,D,nSmp,nSmp);
[eigvector, eigvalue] = LGE(W, D, options, data);
end
eigIdx = find(eigvalue < 1e-3);
eigvalue (eigIdx) = [];
eigvector(:,eigIdx) = [];
Thank you Mohammed for your help.
• asked a question related to KNN
Question
Hi,
Please find attachment of my dataset with labels.
I applied pre-precessing techniques on my data sets, like stop words removal, remove weblinks, punctuation marks, and finally did lemmatization. now i think mydataset is fully tuned. now i am applying different feature extraction techniques to extract features and then i'll classify them using some classifier.
Please Recommend me some feature extraction techniques.
Previously i used lexicon based techniques, Bag of words Model,  KNN to generate features. Now looking for some, to improve my results.
Regards
You may also be interested in this paper:
Irony detection in Twitter: The role of affective content DIH Farías, V Patti, P Rosso - ACM Transactions on Internet Technology (TOIT), 2016
• asked a question related to KNN
Question
Using HoG transform i obtained feature vector for each image, now  how to classify these images using Sklearn classification algorithm(Knn) using obtained feature vector??
same thing!
you have to take some images which have circle shapes in it. similarly you do it for all shapes. you can take the features of these images. for example HoG features. give the feature vector to the classifier in the training phase. tool for Knn is available.
• asked a question related to KNN
Question
What is Caesar, ISS, SarPy and KNN software classification as per ICH M7
The models available in VEGA (www.vegahub.eu/portfolio-item/vega-qsar/) are only trained with bacterial mutagenicity. They have been developed with a data set of about 6 thousand data and using different software tools - including SarPy, SVM, and domain knowledge. The result is only classification, as data are obtained from the AMES test.
For ICH M7,  databases with doses  should be used instead,  either in regression or to build a predictor of  multiple classes of concern.  We had some work using the Gold database for carcinogenicity.
• asked a question related to KNN
Question
I'm studying SVM.
There is one sentence,
"Support Vector Machines are very specific class of algorithms, characterized by usage of kernels, absence of local minima, sparseness of the solution and capacity control obtained by acting on the margin, or on number of support vectors, etc."
In that, what is "sparseness of the solution" means?
Hi,
The SVM's solution is a set of support vectors. And "sparseness of the solution" means that the "number of support vectors increases more slowly than linearly" when the problem size increases.
• asked a question related to KNN
Question
Further how we use these variable to train the model in libsvm.?
Kindly elaborate in clear and simple way and provide relevant links
Thanks!
but how we use these parameters in libsvm ?
• asked a question related to KNN
Question
I used to LibSVM for two class data but I need the plot for showing my data classify is true and especially my margin. How can I use it with LibSVM?
I have also used LibSVM for data classification, I think you need to plot ROC curve using gnuplot..
• asked a question related to KNN
Question
In order to make the system more robust to human orientation changes, i need Pose normalization method, but i dont know it's Related Work, or which work is beter than other work ?
I Need Help
Regards
First do limb normalization as we different skeletons have different body lengths which can be done by making center of body(Sacrum) as center of co-ordinate axis
In order to get a view invariant relative position vector for every node,multiply Rotation matrix(R) with Limb normalized vectors.Rotation matrix can be found with Gram-Schmidt ortho normalization process
• asked a question related to KNN
Question
I used some feature selection methods that request the discretization prior feature selection and I found some of the literature used the method based on three category values (−1, 0 and 1) or (-2,0 and 2) using mean (μ) and standard deviation (σ) of the feature values. So, could someone help how to do that?
You just need to decide the criteria. I'll give you an example. Let's say x is the mean and SD is the standard deviation. You can say that values within x+-SD are 0 and all other are 1 or -1, depending on which side of the scale they are. You can decide on different criteria, for example x+-2SD or x+-0.5SD, etc. It should be relevant to your data.
If you do this programmatically I would first do a zero-mean-unit-variance normalisation, which means that you subtract mean from each value and then divide each value by its SD. Then you set all values between -1 and 1 to 0 (so within 1 SD) and all other to -1 or 1, depending on which side of the scale they are. an R example:
data # this is the vector of your numbers
data.scaled<-scale(data)[,1] # this does zero-mean-unit-scale normalisation
data.discrete<-cut(data.scaled,breaks=c(-Inf,-1,1,Inf),labels=c(-1,0,1))
Good luck!
• asked a question related to KNN
Question
thanks
From what I understand of your question is, you have a multimodal biometric solution. I suppose that we face x systems. Then, for each of such systems, you predict new data with a KNN classifier. Hence, you want to determine the score level fusion for each recognition so that x_1 = score_1, x_2 = score_2, ..., x_n = score_n. The calculation of the ERR metric of each biometric system will let you determine the global EER and let you assess the reliability of your entire multimodal biometric system.
• asked a question related to KNN
Question
I am looking forward to learning ML and I had an idea to implement kNN or Decision tree and train it to understand ISL. However, I am unable to find an open source dataset to train and test. Is there any such data set available.
Hello Jahangir,
Chech the attached websites ,
Hope you find the dataset you want.
Regards.
• asked a question related to KNN
Question
I have a dataset consisting of different characters. I need an approch to combine two classifiers KNN and SVM to classify these characters instead of using one in isolation.
Why not try locally (linear and nonlinear) SVM (LL SVM) classifiers. In short, it designs SVM for some local neighborhood of k-data. Alternatively try ALH approach, but LL SVM is more accurate, although slower. Download my papers on ALH and LL SVM from research gate site here. Good luck.
• asked a question related to KNN
Question
Ultrasound transducer has been studied for using Lead free piezoelectric material.
BNT-BT, KNN, etc. Ceramic is available but ...
Lead free single crystal to ask for help is hard to find.
If BaTiO3 single crystal is good for your application, pls see next website: www.ceracomp.com
• asked a question related to KNN
Question
At research of objects erroneously classified by the SVM classifier, it was found out that most of them fall into the strip separating the classes. The clarification of the classification decision for such objects can be realized using the approach based on the combined application of the SVM classifier and the nearest neighbors (kNN) algorithm.
Irina,
Optimizing the parameters leads to improve the accuracy.
HTH.
Samer
• asked a question related to KNN
Question
Hello everyone I've a training set of 17 observations. The dimensions of training and testing data are
• X= 17x7660 (features, each row is feats from each obs)
• Y= 17 x 1 (label for each row) and
• Test feat is Z=1x7660
I'm using knnclassify and everytime i run different test samples, I always end up with same prediction result; which is the first column of training data (Y(1,:)) |...
Im not sure if my training has an error..or where the problem is.. urgently waiting for help|
May be it's normal to get the same prediction result (since it concerns one test observation), may be you can get it by chance. Iam not sure but try to do the test for a set of observations at a time.
Good luck.
• asked a question related to KNN
Question