Ehsan Sadrfaridpour

Ehsan Sadrfaridpour
Clemson University | CU · School of Computing

PhD

About

5
Publications
5,407
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
66
Citations
Introduction
Ehsan Sadrfaridpour currently works at the School of Computing, Clemson University. Ehsan does research in Machine Learning and Bioinformatics. His current project is 'Fast multilevel support vector machines.'

Publications

Publications (5)
Conference Paper
Full-text available
The support vector machines (SVM) is one of the most widely used and practical optimization based classification models in machine learning because of its interpretability and flexibility to produce high quality results. However, the big data imposes a certain difficulty to the most sophisticated but relatively slow versions of SVM, namely, the non...
Preprint
Full-text available
The support vector machines (SVM) is one of the most widely used and practical optimization based classification models in machine learning because of its interpretability and flexibility to produce high quality results. However, the big data imposes a certain difficulty to the most sophisticated but relatively slow versions of SVM, namely, the non...
Article
Full-text available
The computational complexity of solving nonlinear support vector machine (SVM) is prohibitive on large-scale data. In particular, this issue becomes very sensitive when the data represents additional difficulties such as highly imbalanced class sizes. Typically, nonlinear kernels produce significantly higher classification quality to linear kernels...
Article
Full-text available
Bariatric surgery (BAR) has become a popular treatment for type 2 diabetes mellitus which is among the most critical obesity-related comorbidities. Patients who have bariatric surgery, are exposed to complications after surgery. Furthermore, the mid- to long-term complications after bariatric surgery can be deadly and increase the complexity of man...
Conference Paper
Full-text available
The support vector machine is a flexible optimization-based technique widely used for classification problems. In practice, its training part becomes computationally expensive on large-scale data sets because of such reasons as the complexity and number of iterations in parameter fitting methods, underlying optimization solvers, and nonlinearity of...

Questions

Question (1)
Question
Usually, when both train and test data are available in the beginning, a dimensionality reduction such as Singular Value Decomposition (SVD) can be applied on both of them as one matrix. The reduced dimension data is divided to train and test without any further consideration.
In cases of not having the test data at the beginning, how to apply the same dimensionality reduction on it. In another word, transform the test data to same feature space as the reduced dimension train data. One reason for not having the test data in the beginning can be it takes time to collect/process the test data, or more test data may generate over time. Another reason can be using a trained model on the train data in an application and what to use the output on unseen data.
My goal is to train a classifier such as SVM on the train data now and save all the required information that allow me to transform the future test data.
I can save the U,S,VT
matrices as a result of SVD on the train data. I can reduce the dimension of data using US
.
My question is how can I reduce the dimension for the test data?
I think there are two approaches. I want to know which approach is valid.
  • Reduce the dimension of test data to the same number of dimension of train data without using the U,S,VT matrices of SVD on train data. Somehow use the U,S matrices from the SVD on the train data and transform the test data to lower dimension.
  • The second method is more based on using same (mean, std) or (min, max) for data normalization. However, it seems the U matrix is calculated based on the input data to the SVD and using another data (test data) without recalculating the eigenvalues and eigenvectors seems wrong.

Network

Cited By

Projects

Project (1)
Project
Developing fast multilevel SVM