Science topic

Image Recognition - Science topic

Explore the latest questions and answers in Image Recognition, and find Image Recognition experts.
Questions related to Image Recognition
  • asked a question related to Image Recognition
Question
3 answers
For research purposes, I am looking for good quality images of faces of males and females which vary in masculinity and femininity. Preferably the faces are of caucasian people around 45 years old.
Thank you in advance,
Judith
Relevant answer
Answer
The Chicago Face Database has many male and female faces to select from and, from my perspective, do demonstrate variation in their masculinity and feminitity. The database was not designed for that purpose, but you might consider selecting some subjectively and piloting the images with participants rating these faces on your variables of interest.
  • asked a question related to Image Recognition
Question
6 answers
I'm new in CNNs. After discovering the pre-formed deep CNNs, which can be used for feature extraction, I ask myself if there is still an application area in the field of image recognition, in which the classical classification methods (such as e.g. SVM etc.) are preferred to the CNN's?
Relevant answer
Answer
A big yes, SVMs or any other classical non deep architecture are still used, you should use them when you are confident that the extracted features are of good significance. You can also use a CNN as a feature generator for a downstream SVM based classifier.
  • asked a question related to Image Recognition
Question
3 answers
I am trying to read water meter reading through OCR, however, my first step is to find ROI. I found a dataset from kaggle with the labelled data for the ROI. But they are not in rectangle, rather in polygon shape, some with 5 point, and some with 8 depending on the image. How do I convert this to yolo format?
For example: file name | value | coordinates
id_53_value_595_825.jpg 595.825 {'type': 'polygon', 'data': [{'x': 0.30788, 'y': 0.30207}, {'x': 0.30676, 'y': 0.32731}, {'x': 0.53501, 'y': 0.33068}, {'x': 0.53445, 'y': 0.33699}, {'x': 0.56529, 'y': 0.33741}, {'x': 0.56697, 'y': 0.29786}, {'x': 0.53501, 'y': 0.29786}, {'x': 0.53445, 'y': 0.30417}]}
id_553_value_65_475.jpg 65.475 {'type': 'polygon', 'data': [{'x': 0.26133, 'y': 0.24071}, {'x': 0.31405, 'y': 0.23473}, {'x': 0.31741, 'y': 0.26688}, {'x': 0.30676, 'y': 0.26763}, {'x': 0.33985, 'y': 0.60851}, {'x': 0.29386, 'y': 0.61449}]}
id_407_value_21_86.jpg 21.86 {'type': 'polygon', 'data': [{'x': 0.27545, 'y': 0.19134}, {'x': 0.37483, 'y': 0.18282}, {'x': 0.38935, 'y': 0.76071}, {'x': 0.28185, 'y': 0.76613}]}
Relevant answer
Answer
Muhammad Ali Thank you.
  • asked a question related to Image Recognition
Question
5 answers
I recently came to know about the commercial service https://mathpix.com/ which claims to convert mathematical formulas from (scanned) pdf or even handwritten text to LaTeX.
I have no experience with this. I am interested whether there is an open source solution which solves the same (or a similar) problem.
Relevant answer
Answer
@Knoll, I did it for my research paper. It was about mathematical formulas and equations on panel data econometric model. To open a pdf with MS word, you need to, at first, create a blank word file, then go to the option button on the left corner of the screen, there you will see the option "Open" along with options such as "Save". " Save as". Then click on "Open", then, select the specific pdf file, click on it, and it will be opened on MS word.
  • asked a question related to Image Recognition
Question
5 answers
What about the manual skills?
There are a lot of electronic devices, which help us to adapt to the computer interface but, what about the human interface?
In many case the computer interface are not suitable to the human interface. For this reason, many controllers such as the keyboard, mouse and other game controllers have a long journey in our lifes. They are part of our life as an extension of our own body.
Do you know any other systems that let human to interact with the electronic devices with more freedom? And, what kind of problems are there in these different systems or methods?
For example I could mention Voice Recognition, Image Recognition, Gestural Recognition, Brain Interpretation Sensors, but for sure there are a lot.
What kind of projects, research lines do you know are focus nowadays to improve the human interface but with real adaptation to the human body?
For example, what about the manual skills? Why not to use all the skills of our hand as a magician makes? The illusionist, spend a lot of hours training their hands as essential part of their tricks. Then, why not give another chance to the hands to change the way to interact with the electronic devices?
The COVID period is being a really hard moment for the nurses, doctors as the first barrier to fight agains this horrorific situation. But it is being also a really mental problem for all the human species.
In some way, I was thinking about a solution to mitigate that. For example, avoiding to touch is some of the rules to expand the virus and get contagious.
Therefore, clothing or accessories such as gloves can play a key role.
I suggest in this case the Smart Gloves, and you??
Relevant answer
Answer
Our team were working hard in order to share with the scientist community our article about SmartGloves. I am really proud to share with all of you this systematic review. Here you are.
Please ask whatever you want and send us comments in order to be aware if it will be usefull or not. The number possibilities nowadays are increasing. The first step is to know what are doing and in what real applications. Enjoy!!
  • asked a question related to Image Recognition
Question
5 answers
Hello fellow researchers! I am currently doing my final year project which involves Image Recognition with a Supervised Learning Machine Learning Algorithm. Currently, I have datasets comprised of images (without labels of course) that are obtained from Kaggle. The project that is in progress is to inspect the quality of the green coffee bean and classify them to be either healthy or defected. I am new to this field per se, and I find it hard to do the Python and to train my model. Am I in the right course to train my model with just images? Isn't it like unsupervised learning, since my current datasets are labelled but without their input variable.
Relevant answer
Answer
I think the question is not really clear to me. First, you say that you are working on a project that aims to classify the quality of coffee beams into two categories using images, right? Then, that is a binary classification task and it is supervised learning because you have the labels (i.e. healthy or defected) and you have the images (the inputs). Something like a CNN would process the image and output a single value between [0, 1], which is compared to the target (label).
As a reference, I think this paper (https://ieeexplore.ieee.org/stamp/stamp.jsp?arnumber=8090980) is doing what you want.
  • asked a question related to Image Recognition
Question
15 answers
Dear all, 
currently, I am working on content wise image classification, Can you please specify me about image recognition algorithm?
Thanks,
Relevant answer
Answer
Some of the algorithms used in image recognition (Object Recognition, Face Recognition) are SIFT (Scale-invariant Feature Transform), SURF (Speeded Up Robust Features), PCA (Principal Component Analysis), and LDA (Linear Discriminant Analysis).
Regards
  • asked a question related to Image Recognition
Question
4 answers
Dear all, I am trying to implement pose normalization for face images using piece-wise affine warping. I am using delaunayTriangulation to construct face mesh based on detected 68 landmarks for two images: one with frontal face and the other with non-frontal face. The resulted meshes do not have the same number of triangles and also have triangles that are different in direction and location.
Could anyone help please? Thanks.
------------------------------------------------------------
% Construct mesh for frontal face image
filename1 = '0409';
img1 = imread([filename1 '.bmp']);
figure, imshow(img1); hold on;
pts1 = load([filename1 '.mat']); % Load 68-landmarks
DT1 = delaunayTriangulation(pts1.pts);
triplot(DT1,'cyan');
% Construct mesh for non-frontal face image
filename2 = '0411';
img2 = imread([filename2 '.bmp']);
figure, imshow(img2); hold on;
pts2 = load([filename2 '.mat']); % Load 68-landmarks
DT2 = delaunayTriangulation(pts2.pts);
triplot(DT2,'cyan');
Relevant answer
Answer
  • asked a question related to Image Recognition
Question
11 answers
Hi,
as deep learning is a data-driven approach, the crucial is to have quality data. There exist a lot of datasets for free, but they differ in the quality of labels.
I'm now working on an index, which can tell a researcher quality of the labels, so the researcher may decide if such a dataset is useful nor not. I do have established a pipeline on how to produce such an index in a fully autonomous way. Note, I'm focusing on object detection tasks only, i.e., labels given as bounding-boxes.
The question is: does such the index exist already? I googled a lot and find nothing. It would be nice to compare our approach with existing ones.
Relevant answer
  • asked a question related to Image Recognition
Question
1 answer
Several attempts of image matching using INPHO and Agisoft software were not successful. Any helpful suggestions or related papers would be highly appreciated.
Relevant answer
Answer
Hello Can you explain this result? What are your criteria for this failure? Please explain further
  • asked a question related to Image Recognition
Question
4 answers
I have a data for image recognition using neural networks. The images are in pgm format.how to pre-process that data to get into a suitable matrix in cpp.
  • asked a question related to Image Recognition
Question
5 answers
As a recognition method, neural network is superior and powerful, especially in image-recognition, but as a well-known method whether neural network can solve the causality reasoning or not? If not, why?
Relevant answer
Answer
Hi Yaozhi,
one requirement for causality is the existence of a temporal sequence because in order to be recognizable, the cause has to appear before the effect.
So, during the recognition of single static images there can be no causality involved. You could argue that there is still room for reasoning like "Because a tree is in front of it there are not really two halves of a horse but it's a whole horse, only partially visible." But first, I think current ANNs do not have enough capacity for reasoning, and second, I would not call the relationship between elements in an image "causality".
There are, of course, ANNs for predicting the development of temporal series, e. g. of stock prices; here is certainly causality involved, though partially irrational. But those ANNs do just pattern recognition without "knowing" or regarding any causes.
Speaking pictorially, I would compare today's ANNs to students who are good willing and industrious but do not really understand anything. They have a chance of finishing (officially) successfully by just collecting clues on what to do. ("In tasks which look similar to those we had in autumn, I have to add all resistances.")
  • asked a question related to Image Recognition
Question
8 answers
I'm new in CNNs. After discovering the pre-formed deep CNNs, which can be used for feature extraction, I ask myself if there is still an application area in the field of image recognition, in which the classical classification methods (such as e.g. SVM etc.) are preferred to the CNN's?
Relevant answer
Answer
CNN can be performed well while dataset is big because CNN required large dataset to train the syatem. SVM able to train system by small dataset so, for the stating point, you can chose classifier according to the dataset.
Second, SVM required separate feature extraction and suitable feature extraction method help to reach maximum accuracy. So, have a look n various feature extraction methods.
  • asked a question related to Image Recognition
Question
5 answers
Deep neural networks (DNNs) have been widely used for closed-set recognition. In other words, they only recognize objects that have been seen in training. Can DNN be used in open-set recognition to identify database objects and reject novel unseen objects as unknown? if yes, how?
Relevant answer
Answer
Dear Wasseem Al-Obaydy Wasseem Al-Obaydy,
Look the link, maybe useful.
Regards, Shafagat
  • asked a question related to Image Recognition
Question
14 answers
How to implement multi class SVM in Matlab? Especially when it comes to creating a training matrix set of image dataset and then testing matrix set of images and group sets etc.
Relevant answer
Answer
Now, matlab offers a function named fitcecoc() which is designed specially for multiclass SVM by error correction method. You can apply it. for details please go to the following link: https://www.mathworks.com/help/stats/fitcecoc.html
Thanks
  • asked a question related to Image Recognition
Question
14 answers
I've read several times that on the problems of large dimension (Image Recognition, Text Mining, ...), the Deep Learning method gives significantly higher accuracy than the "classical" methods (such as SVM, Logistic Regression, etc.). And what happens on problems of ordinary, medium dimension? Let's say that the data set is on the order of 1,000 ... 10,000 objects and the object is characterized by 10 ... 20 parameters. Are there articles that provide data on the comparison of accuracy indicators (Recall, Precision, ...) by Deep Learning and other methods on some benchmarks?
Thanks beforehand for your answer. Regards, Sergey.
Relevant answer
Answer
Madam Murthy is right
  • asked a question related to Image Recognition
Question
8 answers
The field of image processing is very effective and high performance quantitative method in science and engineering, in particular the Image recognition in the area of computer vision.
Relevant answer
Answer
Without doubt: Deep Convolutional Neural Networks.
  • asked a question related to Image Recognition
Question
4 answers
Door detection is one of the important issues in indoor navigation.
Canny edge detector is used in door detection
Relevant answer
Answer
For Matlab Code, Visit link
  • asked a question related to Image Recognition
Question
1 answer
Dear all,
I need some help regarding image recognition and/or augmented reality. I don't want to use to scan images with camera to augment it with virtual content. Instead, I would like to upload images in my mobile from mobile locally and then augment it with virtual contents.
Please help get articles and sample codes in this scenario. Thanks a lot.
Relevant answer
Answer
Hello,
I have been working on a same idea for a long time.You can send me an email directly to talk and discuss about this problem.I have solution but it is more expensive and hard to find the correct worker in my country.
Thanks
  • asked a question related to Image Recognition
Question
6 answers
The research content includes a proposed algorithm for image/object matching and two proposed algorithms for multiple object detection.
The algorithms for image/object matching and multiple object detection are not related.
My question is how to organize them to form a Phd thesis? How to unify them into a big problem to present? What title is appropriate?
Relevant answer
Answer
You should probably try to find a problem you can solve combining/pipelining both types of algorithms, i.e. pretend you had an ultimate goal when you worked on them both. I don't know exactly which algorithms you are talking about, but lets say you are detecting people approaching a government building and you want to identify a felon among them or something like this. Depends on your specific work, really..
  • asked a question related to Image Recognition
Question
4 answers
I am currently working on a psychology research project which uses a dual-video task comprised of anxiety-provoking and positive videos to be shown side-by-side. I really want to try and match up the videos as much as possible by perceptual characteristics. For example, sizes of objects on screen, colours, textures, etc. Does anyone know of an algorithm, program, or app which could be used for this purpose?
Relevant answer
Answer
Dear Joanna, my advise is to look at papers of Moncef Gabbouj's group in Tampere University of Technology. They deal with content based image retrieval and search for similar video. Best regards, Vladimir. 
  • asked a question related to Image Recognition
Question
3 answers
For the image super resolution.
Relevant answer
Answer
Maybe the "parametric sparse representation" can help you.
  • asked a question related to Image Recognition
Question
8 answers
Using HoG transform i obtained feature vector for each image, now  how to classify these images using Sklearn classification algorithm(Knn) using obtained feature vector?? 
Relevant answer
Answer
same thing!
you have to take some images which have circle shapes in it. similarly you do it for all shapes. you can take the features of these images. for example HoG features. give the feature vector to the classifier in the training phase. tool for Knn is available.
  • asked a question related to Image Recognition
Question
4 answers
Hi
How to calculate confident level in computer vision. I work on object detection and for that purpose detected relevant features. I work on airplane door detection, so I have some relevant features such as, door window, door handle, text boxes, Door frame lines and so on.  Firstly , detect individual features, then in the second level and done some logical organisation of those features where eliminate the wrong detected features.And the end I have some final checks where should remain only features that belong to that object. So my question is with which confident level I can declare that this is the object I like to detect. Any help
Relevant answer
Answer
My previous post focused on computer stereo-vision. But in a single image feature detector context, I suggest that you check for the following paper by Meer et al., "Edge Detection with Embedded Confidence", 2001 ( http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.21.2946&rep=rep1&type=pdf ).
Miller et al. introduced a confidence measure of interest that they integrated into gradient based edge detectors.
  • asked a question related to Image Recognition
Question
5 answers
What does it mean to get empty matrix when applying HOG transform on an image?
I am working on a segmentation task and aim to use hog descriptor for pixels of an image . Applying transform, I get empty matrix for some windows. What does it mean?
Relevant answer
Answer
you said, "I get empty matrix for some windows".
That can happen for blank patches (ie, filled with the same color), then image gradients will be zero.
  • asked a question related to Image Recognition
Question
2 answers
I'm looking for a dataset of images (>100 images) of cells under bright field microscopy. They can be from any species, but human or mice cells are preferable. I have found a couple of sources such as the cell image library but it does not seem to contain bright field images in the quantities I need.
Also note, I am looking for images of cell cultures in particular where only one type of cell is in the image. As such images of tissue are not suitable for my application. 
The reason I am looking for these image is to test some image recognition and classification software. 
Thanks in advance. 
Relevant answer
Answer
In the following paper by ZaritskyEmail et al., "Benchmark for multi-cellular segmentation of bright field microscopy images" BMC Bioinformatics201314:319 DOI: 10.1186/1471-2105-14-319, they talk about a dataset of 171 manually segmented images of 5 different cell lines at diverse confluence levels, acquired in several laboratories under different imaging conditions in the first figure.  HTH.
  • asked a question related to Image Recognition
Question
4 answers
I want to recognise and track objects in real-time video processing. what is the best classifier that I can use for object recognition?
are there any public datasets for training and testing?
Relevant answer
Answer
First you need to extract your object with SfM or optical flow ... 
Then if you want you want scale invariant descriptor you can use Sift/Surf or Hog you can also have a look to bag of features or even deep learning (by transfer learning) then I would recommend SvM or AdaBoost to make your classification ... But the right answer will depend on your training dataset...  
  • asked a question related to Image Recognition
Question
1 answer
I am working on paddy grain(naturally stored) age assessment and the work is based on the husk color. Please let me know the age intervals (in months) where we can recognize the change in the husk color of stored paddy grains.
Relevant answer
Answer
Six months
  • asked a question related to Image Recognition
Question
17 answers
I need to identify the type of fish caught from fish images. How can I locate anchor points/landmark points to extract features from the image?
Namely, I want to locate eye position, dorsal and pelvic fin. Need to get Fish mouth length, Dorsal and Caudal fin length.
Right now I am trying with SIFT method to get key points.
Can someone suggest me how can I get the specific key points?
Relevant answer
Answer
As mentioned earlier, the TPS software suite very useful and user-friendly in placing landmarks and performing basic analyses. http://life.bio.sunysb.edu/morph/
My former adviser has also worked on computer-based species identification. This may be of interest to you.
Best of luck with your work
  • asked a question related to Image Recognition
Question
7 answers
I would like to know a good starting point to carry out my research in the above mentioned topic.
Relevant answer
Answer
Hello Aditya,
What are the Objects you want to recognise ? Are they same object? and more in Number?
Example : 1st Case : You want to Detect Multiple objects of same class .
You want to detect Multiple Oranges in a Picture.
2nd Case : Multi Class Object detection
You want to detect Oranges, Apples, Banana's . 
If you are looking for the first case, it should not be that hard to implement, Apply the object detector multiple times on the images, instead returning when you find the first object.
If its the second case, try building feature detectors for every class and apply every detector onto the image to find the different objects.
In general there are many approaches to recognise the object. Simple features like Color, Shape, or more complex features using HaarWavelets.
Start with one object and move to multiple object recognition
  • asked a question related to Image Recognition
Question
3 answers
I want to do face matching using weighted chi-square distance. now i have done with the chi-square distance. i have divided the face image into 8*8 sub images. I have assigned weights for particular region.
Kindly suggest me how i can proceed further for face matching.
Relevant answer
Answer
chi-square distance is useful when you want to compare two histograms. Eg for LBP (Local Binary Patterns) we can compare two histograms calculated from image texture (more precise: measure distance between two histograms). SO, if you have some histograms (whatever form LBP or others image features) from different places of face then probably can you compare this with others histogram calculated from another face. That could be a some base for made a simply classifier.
  • asked a question related to Image Recognition
Question
9 answers
Hello,
I'm working primarily in Python 3.5, so would prefer answers that can work with that language if possible please...
I would like to write a program that will automatically detect clouds in photographs, and also for it to detect what sort of cloud(s) is/are present in the photograph.
This means, unfortunately, that there are no easily-definable shapes, sizes, or colours.
Can anyone recommend some way of going about this?
I would assume it will require supervised learning (e.g. feeding the software images of X type of cloud, and then images of Y type of cloud, and Z type, etc. etc.).
Ultimately, I'd like to be able to feed the program photographs and have it output a list showing the cloud type, pixel coordinates within the photo (if possible), and the filename.
Thank you in advance for your help.
David
Relevant answer
Answer
You may want to try the 'scikit-learn' package for that purpose. 
  • asked a question related to Image Recognition
Question
3 answers
Dear colleagues,
I am looking for a music OCR notation software, to edit handwritten music. The source of this music is scanned scores, written with pen. The platform I use is windows 7.
Any ideas would be helpful!
Thanks!
Yannis Kyriakoulis
Relevant answer
Answer
You can try also Audiveris open source sw
Homepage link:   https://audiveris.kenai.com/
Reagards,
Marco
  • asked a question related to Image Recognition
Question
16 answers
I'm facing problem of letters extraction. The image is grayscale. The letters are in a row. Background of the image is not homogenous, there can be some texture with some non-white intensity. The letters are black. And of course, the letters are sometimes so similar with the background so they cannot be separated easily. The problem is also similar to car plate letters extraction. But in these case, we can expect "damaged plate with damaged letters". Unfortunately, the particular images are subject of secret project, so I cannot include there an example.
I used several types of thresholds including adaptive ones, histograms-based methods and some segmentation-based method. But no one of them works generally good. 
The goal of my task is to extract letters, i.e., detect rectangular area of each of them, or to exclude them from backgroud.
There are two criterias: success rate (I need almost 100 % success rate of extraction) and processing speed (as fast as possible, several ms ideally).
 Thank you for your advices, or links to some useful papers.
Relevant answer
Answer
Hey Petr,
as a very basic approach, given your images look like that most of the time, you could use 2D-correlation (like in older OCR systems). Build a database with example letters, correlate them and pick the ones with the highest correlation coefficient. You might have to solve some ambiguous cases (like 6 & 8 for example) afterwards.
A problem with this method is, that it is rather scale-dependent...
Greetings, David
  • asked a question related to Image Recognition
Question
4 answers
I am working on a project using multiple sensor. In this project I am detecting and recognising animal. But I am struggling with thermal camera IR. At this point of project I have to recognise animal using thermal imaging. Is their any way we can recognise individual species.
During my research I find out Eyes, ears and nose are high heat emitting areas. But my problem is what if the animal is facing back first and face end.
Thanks
Kind regards 
Relevant answer
Answer
Regarding the rear view, this is also a problem in VIS imaging: you will not be able to discern a horse from a donkey - a lot of animals look the same when you can only get a glimpse on their rear.
To some extend proportion relations might help: length of legs vs. total height vs. width of the a... and so on. The results will not be too satisfactory, but given the amount of data (that is: not much) this is the best you can do.
Wish you success
  • asked a question related to Image Recognition
Question
5 answers
In my project i want to find contour subsets of an image, each delimited by two concavities. The size of every foreground region delimited by a detected contour subset and by the straight line segment connecting the two extremes of the contour subset is computed. If the size is smaller than a given threshold, the peripheral part is regarded as noisy and is removed. As for the threshold, we use the same value adopted during removal of small 8-connected foreground components and filling of thin hollows and small 4-connected holes, i.e., we remove peripheral regions with less than 32 pixels.
Please anyone can help me to find this?
Relevant answer
Answer
Could you show na image as na example?
  • asked a question related to Image Recognition
Question
8 answers
I] Part-I (Orientation Assignment to Keypoint)
In this process
1. First I have selected window of size 16 x 16 around keypoint and calculated magnitude and orientation for each point in window of 16 x 16.
3. Then created a 36 bin histogram of orientation.
4. then I have assigned the mean value of highest bin.(i.e. if 1st bin(0-10) has highest bin of 36 then '5' is assigned as orientation to keypoint.(Is it Correct?))
5. Then I have calculated Gaussian window of size 16 x 16 with sigma value equal to 1.5 times of scale.
6. Then I have multiplied magnitude matrix of size 16 x 16 with Gaussian window
(What is the use of this multiplication?)
Is it require to multiply this multiplication result(Magnitude x Gaussian) with orientation before assigning orientation to keypoint ? (as i found some histogram bins with highest value has less magnitude value.)
As per my logic we should assign the orientation mean to keypoint as orientation of the bin whose value is highest with its magnitude value.
7. Then I have transformed(rotated) coordinates of key point i.e. x,y position of key point with respect to assigned orientation by using 2D transformation. (Is it Correct?)
8. then I have transformed orientation of all sample points included in window of 16 x 16 according to orientation of keypoint.(e.g. if keypoint orientation =5 and if sample point orientation =270 the it will become 275.(Is it Correct ?))
Relevant answer
Answer
Hi,
I am trying to shed some light here:
1. what you get from SIFT is an 128-dim. descriptor for each detected keypoint position, orientation and scale.
2. In the matching step, only the 128-dim descriptor is involved. Please note that the descriptor is already normalized to orientation and scale. The detected position gives you the center point of the image patch that you describe with the 128 dim. descriptor. The scale tells you how large this patch is and the orientation tells how much the patch has to be rotated before descriptor computation.
3. This depends on the application... if you just want to match points between images you compare the 128.dim descriptors and just look at the keypoint position, when corresponding descriptors have been detected.
4. see 2. and 3.
5. Yes, if you rotate an image by 90 degree you should get the same number of keypoints.
6. This is optional... if you implement the standard matching described in Lowe's paper then you have a one-to-many matching.
  • asked a question related to Image Recognition
Question
2 answers
From the study I realize that the feature extraction methods for word level recognition does not fit well for character level recognition. Thus the question is raised in my mind, what algorithm may work fine for character level recognition as the images for each characters are very small (e.g. 30px by 30px). 
  • asked a question related to Image Recognition
Question
4 answers
We run a small scoring shop for university exams. We use a optical recognition scanner and scan sheets through the scanner to score instructor designed exams.
We have been asked to begin scoring  multiple answer exams. Our current optical recognition software is good at scoring items with only one correct answer. However we are now being asked to score tests where students should indicate all items which are true, up to three correct options for one item. 
Do any of you have a good system for tabulating the correct answer in this type of assessment? Thanks for your help.
Relevant answer
Answer
Hi Laura
You may treat this problem as an ordinal variable. If in your example, the student marks a and b, he gets the maximum score (eg. 3) . If he marks c he obtain the minimum score (eg. 0), if he marks only a or b he obtains a score in the midle (eg. 2). If he marks b and c, or a and c, he gets a regular score (eg. 1). Then you may use an IRT method for an ordinal response or for a mix of binary and ordinal responses. Also it may be that to mark only a was better than to mark only b, then you may give different ordinal value to this selections. For example you may put a value +2 for a, +1 for b, -2 for c and your ordinal variable may be the sum of this values. You should also take account of the lack of response.
  • asked a question related to Image Recognition
Question
9 answers
I need to know why chi-squared kernel outperform other kernels SVM in image classification, many features are extracted depend on bag of features with histograms and the researchers applied chi-squared kernel in many papers.
Relevant answer
Answer
The chi square kernel do some normalization in the kernel itself, so it's often better than gaussian kernel.
  • asked a question related to Image Recognition
Question
3 answers
It's known the classification performance degrades as a function of the number of classes in the corpus. What I would like to know whether there is a systematic study of this aspect in the literature.
Relevant answer
Answer
It's not directly related to "visual recognition", but in 2002, face recognition researchers observed the following:
"One open question in face recognition is: How does database and watch list size effect performance? Because of the large number of people and images in the FRVT 2002 data set, FRVT 2002 reported the first large-scale results on this question. For the best system, the top-rank identification rate was 85% on a database of 800 people, 83% on a database of 1600, and 73% on a database of 37437. For every doubling of database size, performance decreases by two to three overall percentage points. More generally, identification performance decreases linearly in the logarithm of the database size."
Handbook of Face Recognition (2nd ed., Editors: Li & Jain). Chapter 21, page 569, Evaluation Methods in Face Recognition by P. Jonathon Phillips, Patrick Grother, and Ross Micheals
However, I'm not aware of any other supportive -and systematic- findings from other researchers about their observation (linear performance decrease in the log of database -class- size). 
  • asked a question related to Image Recognition
Question
10 answers
Hello,
Usually, when speaking about the invariance of descriptors, one can assume that a descirptor is invariant so the distance between two features of different images under different transformation is equal zero. But, in reality, it is not the case. So, the question is: What is the distance that we can consider as a refernce to evalute the invariance of a descriptor.
Relevant answer
Answer
It depends on the structure of descriptors that you use, where each distance is suitable between a special descriptors with respecting to the space where you apply your comparison. 
  • asked a question related to Image Recognition
Question
4 answers
I want to use color information for using  traffic sign recognation. But the color effected from illumination for example red color looks like black color. I think maybe color normalization help me. what can ı do?
Relevant answer
Answer
If I'm reading your question right, you're interested in what color different signs appear when illuminated using illuminants which have vastly different colors; the example is that a red stop sign will appear dark when illuminated using a source which is predominantly blue.
The traditional way to transform between different illuminations is to use the von Kries transform. It models the way human eyes respond to different illuminations. It's not perfect (human vision is complicated, and color spaces usually use three values to approximate the spectrum!), but it will get you close. [Hint: when you're transforming the colors, remember to convert your pixel values from sRGB to linear RGB first.]
Bruce Lindbloom (linked below) has great information about the von Kries transform and color adaptation when viewing scenes under different illuminations.
  • asked a question related to Image Recognition
Question
5 answers
Hello all,
I would like to use a time-lapse camera to take half hourly photos of a gas meter. Is anybody aware of a number recognition software that could be used in conjunction with the camera so that the readings could be logged in digital formats (as opposed to images?). Time lapse cameras are available commercially (http://www.brinno.com/html/TLC100.html ), but in the absence of a number recognition software, stored images need to be converted into digital values manually, making the process too labour intensive.
I would appreciate any advice.
Relevant answer
Answer
Easymacrosrecorder allow to create automatic pattern of mouse action (f.e. moving cursor,clicking etc.), so you can use it without any special knowledges to create "recording pattern" ;)
  • asked a question related to Image Recognition
Question
11 answers
I am working to count the number of human object that was detected on my camera. Until now I am success to separate the object and background. But I cannot count the number of detected object in simple way. There is any answer to help me to solve this problem? Thanks
  • asked a question related to Image Recognition
Question
2 answers
Interested in ways to interpret green spaces manually or computer assisted from video footage of walks through urban environments
Relevant answer
Answer
try searching for:
Quantifying plant colour and colour difference as perceived by humans using digital images (Kendal et al. 2013)
  • asked a question related to Image Recognition
Question
4 answers
Problem: We need to recognise a plate or any other ID number on a vehicle. However it's hard to get it shadow free. What methods can be applied to remove shadows effectively? Am i right that there is now way to analyse a picture having shadow of a high contrast?
Relevant answer
Answer
This problem has been addressed, at least with face recognition problem, and there are many approaches to choose from. Depending on the further processing some approach may be better for you than other, but I would recommend to see, for example, ECCV 2014 paper from Amara et al. "On the Effects of Illumination Normalization with LBP-Based Watchlist Screening". There you have eleven different approaches tested on faces.
Personally I have used only Retinex style methods and some basic filtering (but not on faces)...
  • asked a question related to Image Recognition
Question
4 answers
for image super resolution
Relevant answer
Answer
Generally, S-Transform provide a better resolution in both time and frequency domain. It is possible to demonstrate that Wavelet Transform is a special case of the S-Transform. Moreover, using S-Transform you can directly evaluate the local spectrum of the signal without using any empirical relationship between the scale factor and frequency.
  • asked a question related to Image Recognition
Question
4 answers
I am looking for state of the art methods which are being used for human pose, human upper body, and head detection in still images mainly.
Relevant answer
Answer
Deformable Part Models are top performers at the Pascal VOC challenge, but they are quite slow (few secs per image on a strong machine). Usually they use HOG + SVM on a multi-scale search-space.
Another implementation with pre-trains models:
Aggregate Channel Features on the other hand are quite fast (30+ fps on a single core, depending on the implementation), since they just estimate the features on nearby scales:
  • asked a question related to Image Recognition
Question
3 answers
I am working on SIFT and want to know the best matching technique with reference to this to get to know about vlfeat toolbox. It provides a function for feature matching i.e. vl_ubcmatch. I just want to know how this function works and its matlab code. Thank you
Relevant answer
Answer
@G. Chliveros : source code is not available on the website
  • asked a question related to Image Recognition
Question
4 answers
I am doing project on night time vehical detection in this only detect the vehical lights so pls give the source code for this project .
in this i am detect lights saparately but in my output street lights also shown in output so please help to me.
the second image is my output image.
first image is my expected output image
Relevant answer
Answer
If u r going to detect lights of car, truck etc. in which two lights are on. Then apply light thresholding (light is near 255 gray intensity). Now detect blobs. If two blobs are near to each other  then lights are true vehicle lights.
  • asked a question related to Image Recognition
Question
2 answers
Hello I want to implement kadir operator to my saliency maps but I don't understand this method. Are there any matlab codes or libs for kadir operator?
  • asked a question related to Image Recognition
Question
7 answers
What is the approach to measure the real size (length, height and width) of an object from an image while I don't know the focal length or the object distance (I don't know the origin of the image, i.e. any technical details of the lens or the camera)?
Relevant answer
Answer
The only other factor you need is the height of the object in real life (otherwise you could be photographing a model which is much closer to the camera).
The maths isn't actually that complex, the ratio of the size of the object on the sensor and the size of the object in real life is the same as the ratio between the focal length and distance to the object.
To work out the size of the object on the sensor, work out it's height in pixels, divide by the image height in pixels and multiply by the physical height of the sensor.
So the whole sum is:
distance to object (mm) = focal length (mm) * real height of the object (mm) * image height (pixels)
---------------------------------------------------------------------------
object height (pixels) * sensor height (mm)
object height (pixels) * sensor height (mm)
Let's sanity check this equation.
If we keep everything else constant and increase the focal length then the distance increases (as focal length is on the numerator). This is what you would expect, if you have to zoom your lens to make one object the size another equally sized object used to be, the first object must be further away.
If we keep everything else constant and increase the real height of the object then again the distance increases as if two objects of different real heights appear the same height in the image the taller one must be further away.
If we keep everything else constant and increase the image height, then the distance increases, as if two objects (of the same size, remember we're keeping everything else constant) appear the same pixel size in a cropped and uncropped image then the object in the uncropped image must be further away.
If we keep everything else constant and increase the object height in pixels then the distance decreases (we're on the denominator now): two equally sized objects, one takes up more pixels, it must be closer.
Finally if we keep everything else constant and increase sensor size, then distance decreases: two equally sized objects have the same height in pixels when shot with a compact (small sensor, where 20mm is a long lens) and shot with a DSLR (large sensor where 20mm is a wide lens), then the object in the DSLR image must be further away (because it appeared the same size but with a wide lens).
I saw that from the website and hope it could help you.
  • asked a question related to Image Recognition
Question
3 answers
I have a requirement to scan large documents and extract the text out of them. How should we scan the book and what ways are the most efficient ways of doing this. How can I do this in the most efficient way and be able to get the most accuracy from an OCR program.
Relevant answer
Answer
You can use the following  site:
Then create the following datasets
Data Set Information:
This database has been artificially generated by using a first order theory which describes the structure of ten capital letters of the English alphabet and a random choice theorem prover which accounts for etherogeneity in the instances. The capital letters represented are the following: A, C, D, E, F, G, H, L, P, R. Each instance is structured and is described by a set of segments (lines) which resemble the way an automatic program would segment an image. Each instance is stored in a separate file whose format is the following:
CLASS OBJNUM TYPE XX1 YY1 XX2 YY2 SIZE DIAG
where CLASS is an integer number indicating the class as described below, OBJNUM is an integer identifier of a segment (starting from 0) in the instance and the remaining columns represent attribute values. For further details, contact the author.
Attribute Information:
TYPE: the first attribute describes the type of segment and is always set to the string "line". Its C language type is char.
XX1,YY1,XX2,YY2: these attributes contain the initial and final coordinates of a segment in a cartesian plane. Their C language type is int.
SIZE: this is the length of a segment computed by using the geometric distance between two points A(X1,Y1) and B(X2,Y2). Its C language type is float.
DIAG: this is the length of the diagonal of the smallest rectangle which includes the picture of the character. The value of this attribute is the same in each object. Its C language type is float.
Good Luck
  • asked a question related to Image Recognition
Question
4 answers
I have read the paper, "Vehicle logo recognition in traffic images using HOG features and SVM", published in 2013. I have some questions which confuse me several days.
First, in part B of section III, I don't know what is overlapping block. What is difference between "overlapping block" and "block"? According to the Fig.8, is the overlapping block equal to the area which is overlapped by two near blocks? Is the overlapping block only made up by left block and right block? and the overlapping block "is not" made up by upper block and bottom block?
Second, a feature vector is obtained by sampling the histograms from the contributing spatial cells. Is the cells in the overlapping block?
Finally, I don't really understand what is multiple binary classification problems? How to reduce the single multi-class problem into multiple binary classification problems?
The attachment is the paper. Thank you very much for your reply.
Relevant answer
Answer
I am one of the authors of the paper that you are citing. I am going to try to help you.
First of all, the concept of overlapping block can be better understood in the original paper where HOG is described by Dalal and Triggs. With the aim of calculating the HOG descriptor, several local gradients are binned in accordance with orientation, weighted depending on their magnitude, within a spatial grids of cells with overlapping blockwise contrast normalization. Within each overlapping block of cells, a feature vector is obtained by sampling the histograms from the contributing spatial cells. The feature vectors for all overlapping blocks are concatenated to produce the final feature vector which is fed to the classifier.
Finally, the concept of binary classification is related to problems where there are several classes that are classified by means of different binary classifiers. You can understand this concept by studying the LIBSVM libraries.
I hope that my response helps you. Good luck with your research!
  • asked a question related to Image Recognition
Question
4 answers
if possible, help me with the matlab code... Thanks in advance......
Relevant answer
Answer
Hi Naveen,
as far as I know the extension of GLCM (Grey-level co-occurrence matrices) to colour images is usually referred to as integrative co-occurrence matrices. The method was originally proposed by Palm [1]. A very good description of the approach is also available in [2].
[1] Palm, C. Color texture classification by integrative Co-occurrence matrices
(2004) Pattern Recognition, 37 (5), pp. 965-976.
[2] Arvis, V. Debain, C. Berducat, M., Benassi A. Generalization of the cooccurrence matrix for colour images: application to colour texture classification (2004) Image Analysis & Stereology, 23(1), pp. 63-72
Vincent Arvis, Christophe Debain, Michel Berducat, Albert Benassi
Finally, let my mention a paper you might find useful:
F. Bianconi, R. Harvey, P. Southam and A. Fernández; Theoretical and experimental comparison of different approaches for color texture classification (2011) Journal of Electronic Imaging, 20(4), 043006
(MATLAB code available at http://dismac.dii.unipg.it/ctc/)
  • asked a question related to Image Recognition
Question
3 answers
I am working on this problem
Relevant answer
Answer
Action recognition with improved trajectories
H Wang, C Schmid
Computer Vision (ICCV), 2013 IEEE International Conference on, 3551-3558
Action recognition by dense trajectories
H Wang, A Klaser, C Schmid, CL Liu
Computer Vision and Pattern Recognition (CVPR), 2011 IEEE Conference on
  • asked a question related to Image Recognition
Question
10 answers
Does anyone have any experience on the development of a domestic robot's ability to locate itself in an indoor environment?
For your answer, take into account the possibility of having a camera on the robot (image recognition may be a way to go?).
I believe it may be necessary to take multiple inputs. For example, an image recognition algorithm, together with a "dead-reckoning" method, such as estimating diplacement as a function of the revolution of the robot's wheels could be used to estimate the position of the robot.
All feedback would be greatly appreciated, as I am just starting with this investigation.
Thank you very much!
Relevant answer
Answer
You could have a look at RatSLAM, as it would fit your constraints very well (works indoor + outdoor, uses a camera as input). There is an open source version of it available, too: OpenRatSLAM.
  • asked a question related to Image Recognition
Question
6 answers
I am working on action recognition from multi-views skeleton images captured using kinects.  However, when I acquired skeleton images using three kinects with 3 notebooks the images frame across devices is not quite synchronized.  Is there any ways to sync all frames from different devices?  Or is there a public available database of synced skeleton coordinates that I can download to perfrom action recognition? 
Relevant answer
Answer
I think the easiest way to solve this is to coregister the three kinect images by calibrating the setting to the same (e.g. real world) coordinate system. This means you have to transform the skeletons of each kinect to the real world coordinate system first. If this is done with sufficient accuracy the registration of the joints coming from the different sensors should be possible by a simple nearest neighbor search.
  • asked a question related to Image Recognition
Question
21 answers
I would like to extract various image features for phone screenshot images recognition. I hope the feature extraction method runs fast, so perhaps the method should be implemented in Python and / or C ++. Any source code links would be very helpful!
Thanks a lot! 
Relevant answer
Answer
OpenCV library is in case. Please, get more information through the link:
  • asked a question related to Image Recognition
Question
1 answer
I plan to do some work about image recognition on mazie pattern recognition for our spray machine. But I didn't find any related research so I don't know difficulties that need to be overcome. So here I need some advice from those who have done related work before. Thanks!
The sprayer is one type of fogging machine which the effective distance of spray is 10m.
Relevant answer
Answer
What type of sprayer? Did you link that image recognition to the sprayer?
  • asked a question related to Image Recognition
Question
6 answers
I know this is a fairly basic question but I'm having some difficulty finding anything out there - does the emotion of the perceiver influence how neutral faces are encoded / recognized?
Relevant answer
Answer
I think that hypothesis makes sense, but I also have not seen this paper extended, which I've always found surprising.
Ackerman has work showing angry faces induce better CR memory, due to the potential threat they pose.
Research also shows that sad mood increase attention to details and thus one might suspect sad moods to increase face memory, but this may only be for SR targets since CR targets may be perceived as offering fewer affiliative affordances.
  • asked a question related to Image Recognition
Question
7 answers
Can anyone suggest a novel technique for Arabic character recognition system ?
Relevant answer
Answer
thanks Mr. Evon
  • asked a question related to Image Recognition
Question
3 answers
I am working on pattern detection of dermatological images and I would like to know how to extract and match them.
Relevant answer
Answer
I think the previous two references provided by @Christos P Loizou and @Lucia Ballerini are new and excellent to start with.
  • asked a question related to Image Recognition
Question
3 answers
Is there a way of labeling multiple objects within an image or each object would be separated to get the results on their feature basis?
Relevant answer
Answer
Thank you all.In my dataset each image has a person with infected ,healthy skin and background ?
  • asked a question related to Image Recognition
Question
2 answers
How would I specify that group 1=apple,group 2 = orange and group3=banana?
Relevant answer
Answer
Thank you
  • asked a question related to Image Recognition
Question
2 answers
SDK -- Software Development Kit
OMR -- Optical Mark Recognition
Relevant answer
Answer
Thanks for this. I'll check out the links.
  • asked a question related to Image Recognition
Question
2 answers
1. students' profile in a structured format, 2. MPEG7 CE-Shape 1
Also needed are: english fnt datasets and kth-tips
Relevant answer
Answer
Thanks Ahmed
  • asked a question related to Image Recognition
Question
1 answer
There is attribute annotated images of imagenet. I expect to use them in my research but being a comparison I am also looking for other stuffs made on this subset. Do you know any work you are able to suggest to look?
Relevant answer
Answer
Perhaps this would be a good place to start:-
  • asked a question related to Image Recognition
Question
13 answers
I need to extract areas that contain any text from the given image which is more like a scene than a simple document. As the next step, I also intend to recognise the text.
Relevant answer
Answer
In order to extract text regions I recomend you to read this paper about text detection on natural images: Epshtein, B.; Ofek, E.; Wexler, Y., "Detecting text in natural scenes with stroke width transform"
ccv is a computer vision library that has some great algoritms, and has an implementation of this method: http://libccv.org/doc/doc-swt/
I hope this will be helpful to you.
  • asked a question related to Image Recognition
Question
5 answers
In the research for moving object detection in moving camera the first step is estimation ego-motion and then do stabilization the sequence video but some of research have other approach, do you know which they are practical and efficient?
Relevant answer
Answer
Gang Zhang , Tobias Senst Thank you very much, good luck.