Science topic

Machine Vision - Science topic

Explore the latest questions and answers in Machine Vision, and find Machine Vision experts.
Questions related to Machine Vision
  • asked a question related to Machine Vision
Question
3 answers
Hello, can you recommend to me any architecture of machine vision that allows me to detect at least one object with a low-resolution camera (or high, as well) with high FPS (frame per second) more than 15 with Raspberry Pi5 (8Gb RAM)?
Relevant answer
Answer
Yolov8 supports real-time object detection and it also can detect small objects.
  • asked a question related to Machine Vision
Question
3 answers
Seeking insights on leveraging deep learning techniques to improve the accuracy and efficiency of object recognition in machine vision systems.
Relevant answer
Answer
Identification. Image classification using deep learning categorizes images or image regions to distinguish between similarly looking objects including those with subtle imperfections. Image classification can, for example, determine if the lips of glass bottles are safe or not.
Regards,
Shafagat
  • asked a question related to Machine Vision
Question
4 answers
𝟮𝟬𝟮𝟰 𝟱𝘁𝗵 𝗜𝗻𝘁𝗲𝗿𝗻𝗮𝘁𝗶𝗼𝗻𝗮𝗹 𝗖𝗼𝗻𝗳𝗲𝗿𝗲𝗻𝗰𝗲 𝗼𝗻 𝗖𝗼𝗺𝗽𝘂𝘁𝗲𝗿 𝗩𝗶𝘀𝗶𝗼𝗻, 𝗜𝗺𝗮𝗴𝗲 𝗮𝗻𝗱 𝗗𝗲𝗲𝗽 𝗟𝗲𝗮𝗿𝗻𝗶𝗻𝗴 (𝗖𝗩𝗜𝗗𝗟 𝟮𝟬𝟮𝟰) 𝘄𝗶𝗹𝗹 𝗯𝗲 𝗵𝗲𝗹𝗱 𝗼𝗻 𝗔𝗽𝗿𝗶𝗹 𝟭𝟵-𝟮𝟭, 𝟮𝟬𝟮𝟰.
𝐈𝐦𝐩𝐨𝐫𝐭𝐚𝐧𝐭 𝐃𝐚𝐭𝐞𝐬:
Full Paper Submission Date: February 1, 2024
Registration Deadline: March 1, 2024
Final Paper Submission Date: March 15, 2024
Conference Dates: April 19-21, 2024
---𝐂𝐚𝐥𝐥 𝐅𝐨𝐫 𝐏𝐚𝐩𝐞𝐫𝐬---
The topics of interest for submission include, but are not limited to:
- Vision and Image technologies
- DL Technologies
- DL Applications
All accepted papers will be published by IEEE and submitted for inclusion into IEEE Xplore subject to meeting IEEE Xplore's scope and quality requirements, and also submitted to EI Compendex and Scopus for indexing.
𝐅𝐨𝐫 𝐌𝐨𝐫𝐞 𝐃𝐞𝐭𝐚𝐢𝐥𝐬 𝐩𝐥𝐞𝐚𝐬𝐞 𝐯𝐢𝐬𝐢𝐭:
Relevant answer
Answer
Great opportunity!
  • asked a question related to Machine Vision
Question
2 answers
#specialissue #CallforPapers 📢 CMC-Computers, Materials & Continua new special issue “Machine Vision Detection and Intelligent Recognition”is open for submission now. Machine vision detection and intelligent recognition are important research areas in computer vision with wide-ranging applications in manufacturing, healthcare, security, transportation, robotics, industrial production, aerospace, and many other industries. This is a great opportunity for researchers and practitioners to share their latest findings and contribute to the advancement of the field. 📆 The deadline for manuscript submission is 31 December 2023. 👉 To submit your manuscript, please visit the following link: https://www.techscience.com/cmc/special_detail/machine-vision-detection We look forward to your contributions to this exciting special issue!
Relevant answer
  • asked a question related to Machine Vision
Question
2 answers
Hello friends, good time
I want to choose a backlight system to inspect hot glass bottle defects. Given that hot bottles emit infrared light, do you think the backlight is better with white light or infrared light?
BR.
Relevant answer
Answer
Hello dear Sumit Bhowmick, good time,
Thank you for your time and information.
Piotr Garbacz previously reviewed the use of UV backlight in the "INSPECTION OF TABLEWARE GLASS PRODUCTS AT THE HOT END OF PRODUCTION LINE" paper and concluded that "Industrial trials have shown that this method cannot be used due to the significant reduction of the fluorescence effect with increasing temperature".
I want to use visible or infrared backlight but I don't know, which one is better!?
BR.
  • asked a question related to Machine Vision
Question
9 answers
As the question is, in many natural gem stones, the color distribution is not uniform, so is there a suitable method to separate and quantify the color clusters?
  • asked a question related to Machine Vision
Question
14 answers
What we should ask, instead, is how to develop more informed and self-aware relationships with technologies that are programmed to take advantage of our liability to be deceived. It might sound paradoxical, but to better comprehend AI we need first to better comprehend ourselves. Contemporary AI technologies constantly mobilize mechanisms such as empathy, stereotyping, and social habits. To understand these technologies more deeply, and to fully appreciate the relationship we are building with them, we need to interrogate how such mechanisms work and which part deception plays in our interaction with “intelligent” machine
Relevant answer
Answer
I think you have to break the phenomena into two layers. There is the layer of people outside the field in which the narrative fed can span from scientific reporting to science fiction. Trying to reason about the narrative at this layer is too complicated due to the many factors that move people. Also, since being non technical, they see an AI accomplish a task such as that of Lambda and hastily jump to conclusions (even the engineer at Google was misled by the output given by the AI).
The more wearisome layer is the scientific community which is driven by scientific results. The reason for focusing on the metric is as follows:
a) metric A gives an 'intelligence measure' B
b) test results using metric A gives results C
c) results C confirm 'intelligence metric' B
Notice that the problem is in accepting a) as valid to form an argument that will sway the scientific community. Since you have a reasonable argument structure most scientist will attribute the phenomena as valid. While I think that what you want to drive at is the construction of a), it is the steps in reasoning that ends with c) that is mostly at play in the phenomenon you mention.
Regards
  • asked a question related to Machine Vision
Question
11 answers
I am currently searching for a topic for my research which is about using machine vision and object recognition to control a robot (serial or parallel it does not matter). Unfortunately I can not find a problem to be solved. can any one recommend some new points of research ?
Relevant answer
Answer
Robotics are used in many materials handling applications because they are more efficient in certain tasks and take people out of potentially unsafe situations. To work effectively, robotics rely on sensors to interact and perceive their environment. This white paper from ifm illustrates the importance of 3D image sensors in robotics and the systems used to operate and manage these.
The interaction of the human eye with the visual centre in the brain creates a three-dimensional image of the environment. In robotics, such a three-dimensional image is important to enable robots to act independently and without causing any danger outside of safety barriers...
  • asked a question related to Machine Vision
Question
6 answers
Hello everyone, good time.
I want to select a camera (lens + sensor) for a machine vision system. according to machine vision books, the modulation transfer function of the camera is an important factor that affects algorithm performance?
How can I select a suitable MTF value for the camera of my machine vision system?
Are there any criteria for determining suitable image contrast for the machine vision algorithms?
Can you please tell me other factors that I must consider for selecting a camera for a machine vision system?
thank you very much in advance.
Relevant answer
Answer
Mahdi Mansouri Maybe you misunderstood me, maybe we're talking about different things: the "minimum feature" to be discerned should not be smaller than 10 pixels h/w - the target itself might require more pixels. Depends.
To clarify it for me: what would be your scene and what would be the targeet ?
Pixel size itself is not important - provided lens and sensor are matched: for a given pixel number, a larger sensor (aka larger pixel area) requires a lens with a larger focal length than a smaller sensor of the same number of pixels.
  • asked a question related to Machine Vision
Question
13 answers
I'm searching about autoencoders and their application in machine learning issues. But I have a fundamental question.
As we all know, there are various types of autoencoders, such as ​Stack Autoencoder, Sparse Autoencoder, Denoising Autoencoder, Adversarial Autoencoder, Convolutional Autoencoder, Semi- Autoencoder, Dual Autoencoder, Contractive Autoencoder, and others that are better versions of what we had before. Autoencoder is also known to be used in Graph Networks (GN), Recommender Systems(RS), Natural Language Processing (NLP), and Machine Vision (CV). This is my main concern:
Because the input and structure of each of these machine learning problems are different, which version of Autoencoder is appropriate for which machine learning problem.
Relevant answer
Answer
Look the link, maybe useful.
Regards,
Shafagat
  • asked a question related to Machine Vision
Question
4 answers
I want to find out the objects length with decent precision. The laminate can have different length, we assume that we don´t know the actual length. Therefore i basically want an algorithm that can calculate the length.
Relevant answer
Answer
This is a good question.
  • asked a question related to Machine Vision
Question
4 answers
I am developing a machine-learning model to make accurate and fast counting of metal pipe with difference cross-sectional shapes. Well-define rectangular, triangular and circular shapes are quite ok to do, but the C shape metal is really complicated especially when they overlap one another as shown in the attached photo. Anyone has any suggestion of a model that can count overlapping object? Thanks in advance.
  • asked a question related to Machine Vision
Question
12 answers
There are many, many datasets for computer vision tasks such as object detection and the like, but benchmarks for automated visual inspection tasks (e.g. detection of surface defects, bulk material classification) are hard to come by. I've searched in the usual places (http://www.cvpapers.com/datasets.html, http://www.computervisiononline.com/datasets, google) but came up with nothing. Do you know of such datasets (synthetic or natural)?
  • asked a question related to Machine Vision
Question
15 answers
Let me assume that the organizations would afford to have high performance super computing to work on high resolution images directly without the need for image down scaling for machine vision applications, what is the compelling need for these super resolution algorithms that down sample and up sample an existing high resolution image?
Relevant answer
Answer
I agree, and I did mention in my first post the "speed of processing" of algorithms but that does not seem to be of interest to the creator of this post based on his reply. When working with CODEC and network communication, speed of processing (i.e., down sampling / super resolution) becomes apparent for multimedia communication because of the bandwidth (as you mentioned) and as you cannot guarantee the client is as lucky as you are in having access to HPC.
  • asked a question related to Machine Vision
Question
9 answers
Movie of driver when driving. Fatigue, drowsiness, distraction,
Relevant answer
Answer
There is a new public dataset that is based on realistic drowsiness for 3 levels of awareness. In consists of 60 participants.
It is called UTA-RLDD
I provide the links here:
Dataset:
Paper:
Code:
Demo:
I hope it helps!
  • asked a question related to Machine Vision
Question
5 answers
I'm a newbie in the field of Deep Reinforcement Learning with background in linear algebra, calculus, probability, data structure and algorithms. I've 2+ years of software development experience. In undergrad, I worked in tracking live objects from camera using C++,OpenCV. Currently, I'm intrigued by the work been done in Berkeley DeepDrive (https://deepdrive.berkeley.edu/project/deep-reinforcement-learning). How do I gain the knowledge to build a theoretical model of a self-driving car ? What courses should I take? What projects should I do ?
Relevant answer
Answer
Hi Aniruddha,
If you are able to spend some money on acquiring the knowledge, then Udacity's Self Driving Course is one of the best places to get started. More info at https://in.udacity.com/course/self-driving-car-engineer-nanodegree--nd013
The best part is they have open sourced some part of the codes which can be a great starting point. The codes are available at https://github.com/udacity/self-driving-car
To write software for self driving cars, I would recommend using ROS (http://www.ros.org/). ROS have many inbuilt functionalities like object detection, path planning, node controls etc which can get you started easily. ROS Wiki (https://wiki.ros.org/) can offer you a glimpse of what ROS is capable of.
ROS turtlebot autonomous navigation (https://wiki.ros.org/turtlebot_navigation/Tutorials/Autonomously%20navigate%20in%20a%20known%20map) will be a great tutorial to start with.
Though I have never used, https://www.duckietown.org/independent/guide-for-learners is also an interesting platform to start with.
Regards,
Vishnu Raj
PS: If you find this answer useful, don't forget to upvote.
  • asked a question related to Machine Vision
Question
3 answers
Explain in detail
Relevant answer
Answer
I am supposing that you are dealing with machines? Hence you mentioned the bearings. I have been working on eccentricity fault detection and diagnosis using model-based ANN, and the accessibility was an important factor. If I am understanding the whole picture, you need the bearings pictures. Is this done to diagnose a part from a specific machine or as an independent part? Regardless the part that we are dealing with extremely small deviations or distance.
  • asked a question related to Machine Vision
Question
4 answers
I need the matlab implementation of 'jseg' image segmentation algorithm for content based image retrieval.
Relevant answer
Answer
Thank you Muhammad
  • asked a question related to Machine Vision
Question
5 answers
Hi
For our research project we would like perform collision detection and obstacle avoidance with Machine Vision. I would like to use deep learning or maybe even better extreme online learning techniques.
So please can someone suggest some platforms and tools where to start with machine vision and deep learning/extreme online learning with Machine Vision to hit the ground? Would be appreciate your help
Thanks
Relevant answer
Answer
How about other deep learning or online/extreme learning library? Im not looking for the simple one but most efficient and appropriate for collision detection/obstacle avoidance. How about Faster R-CNN??
  • asked a question related to Machine Vision
Question
4 answers
Dear colleagues,
I need the cropped version of FERET face database. Please provide a link to download it. Thanks.
Relevant answer
  • asked a question related to Machine Vision
Question
5 answers
Hi
I have a Machine Vision project for a crane in the construction. So must meets the requirements for swing angle detection and obstacle avoidance from the crane (by the distance of around 40m). So, I need some high quality monocular or stereo camera that should be able to detect the swing angle and obstacle avoidance at a distance around 40 meters and meet the other requirements like dynamic range. Please, can you suggest any monocular or stereo Camera that meets such requirements?
Thanks
Relevant answer
Answer
When ever I select a camera for any of my computer vision related projects first I obtain the following factors which works best for the problem.
A. FOV
B. Distance (in your case 40m)
C. Sensor resolution: This should be calculated using the FOV, object distance and the required accuracy in depth calculation.
D. Lighting conditions: (Auto exposure works good only in outdoor and day time operations. Otherwise select a camera which supports manual exposure setting)
E. Communication interface: Based on ambient noise, bandwidth requirement, processing location and processing equipment compatibility.
F. Lens: based on FOV
G. Spectral options (for selecting filters)
I. FPS: 15 would be enough for static obstacle modeling. But at least 30 is required for real-time tracking (if you are going to do adjustments to the swing, dynamically)
J. Sensor type: to avoid certain effects such as rolling shutter and considering spectral sensitivity.
K. Ingress protection requirements
These are some generic factors to be considered when selecting an industrial camera and lenses. Prioritizing the above factors makes it easier to select the best for the budget. Some industrial camera manufacturers even provide online tools for selecting best fitted camera for our problem by knowing the above factors.
For prototyping I would suggest using a cheap industrial/CCTV camera with a common lens mounting option, with around 5 megapixels CMOS sensor and supporting at least FHD video transmission.
  • asked a question related to Machine Vision
Question
9 answers
I would like to estimate the built-up area by using image edge detection. However, after detecting the edge inside the image, I need to close the area between the edges. Can anyone gudie?
Relevant answer
Answer
Dear Mohammad,
Please follow some of the papers given below:
1. Davim, J. P., Rubio, J. C., & Abrao, A. M. (2007). A novel approach based on digital image analysis to evaluate the delamination factor after drilling composite laminates. Composites Science and Technology, 67(9), 1939-1945.
2. Tan, Y. L., Kim, H., Lee, S., Tihan, T., Ver Hoef, L., Mueller, S. G., ... & Knowlton, R. (2018). quantitative surface analysis of combined Mri and Pet enhances detection of focal cortical dysplasias. NeuroImage, 166, 10-18.
3. Li, X., Gao, B., Woo, W. L., Tian, G. Y., Qiu, X., & Gu, L. (2017). Quantitative surface crack evaluation based on eddy current pulsed thermography. IEEE Sensors Journal, 17(2), 412-421.
Thanks,
Sobhan
  • asked a question related to Machine Vision
Question
2 answers
The digital image sensor is a technology used to record electronic images. These image sensors are silicon microchips that are engineered to operate in different ways. The engineering involved in these circuits is very complex and the techniques used to manufacture these devices are not standardized in the industry. It is impractical to discuss the entire spectrum of manufacturing processes here, but they are noted because they give rise to the wide variety of applications for the digital image sensor. These engineering practices involve creating features that are added to the bare sensor for use in specific applications ranging from machine vision in robotics, to cameras integrated into cellular phones and PDAs, to ultraviolet sensors on the Hubble Space Telescope.
Relevant answer
Dear Abhijith,
welcome,
This is nice question.
Can one use other semiconductors other than silicon for the electronic cameras specially those materials that are used as active layers for solar cells?
As photo sensors these materials can be used either in photocondcutivr mode or photovoltaic mode. One can use organic semiconductors which are very efficient are absorber for a wide range range of wavelengths. One can use also perovskites which has internediate properties between the metallic and organic materials.
I think there is no problem with the sensitivity and responsibility of these materials. But are are some other concerns:
The dynamic performance of such devices as the mobility of the charge carriers in these devices is much lower than Si.
The other important concern is sensor current is required to be amplified. This accomplish in Si by directly integrating the amplifiers on the sane material. This may not be applicable for such proposed sensors.
The other point is the acquisition and storage integrated and built on the same chip.
So, i think the main concern is the acquisition and processing capabilities integrated on the same chip which makes silicon prevails the other optical sensors.
One last point is the stability and reliability of this new sensors
In case of LEDs there is metallic LEDs and organic lEDS. Why not have organic sensors for digital cameras. It may need long development track to arrive there.
More elaborate subsidies are need for assessing such sensors for digital camera.
Best wishes
  • asked a question related to Machine Vision
Question
6 answers
Given the recent interest in creating swarms of robotic flies that carry explosives and contain face recognition circuitry (as described in Bot Flies, The Economist, Dec 16, 2017) will a time come when the enablers of such killing machines will be made to account? We all know that machines will never—and I emphasize never—be able to make decisions without the assistance of humans so expect arguments such as “the machine did it” to fail in a court law. Can you imagine a world where people like Robert Oppenheimer (the developer of the atomic bomb) are hauled off to prison and later put to death for the creation of lethal autonomous weaponry?
Relevant answer
Answer
Dear Edward,
I propose you to see links and attached files in subject.
-Who is Responsible for Autonomous Weapons? - Future of Life Institute
-When Thinking Machines Break the Law - Schneier on Security
-UN urged to ban 'killer robots' before they can be developed | Science ...
https://www.theguardian.com › Science › Weapons technology
-We can't ban killer robots – it's already too late | Philip Ball | Opinion ...
https://www.theguardian.com › Opinion › Robots
-Making the Case: The Dangers of Killer Robots and the Need for a ...
-Killer robots: No one liable if future machines decide to kill, says ...
-Humans Can't Escape Killer Robots, but Humans Can Be Held ...
-Military Robots and the Laws of War - The New Atlantis
-In defence of killer robots: Interview with expert Dr. William Boothby ...
Best regards
  • asked a question related to Machine Vision
Question
4 answers
Machines are detecting and recognizing people, detecting actions, tracking objects and even detecting dangers. When can blind people and people with visual disabilities can depend reliably on vision tools to aid them?
Relevant answer
Answer
Five years later we can say that some works have made it happen. Not in production though.
All of the codes of the paper are open source: https://github.com/BAILOOL/Assistant-for-People-with-Low-Vision
  • asked a question related to Machine Vision
Question
3 answers
I was wondering if anyone knows or have published a technique that sucessfully combines shallow (HOG, SIFT, LBP) with deep (GoogLeNet) representation? I am interested both for images and video cases.
Relevant answer
Answer
Hi Konstantinos,
Fischer et al. showed that CNNs outperform local descriptors based on orientation histograms such as SIFT, HOG, SURF. Follow:
This was confirmed by state-of-art GoogLeNet. Follow:
Neverthless, Benenson et al. highlighted "...although some of these features might be driven by learning, they are mainly hand-crafted via trial and error,..." and concluded about deep architectures: "...Despite the common narrative there is still no clear evidence that deep networks are good at learning features for pedestrian detection (when using pedestrian detection training data). Most successful methods use such architectures to model higher level aspects of parts, occlusions, and context. The obtained results are on par with DPM and decision forest approaches, making the advantage of using such involved architectures yet unclear...".
Follow:
As a compromise, combining deep learning and local descriptors allow to enhance computational performances. Follow:
Finally, Milan et al. proposed very recently a recurrent neural network to address online multi-target tracking. They casted "...the classical Bayesian state estimation, data association as well as track initiation and termination tasks as a recurrent neural net, allowing for full end-to-end learning of the model...".
Follow:
Regards
  • asked a question related to Machine Vision
Question
13 answers
I am a beginner in the area of computer vision. I have a basic doubt in 2D perspective projection. While analyzing 3D-2D transform  we are taking the focal length as the distance to image plane. ( In almost all the reference). Focal length of a lens is actually the distance between center point of the lens to the point on the optical axis where the parallel lines will converge. So if we are placing the image plane at this point how we can get the image? If any clarification on my question please mention, I will elaborate. Hope valuable reply, it will help me to improve my basic knowledge in the area.
Relevant answer
Answer
See the book "Optical Metrology, 3rd edition", John Wiley & Sons, Chichester 2002.
  • asked a question related to Machine Vision
Question
3 answers
Hi,
I have a text file containing:
1. The names of images (Image Directory Locations to be read by algorithm)
2. Vectors which have image pixel positions as their values.
Can I use this data to train a Support vector regression model and get the pixel position vector for an unknown image?
I would very much appreciate if someone could guide me about my approach. In case someone can refer me to the relevant papers, I would be really grateful.
Thanks!
Relevant answer
Answer
Hi,
Thank you for your replies. I have checked the material that Alexander shared and it was helpful. However, let me rephrase my question:
My objective is to train the Support Vector Regression model for estimating the required contours in the unknown image. In order to achieve this task, I am thinking about preparing a data set. My training data consists of 64 (600*600) gray scale images, each image containing a known contour which can be fully described by the known pixel positions (points) on the image. The vector containing these points is (1*8) in length. So I have a text file which contains the directory/name of each image and the corresponding vector, so that I can read it into the SVR model and fit it. Now, my target is to feed the trained model with 14 testing images, for which I could get the 14 (1*8) pixel positions vector, each of them corresponding to their respective images.
There is a serious flaw in this approach as regression states that the dependent and independent variables should have same size, which in my case is not happening: considering that known variable X is the image (600*600) and the unknown variable Y is a vector (1*8) containing the points that define the required contour.
Any suggestion from here on is welcome. I am sorry my initial question is too abstract. One way I know is to use a (600*600) matrix of image features and then train the regression model. But asking the regression model to estimate 360000 features based on an unknown image sounds too much and thus is prone to strong uncertainty.
  • asked a question related to Machine Vision
Question
4 answers
I want to color grayscale images and infrared images. I am looking for the best algorithm. Compete accuracy and quality of coloring is needed, as well as speed.
Relevant answer
Answer
which color scale is better? gray or rainbow?
what are the advantages of using a ray scale? check the link below
why the author has used gray scale, not iron or rainbow?
  • asked a question related to Machine Vision
Question
11 answers
Hi everyone,
I have been dealing with a motion (or action) classification problem, which has number of frames for each instance of class in its dataset.
Since I extracted corresponding keypoints among those frames, I have a trajectory for each keypoint, and consequently, a bunch of trajectories for each instance.
And now, I'm looking for how to describe these trajectories to create my feature vectors--instead of just putting the position of keypoints in my feature vector--, and later on, use it for classification.
Any help?
Relevant answer
Answer
I recommend to read the following paper, "Action Recognition by Dense Trajectories"
There is also code available to describe trajectories.
  • asked a question related to Machine Vision
Question
8 answers
I want to develop maybe a new way/algorithm to detect cars in computer vision. What kind of algorithm I can dig in deeper?
Relevant answer
Answer
You can work with motion based algorithms, and develop new background subtraction techniques, extending further Gaussian Mixture Models, and other Bayesian approaches. There are several good surveys. One of the latest ones is of 
T. Bouwmans, “Traditional and Recent Approaches in Background Modeling for Foreground Detection: An Overview”, Computer Science Review, May 2014.
available on line. 
There are other pattern recognition approaches that can be developed beyond the state of the art.
  • asked a question related to Machine Vision
Question
11 answers
I'm a student with electrical/mechanical background, in my project I'm searching for a solution for a company who wants to start with 3D cameras for robotics.
At the moment I'm working with Matlab and it works great, the possibility to create your own GUI is a big plus.
But I read Matlab is more for developing purpose and is slower (overhead).
A second software package that I try to use is Halcon, at the moment I've no overview of the possibilities.
But it looks to me that you can program in Halcon's own language hdevelop or using their libraries in your own code (like C++).
Programming in hdevelop with it's GUI seems to be easier/faster than low-level programming (e.g. C++), but I don't know the limitations.
A disadvantage is that there is no community for support, you need to use their documentation.
A third option I read a lot about is OpenCV, but with no low-level programming background this seems too ambitious for me.
I'm not searching the best solution for me, but for a company (although I know the company hasn't a lot of computer engineers).
I was hoping to find software with a good GUI to reduce low-level programming, Halcon seems to be the closest match.
Thanks for your help.
Relevant answer
Answer
Hi Mat,
I use Halcon, it's a very powerful tool mainly for industrial purposes. For investigation it may be used in processes that aren't your focus because some functions are like a black box (its their knowledge and marketing advantage).
There is a group on LinkedIn about Halcon with experienced users that gives faster answers than Halcon support.
And, YES you develop all your code and export it to other languages or use hdevengine in which any modification require just to replace an Halcon file and not to compile again all app
  • asked a question related to Machine Vision
Question
8 answers
 I want to segment a table in the depth image based on depth information obtained from Kinect2.  The problem with the table is that it is infront of camera and covers a large depth area. Depth thresholding also eliminates other objects from the scene at the same depth level as of table. Any idea would be highly appreciated!
Relevant answer
Answer
please read my two recent paper about object detection by RGB-D camera, 
  • asked a question related to Machine Vision
Question
5 answers
i want to extract  features such as eyes, nose, eyebrows, lips etc., from face image. i would like to use SURF features algorithm using MATLAB computer vision system toolbox. i need help in this, what would be the output of this? i want to execute this across multiple images and then data set to be created with these features.
 can someone please help me, how to  create this data set explain me elaborately.
Relevant answer
Answer
Hi Prasad,
To create this data first you need to extract some features. You can sea bellow links to understand this process.
and then save your data in matlab  by these codes:
eye=[A];
nose=[B];
lips=[C];
saving='data .mat'
save(saving,'eye','nose','lips')
best regards,
  • asked a question related to Machine Vision
Question
2 answers
Is there any research on how we can retrieve the phase of the thermal data from the thermal images captured by a thermal camera?
Relevant answer
Answer
  • asked a question related to Machine Vision
Question
7 answers
Hello every body.
what is the simplest way to obtain position and orientation of two postures of one plate in space wrt each other, by knowing Cartesian information (x,y,z) of three points on the plate wrt a camera in each posture?
for more explanation, we have a plate of one device e.g end effector of parallel robot, and we can obtain Cartesian information(x,y,z) of any desired point on it wrt camera(by stereo vision), the goal is to measure position and orientation of two postures of the plate in space wrt each other, I want to know ways to achieve this goal(especially simplest ones)? notice that we can put some markers on the plate,e.g one black paper with three white circle on it, or a triangle on the paper or...
 wrt=with respect to
Relevant answer
Answer
Finally I found the solution:
Least-Squares Fitting of Two 3-D Point Sets
K. S. ARUN, T. S. HUANG, AND S. D. BLOSTEIN
  • asked a question related to Machine Vision
Question
3 answers
There are some papers that use cca to face recognition.
I've extracted two type of features from ORL dataset, that have 400 images of 40 subjects. and, at this step by using each type of features classification accuracy is about 98%, but when i transformed feature to new space using cca, accuracy fall to just 3%.
- Where is problem?
- Are class labels not important to find directions using cca?
Thanks.
Relevant answer
Answer
Hi, Reza,
Actually, you can use a lot of methods that are much better than CCA.
Please find these methods in chapter 4 of this survey:
a comprehensive survey on pose-invariant face recognition, 2015
  • asked a question related to Machine Vision
Question
7 answers
i want to extract the tables from scanned document images with help of ML. Please suggest robust method for extracting the tables. 
I need to extract the table details with help of ML functions. I have OCR tools but that extracts text only. 
Relevant answer
Answer
Dear Sabari Nathan,
the most significant properties of a line being a part of a table are
a) that it consists of non-continuous text flow, i.e. having sequences of characters alternating with intervals of empty space, and
b) that the intensity distribution (distribution of black and white pixels) of the line has a strong correlation with the distribution of the line above and the line below.
Both properties can be measured easily and can be used as a feature vector for classification in order to decide if the line is part of a table or not, e.g. using thresholds or any kind of machine learning technique.
If you are also looking for tables that do NOT cover the whole width of the sheet you have to apply the above method to smaller sections of the lines.
Good luck!
  • asked a question related to Machine Vision
Question
18 answers
For classification if some m number of features are selected. some of these may be co-operative (good for classification) and rest may not. What is the statistical measure to discard rest of the features.
Relevant answer
Answer
Hi...
You can use Correlation based feature selection method. Alternatively you can use simple method based on Fishers Discriminant Ratio, or use some class separability measures like Scatter Matrices, Bhattacharya Divergence etc.
May be chapter-4 of below book will  help you:
Book: An introduction to pattern recognition: A MATLAB approach
BY: Theodoridis, Sergios
Gud luck...!!!!
  • asked a question related to Machine Vision
Question
7 answers
Hi friends,
i want to classify the scanned  document images and so many methods are there. But it depends high with texts in the document. Please suggest any best algorithm that can classify the document without using the texts.
I have added few sample images. Example i am having 500 documents with different layout . If i feed the image into the engine, it should tell this document is this type(i.e Form 16a, w2 tax).
Relevant answer
Answer
One possible approach could be to detect the straight lines first using Hough line detector. Hough line detector can identify lines with their position and orientation (slope).  As different layouts will have different numbers of horizontal and vertical lines placed at different positions, this (position & orientation of lines) could be a pretty good feature.  
Hope it helps.
  • asked a question related to Machine Vision
Question
7 answers
Hello forum,
I have been reading about cascaded classifiers using haar-features for face detection and I have a few simple questions I have to ask/clarify. This is more towards implementation as I am a little confused as to how they work.
1) I understand that during the training phase, the haar features will be evaluated and rescaled for all possible combinations. At the end, the feature with the smallest error will form the first stage (attached picture). My question is, during the detection phase when a sub-window is selected for evaluation, will the features be placed at a specific region (like in the attached picture again) ?
For example, for the top left feature, it must always be positioned in the center leaving an empty space of 10% (of the width) to the left and right and be 30% (of the height) below.
Or will evaluation start at the top left hand corner (assuming origin), similar to training ? i.e. the feature will be evaluated over all the regions in the subwindow.
2) Regarding adaboost, I have understood the steps but my question is, when the weights are updated after the nth iteration, is it possible that a feature that has been already selected, get selected again ? i.e. it has the smallest error again. Or will features/classifiers that have already been selected be "removed" from the subsequent selection process ?
I am really loving computer vision. I will be undergoing this module in 10 weeks when semester starts but, I can't wait for so long to officially start learning what I love haha. Thanks all.
Relevant answer
Answer
 For the question 1, I think what you should keep in mind is that, classical Haar filters are sensitive to position changes. So, for the evaluated windows, those Haar filters selected in training stages should be placed in the same positions with that in training images. However, because the sliding windows tend to have different scales from training images, so you need to 'align' the sliding window with training scale via some techniques before you place those Haar filters.
  • asked a question related to Machine Vision
Question
2 answers
Hi, we worked on depth estimation of rectified images using OpenCV. The approach usues simple template matching with SAD. Does anybody know of a recent DENSE depth estimation algorithm for which there already exists an online accessible implementation? Thanks
Relevant answer
Answer
Anas Al-Nuaimi
I recommend you to follow this post, from Jay Rambhia, that uses the StereoSGBM algorithm.
I tried it a couple of days ago with great success with 2 ps3 eye cameras, in Linux:
Please note that before applying the StereoSGBM, you first need to calibrate and rectify both image cameras.
For the stereo calibration, follow this link:
If you need any help please ask me. Right now I am  deeply working in this area.
Best regards
João Martinho Moura 
  • asked a question related to Machine Vision
Question
3 answers
I have two the co-ordinates of two bounding boxes one of them is the ground truth and the others the result of my work. I want to evaluate the accuracy of mine against the ground truth one. So I am asking if you have any suggestions. The bounding box details of the ground truth is saved in this format: [x  y  width  weight]..Thanks
Relevant answer
Answer
If "accuracy" is interpreted as how close the estimates are to the truth, then there is no formula for this (assuming Kalman filtering). You have to do Monte Carlo simulations and get statistics. If you mean "precision" then you can use the state estimation error covariance from the Kalman filter. Again, you have to accept that this changes with time. There is a steady state covariance in many cases, which can be calculated by solving a matrix Riccati equation, but you must solve numerically in general. Bearing in mind that even in this simple case, you are assuming that there are NO false measurements and the detection probability of the correct measurement is 1. Since you are at a French speaking institution (if I am correct), these two interpretations are "exactitude" and "précision," which are not the same in object tracking.
  • asked a question related to Machine Vision
Question
6 answers
I am detecting rings from an image and distance between concentric rings of unknown radius.
Relevant answer
Answer
Hi Subah,
as it might be of interest for others or maybe anyone knows a more elegant way, please find my answer here below.
So if you have the centre of the circle, let's call it: XM,YM
%create a vector going through the centre of the circles in your image (img)
vector = img(XM,:); %sorry, it might be YM. I'm always mixed up in Matlab with X and Y
%now you have a vector containing all the pixel horizontal through the centre of the circles
%detect all white pixels within the vector
whites = find(vector==1);
%this gives you the indices of all white pixels
%I honestly don't understand what your first 3 rings really are. Hence I'm cutting them off
whites = whites(3:end);
%now take the difference between the rings --> I assume that each of your rings is a thin ring and you want to know the distance between these thin rings. Is that correct? This means that you have 2 white pixels for each ring. I assume you want to know the distance between the end of one ring and the beginning of the next ring. Is that correct? If yes:
%create a vector with the start indices of a ring and one with the end indices
vStart = whites(1:2:end);
vEnd = whites(2:2:end);
%exclude first start
vStart = vStart(2:end);
%calculate vector with distance of the start point of each new circle and the endpoint of the last circle
distance = vStart-vEnd; %you get a vector containing the distance of each circle
%if you want the average distance
avDistance = mean(distance);
Hope this helps.
Happy to explain more.
best,
Katrin
  • asked a question related to Machine Vision
Question
4 answers
I performed and recorded these tests in the field so my subjects blend in very well with the forest floor -- making automatic detection nearly impossible. I had ok success with a PanLab trial but I'm wondering if there are any good (and hopefully open source) alternatives.
Relevant answer
Answer
Dear Sophia,
have a look at Fiji (distribution of ImageJ) which is an open source platform and offers a lot of plugins, also for tracking. There are preinstalled plugins for automatic (Mtrack2, ToAST, TrackMate), and manual tracking (e.g. TracMate, MTrackJ, etc), but the extension is very easy. As you said, the success strongly depends on the image (feature contrast) - let’s play, the are plenty of tutorials available.
  • asked a question related to Machine Vision
Question
4 answers
Hi,
I am doing research in human action classification.I have used Hog Feature for classification but I got low accuracy. Please suggest me any features.
Relevant answer
Answer
You can give a try SIFT for feature extraction and Random Forests for classificetion.
  • asked a question related to Machine Vision
Question
4 answers
Histogram spread for gray image is given in the paper but there is no clear idea about rgb image.
Relevant answer
Answer
The referenced paper defines histogram spread (HS) as the ratio between the interquartile range (IQR) and the range (R). A possible multidimensional extension of this idea would be to define an IQR-vector and an R-vector and calculate the ratio between their norms.
Assuming you are working in RGB space:
HS = ||IQR|| / ||R||
HS = ||(IQR_red, IQR_green, IQR_blue)|| / ||(R_red, R_green, R_blue)||
Assuming L2 norm...
HS = ((IQR_red ^ 2 + IQR_green ^ 2 + IQR_blue ^ 2)^(1/2)) / ((R_red ^ 2 + R_green ^ 2 + R_blue ^ 2)^(1/2))
HS = ((IQR_red ^ 2 + IQR_green ^ 2 + IQR_blue ^ 2)/(R_red ^ 2 + R_green ^ 2 + R_blue ^ 2))^(1/2)
Assuming L1 norm...
HS = (IQR_red + IQR_green + IQR_blue)/(R_red + R_green + R_blue)
(where IQR_channel and R_channel are the interquartile ranges and ranges of each monochromatic channel, respectively)
This assumes, of course, that the different channels are being measured in the same scale (which is a reasonable assumption, I guess), so that it makes sense to talk about metrics and norms.
Good luck.
  • asked a question related to Machine Vision
Question
5 answers
Currently, we have acquired video data of human actions performing martial arts movements. We want to segment the video frames into different actions (sequentially). Can anyone suggest what the best method so far is for this problem? Some good links are also welcomed. Thank you.
Relevant answer
Answer
  • asked a question related to Machine Vision
Question
4 answers
I seek to write a code that would compute the dimensions of a room from a photo.
Relevant answer
Answer
Thanks a lot for all
  • asked a question related to Machine Vision
Question
4 answers
Is there a comprehensive taxonomy that can explain the state of the art of current abnormal events detection techniques from video?
Relevant answer
Answer
Hello. I imagine that it depends what you mean by abnormal event. From my point of view, this type of problem takes two forms. The first is an image processing problem where you look in a single frame for an object with a certain feature. This can be reduced to a parameter estimation problem. The second way is to view the problem as one of change detection, i.e., the abnormal event corresponds to a change in the dynamics of the underlying system (think of you car engine when something inside it breaks). There is a large literature on change detection, which can be approached in many ways. My experience of this type of problem is from the area of manoeuvring target tracking. Hopefully you can find some useful references for your work as quite a lot has been done in this area (see link), although the sensor is usually a radar system rather than a video sequence.
  • asked a question related to Machine Vision
Question
9 answers
I would like a method to calculate the curvature of a 2D object. Object is a matrix whit n rows (that are corresponded to n consecutive points) and 2 columns (that are corresponded to x and y coordinates).
Relevant answer
Answer
In 2D images, there are (at least) two types of curvature. One describing the intensity landscape (e.g., cup, cap, saddle, etc.) and the other describing the shape of the isophotes (curves of equal intensity).
The first is described by the principal curvatures, which are the Eigenvectors of the Hessian matrix. 
k1 = (Lxx + Lyy - sqrt(4*Lxy^2 + (Lxx-Lyy)^2))/2
k2 = (Lxx + Lyy + sqrt(4*Lxy^2 + (Lxx-Lyy)^2))/2
The second is the isophote curvature.
k = - (-2*Lx*Lxy*Ly + Lxx*Ly^2 + Lyy*Lx^2 ) / ((Lx^2 + Ly^2)^(3/2))
More details are provided in the book of B.M. ter Haar Romeny (Front end vision...).
In this notation, Lx is the first order derivative to x, Lxx is the second order derivative to x, Lxy is the partial derivative to x and to y, etc..
For curvature estimation, I recommend a derivative that is rotation invariant and (at least) twice differentiable. Otherwise, you measure the shape of the pixel grid, which is undesirable. So, a higher-order B-spline or a Gaussian derivative could be appropriate. The small derivative kernels [-1, 0, +1] are not recommended. More information about the Gaussian or B-spline derivatives can be found in [Bouma e.a., Fast and Accurate Gaussian Derivatives based on B-Splines, LNCS, 2007]. A full text PDF version is available in the following link:
  • asked a question related to Machine Vision
Question
18 answers
I am using a 2 layers (one hidden layer ) Neural Networks based classifier to run a classification for my data (images) and use back propagation algorithm.
the networks works well and can classify  with more than 90% of accuracy . But the shape of weights in hidden layer are different by running every time. I am using Matlab imagesc function to  visualize the  weights.
Relevant answer
Answer
An ANN is trained with randomised intial weights, that's why different starting conditions converge on different potential solutions. Take a look at this for the influence of the final layer on previous layers:
McLean, D., Z. Bandar, and J. D. O’Shea. "The evolution of a feedforward neural network trained under backpropagation." Artificial Neural Nets and Genetic Algorithms. Springer Vienna, 1998.
The journal website lets you see the first two pages for nothing, unfortunately there isn't a free online download at the momnent.
  • asked a question related to Machine Vision
Question
9 answers
Hi everyone, I am new in Computer Vision and especially EMGU CV language, but I need to do my project that relate with distance measurement using 2 identical web cameras. 
So far I've done these steps:
1. Perform stereo calibration ( acquire intrinsic & extrinsic parameter )
2. Perform stereo rectification  ( use the parameters to rectify both images )
3. build disparity map using StereoSGBM algorithm
4. And try to acquire the distance value in certain pixel of the image plane using points = PointCollection.ReprojectImageTo3D(disparityMap, Q); 
but I got a problem to understand the meaning of each point (x,y,z) value. I presume that the x and y value appoint the coordinate in the image plane while the z appoint the depth information. 
Anybody can explain to me on how can I convert the z (depth information) into real world distance?
Any information will be respected.. thank you very much. 
Relevant answer
Answer
It is based on trigonometry. Once you have acquired intrinsic and extrinsic params from stereo calibration, you can use the following formula 
z = bf/d
where, f = focal length (in pixels), b = baseline (in meters), d = disparity (in pixels).
  • asked a question related to Machine Vision
Question
6 answers
Interested in doing some research in Computer Vision and Mobile Visual Search. Could you please suggest some novel ideas/issues that are emerging in that research topic?
Relevant answer
  • asked a question related to Machine Vision
Question
7 answers
What is the approach to measure the real size (length, height and width) of an object from an image while I don't know the focal length or the object distance (I don't know the origin of the image, i.e. any technical details of the lens or the camera)?
Relevant answer
Answer
The only other factor you need is the height of the object in real life (otherwise you could be photographing a model which is much closer to the camera).
The maths isn't actually that complex, the ratio of the size of the object on the sensor and the size of the object in real life is the same as the ratio between the focal length and distance to the object.
To work out the size of the object on the sensor, work out it's height in pixels, divide by the image height in pixels and multiply by the physical height of the sensor.
So the whole sum is:
distance to object (mm) = focal length (mm) * real height of the object (mm) * image height (pixels)
---------------------------------------------------------------------------
object height (pixels) * sensor height (mm)
object height (pixels) * sensor height (mm)
Let's sanity check this equation.
If we keep everything else constant and increase the focal length then the distance increases (as focal length is on the numerator). This is what you would expect, if you have to zoom your lens to make one object the size another equally sized object used to be, the first object must be further away.
If we keep everything else constant and increase the real height of the object then again the distance increases as if two objects of different real heights appear the same height in the image the taller one must be further away.
If we keep everything else constant and increase the image height, then the distance increases, as if two objects (of the same size, remember we're keeping everything else constant) appear the same pixel size in a cropped and uncropped image then the object in the uncropped image must be further away.
If we keep everything else constant and increase the object height in pixels then the distance decreases (we're on the denominator now): two equally sized objects, one takes up more pixels, it must be closer.
Finally if we keep everything else constant and increase sensor size, then distance decreases: two equally sized objects have the same height in pixels when shot with a compact (small sensor, where 20mm is a long lens) and shot with a DSLR (large sensor where 20mm is a wide lens), then the object in the DSLR image must be further away (because it appeared the same size but with a wide lens).
I saw that from the website and hope it could help you.
  • asked a question related to Machine Vision
Question
2 answers
I have read about ASM and discrete symmetry operator ,and I got the main idea .
But I got confused of the bundle of the non-understood functions. Is there any simplified illustration for both of them?
Relevant answer
Answer
This is a good question.
Discrete symmetry is nicely explained in
S.V. Smirnov, Adler map for Darboux q-chain, Moscow State University:
See mapping (4), page 2 (see also the Proposition on the same page).
More to the point, consider
A.W.M. El Kaffas, Constraining the two Higgs double method with CP-violation, Ph.D. thesis, University of Bergen, Norway, 2008:
Symmetry is related to the harmony, beauty and unity of a system so that under certain transformations of a physical system, parts of the system remain unchanged (p. 6).   A discrete symmetry describes non-continuous changes in a system.  Such a symmetry flips a system from one state to another state.   See Section 2.1.1, starting on page 6. 
  • asked a question related to Machine Vision
Question
8 answers
As we know that FPGAs are suited for parallel and pipeline based processing. In this regard, can we accelerate FPGAs to solve problems of big data in computer vision perspective.
Relevant answer
Answer
Agreed. FPGA power consumption and ipso facto heat is lower. Design time is higher forFPGAs but tools are improving. Another issue is the requirement of specialized hardware. Your FPGA utilizing code will not run on an "off-the-shelf" machine. However, the FPGA could be delivered to market on a standard bus board, e.g. for PCI, as a software / hardware package. Oracle and others are doing well with proprietary systems. Very large corporations, e.g. Google, Amazon, etc. can afford to design and build their own complete systems. Still other, not so large firms such as hedge funds, brokerages and others keep their proprietary analytics under lock and key as this is their "bread and butter". Not too long ago a major brokerage had an employee theft of code which resulted in prison time for the culprit. Not only is hardware harder to steal, algorithms patented as circuits are much easier to defend in court with a hundred years of case law behind circuit design. As with massively parallel subsystems, you are developing new algorithms (not simply code) for critical sections which may be 10% of your overall system but 90% of your current processing time and you are thus working with a relatively small section of your overall code. But it is a complex overall decision not to be taken lightly.
  • asked a question related to Machine Vision
Question
2 answers
Can anyone help me to understand hand label colour space?
Relevant answer
Answer
I am a little confused about the context of the question, but I am assuming that you are asking in the perspective of automatic color segmentation in images for some machine vision application. I think that in this case, when they use learning algorithms they train them on some test images which are segmented and color labeled by hand so that the learning algorithms are tuned on them before proceeding to the verification and real application. These hand labeled images are known to be a part of the hand label color space or more mathematically correct terminology would be the "these images span the hand labeled color space".
I am giving the link to a paper in which the perspective I have mentioned above is used, hope you find it useful.
  • asked a question related to Machine Vision
Question
10 answers
Does anyone have any experience on the development of a domestic robot's ability to locate itself in an indoor environment?
For your answer, take into account the possibility of having a camera on the robot (image recognition may be a way to go?).
I believe it may be necessary to take multiple inputs. For example, an image recognition algorithm, together with a "dead-reckoning" method, such as estimating diplacement as a function of the revolution of the robot's wheels could be used to estimate the position of the robot.
All feedback would be greatly appreciated, as I am just starting with this investigation.
Thank you very much!
Relevant answer
Answer
You could have a look at RatSLAM, as it would fit your constraints very well (works indoor + outdoor, uses a camera as input). There is an open source version of it available, too: OpenRatSLAM.
  • asked a question related to Machine Vision
Question
6 answers
In the Attached Paper, I have a problem in Mean Shift Vector Equation 10.
Does M(X) => Mean Shift Vector are contain both Vx and Vy or single value ?
Relevant answer
Answer
I think (as Partha said ) that the Mean-Shift vector can expressed in n-dimensional space as M= ax1 + bx2 + cx3 +...+zxn , in case  of 2-dimension the scalar coordinates are x=a and y= b.
  • asked a question related to Machine Vision
Question
3 answers
I have only problem in how to calculate Target Model & Target Candidates Kernel density from histograms (for gray scale images.)
Let suppose we have histograms H1 and H2 of Target Model & Target Candidates respectively. I want to compute Gaussian kernel Density.
Relevant answer
Answer
If H1 is your data hypothesis (some rule for the approximated E1 density function) and H2 is your target hypothesis (some rule for the E2 approximated density function). Then, if E1 data, E2 data do not greatly violate the Gaussian test then you could, in principle, use the kernel estimators from EM / k-means clustering algorithms.
It may prove useful to look into implementations; [1] adaptive Gaussian filtering (pdf_agf / pdf_knn), or [2] kernel density estimator toolbox.
  • asked a question related to Machine Vision
Question
5 answers
In classic snake method, there is a formula (sum or integral) that define the overall internal energy of curve using first and second derivatives of the curve. I can calculate this energy, but i can't use this energy to evolve the curve. I want to know about procedure of curve evolution.
Relevant answer
Answer
In the classical snake, the internal energy is divided into two parts:
- The first derivative, which tries to maintain the points equally spread along the curve.
 - The second derivative, which tries to keep all the points in a straight line.
This means that if you have a closed curve and no other energy, the points will tend to shrink until collapse in an unique point. I am not sure whether this is your problem, but if it is you need to add other energies, since the internal only smooth the curve. When employing snakes for segmentation is common to use the gradient to adapt the contours to the image contours. Another options are the mumford-shah functional or mixing different features. For instance, if you have a vertebra in CT, finding the point that maximizes a mixture of the gradient and the intensity may help to get the real vertebral shape. 
Hope it helps.
  • asked a question related to Machine Vision
Question
21 answers
I know scale invariant SIFT feature but this technique is tedious. Another features are Histogram of Oriented Gradient HOG, which is efficient and can be rotation invariant. But I didn't find any feature which is invariant to RTI and scale changes and yet it's computation cost is low?
Can anyone please suggest me such features if it exists?
Can I use combination of these feature? if yes, how can I combine them as they have different size?
Relevant answer
Answer
There are many such as 
Scale-invariant feature transform (SIFT)
Speeded Up Robust Features (SURF)
Binary Robust Invariant Scalable Keypoints (BRISK)
Fast Retina Keypoint (FREAK) ...
  • asked a question related to Machine Vision
Question
38 answers
I would like methods that do not need specific hardware.
Relevant answer
Answer
Hi Alireza,
Unlike previous suggestions, personally I would not go for color image segmentation nor Hough transforms. Both methods may work in laboratory conditions, but in my humble opinion they do not work well in relatively unconstrained environments. On the one hand, it is very difficult to calibrate color information in changing scenarios (e.g. outdoors or different daylight times). On the other hand, Hough transform is very prone to local minima, very dependent on the success of edge detection techniques (that may strongly fail under blur, noise or clutter), and quite slow.
I definitely would apply a boosting algorithm over descriptors that are powerful enough, such as SURF or HOG, in order to obtain a cascade of classifiers. A typical Adaboost will do the trick. It is the usual approach for face detection, and the variability of non-positive examples is much lower in the case of eyes, as negative examples are simply "other parts of the face", instead of the more challenging "non-face images". I also recommend to implement a bootstrapping process after the selection of each filter, so that new false positives are harvested, and the cascade gets progressively more precise - that is the most awesome principle of boosting. You could also apply simpler descriptors like Haar of LBP, but in my experience this would expand the training time and the cascade size, and the results will not be better.
You can check recent improvements in this field for object detection, most notably SURF cascade (https://sites.google.com/site/leeplus/publications/learningsurfcascadeforfastandaccurateobjectdetection) and Soft Cascade. With less than 10 filters you will obtain very accurate, very fast and computationally inexpensive results for real-time implementations, and you will just require a simple webcam.
Finally, if you don't want to lose time implementing your own training algorithm (I don't recommend using the OpenCV cascade training routines), I also recommend some very nice open-source libraries that provide you fiducial landmarks, eyes being among them:
- Flandmark (fast and quite reliable)
- Stasm (more precise and many landmarks, but much slower)
- And of course, OpenCV itself has some trained LBP(or Haar?) eye cascades. (In my experience this is the least robust one, although very fast).
Hope this helps,
Carles
  • asked a question related to Machine Vision
Question
12 answers
The methods for face detection that work well in unsuitable lighting condition.
Relevant answer
Answer
Most deployed face recognition methods use boosted decision trees of haar-like features. OpenCV has a free implementation. Mark up a few thousand of photos manually on your "unsuitable lighting conditions" (using images of exactly the same kind that you will use later) and train them. It will save you months of time otherwise needed to learn how to make neural networks work...
  • asked a question related to Machine Vision
Question
7 answers
Which later can be used for svm classification with segmentation being involved as well.
Relevant answer
Answer
(Unfortunately I could not attach the doc file containing the following text. Therefore I just pasted the text here.)
• In applications like face recognition which a vector (or a matrix) is extracted from the whole sample image:
1. Extract histogram (first order or second order histogram) of the image:
 First order histogram: (MATLAB)
      im = imread('tire.tif');
  nf = 20; featureVec = imhist(im, nf); % with any arbitrary value of nf
  % OR
  featureVec = imhist(im); % with default nf=256
 Second order histogram, here GLCM (MATLAB)
  im = imread('tire.tif');
  glcm = graycomatrix(im);
  temp = graycoprops(glcm);
  featureVec(1) = temp.Contrast;
  featureVec(2) = temp.Correlation;
  featureVec(3) = temp.Energy;
  featureVec(4) = temp.Homogeneity;
2. This feature vector can be used in any classification/segmentation framework.
• In applications like image (pixel) classification which a vector (or a matrix) is extracted for each pixel in the sample image:
1. Extract histogram (first order or second order histogram) of a neighborhood of each pixel (e.g. a square window around the pixel):
 First order histogram: (MATLAB)
  im = imread('tire.tif');
  [Nx, Ny] = size(im); % suppose the image is a single-band image
  w = 3; % neighborhood window --> (2w+1)-by-(2w+1)
  nf = 50;
  extIm = padarray(im, [w, w], 'symmetric');
  featureVec = zeros(Nx, Ny, nf); % memory allocation
  for x = 1+w:Nx+w
      for y = 1+w:Ny+w
         WIN = extIm(x-w:x+w, y-w:y+w);
         featureVec(x-w, y-w, :) = imhist(WIN, nf);
     end
  end
 Second order histogram, here GLCM (MATLAB)
   im = imread('tire.tif');
   [Nx, Ny] = size(im); % suppose the image is a single-band image
   w = 3; % neighborhood window --> (2w+1)-by-(2w+1)
   extIm = padarray(im, [w, w], 'symmetric');
   nf = 4; % graycoprops gives 4 features
   featureVec = zeros(Nx, Ny, nf);
   for x = 1+w:Nx+w
      for y = 1+w:Ny+w
         WIN = extIm(x-w:x+w, y-w:y+w);
         glcm = graycomatrix(WIN);
         temp = graycoprops(glcm);
         featureVec(x-w, y-w, 1) = temp.Contrast;
         featureVec(x-w, y-w, 2) = temp.Correlation;
         featureVec(x-w, y-w, 3) = temp.Energy;
         featureVec(x-w, y-w, 4) = temp.Homogeneity;
      end
   end
2. These feature vectors can be used in any classification/segmentation framework.
  • asked a question related to Machine Vision
Question
4 answers
I am using A4tech webcam, and I connect it to the opencv via visual studio. But the problem is that I cannot analyse it's frames. i.e. the program is built completely, but if in the main code I have used the frame properties, it will not start running. The memory of the frames (frame=cvQuaryFrame(capture)) is not accessible. But this problem is only for external webcam. The laptop Embedded webcam does not have this problem. I have attache the .cpp code that I have written
Relevant answer
Answer
In case it helps, I have a number of example OpenCV applications that use one, two, or more cameras ... Found here - http://mercury.pr.erau.edu/~siewerts/extra/code/, specifically here - http://mercury.pr.erau.edu/~siewerts/extra/code/computer-vision/, and a paper on using 2 webcams here - http://www.ibm.com/developerworks/library/bd-mdasecurity/