Science topic
Machine Vision - Science topic
Explore the latest questions and answers in Machine Vision, and find Machine Vision experts.
Questions related to Machine Vision
Hello, can you recommend to me any architecture of machine vision that allows me to detect at least one object with a low-resolution camera (or high, as well) with high FPS (frame per second) more than 15 with Raspberry Pi5 (8Gb RAM)?
Seeking insights on leveraging deep learning techniques to improve the accuracy and efficiency of object recognition in machine vision systems.
𝟮𝟬𝟮𝟰 𝟱𝘁𝗵 𝗜𝗻𝘁𝗲𝗿𝗻𝗮𝘁𝗶𝗼𝗻𝗮𝗹 𝗖𝗼𝗻𝗳𝗲𝗿𝗲𝗻𝗰𝗲 𝗼𝗻 𝗖𝗼𝗺𝗽𝘂𝘁𝗲𝗿 𝗩𝗶𝘀𝗶𝗼𝗻, 𝗜𝗺𝗮𝗴𝗲 𝗮𝗻𝗱 𝗗𝗲𝗲𝗽 𝗟𝗲𝗮𝗿𝗻𝗶𝗻𝗴 (𝗖𝗩𝗜𝗗𝗟 𝟮𝟬𝟮𝟰) 𝘄𝗶𝗹𝗹 𝗯𝗲 𝗵𝗲𝗹𝗱 𝗼𝗻 𝗔𝗽𝗿𝗶𝗹 𝟭𝟵-𝟮𝟭, 𝟮𝟬𝟮𝟰.
𝐈𝐦𝐩𝐨𝐫𝐭𝐚𝐧𝐭 𝐃𝐚𝐭𝐞𝐬:
Full Paper Submission Date: February 1, 2024
Registration Deadline: March 1, 2024
Final Paper Submission Date: March 15, 2024
Conference Dates: April 19-21, 2024
---𝐂𝐚𝐥𝐥 𝐅𝐨𝐫 𝐏𝐚𝐩𝐞𝐫𝐬---
The topics of interest for submission include, but are not limited to:
- Vision and Image technologies
- DL Technologies
- DL Applications
All accepted papers will be published by IEEE and submitted for inclusion into IEEE Xplore subject to meeting IEEE Xplore's scope and quality requirements, and also submitted to EI Compendex and Scopus for indexing.
𝐅𝐨𝐫 𝐌𝐨𝐫𝐞 𝐃𝐞𝐭𝐚𝐢𝐥𝐬 𝐩𝐥𝐞𝐚𝐬𝐞 𝐯𝐢𝐬𝐢𝐭:
#specialissue #CallforPapers
📢 CMC-Computers, Materials & Continua new special issue “Machine Vision Detection and Intelligent Recognition”is open for submission now.
Machine vision detection and intelligent recognition are important research areas in computer vision with wide-ranging applications in manufacturing, healthcare, security, transportation, robotics, industrial production, aerospace, and many other industries.
This is a great opportunity for researchers and practitioners to share their latest findings and contribute to the advancement of the field.
📆 The deadline for manuscript submission is 31 December 2023.
👉 To submit your manuscript, please visit the following link: https://www.techscience.com/cmc/special_detail/machine-vision-detection
We look forward to your contributions to this exciting special issue!
Hello friends, good time
I want to choose a backlight system to inspect hot glass bottle defects. Given that hot bottles emit infrared light, do you think the backlight is better with white light or infrared light?
BR.
As the question is, in many natural gem stones, the color distribution is not uniform, so is there a suitable method to separate and quantify the color clusters?
What we should ask, instead, is how to develop more informed and self-aware relationships with technologies that are programmed to take advantage of our liability to be deceived. It might sound paradoxical, but to better comprehend AI we need first to better comprehend ourselves. Contemporary AI technologies constantly mobilize mechanisms such as empathy, stereotyping, and social habits. To understand these technologies more deeply, and to fully appreciate the relationship we are building with them, we need to interrogate how such mechanisms work and which part deception plays in our interaction with “intelligent” machine
Image Source: https://www.maize.io/news/into-the-unknown/
I am currently searching for a topic for my research which is about using machine vision and object recognition to control a robot (serial or parallel it does not matter). Unfortunately I can not find a problem to be solved. can any one recommend some new points of research ?
Hello everyone, good time.
I want to select a camera (lens + sensor) for a machine vision system. according to machine vision books, the modulation transfer function of the camera is an important factor that affects algorithm performance?
How can I select a suitable MTF value for the camera of my machine vision system?
Are there any criteria for determining suitable image contrast for the machine vision algorithms?
Can you please tell me other factors that I must consider for selecting a camera for a machine vision system?
thank you very much in advance.
I'm searching about autoencoders and their application in machine learning issues. But I have a fundamental question.
As we all know, there are various types of autoencoders, such as Stack Autoencoder, Sparse Autoencoder, Denoising Autoencoder, Adversarial Autoencoder, Convolutional Autoencoder, Semi- Autoencoder, Dual Autoencoder, Contractive Autoencoder, and others that are better versions of what we had before. Autoencoder is also known to be used in Graph Networks (GN), Recommender Systems(RS), Natural Language Processing (NLP), and Machine Vision (CV). This is my main concern:
Because the input and structure of each of these machine learning problems are different, which version of Autoencoder is appropriate for which machine learning problem.
I want to find out the objects length with decent precision. The laminate can have different length, we assume that we don´t know the actual length. Therefore i basically want an algorithm that can calculate the length.
I am developing a machine-learning model to make accurate and fast counting of metal pipe with difference cross-sectional shapes. Well-define rectangular, triangular and circular shapes are quite ok to do, but the C shape metal is really complicated especially when they overlap one another as shown in the attached photo. Anyone has any suggestion of a model that can count overlapping object? Thanks in advance.
There are many, many datasets for computer vision tasks such as object detection and the like, but benchmarks for automated visual inspection tasks (e.g. detection of surface defects, bulk material classification) are hard to come by. I've searched in the usual places (http://www.cvpapers.com/datasets.html, http://www.computervisiononline.com/datasets, google) but came up with nothing. Do you know of such datasets (synthetic or natural)?
Let me assume that the organizations would afford to have high performance super computing to work on high resolution images directly without the need for image down scaling for machine vision applications, what is the compelling need for these super resolution algorithms that down sample and up sample an existing high resolution image?
Movie of driver when driving. Fatigue, drowsiness, distraction,
I'm a newbie in the field of Deep Reinforcement Learning with background in linear algebra, calculus, probability, data structure and algorithms. I've 2+ years of software development experience. In undergrad, I worked in tracking live objects from camera using C++,OpenCV. Currently, I'm intrigued by the work been done in Berkeley DeepDrive (https://deepdrive.berkeley.edu/project/deep-reinforcement-learning). How do I gain the knowledge to build a theoretical model of a self-driving car ? What courses should I take? What projects should I do ?
I need the matlab implementation of 'jseg' image segmentation algorithm for content based image retrieval.
Hi
For our research project we would like perform collision detection and obstacle avoidance with Machine Vision. I would like to use deep learning or maybe even better extreme online learning techniques.
So please can someone suggest some platforms and tools where to start with machine vision and deep learning/extreme online learning with Machine Vision to hit the ground? Would be appreciate your help
Thanks
Dear colleagues,
I need the cropped version of FERET face database. Please provide a link to download it. Thanks.
Hi
I have a Machine Vision project for a crane in the construction. So must meets the requirements for swing angle detection and obstacle avoidance from the crane (by the distance of around 40m). So, I need some high quality monocular or stereo camera that should be able to detect the swing angle and obstacle avoidance at a distance around 40 meters and meet the other requirements like dynamic range. Please, can you suggest any monocular or stereo Camera that meets such requirements?
Thanks
I would like to estimate the built-up area by using image edge detection. However, after detecting the edge inside the image, I need to close the area between the edges. Can anyone gudie?
The digital image sensor is a technology used to record electronic images. These image sensors are silicon microchips that are engineered to operate in different ways. The engineering involved in these circuits is very complex and the techniques used to manufacture these devices are not standardized in the industry. It is impractical to discuss the entire spectrum of manufacturing processes here, but they are noted because they give rise to the wide variety of applications for the digital image sensor. These engineering practices involve creating features that are added to the bare sensor for use in specific applications ranging from machine vision in robotics, to cameras integrated into cellular phones and PDAs, to ultraviolet sensors on the Hubble Space Telescope.
Given the recent interest in creating swarms of robotic flies that carry explosives and contain face recognition circuitry (as described in Bot Flies, The Economist, Dec 16, 2017) will a time come when the enablers of such killing machines will be made to account? We all know that machines will never—and I emphasize never—be able to make decisions without the assistance of humans so expect arguments such as “the machine did it” to fail in a court law. Can you imagine a world where people like Robert Oppenheimer (the developer of the atomic bomb) are hauled off to prison and later put to death for the creation of lethal autonomous weaponry?
Machines are detecting and recognizing people, detecting actions, tracking objects and even detecting dangers. When can blind people and people with visual disabilities can depend reliably on vision tools to aid them?
I was wondering if anyone knows or have published a technique that sucessfully combines shallow (HOG, SIFT, LBP) with deep (GoogLeNet) representation? I am interested both for images and video cases.
I am a beginner in the area of computer vision. I have a basic doubt in 2D perspective projection. While analyzing 3D-2D transform we are taking the focal length as the distance to image plane. ( In almost all the reference). Focal length of a lens is actually the distance between center point of the lens to the point on the optical axis where the parallel lines will converge. So if we are placing the image plane at this point how we can get the image? If any clarification on my question please mention, I will elaborate. Hope valuable reply, it will help me to improve my basic knowledge in the area.
Hi,
I have a text file containing:
1. The names of images (Image Directory Locations to be read by algorithm)
2. Vectors which have image pixel positions as their values.
Can I use this data to train a Support vector regression model and get the pixel position vector for an unknown image?
I would very much appreciate if someone could guide me about my approach. In case someone can refer me to the relevant papers, I would be really grateful.
Thanks!
I want to color grayscale images and infrared images. I am looking for the best algorithm. Compete accuracy and quality of coloring is needed, as well as speed.
Hi everyone,
I have been dealing with a motion (or action) classification problem, which has number of frames for each instance of class in its dataset.
Since I extracted corresponding keypoints among those frames, I have a trajectory for each keypoint, and consequently, a bunch of trajectories for each instance.
And now, I'm looking for how to describe these trajectories to create my feature vectors--instead of just putting the position of keypoints in my feature vector--, and later on, use it for classification.
Any help?
I want to develop maybe a new way/algorithm to detect cars in computer vision. What kind of algorithm I can dig in deeper?
I'm a student with electrical/mechanical background, in my project I'm searching for a solution for a company who wants to start with 3D cameras for robotics.
At the moment I'm working with Matlab and it works great, the possibility to create your own GUI is a big plus.
But I read Matlab is more for developing purpose and is slower (overhead).
A second software package that I try to use is Halcon, at the moment I've no overview of the possibilities.
But it looks to me that you can program in Halcon's own language hdevelop or using their libraries in your own code (like C++).
Programming in hdevelop with it's GUI seems to be easier/faster than low-level programming (e.g. C++), but I don't know the limitations.
A disadvantage is that there is no community for support, you need to use their documentation.
A third option I read a lot about is OpenCV, but with no low-level programming background this seems too ambitious for me.
I'm not searching the best solution for me, but for a company (although I know the company hasn't a lot of computer engineers).
I was hoping to find software with a good GUI to reduce low-level programming, Halcon seems to be the closest match.
Thanks for your help.
I want to segment a table in the depth image based on depth information obtained from Kinect2. The problem with the table is that it is infront of camera and covers a large depth area. Depth thresholding also eliminates other objects from the scene at the same depth level as of table. Any idea would be highly appreciated!
i want to extract features such as eyes, nose, eyebrows, lips etc., from face image. i would like to use SURF features algorithm using MATLAB computer vision system toolbox. i need help in this, what would be the output of this? i want to execute this across multiple images and then data set to be created with these features.
can someone please help me, how to create this data set explain me elaborately.
Is there any research on how we can retrieve the phase of the thermal data from the thermal images captured by a thermal camera?
Hello every body.
what is the simplest way to obtain position and orientation of two postures of one plate in space wrt each other, by knowing Cartesian information (x,y,z) of three points on the plate wrt a camera in each posture?
for more explanation, we have a plate of one device e.g end effector of parallel robot, and we can obtain Cartesian information(x,y,z) of any desired point on it wrt camera(by stereo vision), the goal is to measure position and orientation of two postures of the plate in space wrt each other, I want to know ways to achieve this goal(especially simplest ones)? notice that we can put some markers on the plate,e.g one black paper with three white circle on it, or a triangle on the paper or...
wrt=with respect to
There are some papers that use cca to face recognition.
I've extracted two type of features from ORL dataset, that have 400 images of 40 subjects. and, at this step by using each type of features classification accuracy is about 98%, but when i transformed feature to new space using cca, accuracy fall to just 3%.
- Where is problem?
- Are class labels not important to find directions using cca?
Thanks.
i want to extract the tables from scanned document images with help of ML. Please suggest robust method for extracting the tables.
I need to extract the table details with help of ML functions. I have OCR tools but that extracts text only.
+1
For classification if some m number of features are selected. some of these may be co-operative (good for classification) and rest may not. What is the statistical measure to discard rest of the features.
Hi friends,
i want to classify the scanned document images and so many methods are there. But it depends high with texts in the document. Please suggest any best algorithm that can classify the document without using the texts.
I have added few sample images. Example i am having 500 documents with different layout . If i feed the image into the engine, it should tell this document is this type(i.e Form 16a, w2 tax).
Hello forum,
I have been reading about cascaded classifiers using haar-features for face detection and I have a few simple questions I have to ask/clarify. This is more towards implementation as I am a little confused as to how they work.
1) I understand that during the training phase, the haar features will be evaluated and rescaled for all possible combinations. At the end, the feature with the smallest error will form the first stage (attached picture). My question is, during the detection phase when a sub-window is selected for evaluation, will the features be placed at a specific region (like in the attached picture again) ?
For example, for the top left feature, it must always be positioned in the center leaving an empty space of 10% (of the width) to the left and right and be 30% (of the height) below.
Or will evaluation start at the top left hand corner (assuming origin), similar to training ? i.e. the feature will be evaluated over all the regions in the subwindow.
2) Regarding adaboost, I have understood the steps but my question is, when the weights are updated after the nth iteration, is it possible that a feature that has been already selected, get selected again ? i.e. it has the smallest error again. Or will features/classifiers that have already been selected be "removed" from the subsequent selection process ?
I am really loving computer vision. I will be undergoing this module in 10 weeks when semester starts but, I can't wait for so long to officially start learning what I love haha. Thanks all.
Hi, we worked on depth estimation of rectified images using OpenCV. The approach usues simple template matching with SAD. Does anybody know of a recent DENSE depth estimation algorithm for which there already exists an online accessible implementation? Thanks
I have two the co-ordinates of two bounding boxes one of them is the ground truth and the others the result of my work. I want to evaluate the accuracy of mine against the ground truth one. So I am asking if you have any suggestions. The bounding box details of the ground truth is saved in this format: [x y width weight]..Thanks
I am detecting rings from an image and distance between concentric rings of unknown radius.
I performed and recorded these tests in the field so my subjects blend in very well with the forest floor -- making automatic detection nearly impossible. I had ok success with a PanLab trial but I'm wondering if there are any good (and hopefully open source) alternatives.
Hi,
I am doing research in human action classification.I have used Hog Feature for classification but I got low accuracy. Please suggest me any features.
Histogram spread for gray image is given in the paper but there is no clear idea about rgb image.
Currently, we have acquired video data of human actions performing martial arts movements. We want to segment the video frames into different actions (sequentially). Can anyone suggest what the best method so far is for this problem? Some good links are also welcomed. Thank you.
I seek to write a code that would compute the dimensions of a room from a photo.
Is there a comprehensive taxonomy that can explain the state of the art of current abnormal events detection techniques from video?
I would like a method to calculate the curvature of a 2D object. Object is a matrix whit n rows (that are corresponded to n consecutive points) and 2 columns (that are corresponded to x and y coordinates).
I am using a 2 layers (one hidden layer ) Neural Networks based classifier to run a classification for my data (images) and use back propagation algorithm.
the networks works well and can classify with more than 90% of accuracy . But the shape of weights in hidden layer are different by running every time. I am using Matlab imagesc function to visualize the weights.
Hi everyone, I am new in Computer Vision and especially EMGU CV language, but I need to do my project that relate with distance measurement using 2 identical web cameras.
So far I've done these steps:
1. Perform stereo calibration ( acquire intrinsic & extrinsic parameter )
2. Perform stereo rectification ( use the parameters to rectify both images )
3. build disparity map using StereoSGBM algorithm
4. And try to acquire the distance value in certain pixel of the image plane using points = PointCollection.ReprojectImageTo3D(disparityMap, Q);
but I got a problem to understand the meaning of each point (x,y,z) value. I presume that the x and y value appoint the coordinate in the image plane while the z appoint the depth information.
Anybody can explain to me on how can I convert the z (depth information) into real world distance?
Any information will be respected.. thank you very much.
Interested in doing some research in Computer Vision and Mobile Visual Search. Could you please suggest some novel ideas/issues that are emerging in that research topic?
What is the approach to measure the real size (length, height and width) of an object from an image while I don't know the focal length or the object distance (I don't know the origin of the image, i.e. any technical details of the lens or the camera)?
I have read about ASM and discrete symmetry operator ,and I got the main idea .
But I got confused of the bundle of the non-understood functions. Is there any simplified illustration for both of them?
As we know that FPGAs are suited for parallel and pipeline based processing. In this regard, can we accelerate FPGAs to solve problems of big data in computer vision perspective.
Can anyone help me to understand hand label colour space?
Does anyone have any experience on the development of a domestic robot's ability to locate itself in an indoor environment?
For your answer, take into account the possibility of having a camera on the robot (image recognition may be a way to go?).
I believe it may be necessary to take multiple inputs. For example, an image recognition algorithm, together with a "dead-reckoning" method, such as estimating diplacement as a function of the revolution of the robot's wheels could be used to estimate the position of the robot.
All feedback would be greatly appreciated, as I am just starting with this investigation.
Thank you very much!
In the Attached Paper, I have a problem in Mean Shift Vector Equation 10.
Does M(X) => Mean Shift Vector are contain both Vx and Vy or single value ?
I have only problem in how to calculate Target Model & Target Candidates Kernel density from histograms (for gray scale images.)
Let suppose we have histograms H1 and H2 of Target Model & Target Candidates respectively. I want to compute Gaussian kernel Density.
In classic snake method, there is a formula (sum or integral) that define the overall internal energy of curve using first and second derivatives of the curve. I can calculate this energy, but i can't use this energy to evolve the curve. I want to know about procedure of curve evolution.
I know scale invariant SIFT feature but this technique is tedious. Another features are Histogram of Oriented Gradient HOG, which is efficient and can be rotation invariant. But I didn't find any feature which is invariant to RTI and scale changes and yet it's computation cost is low?
Can anyone please suggest me such features if it exists?
Can I use combination of these feature? if yes, how can I combine them as they have different size?
I would like methods that do not need specific hardware.
The methods for face detection that work well in unsuitable lighting condition.
Which later can be used for svm classification with segmentation being involved as well.
I am using A4tech webcam, and I connect it to the opencv via visual studio. But the problem is that I cannot analyse it's frames. i.e. the program is built completely, but if in the main code I have used the frame properties, it will not start running. The memory of the frames (frame=cvQuaryFrame(capture)) is not accessible. But this problem is only for external webcam. The laptop Embedded webcam does not have this problem. I have attache the .cpp code that I have written