Science topics: Artificial IntelligenceObject Recognition
Science topic
Object Recognition - Science topic
Explore the latest questions and answers in Object Recognition, and find Object Recognition experts.
Questions related to Object Recognition
Deep Learning for Computer Vision
Explore how deep learning is revolutionizing the way machines see and understand the world around us!
🔍 In this lecture, we delve into:
✅ The power of Convolutional Neural Networks (CNNs)
✅ Image and Object Recognition
✅ Semantic Segmentation and Localization
✅ Advanced Object Detection techniques like RCNN, Fast-RCNN, and Faster-RCNN
🎥 Watch the full lecture here: https://www.youtube.com/watch?v=Ql0sApkfXpk
Join us as we uncover cutting-edge techniques and their applications in this exciting domain of AI. Let's shape the future together!
#DeepLearning #ComputerVision #AI #MachineLearning #Education #Innovation
Seeking insights on leveraging deep learning techniques to improve the accuracy and efficiency of object recognition in machine vision systems.
Hello Everybody,
As the title says, I am searching for a public 3D Object Recognition and Pose Estimation Dataset.
I've utilized Google for three days, so I thought I might as well ask here.
The dataset should contain model and scene pointcloud data, ideally stored in pcd or ply format. (I am working with the point cloud library)
There is no need for thoundand files of training data, 10 files f.e. would be fine as I just want to evaluate an object recognition pipeline fast.
So basically the algorithms tries to fit 3D models clouds into a point cloud of a scene photographed by an ASUS Xtion camera.
I am currently searching for a topic for my research which is about using machine vision and object recognition to control a robot (serial or parallel it does not matter). Unfortunately I can not find a problem to be solved. can any one recommend some new points of research ?
I would like suggestions that mainly be able to analyze the New Object Recognition Test (NORT). Free software!
To answer this question, you should define what is meant by object recognition. You can then include research evidence showing how object recognition is performed.
Hi Everyone,
I'm currently practising an object detection model which should detect a car, person, truck, etc. in both day and night time. Now, I have started gathering data for both day and night time. I'm not sure whether to train a separate model for daylight and another model for the night-light or to combine together and train it?
can anyone suggest to me the data distribution for each class at day and night light? I presume it should be a uniform distribution. Please correct me if I'm wrong.
Eg: for person: 700 images at daylight and another 700 images for nightlight
Any suggestion would be helpful.
Thanks in Advance.
Hi,
Are there datasets for objects recognition for industrial applications ? as tools
I want to Identify darkest object in uploaded image. I have tried Imagej. In IMAGEJ, for each image I have to do different threshold and analyzing. Here some are getting excluded for different images with same value of threshold. I want to learn if counting is possible very accurate automatically with some image processing technique. Is it possible to identify with OpenCV?
Dear colleagues:
I had an interesting discussion with my labmates about what is considered exploration in object location or object recognition tasks. In my experience/opinion, object exploration performed by roedents is defined just when the animal snif directly the object, with a clear "intention" to explore (figure1). According to my labmates and published pappers, exploration is defined when the animal brings its head closer to the object. Automated softwares quantify any entry in a determined circle around the object.
My concerns are related to:
1.- when the the animal sometimes hides behind an object, in this scenario is not exploring, just being there, sometimes even immobile (figure2).
2.- sometimes rats use the object to rear and snif pointing its nose to the upper part of the arena, not in the object direction (figure3).
Can you share your opinions about this concern????
Dear community,
I'm looking into ways how to do an a-priori power analysis for an fMRI experiment where the main analysis will be a representational similarity analysis (RSA).
The experiment will present the same stimuli in two successive fMRI-sessions (with a behavioral training in between). For each fMRI session, I plan to do a model-based RSA on brain responses elicited by the stimuli. Voxel restriction will be done with a searchlight procedure. The most interesting outcome will be the difference in these results between the two training sessions, as estimated with a GLM contrast.
I think this is not uncommon as i found other experiments adopting a similar analysis procedure. I found no clue however on how to estimate the necessary sample size to achieve a certain statistical power (say 80%).
Since this is a bit of a frankenstein made from other common statistical approaches, I'm not sure if the general logic of fMRI-power analysis applies here.
Has anybody experience in this area or can point me to literature that contemplates this issue?
Thanks,
Oliver
I want to start a project in which I want to use machine learning tools (neural networks) to recognise objects. For this purpose I am looking for the right hardware component with respect to camera and enlightning.
The following requirements to the hardware are given:
The resolution shall be at least 4000x3000 pixels or 4K.
The device shall be designed for continuous operation. It must not shut down after a certain time.
The depth of focus shall be good enough. The camera is about 1.5 meters away from the objects and it may occur that some objects have large packing heights.
In general exposure time, aperture and focus shall be able to be fixed manually.
Also the products shall not be too expensive in comparison with similar products.
I am using UAV data for mapping geomorphological processes in different environments, from coastal and estuarine subtropical areas to subpolar and polar glacial landscapes, and I want to profit from the huge amount of information in such high-resolution datasets. So, I was wondering if there are good free options for object-oriented image classification, alternative to eCognition for example?
Hello. I am an undergraduate student and I need to collaborate with my professor (who specialises in Computer Vision, Image Processing and 3-D medical imaging). I am looking for research ideas mainly in the topics of object detection, visual object tracking, object recognition, semantic segmentation, localization using u-net or medical imaging. Can anybody help me jot down growing research fields in these areas? Thank you!
I am looking for visual stimuli that produce a similar effect than the well know “Dalmatian dog illusion” (see Figure attached).
If you look briefly at the Dalmatian dog illusion for the first time, it looks like a pattern of meaningless black and white stains (left panel). However, once a priming cue is briefly presented (the red contour in the right panel) the representation of a Dalmatian dog in a field becomes apparent. Once “seen” this representation will remain apparent even after the priming cue is removed and can't be unseen (look at the left panel again without the red contour).
Do you know other types of visual stimuli containing a hidden object shape that never pops-out before and always pops-out after the transient presentation of a priming stimulus?
Thank you!
Which algorithms makes use of least squares in object Recognition? Is the least squares approximation use in calculating Principal Component Analysis (PCA) and Singular Value Decomposition (SVD) in machine learning?
How many hidden layers are there in Faster Region Convolutional neural network used for object recognition?
In image processing, an image is "processed", that is, transformations are applied to an input image and an output image is returned. The transformations can e.g. be "smoothing", "sharpening", "contrasting" and "stretching". The transformation used depends on the context and issue to be solved.
In computer vision, an image or a video is taken as input, and the goal is to understand (including being able to infer something about it) the image and its contents. Computer vision uses image processing algorithms to solve some of its tasks.
The main difference between these two approaches are the goals (not the methods used). For example, if the goal is to enhance an image for later use, then this may be called image processing. If the goal is to emulate human vision, like object recognition, defect detection or automatic driving, then it may be called computer vision.
Hello everyone,
I have a group of fish (n=28 per group). I have measured the amount of time that each fish spent exploring two different objects (A and B) in a squared shape tank. Then I have calculated an exploration index as: (time spent with A) / (time spent with B + time spent with A).
Now I have some proportional data and I am going to test whether a fish spent time with A more than chance level (0.5) to see if they have a preference for object A. To see if there is a difference between exploration ratios and chance level, I am gonna run one sample t-test. However, as I read some papers, I think I should do arcsin transformation (arcsine square root transformation) for my data. However, when I transform my data and run one-sample t-test I get odd results. For example, my raw exploration ratios show that they are below chance level (mean: 0.322 and SD:0.12) while after transformation of data I see that the ratio are now higher than chance level. I am really confused and I do not now how I should treat with my data. I would really appreciate it if you can advise me
In the very first frame of the video, I define a ROI by drawing a close line on the image. The goals is to recognize that ROI in a later frame, but that ROI is not a salient object. It is just a part of an object, and it can deform, rotate, translate and even not be fully in the frame.
Essentially, this algorithm should be used to reinitialize trackers once they are lost.
I have used a histogram based algorithm which works somewhat well, but it doesn't "catch" the ROI entirely.
The object is a soft and deformable object, soft tissue in a way, meaning you can expect deformations and also visual changes due to lightning.
I have a data for image recognition using neural networks. The images are in pgm format.how to pre-process that data to get into a suitable matrix in cpp.
I have a tif image of size around ~10 Gb. I need to perform object classification or pixel classification in this image. The dimension of image data has zyx form. My voxel size in x=0.6, y=0.6 and z=1.2.
Z is the depth of the object.
If I do classification of pixels in each Z plane separately and then merge to get the final shape and volume of object. Would I loose any information and my final shape or volume of object will be wrong?
like a MWM, EPM and object recognition test free software also help how to use and download it
thanks
Dear colleagues,
I am looking for the URLs file of the VALIDATION set of ImageNet Large Scale Visual Recognition Competition (ILSVRC) 2012.
I can easily find that of the training set. However, I have troubles reaching out the validation set's file.
BTW, I have the original image set. I just need the source URLs.
Thanks for help.
I want to code for object recognition using deep learning, where I do not have any database for supervised approach.
I want to perform the same using unsupervised deep learning approach.
Can you please guide me to focus those possible methods to go through for object recognition.
I have a project in matlab. I need to recognize the color of that car.
On other words , How can I recognize any object color using matlab and wrote the result as text for example if car has a red color, then the result will be "red" as text .
I am looking for the state of the art methods which are being used for object recognition on moving platforms.
Formally, a computer program should be able to scan such an image, perform image processing or any specific treatment on it and produce the followings.
1) The info about multiple geometrical shapes stacked together in front/side and top views in the image.
2) The correlation between the two views as some shapes (or a part of a shape) are hidden in one view but their projections are seen in the other view.
3) The relationship among shapes such as the orientation of a shape w.r.t. to each other.
4) The info about dimensions normally written in text besides arrows.
5) The info about arrows, single-headed, double-headed, straight, slanted, etc.
I have a small set of chest CT scans. I am interested to use a deep neural network for denoising of these images. However due to small size of data, I can not train the network, hence I am looking for pretrained networks. I am aware of pretrained CNNs for object recognition or feature extraction (VGG, Resnet, etc.) but not any for denoising). I appreciate any suggestions.
Thanks,
Nastaran Emaminejad
Can anyone help me about the best method to classify facial expression database? I have tried using FFT and SVM, but still it's just based on the whole features in image. It doesn't necessarily focus to mouth or eye expression. Thank you
With a list of models (CNN, FFNN, RNN, etc) performances? A kind of MNIST for VOR?
A want to compare performances to well-known models in computer vision.
The research content includes a proposed algorithm for image/object matching and two proposed algorithms for multiple object detection.
The algorithms for image/object matching and multiple object detection are not related.
My question is how to organize them to form a Phd thesis? How to unify them into a big problem to present? What title is appropriate?
Hi,
I will be running an experiment that requires participants to distinguish between several novel objects. Ideally, each novel object will be a configuration of 3D geometric shapes (e.g., pyramids, pentagonal-prisms, spirals, discs, cuboids) but objects *cannot* be distinguished from one another based on one particular local feature: the only defining aspect of an object should be its overall configuration.
For example, if we have object A (a configuration of a cuboid, cylinder, and a pyramid) for each of its features there will be at least one other novel object that contains the identical feature (e.g., object B might have the identical cuboid, object C might have the identical pyramid, and so on…) - and thus the objects cannot be differentiated based on local features, and must be differentiated by overall configuration instead. So I’m looking for a stimuli set where features have been manipulated systematically such that objects can be distinguished only by their configuration of features (something corresponding to the linked table would be ideal):
Has such a stimuli set has been used in the past, and if so has it been made available? Any suggestions welcome.
Ryan
I want to know is any object in a part of picture or not. I do not need to know what that object is.
Could any one of you please suggest some technical papers/articles in the field of thermal image processing to start with?
I need to apply a machine learning technique to categorize a data set in to a large number of classes (around 60). What would be the best machine learning technique to use? I just want to get an idea.
Hi,
I am new to active shape models (ASM) and I want to use it in my research to do image segmentation.
For using ASM, there should be a training set to generate the shape statistical model: x = xbar+Pb, where xbaris the mean shape and P is the eigenvectors, x is the shapes obtained by changing the shape parameters b.
In my particular case, there is no training set. However, the mean shape is known, so is the shape constraints. Is there a way to use the statistical model? Use simulated shapes for training? But how to mimic the gray-level profile?
Thanks in advance.
I ran a novel object recognition test on mice that underwent traumatic brain injury (or sham) and were treated with a drug (or vehicle). My sham controls (both treated and untreated) performed well and showed clear preference for the novel object (>60%), the TBI untreated group exhibited no preference for the novel object. Curiously, the drug treated TBI group exhibited a pretty strong avoidance of the novel object, only actively investigating it about 25% of the time. Does anyone with experience in this task have some insight into how to interpret this result? Is it neophobia? Anxiety?
I would like to know the process of recognition of shape and pattern of the object using a digi-cam based on image processing
I would like to know a good starting point to carry out my research in the above mentioned topic.
hi, i'm working on The biologically inspired hierarchical model for object recognition, Hierarchical Model and X (HMAX), and i want to know that how many images i should use in training stage to extract patchs?
only thing i see in related works is that they just mentioned to number of patches and they didn't say any thing about number of images.
Say I have a set of objects to teach CNN but when they are appear in different angles the network doesnt recognise them. I can teach each CNN per angle but it looks like a weak solution. Is there any existing experience to solve recognition issue for 360 degrees of same object?
Esp. for Face recognition and face expression recognition
I have an arm robot. An object coordinates will captured by the camera and need to be mapped to the robot to implement the IK (inverse kinematic) algorithm and then robot has to move to a location defined in the camera image displayed by the supervising computer.
I'm wondering, what kind of a vision system should I apply to capture object coordinates? .and to measure surface defects to characterize surface roughness in polishing task?
Hi all,
I am using slidning windows technique for object detection. But this techniques is very slow. So i am searching for an alternative method for object detection.
Is there any alternative method for object detection ?
i want detecting, counting and measuring of plant stomata in microscopic image by a computer software. Can I do it by image processing at all? How can I do it? Is it possible programmed by vb.net library or need to use other programming languages or even matlab toolboxes? I attached an example image. So thanks
hi all
I have some images. I should convert my images to binary. For all images the histogram has a distribution like the attachment.
I think that if we have a bimodal histogram then choosing a thereshold is easy (there are some methods for this. For example: I know that otsu's method is good for bimodal histogram).
Now with 3 peaks in all histograms, how can I convert my images to binary? (consider to the peak that Corresponding to zero)
My main idea is to detect an object from a cluttered scene. At first, I capture the image of the object alone. Next, I capture the image of a cluttered scene in which the object is present. Th object must be detected from the cluttered scene. I am taking the pictures using flash only.
I want to measure major and minor axes on each hole in image that i attached. I've done pre-processing methods on image,finally i have this image that i want to measure axis,how can i do that?
I am doing project for 3D model search using shape matching. For this I have generated 4 texture-less views from 3D model (Front, Top, Side and Isometric) with hidden lines. I have used SIFT algorithm to match these diagram with the one that is provided by an user. But SIFT is mainly for textured object detection so it is not generating up to the mark result. Also there is AKAZE but it also uses textured object detection.
Can anyone suggest me any shape matching algorithm with scale and rotation invariant.
Hi
I'm trying to find an algorithm for detecting fire in a video.
which method for dynamic texture detection is the best for this purpose?
thanks in advance...
I'm working on hand-recognition project using Matlab, I'm trying to find convexity defects to define the fingers roots, actually, I have got Convex-hull points ( Convex-contour ) as shown in figure below (blue line), but I don't know how to find convexity defects, convexity defects shown as yellow points in the next figure.
Hi all. I am a postgraduate student in electrical engineering and I would like to hear your opinion on automatic recognition of fasteners. We have heard about plant identification and face recognition but fastener recognition is rarely discussed. From my research, there are thousands of unique fasteners and to identify one fastener with another requires knowledge about the pitch diameter, head type, etc. There is also gauges invented to identify fastener but how about a system which require only a camera and a computer?
Would you like to have an automatic fastener identification system using a camera?
Do you think this system is important in your daily life or in the manufacturing/maintenance industry?
And lastly do you have problem in identifying fastener?
I am looking Video dataset to make studies on the field of Video processing?
I] Part-I (Orientation Assignment to Keypoint)
In this process
1. First I have selected window of size 16 x 16 around keypoint and calculated magnitude and orientation for each point in window of 16 x 16.
3. Then created a 36 bin histogram of orientation.
4. then I have assigned the mean value of highest bin.(i.e. if 1st bin(0-10) has highest bin of 36 then '5' is assigned as orientation to keypoint.(Is it Correct?))
5. Then I have calculated Gaussian window of size 16 x 16 with sigma value equal to 1.5 times of scale.
6. Then I have multiplied magnitude matrix of size 16 x 16 with Gaussian window
(What is the use of this multiplication?)
Is it require to multiply this multiplication result(Magnitude x Gaussian) with orientation before assigning orientation to keypoint ? (as i found some histogram bins with highest value has less magnitude value.)
As per my logic we should assign the orientation mean to keypoint as orientation of the bin whose value is highest with its magnitude value.
7. Then I have transformed(rotated) coordinates of key point i.e. x,y position of key point with respect to assigned orientation by using 2D transformation. (Is it Correct?)
8. then I have transformed orientation of all sample points included in window of 16 x 16 according to orientation of keypoint.(e.g. if keypoint orientation =5 and if sample point orientation =270 the it will become 275.(Is it Correct ?))
Boundary of the image is generate. from the boundary image i want to generate the datum points between the (index-middle) finger and (ring-little) finger.
Do the Flying saucers contain giant wheels as a century fuse (e.g. to create artificial gravity – by spinning at high-speed)?
Is the shape of a saucer (i.e. shape of magnifying or convex lens) is ideal shape for deflecting space debris (e.g. to minimize damage)?
If mankind wish to travel to nearby planets such as Mars, don’t we need to study the reasons or possible advantages for saucer shape?
I am not saying, aliens travelled to Earth. But we all know that the most popular shape for the UFO is Flying saucers.
I like to know pros and cons and thoughts who have done more investigation. I am just a curious bystander. I saw a small bit on returning of US astronaut after nearly spending one year in the space. Also the news mentioned that, it would take about 1 year just to reach Mars.
This is the weekend, so wish to explore something fun and interesting. If UFO contains a gain-wheel/centrifuge. How many hours a day should we need to run the gain-wheel/centrifuge to maintain healthy bone mass density?
Of course, it is possible to run the gain-wheel/centrifuge at different speeds in order to exert different weight (e.g. ranging from 0.5G or 1.5G). Of course, such power consumption can be meat by a mini nuclear power plant. I am sure, such advanced civilizations could have developed such mini nuclear power plant.
Best Regards,
Raju Chiluvuri
I am doing the work for finger recognition. For each image containing hand object, I have the human labeled ground truth and calculated result by algorithms of each fingertip coordinate. Right now I would like to calculate error between the result by algorithm and ground truth.
Before the error calculating, I believe I need to match fingertips to certain pairs. My intuitive method is to generate fingertip blocks firstly, and for each recognized fingertip block by the algorithm, use SSIM to find the nearest block in labeled data.
Could you give me more suggestions for the corresponding fingertips matching procedure?
Thank you so much for your great help!
I have mixed some detection and tracking algorithms to do multiple pedestrian tracking. At this stage I have my results and I can visually see the tracked pedestrians. However, I don't know how to evaluate my results to show how good my method works. Do you have any suggestions?
Thank you.
I am trying to detect 3d model from live video stream.Model should be detected by any face.how can do that?
I'm a student with electrical/mechanical background, in my project I'm searching for a solution for a company who wants to start with 3D cameras for robotics.
At the moment I'm working with Matlab and it works great, the possibility to create your own GUI is a big plus.
But I read Matlab is more for developing purpose and is slower (overhead).
A second software package that I try to use is Halcon, at the moment I've no overview of the possibilities.
But it looks to me that you can program in Halcon's own language hdevelop or using their libraries in your own code (like C++).
Programming in hdevelop with it's GUI seems to be easier/faster than low-level programming (e.g. C++), but I don't know the limitations.
A disadvantage is that there is no community for support, you need to use their documentation.
A third option I read a lot about is OpenCV, but with no low-level programming background this seems too ambitious for me.
I'm not searching the best solution for me, but for a company (although I know the company hasn't a lot of computer engineers).
I was hoping to find software with a good GUI to reduce low-level programming, Halcon seems to be the closest match.
Thanks for your help.
We run a small scoring shop for university exams. We use a optical recognition scanner and scan sheets through the scanner to score instructor designed exams.
We have been asked to begin scoring multiple answer exams. Our current optical recognition software is good at scoring items with only one correct answer. However we are now being asked to score tests where students should indicate all items which are true, up to three correct options for one item.
Do any of you have a good system for tabulating the correct answer in this type of assessment? Thanks for your help.
Is there any difference in performing the novel object recognition test in open arena or in Y-Maze or in multiple chambers? Do they have different purpose?
What are we supposed to do during recognition process in order to sparse representing? Actually sparse representation should be done in which part of the process? skin detection, feature extraction or classification of gestures? Is there any projects that have been done with similar subject, to give me a view about how I could go through it?
Actually, I have thought to implement HOG feature with temporal context for video data. Dalal and Triggs HOG is for 2 Dimension image. I want to implement it for video sequences as a feature for human action recognition. Where, you have to find gradient in three directions (x,y and t) and follow the procedure of traditional HOG with some nominal changes. So, is anyone has already used this technique? Is this concept is worth as efficient feature?
As in case of object recognition, different work is done in this field with different test objects, how can I compare the performance of my work with any existing method?
Ask for advice: Comparing the delay or retrieval activity in object color-based working memory with corresponding activity in object location-based WM.
I would be grateful if someone could give me some advices and paper.
I have a skeletal model given by Kinect and now I want to label body parts using it. I know that for finding joint coordinates Kinect algorithm does this in its intermediate steps http://research.microsoft.com/pubs/145347/BodyPartRecognition.pdf
but is there a way to access that information or can you suggest me some other method/code to label body parts.
I have performed behavioral tests of open field, elevated plus maze and novel bject recognition in my test mice in CD1 background. Interestingly, these mice didn't show any change in Elevated plus maze test. However in open field they traversed less in the central area of the open field. This was significant. The same difference was there in C57 background also. In novel object the test mice stayed more in proximity of the older object as compared to the novel object. I interpret it as mild anxiety which may be driven by novelty associated fear. However there still remains the question of the mice not showing any changes as compared to control mice in Elevated plus maze. Can someone help me alternate interpretation?An alternate explanation would be immensely helpful. I haven't performed EPM on c57 background. Only open field has been carried out.
Hello forum,
I have been reading about cascaded classifiers using haar-features for face detection and I have a few simple questions I have to ask/clarify. This is more towards implementation as I am a little confused as to how they work.
1) I understand that during the training phase, the haar features will be evaluated and rescaled for all possible combinations. At the end, the feature with the smallest error will form the first stage (attached picture). My question is, during the detection phase when a sub-window is selected for evaluation, will the features be placed at a specific region (like in the attached picture again) ?
For example, for the top left feature, it must always be positioned in the center leaving an empty space of 10% (of the width) to the left and right and be 30% (of the height) below.
Or will evaluation start at the top left hand corner (assuming origin), similar to training ? i.e. the feature will be evaluated over all the regions in the subwindow.
2) Regarding adaboost, I have understood the steps but my question is, when the weights are updated after the nth iteration, is it possible that a feature that has been already selected, get selected again ? i.e. it has the smallest error again. Or will features/classifiers that have already been selected be "removed" from the subsequent selection process ?
I am really loving computer vision. I will be undergoing this module in 10 weeks when semester starts but, I can't wait for so long to officially start learning what I love haha. Thanks all.
I've implement sift
1. make Gaussian blurring in every octaves and find the DoGs.
2. find local extrema.
but at this step I am confuse what should I do to find the key point with those 6 extremas?
can some one explain the formula of keypoint localization D(x) ?
I need to know about object recognition steps can any one help me, with an example.