Science topics: Artificial IntelligenceObject Detection
Science topic
Object Detection - Science topic
Explore the latest questions and answers in Object Detection, and find Object Detection experts.
Questions related to Object Detection
Need suggestions for publishing my review paper in free sci/Esci journals.
1. Need suggestions for inclusion of novelty in my review paper of object detection
2. Suggestions on future development and directions for object detection using deep learning
3. Any graphical representation of comparison of literature for Obj Det using deep learning
4. How a SOTA review paper can be insightful for readers.
TQ and welcome suggestions on these details.
Dear All,
I want to deploy Deep Learning Networks either classification networks or object detection networks on FPGA kit. Anyone please suggest me which FPGA kit (Kit name and approx cost of the kit) is much suitable to deploy deep learning networks?
Which Object Detection architecture (be it CNN-based or Visual Transformer-based) can be used to detect small objects ?
How YOLOv8 is used to obtain the contours/object and calculate individual area
The artist Salvador Dali was a master of imbedding images into his paintings to challenge perception (Fig. 1, from fig. 13-21 of Schiller and Tehovnik 2015). The head of Voltaire in the painting is composed of three Nuns. Depending on how you view the image will determine whether you see Voltaire by fixating his nose or see the Nuns by fixating their heads. Many bistable images oscillate depending on what part of an object is foveated by the eyes. Indeed, IT cortex must have access to eye-position information with respect to the details of an object (ultimately in three dimensions) (Ingle 1973). This information in its entirety (including the eye movements) is required when learning about novel objects (Hebb 1949; Yarbus 1967), so that the first time you experience the face of a new person it is immediately added to your library of stored faces. We believe that an individual neuron in IT cortex, which can be connected to a network of over 1,000 follower neurons (a unit of declarative consciousness), is sufficient to store a new representation immediately (Tehovnik, Hasanbegović, Chen 2024). To test this supposition, we now have the understanding to disrupt individual neurons in the neocortex using the method of Ojemann/Penfield (Ojemann 1983, 1991; see Fig. 7 of Tehovnik et al. 2009; also see Houweling and Brecht 2008), which should be able to erase the memory of a new face upon direct electrical stimulation of the neuron occupying the center of one unit of declarative consciousness.

I have a dataset with piles of fish and I'd like to implement an object detection algorithm by using rotated bounding boxes. Any suggestions with available (github) code?
Computer vision tasks, such as object detection, have traditionally relied on labeled image datasets for training. However, this approach is limited to detecting only the set of classes present in the training data. Zero-shot object detection (ZSD) is a breakthrough in computer vision that allows models to detect objects in images based on free-text queries, without the need for fine-tuning on labeled datasets
This capability has significant implications for businesses, as it enables more flexible and adaptable computer vision systems. In this blog post, we will explore how zero-shot object detection is changing computer vision tasks in business and discuss some of the key benefits and challenges associated with this technology.
Computers, Materials & Continua new special issue “Deep Learning based Object Detection and Tracking in Videos” is open for submission now.
At link: https://www.techscience.com/cmc/special_detail/object-detection
Guest Editors
Dr. Sultan Daud Khan, National University of Technology, Pakistan.
Prof. Saleh Basalamah, Umm Al-Qura University, Saudi Arabia.
Dr. Farhan Riaz, University of Lincoln, UK.
Summary
Object detection and tracking in videos has become an increasingly important area of research due to its potential applications in a variety of domains such as video surveillance, autonomous driving, robotics, and healthcare. With the growing popularity of deep learning techniques, computer vision researchers have made significant strides in developing novel approaches for object detection and tracking.
This special issue will provide a platform for researchers to present their latest findings, exchange ideas, and discuss challenges related to object detection and tracking in videos. We invite original research articles, reviews, and surveys related to this topic. Additionally, this issue will also welcome topics on action recognition, anomaly detection, and behavior understanding in videos.

How do LiDAR sensor resolution and point cloud sparsity affect object detection and classification in autonomous driving, and what filtering techniques can be applied to improve perception in challenging environments such as rain or fog?
In this project, we try to make a model that detect maize and weeds in a photo taken from 1 meter, vertically to the ground. In fact its enough that model predict it is MAIZE or NOT. But Weeds can be "quite different as morphologically".
My opinion is that we must annotate each species as a different classes. But I still need someone say that its a obligatory.
(Most probably, we use yoloV5 or V8)
Thank you.
I have carried out a work on object detection specifically to detection of pedestrians using YOLO v1, v2, v3, v4 and tiny YOLO v1, v2, v3 and v4 algorithm and proposed a new variant of YOLO algorithm. The employed and proposed variants of YOLO have been evaluated using precision, recall, f1 score, AP for each class and map value.
I submitted the same work to a journal and in response reviewer has asked for carrying out statistical analysis using Friedman Aligned Rank Test, Wilcoxon Test, Quade Test etc and find out the P value.
Is it possible to apply such tests on an object detection problem? Here the training and testing have been done on images. How to apply such statistical tests. Please guide.
I read more paper and wrote for example: "real-time detection of people from the RGB-D image"
or "robust detection and identification of a person in real-time (around 0.3 s)"
What are the recent and best models for Zero Shot Object Detection ?
As it is known that deep learning can be used to improve the accuracy of object detection in UAVs for shark spotting and surveillance by leveraging powerful machine learning algorithms such as convolutional neural networks (CNNs). CNNs can be trained to detect objects in an image or video with high accuracy, allowing UAVs to identify sharks in real time. Additionally, techniques such as transfer learning can be used to reduce the amount of data needed to train the model and increase its accuracy....What specific techniques can be utilized to enhance that detection accuracy under water ?
Hi,
I am using Mapillary mly.interface.get_detections_with_image_id interface from the SDK, while it seems it is not detecting objects correctly by the image-ids for many images.
Does anyone have any experience/ recommendation for the best algorithm for object detection/ instance segmentation for images retrieved from Mapillary?
mly.interface.get_detections_with_image_id(image_id=246770667235908)
What is the best computer vision or CNN model for detecting very small objects in high resolution images ?
Hello everyone! I am in search of a suitable dataset for the Yolov5 application. I need a dataset for the object detection task with several kinds of balls and a suitable annotation.
Does anyone get errors with using the Recognition part in " Scene Text Detection and Recognition by CRAFT and a Four-Stage Network" ?
I have carried out a work on object detection specifically to detection of pedestrians using YOLO v1, v2, v3, v4 and tiny YOLO v1, v2, v3 and v4 algorithm and proposed a new variant of YOLO algorithm. The employed and proposed variants of YOLO have been evaluated using precision, recall, f1 score, AP for each class and map value.
I submitted the same work to a journal and in response reviewer has asked for carrying out statistical analysis using Friedman Aligned Rank Test, Wilcoxon Test, Quade Test etc and find out the P value.
Is it possible to apply such tests on an object detection problem? Here the training and testing have been done on images. How to apply such statistical tests. Please guide.
I and my team would to improve a system with 3 sensors HC-SR04 to detection the obstacles for an autonomous agriculture truck. We use Arduino to implement this.
The question is: With only sensors HC-SR04 is possibile to detect the velocity and the size measurement of the object?
EfficientDet outputs classes and bounding boxes. My question is about both but specifically I am interested in the class prediction net part. In the paper's diagram it shows 2 conv layers. I don't understand the code and how it works. And what's the difference between the 2 conv layers of classification and box prediction?

Hello all,
I am trying to have object detection using point clouds. But the constraint is I cannot use machine learning approach because I do not have enough data to train a network and it is not an usual class object found in the pre-trained networks.
I want to proceed with traditional approach. I have come across segmentation and clustering approaches for obtaining the points that represent the desired object in the 3d point cloud. I have also tried RANSAC to get the plane on which the object lies in order to eliminate outlier points.
I am still looking for any other ideas which can serve the purpose. I would be really grateful if I can find some new ideas.
Thank you very much!
I have a dataset and I intend to use multi-label learning approach to recognize the various objects present in the images dataset. What is/are the appropriate segmentation approach to use for the multiple objects detection task?
For many smart road applications, objects detection and recognition are one of the most important components. Indeed, precise detection of road objects is a critical task for autonomous urban driving and robotics technologies.
Actually, we are working on the conception of a smart system that consists in detecting and blurring undesirable road objects to anonymize and secure road users.
In this context, we want to know what are the up-to-date methods (Neural approaches, etc.) used for the detection and recognition of road objects (humans, vehicles, license plates, etc.) ? Any ideas ?
Greatly appreciate your review
When training a CNN for object detection within images, it is supplied with images containing examples of said object (or list of objects) through their coordinates within the image.
But what happens if you don't supply the coordinates of all the objects in the training images? So there are some object in your training images that match the pattern that the CNN is looking for but are marked as negatives...
I suspect this has a mostly negative impact but, Are there any resources/papers that go over this issue into details?
Thanks
ONNX based YOLOv5s Model is running perfectly fine, when applied on images for object detection. But when same model is running in Live stream, objects are detecting but bounding boxes keep on flickering. I want them to be fixed. Can anyone please help me with that? Please give the reason for such flickering and solution for the same?
Thanks in advance!
Thinking about Transfer learning.
Can we learn a few features from model one and few from model two, and make a single model perform both tasks??
> let's suppose person detection from YOLOv3 model_person.weight and another face detection from YOLOv3 model_face.weight, Can we combine and make a single model, detect Person and face as well.
I need to calculate the accuracy, precision, recall, specificity, and F1 score for my Mask-RCNN model. Hence I hope to calculate the confusion matrix to the whole dataset first to get the TP, FP, TN, FN values. But I noticed almost all the available solutions out there for the calculation of confusion matrix, only outputs the TP, FP, and FN values. So how can I calculate metrics like accuracy and specificity which includes TN as a parameter?.Should I consider the TN as 0 during the calculations ?
I need to know do I need to calculate the above metrics by selecting every available trained weight with all the test data to evaluate the model performance? Or do I only need to calculate those metrics for the best weight file I was chosen as my final model for the complete test set? The reason for asking this is in the below reference the author of the repository calculates both metrics by taking every weight file for the complete test set.
Please refer line 152 .
I need to know whether I should consider the number of labels available or the number of images available to increase the model performance. I mean in my case in a single image I have multiple annotated labels available. I know when I increase the number of images for training the accuracy of the model will go high. But in my case do I need to pay attention to maximizing the number of labels in the dataset or the number of images in the dataset to get better accuracy.
Ex:
Case 1:
1 image contains 4 annotated labels(2 for ClassA , 2 for ClassB)
|
|
Total: 60 images
Case 2:
image_b_1 contains 1 annotated label(1 for ClassA)
image_b_2 contains 1 annotated label(1 for ClassB)
|
|
Total: 200 images
Which case will give the maximum accuracy results during the training?
Hello dear researchers.
I run the siamfc ++ algorithm. It showed better results than the other algorithms but its accuracy is still not acceptable. One of my ideas is to run an object detection algorithm first and then draw a bunding box around the object I want to track.I think by doing this I have indirectly labeled the selected object. So the results and accuracy should be improved compared to the previous case. Do you think this is correct? Is there a better idea to improve the algorithm?
Thanks for your tips
Hi everybody,
I am training a Tensorflow model that starts out normally, but then proceeds to rapidly increase in loss after about 16,900 training steps. I have written a question on this AI Stack Overflow post:
Would somebody take a look at this question and be able to provide me some feedback as to what could be going on? Do I just need to terminate training at around 16,900 steps, or is there something else that could be going on?
I'm using Python 3.9, Tensorflow 2.6.0, Ubuntu 20.04, and following this tutorial. The tf.record files have just been generated, and I'm trying to get my training to start. the .config files were downloaded from the Tensorflow 2 Detection Model Zoo (https://github.com/tensorflow/models/blob/master/research/object_detection/g3doc/tf2_detection_zoo.md).
While trying to kick off training for my Tensorflow model with the following code...:
python model_main_tf2.py --model_dir=path_to_model_dir/my_ssd_resnet50_v1_fpn --pipeline_config_path=path_to_model_dir/my_ssd_resnet50_v1_fpn/pipeline.config
...I received this error:
tensorflow.python.framework.errors_impl.InvalidArgumentError: assertion failed: [[0.0792563558][0.286692768][0.342465758]] [[0.000978473574][0.365949124][0.695694685]]
This Stack Overflow post (https://stackoverflow.com/questions/62075321/tensorflow-python-framework-errors-impl-invalidargumenterror-invalid-argument) outlines the error nicely, but I'm still needing assistance in solving it. I have checked, and there are no invalid entries (min/max values outside of the photo's dimensions, and no negative values). The most likely culprit is that several of the bounding boxes were drawn in a reverse order so that some of the min bounding box dimensions are larger than the max dimensions, as discussed by the post. I need to correct the bounding box dimensions before generating the tf.record files (for both my training and testing datasets).
I have labeled my images with Labelimg in Pascal VOC format, so each of my images have an XML file with them (I have roughly 10,900 of them). Is there a way to convert my xml files to a CSV document and THEN generate the tf.record files?
Once the CSV files are generated, I can correct the ordering of my min and max columns with the following code. I'm hoping to use these corrected csv files to generate the tf.record files.
#Designed to correctly modify min/max columns in the CSV files generated when producing tf.record files
#Designed to run each line from the command line
import pandas
import numpy as np
import os
import csv
#Open CV file
df = pandas.read_csv("C:\\desired_directory\\train.csv")
df = df.assign(
xmin=df[['xmin', 'xmax']].min(1),
xmax=df[['xmin', 'xmax']].max(1),
ymin=df[['ymin', 'ymax']].min(1),
ymax=df[['ymin', 'ymax']].max(1),
)
df.to_csv("C:\\desired_directory_to_save_csv_file\\train.csv")
I am training an object detection model, and I have some very highly unbalanced data annotations. I have almost 11,000 images, all with dimensions of 1024 x 1024.
Within those images I have the following number of annotations:
*Class 1 - 40,000
*Class 2 - 25,000
*Class 3 - 900
*Class 4 - 500
This goes on for a few more classes.
As this is an object detection algorithm that was annotated with the annotation tool Label-img, there are often multiple annotations on each photo. Do any of you have any recommendations as to how to handle fine-tuning an object-detection algorithm on an unbalanced dataset? Currently, collecting more imagery is not an option. I would augment the images and re-label, but since there are multiple annotations on the images, I would be increasing the number of annotations for the larger classes as well.
Note: I'm using the Tensorflow Object Detection API and have downloaded the models and .config files from the Tensorflow 2 Detection Model Zoo.
Thanks everybody!
Dear colleagues, I am working on a project of Object detection with Bounding box regression (on Keras, TensorFlow), but I can't find decent sources and code samples. My task involves 200 classes, and each picture can contain from 10 to 20 classes that need to be detected. A medical task, more precisely, the recognition of types of medical blood tests.
I fixed some works that are close to mine. But I want to implement my task for the Russian language and on the KERAS & TensorFlow framework. I am sincerely grateful for any advice and support!
I have captured the depth image from Intel REALSENSE depth camera , and saved depth image as RGB-D data format . However , I want to detect the change of position of a ball in the picture .Because the image is represented in depth format, it is different from the 2D image . How can i do this ? Thanks !
Hi Everyone,
I'm currently practising an object detection model which should detect a car, person, truck, etc. in both day and night time. Now, I have started gathering data for both day and night time. I'm not sure whether to train a separate model for daylight and another model for the night-light or to combine together and train it?
can anyone suggest to me the data distribution for each class at day and night light? I presume it should be a uniform distribution. Please correct me if I'm wrong.
Eg: for person: 700 images at daylight and another 700 images for nightlight
Any suggestion would be helpful.
Thanks in Advance.
Do any semi-supervised deep learning technique or architecture exist for object detection ?
I have been working on object detection with YOLO as my master's thesis. I have done some experiments but trying to understand the architecture but it is getting very difficult to understand the architecture from one source. Can somebody please help me by providing good resources regarding this? I am very grateful to you in advance.
Hi,
I am using Tensorflow object detection api with Faster-RCNN arcitecture and ResNet152 for training and object detection. Since I do have small number of train/validation images the obtained mAP@0.5 is low (~0.6). Do you know any good documentation, video whatever which describes how to use transfer learning in tensorflow object detection api? I would like to try this to see if this helps to increase the accuracy.
Thanks
hello,
i want to know how to use the XML files in Object detections datasets !
how to read, how to store information and finally how to use !
any information ! link will it will be helpfull -_-
thx
Hello everyone,
I am working on a project which includes object recognition, currency recognition, text to speech, and location of the user. In order to perform the currency recognition of Pakistan's currency, I'd need a lot of data and computing resources to train a model, but unfortunately, I don't have access to either of these.
So, I just wanted to know whether there is an open-source pre-trained model that I could use for my project?
Any help would be appreciated.
Thank you.
Hi everyone i have two datasets 24 imaged each (sampled from DIV2K dataset)
And i wan't to use one dataset to test an application for face detection( yolo face detection) and the other to test an object detection application ( yolo object detection)
And i am wondering if there is a solution to label these images before using them.
Thank you all
Dear all,
Most of the research articles are compared and tableted the different methods and metric values when come for result & discussion. Are they really taking the result of each method before the mention in article.
Im looking for doing work on Multi scale -CNN model for Traffic Sign detection and recognition. I expecting your valuable suggestion and input for my further movement.
I am looking at training the Scaled YOLOv4 on TensorFlow 2.x, as can be found at this link ( https://github.com/wangermeng2021/Scaled-YOLOv4-tensorflow2).
I plan to collect the imagery, annotate the objects within the image in VOC format, and then use these images/annotations to train the large-scale model. If you look at the multi-scale training commands, they are as follows:
```
python train.py --use-pretrain True --model-type p5 --dataset-type voc --dataset dataset/pothole_voc --num-classes 1 --class-names pothole.names --voc-train-set dataset_1,train --voc-val-set dataset_1,val --epochs 200 --batch-size 4 --multi-scale 320,352,384,416,448,480,512 --augment ssd_random_crop
```
As we know that Scaled YOLOv4 (and any YOLO algorithm) likes image dimensions divisible by 32, I have plans to use larger images of 1024x1024. Is it possible to modify the ```--multi-scale``` commands to include larger dimensions such as 1024, and have the algorithm run successfully?
Here is what it would look like when modified:
```
--multi-scale 320,352,384,416,448,480,512,544,576,608,640,672,704,736,768,800,832,864,896,928,960,992,1024
```
What is the difference between Easy positives/negatives and Hard positives/negatives? In object detection specifically considering focal loss
I'll need to research on this but I am a beginner and need to learn from the start. So, I hope someone can give me like a starting point. I've searched a lot, and there are a lot of ways of doing this, but on what way it can still be improved? I hope someone can help me on this.
I am doing research work in robotics field which I have recently started. If you know any free to use robotic arm simulation software which could help me simulate and design an environment for pick and place application. Also I want to use pixy cam which helps in object detection. Software should include using pixy cam 2, universal robotic arm UR5 and their integration. Thanks
I am trying to find a dataset which meets any of the following requirements.
1. Dataset of RGB or RGBD images with object bbox annotations, 6d pose annotations and grasp annotations.
2. Dataset of RGB or RGBD images with object bbox annotations and grasp annotations.
I have worked with Cornell grasp dataset and linemod 6d object pose estimation dataset separately. I am trying to build a unified model that does all these tasks together and trying to find available dataset for the same.
Hello everyone,
I need to annotate the MRI images (with .nii.gz format). Each image includes several slices. What I need is an annotation tool that can annotate the area of interest and propagate it in all slices? (considering that the location of object changes in different slices).
Thank you all in advance.
Hello, I am trying to understand various approaches to 3DOD (3D Object Detection) and trying to figure out the most tested one.
I aim to develop a detector for my rover to aid it in detecting pose of a custom object in outdoors (a table for instance). The object is not something already in the KITTI dataset, so requires training from scratch. As I do not have access to a 3d LiDAR, I can have stereo or monocular camera.
I have come across various implementations like [RTM3D](https://github.com/maudzung/RTM3D) and other [methods that use geometry and deep learning](https://github.com/skhadem/3D-BoundingBox).
None of these methods explain how to go forward with training a new detector for a custom object. One can observe that they need to create a data structure similar to the KITTI. The rest is trial and error. I am looking for methods which have been validated and save my time. Any help is appreciated. Thanks :)
We are doing a project in which we are detecting (using YOLOv4) runway debris using a fixed camera mounted at a pole on the side of the runway. We want to find out the position of the object with respect to the runway surface. Does anyone know about any algorithm or technique that will help us find out the position of the object with respect to runway dimensions?
I am developing a machine-learning model to make accurate and fast counting of metal pipe with difference cross-sectional shapes. Well-define rectangular, triangular and circular shapes are quite ok to do, but the C shape metal is really complicated especially when they overlap one another as shown in the attached photo. Anyone has any suggestion of a model that can count overlapping object? Thanks in advance.

I was thinking about the small object detection problem, and while surfing found a reasonable amount of paper those are working on Faster RCNN to solve this issue. Why they choose Faster RCNN instead of other state-of-the-art methods? I want to know the technical reasons. Thanks in advance.
I implemented an image classifier and object detection model. I added a new class every day to my model and the data set growing too. I wanted to ask if anyone has the same experience? at now it working fine. Any suggestions about further problem?
What can be the different parameters of comparison and Can it be used for face recognition or speech recognition as well.I agree there are papers published but I have not managed to find genuine parameters to justify Optical flow method for object or moving object detection
I have tried changing parameter hue to higher values and hence obtained a higher mAp ,but in latter cases i think overfitting became a problem and the highest mAp model detection was less accurate.Please share ideas if you are working on a similar design .
For my Ph.D., suggest some research areas/ problem statement in object detection.
Thanks in advance.
Hi, I am doing object detection. To improve my detection result I want to do Hard negative mining to minimize false detection. But I don't know that how to do it? Can someone explain it with providing MATLAB/c/c++ code.
In the research paper, Histograms of Oriented Gradients for Human Detection, the images used are of 64X128 pixels. Do we need to crop our images dataset for applying histogram of oriented gradient?
Is it possible to determine any Physics Law from a moving object using deep learning model algorithm or can I train my model to detect it ??
I have tried tabula-py library and java tool so far but it results in many false positives ( i.e. telling that a table is present when not the case).
Some of the cases were
content 1 content 3
content 2 content 4
If text is written in the above manner, then also it marks it as tabular data. Is there any solution that does the task better and handles the above problem. ( including Deep learning or other techniques).
Hi,
as deep learning is a data-driven approach, the crucial is to have quality data. There exist a lot of datasets for free, but they differ in the quality of labels.
I'm now working on an index, which can tell a researcher quality of the labels, so the researcher may decide if such a dataset is useful nor not. I do have established a pipeline on how to produce such an index in a fully autonomous way. Note, I'm focusing on object detection tasks only, i.e., labels given as bounding-boxes.
The question is: does such the index exist already? I googled a lot and find nothing. It would be nice to compare our approach with existing ones.
I am trying to implement object detection using C++. I have hyperspectral data in the form of .bil file along with header file. Can anyone help me in accessing .bil files using C++?
In the case of object detection or abnormal cell or tissue e.g., tumor, detection segmentation is an important part. Sometimes FCM and K-means clustering is used for segmentation. I want to know which method is best for segmentation in MRI or other radio imaging for the detection of abnormalities.
Hall scan data can consist of point clouds, (3D) images and videos...
I'm curious if you know about any projects or ongoing research in this area.
The aim of the object detection could be the creation of a map or a CAD model.
Thanks in advance! :)
We are working on an Automated Foreign Object Detection on Runway (FYP Project). It is a system to detect the FODs (Foreign Object Debris) on the surface of Runways. It also detects other anomalies like wildlife, snow, ice pavement, cracks, etc. in all weather conditions (like fog/smog, rain, dark weather etc.) Cameras will be mounted on poles at the sides of Runways to detect them and report to the Airport staff in Real-Time.
I am not sure what technique will be best for its implementation as I am new to this field. I am currently researching about Keras, YOLO, DNN, R-CNN and others. I want your opinion on how should we implement it,
Thank you.
i am trying to do preprocessing step in my object detection that is edge detection. but i wonder how to do it. whether i have to have dataset that converted to edge first or no?
in my understanding, i should do the step like this:
1. input image
2. do grayscaling
3. do filtering
4. do edge
5. output of process 4, will be input in object detection method. the edge of object will be extracted to get the feature.
6. data training from dataset inria (without convert it to edge) will be trained with svm classifier.
7. do the matching using the object detection
is that correct?
Please kindly help. Thank you
i was trying to use DLA34 with centernet using this repo : https://github.com/xingyizhou/CenterNet/blob/master/src/lib/models/networks/pose_dla_dcn.py
i can load efficientnet features with centernet like this :
from efficientnet_pytorch import EfficientNet
base_model = EfficientNet.from_pretrained('efficientnet-b1')
x_center = x[:, :, :, IMG_WIDTH // 8: -IMG_WIDTH // 8]
feats = base_model.extract_features(x_center)
but in Deep Layer Aggregation(DLA34) extract_features() function is not available,i am new to object detection,how can i extract_features from dla34 and other networks like densenet with centernet?
for my object detection model yolo i have annotation xml file with the bounding box coordinates (ymin, xmin, ymax, xmax) but don't have height and width information how can i calculate them or if there is a python script to extract them
Hi everyone,
I'm working on a task which consists in detecting if there are some foreign objects (like bolts, pliers etc.) inside the interior of an aircraft's wing. Problem is that my neural networks should find out-of-place objects inside the specified environment. I was thinking to train pytorch pre-trained CNN like Resnet101 or Inceptionv3. Do you know any further suitable CNN for this particular use-case or a different technique? I am open to any kind of suggestion. Thanks in advance.
There are many object detection methods out there. The newest object detection method is object detection with neural network. But there are also traditional methods, on of the traditional method is HOG method.
I want to know is HOG still relevant to be used in object detection? what is the advantage of using HOG method instead neural network method?
I have been trying to tackle a problem where I need to track multiple people through multiple camera viewpoints on a real-time basis.
I found a solution DeepCC (https://github.com/daiwc/DeepCC) on DukeMTMC dataset but unfortunately, this solution has been taken down because of data confidentiality issues. They were using Fast R-CNN for object detection, triplet loss for Re-identification and DeepSort for real-time multiple object tracking.
Can someone share some other resources regarding the same problem?
Small object detection is always challenging due to the limitations of available information. Is it a good idea to use GAN to improve the feature presentation for small objects? What can we do ?
What are the characteristics of small objects and how to design an algorithm based on the characteristics ? To my knowledge, feature fusion and context learning are usually used to improve object detection. However, it is hard to why they improve the detection results. Are there some algorithms designed just for small object detection ?
I want to Develop an Object Detection RESTful web service that will take images as payload and return the coordinates of the bounding boxes or the image itself marked with the detected objects.
Note: Tensorflow Object Detection API makes it easy to detect objects by using pre-trained object detection models.
I am thinking of utilizing YOLO algorithm and Faster RCNN for localization and detection task, is this easy?
We are working on a project to detect foreign objects on the surface of a runway. We want to know which camera will be best for this purpose. The runway is 46m wide and it should be able to detect very small objects on it. The angle of the camera should be high for maximum area coverage. It should be able to work in all weather conditions like rain, fog, low light etc. Can you recommend us some cameras that will be good for this project? We will use Artificial Intelligence techniques to detect the object.
I am working on vehicle detection task on UA-DETRAC dataset. I am getting a precision recall curve (PR curve) as shown in figure.
Is it acceptable ? Because mostly PR-curve remains high at start and then decreases suddenly. I checked my code and I believe it is correct. But in the literature PR-curve for this dataset looks as shown in attached figure. Do you the curve I got is also correct ??


In general the first principle is a basic assumption that cannot be deduced any further. Related to different fields of human activity there are different definitions of first principles, for example for engineering those are the laws of physics. Often great innovation in science/engineering happens when the new idea is not build on top of the current state of the art or commonly accepted technology. Instead the problem is initiated from those first principles or in other words "what we know for sure" and re-build from there.
So, what are the first principles known so far in computer vision, particularly in object detection. Are there fundamental "can do" and "cant do" that take its roots and proofs in computer science, physics, mathematics?
I am using Mask-RCNN model with ResNet50 backbone for nodule detection in ultrasound images. My dataset consists of 500 US images. Maximum object detection accuracy for training set is approximately 54% (using data augmentation and hyper-parameter tuning). The accuracy of object detection on my test set is even lower. Are there any suggestions for improving object detection accuracy?
Many thanks in advance,
I am particularly looking for dataset containing cameras capturing videos in a distributed setup. The videos, captured by multiple cameras have to be correlated. Is there any benchmark dataset on that? I found some of the existing datasets for distributed networks, but the factor of correlation is absent. It would be great to get some help.
Hi there!
I would like to count and categorize embryos on dozens of pictures. It would be quite time consuming to do that manually so I was thinking about a Python solution to automatize that process.
A sample picture is attached. This picture has all the challenges and nearly all the developmental stages (2 cell; 4-8 cell; morula) that I am facing of.
Does any of you have a good recommendation on that (libraries; Git; articles etc)?
Thanks in advance!
Bálint
Hi Guys
I am experiencing that when i am using a R-CNN detector for object detection , when i increase the training data , i have bad classification and overfitting
Is there some one experienced this issue and the reason for that?
Best
Abdussalam
I'm looking for a point cloud processing library that includes classification / object detection and tracking. After some web surfing i got an impression that the Point Cloud Library (PCL) written in C/C++ is the only option with ALL abovementioned capabilities. There is no problem to start the development in C/C++. However, it would be better to use python instead for faster prototyping and building the first working version for further experimenting. The performance / real-time processing are not pursued at first. Did anyone meet similar problem and what was your library of choice? Thanks
I don't need too much precision, but the edges should be continuous for object detection. I want to create an enclosed contour outlining the objects in the gray image.
Machine Learning object detection or image classification technique could be the solution?
Training ultrasound images with original CNN architectures directly is not satisfactory for object detection, segmentation, etc. Is it possible to improve CNN performance for these images?
I am working on a problem, where I need to detect a particular type of objects. I have tried with tensorflow ssd mobilenet model, but I their processing time is very high. Hence I am planning to develop my own object detection model for single class with a very few layers in Keras (Tensorflow). I need to know, is there any github repo, or tutorial that explains clear procedure about single class custom object detection model training.
I am working on contour detection based object detection approaches. But all the contour detection techniques are highly influenced by the image noise and the environmental conditions.
I wanted to know is there any robust approaches other than contour detection.
Deep learning based object detection is another approach, but I am expecting some other methods.
I am in a research related to Automatic image annotation/Tagging. So far, I came across many tools most of which are manual and a few semi automated ( 1 or 2). The semi automated were mostly trained on a large datasets like coco, so when we import an image which has one of the related classes and then it gives a bounding box with coordinates based on suggestion. However, this requires a manual intervention in importing the image and selecting the suggested class. Also, at the end this is a DL model trained on a dataset which has a good UI in the form of their API. However I am looking for something that could capture the live image through a camera and gets annotated online. Kindly let me know if there are any existing tools with regards to this and also feel free to give your suggestions of building such a tool if is not already in practice.
Regards,
Harish.
My research work focused on Multi moving swarm robot based moving object detection and tracking for intelligent video surveillance. I need some excellent papers which are related to Multi moving object detection and tracking for multiple swarm robots.
Keyword: Multi camera multi moving object detection and tracking.
I am a new PhD research working on object detection and image analysis for security applications for my area. I am trying to obtain the relevant sources and some i have registered for and rest i googled are dead links! Please help me!
I want to know about object detection in remote sensing data for research work in final year of master degree
I need some papers and references for my base paper (problem definition).
Is it possible that I use camera for taking the images and apply object classifiers and detection method like yolo and faster rcnn for object detection and then apply reinforcement learning algorithm for autonomous navigation? Does this approach helps me to navigate fast or not?
As smartphones equips with low frame rate cameras and lower processing speed, I'm searching for high speed real-time object detection technique using this camera. It'll be appreciated for any related answers. Thanks in advance...
I am searching for state of art deep learning model which have about 80-85% accuracy and computation speed about 30 fps. I search yolov3 has speed of about 30 fps but accuracy is a bit low.
Thanks all in advance.
How to do quantitative evaluation of Image segmentation other than precision, recall and ROC curve. Also how to get these performance metrics in case of object detection and what can be the performance metrics in case of object RECOGNITION
- How is the meanAveragePrecision(mAP) score calculated for the object detection?
- How can I modify it to take class imbalance into account?
Should I make it weighted meanAveragePrecision(mAP) where I normalise the mAP values of each class like : SUM(alpha i * mAP for class i), where alpha i = data points in class i / corpus size
Hi
I have confusion about on one topic of my project. I want to Detect/Labelled Burn Areas of object from particular pictures and search also the related history of them within database. Which ML technique i can use?
1. Labelled every part of damaged areas and then search related label from database.
2. Search whole image with others. (like Image search)
Thanks in advance