Article

A Brain–Computer Interface (BCI) for the Detection of Mine-Like Objects in Sidescan Sonar Imagery

Authors:
To read the full-text of this research, you can request a copy directly from the authors.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

... In recent years, an increasing number of research efforts have been dedicated to the development of BCI systems [5,6], with applications extended from the realization of wheelchair operation [7], prosthetic control [8], neurological rehabilitation This work was supported in part by the Natural Science Foundation of China 61803255 and the Natural Science Foundation of Shanghai 18ZR1416700). (Corresponding author: Raofen Wang) [9] for physically challenged patients to a wider range of practical scenarios, such as virtual reality games [10], military detection [11] and operator fatigue detection [12,13]. Depending on the specific activity patterns of the brain, EEG signals applied to BCI development mainly include: slow cortical potential (SCP) [14], P300 evoked potential [15,16], steady-state visual evoked potential (SSVEP) [17,18], eventrelated desynchronization (ERD) and synchronization (ERS) [19,20]. ...
... Firstly, the joint correlation matrix between X and Y should be calculated as: 11 12 ...
Article
Full-text available
In recent years, multivariate synchronization index (MSI) algorithm, as a novel frequency detection method, has attracted increasing attentions in the study of brain-computer interfaces (BCIs) based on steady state visual evoked potential (SSVEP). However, MSI algorithm is hard to fully exploit SSVEP-related harmonic components in the electroencephalogram (EEG), which limits the application of MSI algorithm in BCI systems. In this paper, we propose a novel filter bank-driven MSI algorithm (FBMSI) to overcome the limitation and further improve the accuracy of SSVEP recognition. We evaluate the efficacy of the FBMSI method by developing a 6-command SSVEP-NAO robot system with extensive experimental analyses. An offline experimental study is first performed with EEG collected from nine subjects to investigate the effects of varying parameters on the model performance. Offline results show that the proposed method has achieved a stable improvement effect. We further conduct an online experiment with six subjects to assess the efficacy of the developed FBMSI algorithm in a real-time BCI application. The online experimental results show that the FBMSI algorithm yields a promising average accuracy of 83.56% using a data length of even only one second, which was 12.26% higher than the standard MSI algorithm. These extensive experimental results confirmed the effectiveness of the FBMSI algorithm in SSVEP recognition and demonstrated its potential application in the development of improved BCI systems.
... EEG has become the most widely used neuroimaging technique for brain-computer interfaces (BCI). Some of these extended uses of EEG include military operations such as controlling weapons or drones [4][5][6][7][8], educational classroom applications such as monitoring student's attention/other mental states or helping them engage with material [9][10][11][12][13], cognitive enhancement such as increasing cognitive load or focus [12,14,15], and consumer based games such as computer games or physical toys controlled via brain waves [2,[15][16][17][18][19]. ...
Article
Full-text available
In the last decade there has been significant growth in the interest and application of using EEG (electroencephalography) outside of laboratory as well as in medical and clinical settings, for more ecological and mobile applications. However, for now such applications have mainly included military, educational, cognitive enhancement, and consumer-based games. Given the monetary and ecological advantages, consumer-grade EEG devices such as the Emotiv EPOC have emerged, however consumer-grade devices make certain compromises of data quality in order to become affordable and easy to use. The goal of this study was to investigate the reliability and accuracy of EPOC as compared to a research-grade device, Brainvision. To this end, we collected data from participants using both devices during three distinct cognitive tasks designed to elicit changes in arousal, valence, and cognitive load: namely, Affective Norms for English Words, International Affective Picture System, and the n-Back task. Our design and analytical strategies followed an ideographic person-level approach (electrode-wise analysis of vincentized repeated measures). We aimed to assess how well the Emotiv could differentiate between mental states using an Event-Related Band Power approach and EEG features such as amplitude and power, as compared to Brainvision. The Emotiv device was able to differentiate mental states during these tasks to some degree, however it was generally poorer than Brainvision, with smaller effect sizes. The Emotiv may be used with reasonable reliability and accuracy in ecological settings and in some clinical contexts (for example, for training professionals), however Brainvision or other, equivalent research-grade devices are still recommended for laboratory or medical based applications.
... The Asimo (Advanced Step in Innovative Mobility) robot is a humanoid (similar to humans), which looks like a boy, and can be controlled by brain waves, this device also uses EEG technology [19]. It may take role in several military applications, in the future even brain wave based control of a recon plane may be possible [20][21][22]. At the University of Minnesota, such brain wave guided helicopter has been developed, which is capable of passing through an obstacle course. ...
Conference Paper
Full-text available
This article presents the development, implementation and testing of a brain-computer interface (BCI) system, which enables the speed control of the mobile robot called Robotino, manufactured by Festo Didactic. The BCI system was implemented, and the results of the BCI system were evaluated during a students' project, based on the project-based learning methodology. Speed control has been achieved by utilization of NeuroSky MindWave EEG headset-based electroencephalogram (EEG) method, by processing brain bioelectric signals measured on the frontal lobe. Tests of the system evolved by using the brain-computer interface have been performed and evaluated, both regarding implementing speed control and user's experiences, which have been finished with positive results.
... SVM is one of the statistical learning theory based supervised machine learning methods. Having better generalization performance and robustness compared to classical learning procedures, SVM has successfully been applied to various fields in recent years [9][10][11]. The main goal of an SVM classifier is to find the optimum hyper plane that separates two classes Content courtesy of Springer Nature, terms of use apply. ...
Article
Full-text available
In this study, design and implementation of a multi sensor based brain computer interface for disabled and/or elderly people is proposed. Developed system consists of a wheelchair, a high-power motor controller card, a Kinect camera, electromyogram (EMG) and electroencephalogram (EEG) sensors and a computer. The Kinect sensor is installed on the system to provide safe navigation for the system. Depth frames, captured by the Kinect’s infra-red (IR) camera, are processed with a custom image processing algorithm in order to detect obstacles around the wheelchair. A Consumer grade EMG device (Thalmic Labs) was used to obtain eight channels of EMG data. Four different hand movements: Fist, release, waving hand left and right are used for EMG based control of the robotic wheelchair. EMG data is first classified using artificial neural network (ANN), support vector machines and random forest schemes. The class is then decided by a rule-based scheme constructed on the individual outputs of the three classifiers. EEG based control is adopted as an alternative controller for the developed robotic wheelchair. A wireless 14-channels EEG sensor (Emotiv Epoch) is used to acquire real time EEG data. Three different cognitive tasks: Relaxing, math problem solving, text reading are defined for the EEG based control of the system. Subjects were asked to accomplish the relative cognitive task in order to control the wheelchair. During experiments, all subjects were able to control the robotic wheelchair by hand movements and track a pre-determined route with a reasonable accuracy. The results for the EEG based control of the robotic wheelchair are promising though vary depending on user experience.
... superimposed with small target airplane images, which could vary in location and angle within an 23 elliptical focal area. Correspondingly, in (Barngrover et al., 2016), the prime goal was to correctly 24 identify sonar images of mine-like objects on the sea bed. Accordingly, a three-stage BCI system was 25 developed whereby the initial stages entail computer vision procedures e.g. ...
Article
Full-text available
Rapid serial visual presentation (RSVP) combined with the detection of event related brain responses facilitates the selection of relevant information contained in a stream of images presented rapidly to a human. Event related potentials (ERPs), measured non-invasively with electroencephalography (EEG), can be associated with infrequent target stimuli(images) in groups of images, potentially providing an interface for human-machine symbiosis, where humans can interact and interface with a computer without moving and which may offer faster image sorting than scenarios where humans are expected to physically react when a target image is detected. Certain features of the human visual system impact on the success of the RSVP paradigm. Pre-attentive processing supports the identification of target information ~100ms following information presentation. This paper presents a comprehensive review and evaluation of research in the broad field of RSVP-based brain-computer interfaces (BCIs). Applications that use RSVP-based BCIs are classified based on the operation mode whilst protocol design considerations are critiqued. Guidelines for using the RSVP-based BCI paradigms are defined and discussed, with a view to further standardization of methods and experimental evidence gathering to support the use of RSVP-based BCIs in practice.
... Recent works have tended to merge the detection and classification of objects in images into a unified stage. Barngrover et al. [12] used a brain-computer interface that combines computer vision with human vision, in which a Haar-like feature [13] classifier is trained on a large data set to detect objects. Sadjadi et al. [14] proposed a subspace-based detector. ...
Article
We offer a new unsupervised statistically-based algorithm for the detection of underwater objects in synthetic aperture sonar (SAS) imagery due to its high-resolution imagery and because its resolution is independent of the range. In contrast to other methods that do not utilize the statistical model of the shadow region, our algorithm combines highlight detection and shadow detection using a weighted likelihood ratio test, while exploiting the expected spatial distribution of potential objects. We detect highlights by a higher-order-statistics representation of the image, followed by a segmentation process to form a region-of-interest (ROI). Then, while taking into account the sonar elevation and scan angle, for each ROI, we use a support vector machine (SVM) over the statistical features of the pixels within the ROI to detect shadow-related pixels and background pixels. Our algorithm has the benefit of being robust as a result of setting its main parameters in situ . Moreover, we do not require knowledge about the target’s shape or size, thereby making our algorithm suitable for all sonar detection applications and sonar types. To test detection performance, using our own autonomous underwater vehicle, we collected 270 sonar images, which we also share with the community. Compared to the results of benchmark schemes, our detection algorithm shows a trade-off between the probability of detection and the false alarm rate (FAR), which is close to the Kullback-Leibler (KL) divergence lower bound.
... Side scan sonar (SSS), among the most common sensors used in ocean survey, can provide images of the seafloor and underwater target. Target detection based on SSS image has a great variety of applications in marine archaeological surveying [1], oceanic mapping [2], and underwater detection [3][4][5], in which the main task is SSS image segmentation. ...
Article
Full-text available
This paper presents a novel and practical convolutional neural network architecture to implement semantic segmentation for side scan sonar (SSS) image. As a widely used sensor for marine survey, SSS provides higher-resolution images of the seafloor and underwater target. However, for a large number of background pixels in SSS image, the imbalance classification remains an issue. What is more, the SSS images contain undesirable speckle noise and intensity inhomogeneity. We define and detail a network and training strategy that tackle these three important issues for SSS images segmentation. Our proposed method performs image-to-image prediction by leveraging fully convolutional neural networks and deeply-supervised nets. The architecture consists of an encoder network to capture context, a corresponding decoder network to restore full input-size resolution feature maps from low-resolution ones for pixel-wise classification and a single stream deep neural network with multiple side-outputs to optimize edge segmentation. We performed prediction time of our network on our dataset, implemented on a NVIDIA Jetson AGX Xavier, and compared it to other similar semantic segmentation networks. The experimental results show that the presented method for SSS image segmentation brings obvious advantages, and is applicable for real-time processing tasks.
... The side scan sonar [1] during motion may blur the sonar image of underwater moving objects. If the movement speed is too fast, the collected sonar image will be too blurred to extract the required information. ...
Article
Full-text available
In order to recover the blurred sonar image collected by the side scan sonar during motion, we propose a solution based on the conditional adversarial networks to deblur the sonar image of the unknown motion blur kernels. First, we use improved conditional adversarial networks to recover the sonar image, and improve the loss function, so that the quality of image generation is improved while the training stability is enhanced. Then we propose a method for generating blurred sonar images. The blurred sonar image generated by this method is closer to the real blurred sonar image. Finally, we made our own sonar image set and trained it with two-timescale update rule. The final results proved that the image restored by this method has higher definition.
... Image retrieval is a typical application of RSVP-based BCIs. In addition, various BCI applications have been developed, such as speller [11][12][13], image classification [14,15], anomaly detection [16], and anti-deception [17]. ...
Article
Objective: Rapid serial visual presentation (RSVP)-based brain-computer interface (BCI) is an efficient information detection technology through detecting event-related potential (ERP) evoked by target visual stimuli. The BCI system requires a time-consuming calibration process to build a reliable decoding model for a new user. Therefore, zero-calibration has become an important topic in BCI research. Approach: In this paper, we construct an RSVP dataset that includes 31 subjects, and propose a zero-calibration method based on a metric-based meta-learning: ERP Prototypical Matching Net (EPMN). EPMN learns a metric space where the distance between EEG features and ERP prototypes belonging to the same category is smaller than that of different categories. Here, we employ prototype learning to learn a common representation from ERP templates of different subjects as ERP prototypes. Also, a metric-learning loss function is proposed for maximizing the distance between different classes of EEG and ERP prototypes and minimize the distance between the same classes of EEG and ERP prototypes in the metric space. Main results: The experimental results showed that EPMN achieved a balanced-accuracy of 86.34% and outperformed the comparable methods. Significance: Our EPMN can realize zero-calibration for an RSVP-based BCI system.
... In [19], Sawas and Petillot applied the Haar-like features and a cascade of boosted classifiers, which were first introduced by Viola and Jones [31]. In [21], Barngrover et al. also utilized the Haar-like feature classifier to generate image patches (around regions of interest), which are then processed by subjects using the rapid serial visual presentation paradigm. Other feature-based methods used the geometric visual descriptors, such as scale-invariant feature transform (SIFT) [32], [33], [18] and local binary pattern (LBP) [34], [20]. ...
Article
Full-text available
With the advances in sonar imaging technology, sonar imagery has increasingly been used for oceanographic studies in civilian and military applications. High-resolution imaging sonars can be mounted on various survey platforms, typically autonomous underwater vehicles, which provide enhanced speed and improved data quality with long-range support. This paper addresses the automatic detection of mine-like objects using sonar images. The proposed Gabor-based detector is designed as a feature pyramid network with a small number of trainable weights. Our approach combines both semantically weak and strong features to handle mine-like objects at multiple scales effectively. For feature extraction, we introduce a parameterized Gabor layer which improves the generalization capability and computational efficiency. The steerable Gabor filtering modules are embedded within the cascaded layers to enhance the scale and orientation decomposition of images. The entire deep Gabor neural network is trained in an end-to-end manner from input sonar images with annotated mine-like objects. An extensive experimental evaluation on a real sonar dataset shows that the proposed method achieves competitive performance compared to the existing approaches.
... Cho et al. [18] tried to improve the recognition accuracy by using multi-angle view mine simulation and template matching. Away from model-based approaches, local feature descriptors without prior knowledge, such as the Haar-like feature [19], the Haar-like and local binary pattern (LBP) features [3], the combination of Haar features and learned features from a human operator's brain electroencephalogram (EEG) [20] have also been proposed for mine recognition. The extracted features are usually combined with some state-of-the-art machine learning approaches, such as boosting [19] and support vector machines (SVMs) [21]. ...
Article
Full-text available
Sidescan sonars are increasingly used in underwater search and rescue for drowning victims, wrecks and airplanes. Automatic object classification or detection methods can help a lot in case of long searches, where sonar operators may feel exhausted and therefore miss the possible object. However, most of the existing underwater object detection methods for sidescan sonar images are aimed at detecting minelike objects, ignoring the classification of civilian objects, mainly due to lack of dataset. So, in this study, we focus on the multi-class classification of drowning victim, wreck, airplane, mine and seafloor in sonar images. Firstly, through a long-term accumulation, we built a real sidescan sonar image dataset named SeabedObjects-KLSG, which currently contains 385 wreck, 36 drowning victim, 62 airplane, 129 mine and 578 seafloor images. Secondly, considering the real dataset is imbalanced, we proposed a semisynthetic data generation method for producing sonar images of airplanes and drowning victims, which uses optical images as input, and combines image segmentation with intensity distribution simulation of different regions. Finally, we demonstrate that by transferring a pre-trained deep convolutional neural network (CNN), e.g. VGG19, and fine-tuning the deep CNN using 70% of the real dataset and the semisynthetic data for training, the overall accuracy on the remaining 30% of the real dataset can be eventually improved to 97.76%, which is the highest among all the methods. Our work indicates that the combination of semisynthetic data generation and deep transfer learning is an effective way to improve the accuracy of underwater object classification.
... According to the research of bionics [6], the biological vision system divides an object into several subsystems and realizes the identification through the synthesis of local information. In acoustic image sequences, local features are different from the image patterns of the nearest neighbour [7,8]. ...
Article
Full-text available
This paper proposes underwater target identification with local features and a feature tracking algorithm for acoustic image sequences. Feature detectors and descriptors are key to feature tracking. Their performance in underwater scene is evaluated by the change of multitarget parameters. A comprehensive quantitative investigation into the performance of feature tracking is thereby presented. Experimental results confirm that the proposed algorithm can accurately track potential targets and determine whether the potential targets are static targets, dynamic targets, or false alarms according to the tracking trajectories and statistical data.
... Apart from model-based approaches, local feature descriptors without prior knowledge have also been deployed for mine classification. Among them, the most popular are: the Haarlike feature [55], the combination of Haar features and learned features from a human operator's brain electroencephalogram (EEG) [56] and Haar-like and local binary pattern (LBP) features [4]. The extracted features are usually analysed using machine learning techniques, such as boosting [55] and support vector machines (SVMs) [57]. ...
Article
Full-text available
Underwater mines pose extreme danger for ships and submarines. Therefore, navies around the world use mine countermeasure (MCM) units to protect against them. One of the measures used by MCM units is mine hunting, which requires searching for all the mines in a suspicious area. It is generally divided into four stages: detection, classification, identification and disposal. The detection and classification steps are usually performed using a sonar mounted on a ship’s hull or on an underwater vehicle. After retrieving the sonar data, military personnel scan the seabed images to detect targets and classify them as mine-like objects (MLOs) or benign objects. To reduce the technical operator’s workload and decrease post-mission analysis time, computer-aided detection (CAD), computer-aided classification (CAC) and automated target recognition (ATR) algorithms have been introduced. This paper reviews mine detection and classification techniques used in the aforementioned systems. The author considered current and previous generation methods starting with classical image processing, and then machine learning followed by deep learning. This review can facilitate future research to introduce improved mine detection and classification algorithms.
... Second, the SVM classifier was trained using samples from all the subjects. Whereas in previous studies, individual classifiers were constructed for each subject (Wang and Jung, 2011;Barngrover et al., 2016), thus each subject had his own classification performance. But the subject-specific classifiers were hard to apply to other subjects because of the individual differences. ...
Article
Full-text available
Face processing is a spatiotemporal dynamic process involving widely distributed and closely connected brain regions. Although previous studies have examined the topological differences in brain networks between face and non-face processing, the time-varying patterns at different processing stages have not been fully characterized. In this study, dynamic brain networks were used to explore the mechanism of face processing in human brain. We constructed a set of brain networks based on consecutive short EEG segments recorded during face and non-face (ketch) processing respectively, and analyzed the topological characteristic of these brain networks by graph theory. We found that the topological differences of the backbone of original brain networks (the minimum spanning tree, MST) between face and ketch processing changed dynamically. Specifically, during face processing, the MST was more line-like over alpha band in 0–100 ms time window after stimuli onset, and more star-like over theta and alpha bands in 100–200 and 200–300 ms time windows. The results indicated that the brain network was more efficient for information transfer and exchange during face processing compared with non-face processing. In the MST, the nodes with significant differences of betweenness centrality and degree were mainly located in the left frontal area and ventral visual pathway, which were involved in the face-related regions. In addition, the special MST patterns can discriminate between face and ketch processing by an accuracy of 93.39%. Our results suggested that special MST structures of dynamic brain networks reflected the potential mechanism of face processing in human brain.
... BCI system based on Electroencephalogram (EEG) has been extensively explored due to the characteristics of easy operation, costeffectiveness, and zero risks [2]. As one of the most significant branches of EEG-based BCI system, Event-Related Potential (ERP) analysis based on Rapid Serial Visual Presentation (RSVP) paradigm has received increasing attention in recent years, and its applications range from face recognition [3] and medical image diagnosis [4] to target surveillance [5]. However, due to the low signal-to-noise ratio, large inter-subject variabilities, and imbalanced ERP dataset, the generalization of the EEG-based BCI system is still limited. ...
Article
Due to the low signal-to-noise ratio, limited training samples, and inter-subject variabilities in electroencephalogram (EEG) signals, developing a subject-independent brain-computer interface (BCI) system used for new users without any calibration is still challenging. In this letter, we propose a novel Multi-Attention Convolutional Recurrent mOdel (MACRO) for EEG-based event-related potential (ERP) detection in the subject-independent scenario. Specifically, the convolutional recurrent network is designed to capture the spatial-temporal features, while the multi-attention mechanism is integrated to focus on the most discriminative channels and temporal periods of EEG signals. Comprehensive experiments conducted on a benchmark dataset for RSVP-based BCIs show that our method achieves the best performance compared with the five state-of-the-art baseline methods. This result indicates that our method is able to extract the underlying subject-invariant EEG features and generalize to unseen subjects. Finally, the ablation studies verify the effectiveness of the designed multi-attention mechanism in MACRO for EEG-based ERP detection.
... Sidescan sonar (SSS), which can provide high-resolution images of the seabed, is one of the most common sensors for various underwater applications, such as topography measurement [1], search for sunken vessels and submerged settlements [2], underwater mine detection [3], fish stocks detection, cable or pipeline detection [4][5][6], and offshore oil prospecting [7]. Accurate and efficient segmentation of SSS images is essential for underwater objects detection. ...
Article
Full-text available
For high-resolution side scan sonar images, accurate and fast segmentation of sonar images is crucial for underwater target detection and recognition. However, due to the characteristics of low signal-to-noise ratio (SNR) and complex environmental noise of sonar, the existing methods with high accuracy and good robustness are mostly iterative methods with high complexity and poor real-time performance. For this purpose, a region growing based segmentation using the likelihood ratio testing method (RGLT) is proposed. This method obtains the seed points in the highlight and the shadow regions by likelihood ratio testing based on the statistical probability distribution and then grows them according to the similarity criterion. The growth avoids the processing of the seabed reverberation regions, which account for the largest proportion of sonar images, thus greatly reducing segmentation time and improving segmentation accuracy. In addition, a pre-processing filtering method called standard deviation filtering (STDF) is proposed to improve the SNR and remove the speckle noise. Experiments were conducted on three sonar databases, which showed that RGLT has significantly improved quantitative metrics such as accuracy, speed, and segmentation visual effects. The average accuracy and running times of the proposed segmentation method for 100 × 400 images are separately 95.90% and 0.44 s.
Chapter
For the underwater acoustic targets recognition, it is a challenging task to provide good classification accuracy for underwater acoustic target using radiated acoustic signals. Generally, due to the complex and changeable underwater environment, when the difference between the two types of targets is not large in some sensitive characteristics, the classifier based on single feature training cannot output correct classification. In addition, the complex background noise of target will also lead to the degradation of feature data quality. Here, we present a feature fusion strategy to identify underwater acoustic targets with one-dimensional Convolutional Neural Network. This method mainly consists of three steps. Firstly, considering the phase spectrum information is usually ignored, the Long and Short-Term Memory (LSTM) network is adopted to extract phase features and frequency features of the acoustic signal in the real marine environment. Secondly, for leveraging the frequency-based features and phase-based features in a single model, we introduce a feature fusion method to fuse the different features. Finally, the newly formed fusion features are used as input data to train and validate the model. The results show the superiority of our algorithm, as compared with the only single feature data, which meets the intelligent requirements of underwater acoustic target recognition to a certain extent.
Conference Paper
Detection of mines on the seafloor is most accurately performed by a human operator. However, it is a difficult task for machine vision methods. In addition, mine detection calls for high accuracy detection because of the high-risk nature of the problem. The advancements in the capabilities of sonar imaging and autonomous underwater vehicles has led to research using machine learning techniques and well known computer vision features (Barngrover et al., IEEE J. Ocean Eng. (2015), [1]). Non-linear classifiers such as Haar-like feature classifiers have shown good potential in extracting complex spatial and temporal patterns from noisy multidirectional series of sonar imagery, however this approach is dependent on specific sonar illumination methods and does not account for amount of lighting or soil type variation in training and test images. In this paper, we report on the preliminary methods and results of applying a non-linear classification method, convolutional neural networks (CNN) to mine detection in noisy sonar imagery. The advantage of this method is that it can learn more abstract and complex features in the input space, leading to a lower false-positive and higher true positive rates. CNNs routinely outperform other methods in similar machine vision tasks (Deng and Yu, Found. Trends Signal Process. 7, 197–387 (2013), [2]). We used a simple CNN architecture trained to distinguish mine-like objects from background clutter with up to 99% accuracy.
Article
A new algorithm called the Mondrian detector has been developed for object detection in high-frequency synthetic aperture sonar (SAS) imagery. If a second (low) frequency-band image is available, the algorithm can seamlessly exploit the additional information via an auxiliary prescreener test. This flexible single-band and multiband functionality fills an important capability gap. The algorithm's overall prescreener component limits the number of potential alarms. The main module of the method then searches for areas that pass a subset of pixel-intensity tests. A new set of reliable classification features has also been developed in the process. The overall framework has been kept uncomplicated intentionally in order to facilitate performance estimation, to avoid requiring dedicated training data, and to permit delayed real-time detection at sea on an autonomous underwater vehicle. The promise of the new algorithm is demonstrated on six substantial data sets of real SAS imagery collected at various geographical sites that collectively exhibit a wide range of diverse seafloor characteristics. The results show that--as with Mondrian's art--simplicity can be powerful.
Article
Creating human-informative signal processing systems for the underwater acoustic environment that do not generate operator cognitive saturation and overload is a major challenge. To alleviate cognitive operator overload, we present a visual analytics methodology in which multiple beam-formed sonar returns are mapped to an optimized 2-D visual representation, which preserves the relevant data structure. This representation alerts the operator as to which beams are likely to contain anomalous information by modeling a latent distribution of information for each beam. Sonar operators therefore focus their attention only on the surprising events. In addition to the principled visualization of high-dimensional uncertain data, the system quantifies anomalous information using a Fisher Information measure. Central to this process is the novel use of both signal and noise observation modeling to characterize the sensor information. A demonstration of detecting exceptionally low signal-to-noise ratio targets embedded in real-world 33-beam passive sonar data is presented.
Conference Paper
Face recognition plays an import role in our daily lives. However, computer face recognition performance degrades dramatically with the presence of variations in illumination, head pose and occlusion. In contrast, the human brain can recognize target faces over a much wider range of conditions. In this paper, we investigate target face detection through electroencephalography (EEG). We address the problem of single-trial target-face detection in a rapid serial visual presentation (RSVP) paradigm. Whereas most previous approaches used support vector machines (SVMs), we use a convolutional neural network (CNN) to classify EEG signals when subjects view target and non-target face stimuli. The CNN outperforms the SVM algorithm, which is commonly used for event-related-potential (ERP) detection. We also compare the difference in performance when using animal stimuli. The proposed system can be potentially used in rapid face recognition system.
Article
Full-text available
To overcome the shortcomings of the traditional manual detection of underwater targets in side-scan sonar (SSS) images, a real-time automatic target recognition (ATR) method is proposed in this paper. This method consists of image preprocessing, sampling, ATR by integration of the transformer module and YOLOv5s (that is, TR–YOLOv5s), and target localization. By considering the target-sparse and feature-barren characteristics of SSS images, a novel TR–YOLOv5s network and a down-sampling principle are put forward, and the attention mechanism is introduced in the method to meet the requirements of accuracy and efficiency for underwater target recognition. Experiments verified the proposed method achieved 85.6% mean average precision (mAP) and 87.8% macro-F2 score, and brought 12.5% and 10.6% gains compared with the YOLOv5s network trained from scratch, and had the real-time recognition speed of about 0.068 s per image.
Article
Full-text available
The classification of low signal-to-noise ratio (SNR) underwater acoustic signals in complex acoustic environments and increasingly small target radiation noise is a hot research topic.. This paper proposes a new method for signal processing—low SNR underwater acoustic signal classification method (LSUASC)—based on intrinsic modal features maintaining dimensionality reduction. Using the LSUASC method, the underwater acoustic signal was first transformed with the Hilbert-Huang Transform (HHT) and the intrinsic mode was extracted. the intrinsic mode was then transformed into a corresponding Mel-frequency cepstrum coefficient (MFCC) to form a multidimensional feature vector of the low SNR acoustic signal. Next, a semi-supervised fuzzy rough Laplacian Eigenmap (SSFRLE) method was proposed to perform manifold dimension reduction (local sparse and discrete features of underwater acoustic signals can be maintained in the dimension reduction process) and principal component analysis (PCA) was adopted in the process of dimension reduction to define the reduced dimension adaptively. Finally, Fuzzy C-Means (FCMs), which are able to classify data with weak features was adopted to cluster the signal features after dimensionality reduction. The experimental results presented here show that the LSUASC method is able to classify low SNR underwater acoustic signals with high accuracy.
Conference Paper
This paper introduces a new unsupervised statistically-based algorithm for the detection of underwater objects in sonar imagery. Highlights are detected by a higher-order-statistics representation of the image followed by a segmentation process to form a region-of-interest (ROI). Our algorithm sets its main parameters \textit{in situ} and avoids the need of parameter calibration. Moreover, we do not require knowledge about the target's shape or size, thereby making our algorithm robust to any sonar detection application. Results obtained from real sonar system show a good trade-off between probability of detection and false alarm rate (FAR).
Article
This work demonstrates that automated mine countermeasure (MCM) tasks are greatly facilitated by characterizing the seafloor environment in which the sensors operate as a first step within a comprehensive strategy for how to exploit information from available sensors, multiple detector types, measured features, and target classifiers, depending on the specific seabed characteristics present within the high-frequency synthetic aperture sonar (SAS) imagery used to perform MCM tasks. This approach is able to adapt as environmental characteristics change and includes the ability to recognize novel seabed types. Classifiers are then adaptively retrained through active learning in these unfamiliar seabed types, resulting in improved mitigation of challenging environmental clutter as it is encountered. Further, a segmentation constrained network algorithm is introduced to enable enhanced generalization abilities for recognizing mine-like objects from underrepresented environments within the training data. Additionally, a fusion approach is presented that allows the combination of multiple detectors, feature types spanning both measured expert features and deep learning, and an ensemble of classifiers for the particular seabed mixture proportions measured around each detected target. The environmentally adaptive approach is demonstrated to provide the best overall performance for automated mine-like object recognition.
Article
This article presents an automatic real-time object detection method using sidescan sonar (SSS) and an onboard graphics processing unit (GPU). The detection method is based on a modified convolutional neural network (CNN), which is referred to as self-cascaded CNN (SC-CNN). The SC-CNN model segments SSS images into object-highlight, object-shadow, and seafloor areas, and it is robust to speckle noise and intensity inhomogeneity. Compared with typical CNN, SC-CNN utilizes crop layers which enable the network to use local and global features simultaneously without adding convolution parameters. Moreover, to take the local dependencies of class labels into consideration, the results of SC-CNN are postprocessed using Markov random field. Furthermore, the sea trial for real-time object detection via the presented method was implemented on our autonomous underwater vehicle (AUV) named SAILFISH via its GPU module at Jiaozhou Bay Bridge, Qingdao, China. The results show that the presented method for SSS image segmentation has obvious advantages when compared with the typical CNN and unsupervised segmentation methods, and is applicable in real-time object detection task.
Article
Sonar is one of the most important tools for underwater object detection and submarine topography reconstruction. To classify sonar images automatically and accurately is essential for the navigation and path planning of autonomous underwater vehicles (AUV). However, for the intensity inhomogeneity and speckle noise in sonar images, it is difficult to obtain segmentation results of high accurate rate. To address these issues, in this paper, we advocate a segmentation method incorporating simple linear iterative clustering (SLIC) and adaptive intensity constraint into Markov random field (MRF), to segment sonar images with intensity inhomogeneity into the object highlight, the object shadow and the background areas. The main procedures of the proposed work are as follows: first, SLIC is used to separate sonar images into homogeneous super pixels, and second the homogeneity patches, with a novel intensity constraint strategy, is utilized to optimize the segmentation result of MRF at each iteration. Experimental results reveal that the proposed method performs well and fast on real sonar images which have intensity inhomogeneity problem.
Article
Objective: Brain computer interface (BCI) aims at providing a new way of communication between the human brain and external devices. One of the major tasks associated with the BCI system is to improve classification performance of motor imagery (MI) signal. Electroencephalogram (EEG) signals are widely used for the MI BCI system. The raw EEG signals are usually non-stationary time series with weak class property, degrading the classification performance. Approach: Nonnegative matrix factorization (NMF) has been successfully applied to pattern extraction which provides meaningful data presentation. However, NMF is unsupervised and cannot make use of the label information. Based on the label information of MI EEG data, we propose a novel method, called double constrained nonnegative matrix factorization (DCNMF), to improve the classification performance of NMF on MI BCI. The proposed method constructs a couple of label matrices as the constraints on the NMF procedure to make the EEGs with the same class labels have the similar representation in the low dimensional space, while the EEGs with different class labels have dissimilar representation as much as possible. Accordingly, the extracted features have obtain obvious class property, which is optimal to the classification of MI EEG. Main results: This study is conducted on the BCI competition III datasets (I and IVa). The proposed method has helped achieve high averaged accuracy across two datasets (79.00% for dataset I, 77.78% for dataset IVa), it performed better than the existing studies in the literature by about10%. Significance: Our study provides a novel solution for the MI BCI analysis from the perspective of label constraint, it provides convenience for semi-supervised learning of features and significantly improves the classification performance.
Conference Paper
Rapid Serial Visual Presentation (RSVP)-based Brain-Computer Interface (BCI) is an efficient information detection technology by detecting event-related brain responses evoked by target visual stimuli. However, a time-consuming calibration procedure is needed before a new user can use this system. Thus, it is important to reduce calibration efforts for BCI applications. In this paper, we collect an RSVP-based electroencephalogram (EEG) dataset, which includes 11 subjects. The experimental task is image retrieval. Also, we propose a multi-source transfer learning framework by utilizing data from other subjects to reduce the data requirement on the new subject for training the model. A source-selection strategy is firstly adopted to avoid negative transfer. And then, we propose a transfer learning network based on domain adversarial training. The convolutional neural network (CNN)-based network is designed to extract common features of EEG data from different subjects, while the discriminator tries to distinguish features from different subjects. In addition, a classifier is added for learning semantic information. Also, conditional information and gradient penalty are added to enable stable training of the adversarial network and improve performance. The experimental results demonstrate that our proposed method outperforms a series of state-of-the-art and baseline approaches.
Article
Rapid Serial Visual Presentation (RSVP)-based Brain-Computer Interface (BCI) is an efficient information detection technology by detecting event-related brain responses evoked by target visual stimuli. However, a time-consuming calibration procedure is needed before a new user can use this system. Thus, it is important to reduce calibration efforts for BCI applications. In this paper, we propose a multi-source conditional adversarial domain adaptation with the correlation metric learning (mCADA-C) framework that utilizes data from other subjects to reduce the data requirement from the new subject for training the model. This model utilizes adversarial training to enable a CNNbased feature extraction network to extract common features from different domains. A correlation metric learning (CML) loss is proposed to constrain the correlation of features based on class and domain to maximize the intra-class similarity and minimize inter-class similarity. Also, a multi-source framework with a source selection strategy is adopted to integrate the results of multiple domain adaptation. We constructed an RSVP-based dataset that includes 11 subjects each performing three RSVP experiments on three different days. The experimental results demonstrate that our proposed method can achieve 87.72% cross-subject balanced-accuracy under one block calibration. The results indicate our method can realize a higher performance with less calibration efforts.
Conference Paper
Over the past several decades, the Brain-computer interface (BCI) technology has attracted considerable attentions from researchers in the field of aerospace, military simulation, neural engineering, oceanic engineering and pattern recognition. In this paper, we explore the developing process, framework of a BCI system, feature extraction and classification models of machine learning and application on the basis of the latest research for BCI. Meanwhile, challenges of the BCI technology in aerospace systems and space explorations are analyzed. Finally, the application prospects in the modeling and simulation field to support combat training in complex situations are pointed out.
Article
Full-text available
The detection of mine-like objects (MLOs) in sidescan sonar (SSS) imagery continues to be a challenging task. In practice, subject matter experts tediously analyze images searching for MLOs. In the literature, there are many attempts at automated target recognition (ATR) to detect the MLOs. This paper focuses on the classifiers that use computer vision and machine learning approaches. These techniques require large amounts of data, which is often prohibitive. For this reason, the use of synthetic and semisynthetic data sets for training and testing is commonplace. This paper shows how a simple semisynthetic data creation scheme can be used to pretest these data-hungry training algorithms to determine what features are of value. The paper provides real-world testing and training data sets in addition to the semisynthetic training and testing data sets. The paper considers the Haar-like and local binary pattern (LBP) features with boosting, showing improvements in performance with real classifiers over semisynthetic classifiers and improvements in performance as semisynthetic data set size increases.
Conference Paper
Full-text available
Given a high dimensional dataset, one would like to be able to represent this data using fewer parameters while preserving relevant signal information. If we assume the original data actually exists on a lower dimensional manifold embedded in a high dimensional feature space, then recently popularized approaches based in graph-theory and differential geometry allow us to learn the underlying manifold that generates the data. One such technique, called Diffusion Maps, is said to preserve the local proximity between data points by first constructing a representation for the underlying manifold. This work examines target specific classification problems using Diffusion Maps to embed inverse imaged synthetic aperture sonar signal data for automatic target recognition. The data set contains six target types. Results demonstrate that the diffusion features capture suitable discriminating information from the raw signals and acoustic color to improve target specific recognition with a lower false alarm rate. However, fusion performance is degraded.
Article
Full-text available
Boosting is one of the most important recent developments in classi-fication methodology. Boosting works by sequentially applying a classifica-tion algorithm to reweighted versions of the training data and then taking a weighted majority vote of the sequence of classifiers thus produced. For many classification algorithms, this simple strategy results in dramatic improvements in performance. We show that this seemingly mysterious phenomenon can be understood in terms of well-known statistical princi-ples, namely additive modeling and maximum likelihood. For the two-class problem, boosting can be viewed as an approximation to additive modeling on the logistic scale using maximum Bernoulli likelihood as a criterion. We develop more direct approximations and show that they exhibit nearly identical results to boosting. Direct multiclass generalizations based on multinomial likelihood are derived that exhibit performance comparable to other recently proposed multiclass generalizations of boosting in most situations, and far superior in some. We suggest a minor modification to boosting that can reduce computation, often by factors of 10 to 50. Finally, we apply these insights to produce an alternative formulation of boosting decision trees. This approach, based on best-first truncated tree induction, often leads to better performance, and can provide interpretable descrip-tions of the aggregate decision rule. It is also much faster computationally, making it more suitable to large-scale data mining applications.
Chapter
Full-text available
We have developed EEG-based BCI systems which couple human vision and computer vision for speeding the search of large images and image/video databases. We term these types of BCI systems “cortically-coupled computer vision” (C3Vision). C3Vision exploits (1)the ability of the human visual system to get the “gist” of a scene with brief (10’s–100’s of ms) and rapid serial (10Hz) image presentations and (2)our ability to decode from the EEG whether, based on the gist, the scene is relevant, informative and/or grabs the user’s attention. In this chapter we describe two system architectures for C3Vision that we have developed. The systems are designed to leverage the relative advantages, in both speed and recognition capabilities, of human and computer, with brain signals serving as the medium of communication of the user’s intentions and cognitive state.
Conference Paper
Full-text available
A scanning window type pedestrian detector is presented that uses both appearance and motion information to find walking people in surveillance video. We extend the work of Viola, Jones and Snow (2005) to use many more frames as input to the detector thus allowing a much more detailed analysis of motion. The resulting detector is about an order of magnitude more accurate than the detector of Viola, Jones and Snow. It is also computationally efficient, processing frames at the rate of 5 Hz on a 3 GHz Pentium processor.
Conference Paper
Full-text available
This paper describes a visual object detection framework that is capable of processing images extremely rapidly while achieving high detection rates. There are three key contributions. The first is the introduction of a new image representation called the "Integral Image" which allows the features used by our detector to be computed very quickly. The second is a learning algorithm, based on AdaBoost, which selects a small number of critical visual features and yields extremely efficient classifiers (6). The third contribution is a method for combining classifiers in a "cascade" which allows background regions of the image to be quickly discarded while spending more computation on promising object-like regions. A set of experiments in the domain of face detection are presented. The system yields face detection performace comparable to the best previous systems (18, 13, 16, 12, 1). Implemented on a conventional desktop, face detection proceeds at 15 frames per second.
Article
Full-text available
We report the design and performance of a brain-computer interface (BCI) system for real-time single-trial binary classification of viewed images based on participant-specific dynamic brain response signatures in high-density (128-channel) electroencephalographic (EEG) data acquired during a rapid serial visual presentation (RSVP) task. Image clips were selected from a broad area image and presented in rapid succession (12/s) in 4.1-s bursts. Participants indicated by subsequent button press whether or not each burst of images included a target airplane feature. Image clip creation and search path selection were designed to maximize user comfort and maintain user awareness of spatial context. Independent component analysis (ICA) was used to extract a set of independent source time-courses and their minimally-redundant low-dimensional informative features in the time and time-frequency amplitude domains from 128-channel EEG data recorded during clip burst presentations in a training session. The naive Bayes fusion of two Fisher discriminant classifiers, computed from the 100 most discriminative time and time-frequency features, respectively, was used to estimate the likelihood that each clip contained a target feature. This estimator was applied online in a subsequent test session. Across eight training/test session pairs from seven participants, median area under the receiver operator characteristic curve, by tenfold cross validation, was 0.97 for within-session and 0.87 for between-session estimates, and was nearly as high (0.83) for targets presented in bursts that participants mistakenly reported to include no target features.
Article
Full-text available
In this paper, we describe a simple set of "recipes" for the analysis of high spatial density EEG. We focus on a linear integration of multiple channels for extracting individual components without making any spatial or anatomical modeling assumptions, instead requiring particular statistical properties such as maximum difference, maximum power, or statistical independence. We demonstrate how corresponding algorithms, for example, linear discriminant analysis, principal component analysis and independent component analysis, can be used to remove eye-motion artifacts, extract strong evoked responses, and decompose temporally overlapping components. The general approach is shown to be consistent with the underlying physics of EEG, which specifies a linear mixing model of the underlying neural and non-neural current sources.
Article
Full-text available
Detecting people in images is key for several important application domains in computer vision. This paper presents an in-depth experimental study on pedestrian classification; multiple feature-classifier combinations are examined with respect to their ROC performance and efficiency. We investigate global versus local and adaptive versus nonadaptive features, as exemplified by PCA coefficients, Haar wavelets, and local receptive fields (LRFs). In terms of classifiers, we consider the popular Support Vector Machines (SVMs), feed-forward neural networks, and k-nearest neighbor classifier. Experiments are performed on a large data set consisting of 4,000 pedestrian and more than 25,000 nonpedestrian (labeled) images captured in outdoor urban environments. Statistically meaningful results are obtained by analyzing performance variances caused by varying training and test sets. Furthermore, we investigate how classification performance and training sample size are correlated. Sample size is adjusted by increasing the number of manually labeled training data or by employing automatic bootstrapping or cascade techniques. Our experiments show that the novel combination of SVMs with LRF features performs best. A boosted cascade of Haar wavelets can, however, reach quite competitive results, at a fraction of computational cost. The data set used in this paper is made public, establishing a benchmark for this important problem.
Article
Full-text available
We describe a real-time electroencephalography (EEG)-based brain-computer interface system for triaging imagery presented using rapid serial visual presentation. A target image in a sequence of nontarget distractor images elicits in the EEG a stereotypical spatiotemporal response, which can be detected. A pattern classifier uses this response to reprioritize the image sequence, placing detected targets in the front of an image stack. We use single-trial analysis based on linear discrimination to recover spatial components that reflect differences in EEG activity evoked by target versus nontarget images. We find an optimal set of spatial weights for 59 EEG sensors within a sliding 50-ms time window. Using this simple classifier allows us to process EEG in real time. The detection accuracy across five subjects is on average 92%, i.e., in a sequence of 2500 images, resorting images based on detector output results in 92% of target images being moved from a random position in the sequence to one of the first 250 images (first 10% of the sequence). The approach leverages the highly robust and invariant object recognition capabilities of the human visual system, using single-trial EEG analysis to efficiently detect neural signatures correlated with the recognition event.
Article
Full-text available
This paper is concerned with hierarchical Markov random field (MRP) models and their application to sonar image segmentation. We present an original hierarchical segmentation procedure devoted to images given by a high-resolution sonar. The sonar image is segmented into two kinds of regions: shadow (corresponding to a lack of acoustic reverberation behind each object lying on the sea-bed) and sea-bottom reverberation. The proposed unsupervised scheme takes into account the variety of the laws in the distribution mixture of a sonar image, and it estimates both the parameters of noise distributions and the parameters of the Markovian prior. For the estimation step, we use an iterative technique which combines a maximum likelihood approach (for noise model parameters) with a least-squares method (for MRF-based prior). In order to model more precisely the local and global characteristics of image content at different scales, we introduce a hierarchical model involving a pyramidal label field. It combines coarse-to-fine causal interactions with a spatial neighborhood structure. This new method of segmentation, called the scale causal multigrid (SCM) algorithm, has been successfully applied to real sonar images and seems to be well suited to the segmentation of very noisy images. The experiments reported in this paper demonstrate that the discussed method performs better than other hierarchical schemes for sonar image segmentation
Article
Full-text available
This review summarizes linear spatiotemporal signal analysis methods that derive their power from careful consideration of spatial and temporal features of skull surface potentials. BCIs offer tremendous potential for improving the quality of life for those with severe neurological disabilities. At the same time, it is now possible to use noninvasive systems to improve performance for time-demanding tasks. Signal processing and machine learning are playing a fundamental role in enabling applications of BCI and in many respects, advances in signal processing and computation have helped to lead the way to real utility of noninvasive BCI.
Conference Paper
Automatic detection of underwater objects is a critical task for a variety of underwater applications. Rapid detection approaches are needed to tackle the large amount of data produced using state-of-the-art sensors such as Synthetic Aperture Sonar. Accurate detection approaches are also required to reduce the number of false alarms and enable on the fly adaptation of the missions in Autonomous Underwater Vehicles. In this paper we propose a new method for object detection in Synthetic Aperture Sonar imagery capable of processing images extremely rapidly based on the Viola and Jones cascade of boosted classifiers. Our approach provides confidence-rated predictions rather than the {-1,1} of the traditional cascade. This does not only provide a confidence level to each prediction but also reduces the false alarm rate significantly. We also introduce a novel structure of the cascade capable of obtaining low false alarm rates while achieving high detection accuracy. Results obtained on a real dataset of Synthetic Aperture Sonar on a variety of challenging terrains are presented to show the discriminative power of such an approach.
Article
The support-vector network is a new learning machine for two-group classification problems. The machine conceptually implements the following idea: input vectors are non-linearly mapped to a very high-dimension feature space. In this feature space a linear decision surface is constructed. Special properties of the decision surface ensures high generalization ability of the learning machine. The idea behind the support-vector network was previously implemented for the restricted case where the training data can be separated without errors. We here extend this result to non-separable training data. High generalization ability of support-vector networks utilizing polynomial input transformations is demonstrated. We also compare the performance of the support-vector network to various classical learning algorithms that all took part in a benchmark study of Optical Character Recognition.
Article
The problem of classifying targets in sonar images from multiple views is modeled as a partially observable Markov decision process (POMDP). This model allows one to adaptively determine which additional views of an object would be most beneficial in reducing the classification uncertainty. Acquiring these additional views is made possible by employing an autonomous underwater vehicle (AUV) equipped with a side-looking imaging sonar. The components of the multiview target classification POMDP are specified. The observation model for a target is specified by the degree of similarity between the image under consideration and a number of precomputed templates. The POMDP is validated using real synthetic aperture sonar (SAS) data gathered during experiments at sea carried out by the NATO Undersea Research Centre, and results show that the accuracy of the proposed method outperforms an approach using a number of predetermined view aspects. The approach provides an elegant way to fully exploit multiview information and AUV maneuverability in a methodical manner.
Article
An advanced capability for automated detection and classification of sea mines in sonar imagery has been developed. The advanced mine detection and classification (AMDAC) algorithm consists of an improved detection density algorithm, a classification feature extractor that uses a stepwise feature selection strategy, a k-nearest neighbor attractor-based neural network (KNN) classifier, and an optimal discriminatory filter classifier. The detection stage uses a nonlinear matched filter to identify mine-size regions in the sonar image that closely match a mine's signature. For each detected mine-like region, the feature extractor calculates a large set of candidate classification features. A stepwise feature selection process then determines the subset features that optimizes probability of detection and probability of classification for each of the classifiers while minimizing false alarms. Bibtex entry for this abstract Preferred format for this abstract (see Preferences) Find Similar Abstracts: Use: Authors Title Abstract Text Return: Query Results Return items starting with number Query Form Database: Astronomy Physics arXiv e-prints
Conference Paper
Recently Viola et al. [2001] have introduced a rapid object detection. scheme based on a boosted cascade of simple feature classifiers. In this paper we introduce a novel set of rotated Haar-like features. These novel features significantly enrich the simple features of Viola et al. and can also be calculated efficiently. With these new rotated features our sample face detector shows off on average a 10% lower false alarm rate at a given hit rate. We also present a novel post optimization procedure for a given boosted cascade improving on average the false alarm rate further by 12.5%.
Conference Paper
Operational requirements for naval applications have shifted towards the fast, reliable detection and avoidance or elimination of underwater threats (e.g. mines, IEDs (improvised explosive devices),...) over the last decade. For these purposes the ability to reliable separate mines or IEDs from rocks or bottom features is essential. This separation can be much more difficult for IEDs compared to traditional cylindrical or spherical mines. Furthermore, automatic target recognition (ATR) approaches are gaining more and more importance for autonomous UUVs. Since no operator is in the loop, these systems are harmed by a limited number of missed detections or a significant number of false targets. In this context the ability to automatically detect and classify objects depends directly on the true resolution of the acoustic imaging system. All this points towards the need for a high resolution sensor for reliable object detection, classification and identification. Starting with some examples, this paper presents theoretical considerations about the required resolution for the detection, classification and identification process of objects in side scan sonar images. Clues for the required resolution can be directly derived from the Johnson-Criterion for electro-optics systems. Secondly, an image processing software for automatic object detection and classification currently under development at FWG with the assistance of FU-Berlin and FGAN-FOM is presented. This part focuses on an overview of the system and recently developed and tested algorithms. Before applying different detection algorithms, the side scan sonar images are preprocessed including normalization, height estimation plus slant range correction and geo-referencing. Different normalization algorithms can be used. Currently six different screening algorithms for detecting regions of interests (ROIs) with objects of interest are implemented. These screening algorithms base on statistical features within a slid- ing window, a highlight / shadow analysis after threshold segmentation, a normalized Id-cross correlation with a template, a modified maximally stable extremal regions (MSER) approach, a k-means and a higher order statistic based segmentation. Afterwards false detections of ROIs without objects of interest are eliminated by applying a single snake algorithm for the entire highlight and shadow area, a coupled snake algorithms for the highlight area and for the shadow area, a 2D-cross correlation with reference images of MLOs and an iterative segmentation, all combined with robust and fast classifiers. The final processing step is a classifier (Probabilistic Neural Network (PNN)). Also a simple data fusion strategy was tested based on the output of the different screening and reduction of false positives algorithms. Finally, consequences for image processing with improved sensor resolution are discussed. All algorithms were tested using a data set representing roughly 25 km<sup>2</sup> of the sea floor. This data set was in part collected by the SeaOtter MK I AUV from Atlas Elektronik and gathered in the Baltic Sea and the Mediterranean Sea. Different side scan sonar systems were used.
Article
In many remote-sensing classification problems, the number of targets (e.g., mines) present is very small compared with the number of clutter objects. Traditional classification approaches usually ignore this class imbalance, causing performance to suffer accordingly. In contrast, the recently developed infinitely imbalanced logistic regression (IILR) algorithm explicitly addresses class imbalance in its formulation. We describe this algorithm and give the details necessary to employ it for remote-sensing data sets that are characterized by class imbalance. The method is applied to the problem of mine classification on three real measured data sets. Specifically, classification performance using the IILR algorithm is shown to exceed that of a standard logistic regression approach on two land-mine data sets collected with a ground-penetrating radar and on one underwater-mine data set collected with a sidescan sonar.
Human-aided computing proposes using information measured directly from the human brain in order to per- form useful tasks. In this paper, we extend this idea by fus- ing computer vision-based processing and processing done by the human brain in order to build more effective object categorization systems. Specifically, we use an electroen- cephalograph (EEG) device to measure the subconscious cognitive processing that occurs in the brain as users see images, even when they are not trying to explicitly clas- sify them. We present a novel framework that combines a discriminative visual category recognition system based on the Pyramid Match Kernel (PMK) with information derived from EEG measurements as users view images. We propose a fast convex kernel alignment algorithm to effectively com- bine the two sources of information. Our approach is val- idated with experiments using real-world data, where we show significant gains in classification accuracy. We an- alyze the properties of this information fusion method by examining the relative contributions of the two modalities, the errors arising from each source, and the stability of the combination in repeated experiments.
Article
Thesupport-vector network is a new learning machine for two-group classification problems. The machine conceptually implements the following idea: input vectors are non-linearly mapped to a very high-dimension feature space. In this feature space a linear decision surface is constructed. Special properties of the decision surface ensures high generalization ability of the learning machine. The idea behind the support-vector network was previously implemented for the restricted case where the training data can be separated without errors. We here extend this result to non-separable training data. High generalization ability of support-vector networks utilizing polynomial input transformations is demonstrated. We also compare the performance of the support-vector network to various classical learning algorithms that all took part in a benchmark study of Optical Character Recognition.
Article
Mine detection and classification using high-resolution sidescan sonar is a critical technology for mine counter measures (MCM). As opposed to the majority of techniques which require large training data sets, this paper presents unsupervised models for both the detection and the shadow extraction phases of an automated classification system. The detection phase is carried out using an unsupervised Markov random field (MRF) model where the required model parameters are estimated from the original image. Using a priori spatial information on the physical size and geometric signature of mines in sidescan sonar, a detection-orientated MRF model is developed which directly segments the image into regions of shadow, seabottom-reverberation, and object-highlight. After detection, features are extracted so that the object can be classified. A novel co-operating statistical snake (CSS) model is presented which extracts the highlight and shadow of the object. The CSS model again utilizes available a priori information on the spatial relationship between the highlight and shadow, allowing accurate segmentation of the object's shadow to be achieved.
In a blink of an eye and a switch of a transistor: Cortically coupled computer vision Brain activity-based image classification from rapid serial visual presentation
  • P Sajda
  • E Pohlmeyer
  • J Wang
  • L C Parra
  • C Christoforou
  • J Dmochowski
  • B Hanna
  • C Bahlmann
  • M K Singh
  • S.-F Chang
P. Sajda, E. Pohlmeyer, J. Wang, L. C. Parra, C. Christoforou, J. Dmochowski, B. Hanna, C. Bahlmann, M. K. Singh, and S.-F. Chang, " In a blink of an eye and a switch of a transistor: Cortically coupled computer vision, " Proc. IEEE, vol. 98, no. 3, pp. 462–478, Mar. 2010. [18] N. Bigdely-Shamlo, A. Vankov, R. R. Ramirez, and S. Makeig, " Brain activity-based image classification from rapid serial visual presentation, " IEEE Trans. Neural Syst. Rehab. Eng., vol. 16, no. 5, pp. 432–441, Oct. 2008.