Article

Fish species identification using a convolutional neural network trained on synthetic data

Authors:
To read the full-text of this research, you can request a copy directly from the authors.

Abstract

Acoustic-trawl surveys are an important tool for marine stock management and environmental monitoring of marine life. Correctly assigning the acoustic signal to species or species groups is a challenge, and recently trawl camera systems have been developed to support interpretation of acoustic data. Examining images from known positions in the trawl track provides high resolution ground truth for the presence of species. Here, we develop and deploy a deep learning neural network to automate the classification of species present in images from the Deep Vision trawl camera system. To remedy the scarcity of training data, we developed a novel training regime based on realistic simulation of Deep Vision images. We achieved a classification accuracy of 94% for blue whiting, Atlantic herring, and Atlantic mackerel, showing that automatic species classification is a viable and efficient approach, and further that using synthetic data can effectively mitigate the all too common lack of training data. © International Council for the Exploration of the Sea 2018. All rights reserved.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

... This absorption and scattering cause significant loss of detail and contrast in the captured images, necessitating the development of advanced imaging and processing techniques to mitigate these effects. Particles and microorganisms in water contribute to turbidity, obscuring objects and details [6,7]. Turbidity varies with water conditions, which can change rapidly due to environmental factors such as weather, tides, and human activities. ...
... Color distortion due to differential light absorption affects visual quality. Color correction algorithms and machine learning models help restore natural colors in underwater images [6,14]. Traditional color correction methods often involve applying inverse transformations based on the known absorption characteristics of water. ...
... CNNs are widely used for classifying underwater images, such as identifying marine species or coral types. For instance, a CNN model trained on a dataset of fish images achieved high accuracy in species classification [6,15]. The architecture of CNNs, consisting of convolutional layers, pooling layers, and fully connected layers, allows them to learn hierarchical features from raw pixel data, making them well-suited for image classification tasks. ...
Article
Full-text available
Underwater imaging plays a critical role in various fields such as marine biology, environmental monitoring, underwater archaeology, and defense. However, it faces unique challenges including light absorption and scattering, limited visibility, color distortion, and dynamic underwater conditions. Recent advancements in machine learning have provided powerful tools to address these challenges, significantly improving the quality and analysis of underwater images. This review comprehensively explores the intersection of underwater imaging and machine learning, covering supervised learning, unsupervised learning, deep learning, and reinforcement learning techniques. I discuss key applications such as marine species identification, coral reef monitoring, autonomous underwater navigation, archaeological site exploration, and environmental monitoring. Additionally, I examine publicly available datasets, benchmarking methods, and evaluation metrics essential for developing and accessing machine learning models in this domain. Through detailed case studies and practical implementations, I highlight the strengths and weaknesses of various approaches. Emerging trends such as the integration of AI with robotics, advancements in imaging hardware, and the development of specialized algorithms are also discussed. Future directions include enhanced image processing techniques, interdisciplinary collaborations, and real-time processing capabilities. This review aims to provide a comprehensive overview of the current state of underwater imaging and machine learning, highlighting the potential for continued research and innovation in this rapidly evolving field.
... For instance, fish species identification models using original feature extraction or Convolutional Neural Networks (CNNs) have been developed [4][5][6]. Identification models by transfer learning using a pre-trained model of ImageNet [7] have also been developed [8][9][10][11][12][13]. These studies were conducted assuming that the training data, as shown in the column "target task", are provided in each fishing ground; therefore, users need to prepare a new dataset and additionally train the model at the new fishing ground when they want to apply the model to a new place. ...
... Model Pre-Train [4] Original NN None [5] Fish4Knowledge CNN (2 layers) None [8] Original Inception v3 ImageNet [6] Fish-Pak CNN (32 layers) None [9] Fish4Knowledge ResNet50 ImageNet [10] Fish4Knowledge Inception v3 ImageNet [11] LifeCLEF 2015 Fish ResNet50 ImageNet [12] LifeCLEF 2015 Fish & URPC ResNet50 ImageNet [25] Original VGG16 Unknown [13] Large Fish MobileNetV2 ImageNet [14] SEAMAPD21 MobileNetV3 Unknown [15] FishNet ConvNeXt ImageNet Based on the above, we develop a new transferable fish identification (TFI) model that can be easily applied to various fishing grounds as shown in Figure 3. The proposed method consists of two phases: pre-training and implementation. ...
... Fine-tuning was performed from the model corresponding to the "KPM-train strategy" columns. The table shows that a higher performance was achieved by fine-tuning the TFI model using ImageNet and KPM (5)(6)(7)(8). The models with no pre-training (none) and fine-tuning from ImageNet achieved an accuracy of approximately 73%, whereas the model using only KPM without ImageNet in the pre-training resulted in lower estimation accuracy. ...
Article
Full-text available
The digitization of catch information for the promotion of sustainable fisheries is gaining momentum globally. However, the manual measurement of fundamental catch information, such as species identification, length measurement, and fish count, is highly inconvenient, thus intensifying the call for its automation. Recently, image recognition systems based on convolutional neural networks (CNNs) have been extensively studied across diverse fields. Nevertheless, the deployment of CNNs for identifying fish species is difficult owing to the intricate nature of managing a plethora of fish species, which fluctuate based on season and locale, in addition to the scarcity of public datasets encompassing large catches. To overcome this issue, we designed a transferable pre-trained CNN model specifically for identifying fish species, which can be easily reused in various fishing grounds. Utilizing an extensive fish species photographic database from a Japanese museum, we developed a transferable fish identification (TFI) model employing strategies such as multiple pre-training, learning rate scheduling, multi-task learning, and metric learning. We further introduced two application methods, namely transfer learning and output layer masking, for the TFI model, validating its efficacy through rigorous experiments.
... The global ecosystem entered to a new age, and it is named anthropogenic defaunation, where the different activities of humans affect the species range contraction, a new wave of species extinction, and animal abundance in the natural ecosystem [1]. In modern society, various enhancements are performed in technology, so people can understand and explore the ocean deeply. ...
... Later, the distribution of Rayleigh is achieved by combining neighbor regions with different bilinear interpolations to reject the artificial territories. The new image is obtained using the CLAHE technique in the original image and presented in Eq. (1). ...
... The optimal features of the first set are acquired from DenseNet that is indicated by the term O f dnt k , the optimal features of the second set extracted from MobileNet are denoted by O f mbn k , the optimal features of the third set acquired from VGG16 are indicated by O f Vgg k , and finally, the attained fourth set of optimal features are taken from ResNet and it is denoted as O f Res k , respectively. These four sets of optimal features are attained in the range of [1,5] ...
Article
Full-text available
The classification of fish species has become an essential task for marine ecologists and biologists for the estimation of large quantities of fish variants in their own environment and also to supervise their population changes. Different conventional classification is expensive, time-consuming, and laborious. Scattering and absorption of light in deep sea atmosphere achieves a very low-resolution image and becomes highly challenging for the recognition and classification of fish variants. Then, the performance rate of existing computer vision methods starts to reduce underwater because of highly indistinct features and background clutter of marine species. The attained classification issues can be resolved using deep structured models, which are highly recommended to enhance the performance rate in fish species classification. But, only a limited amount of fish datasets is available, which makes the system more complex, and also, they need enormous amounts of datasets to perform training. So, it is essential to develop an automated and optimized system to detect, categorize, track, and minimize manual interference in fish species classification. Thus, this paper aims to suggest a new fish species classification model by the optimized recurrent neural network (RNN) and feature fusion. Initially, standard underwater images are acquired from a standard database. Then, the gathered images are pre-processed for cleaning and enhancing the quality of images using “contrast limited adaptive histogram equalization (CLAHE) and histogram equalization”. Then, the deep feature extractions are obtained using DenseNet, MobileNet, ResNet, and VGG16, where the gathered features are given to the new phase optimal feature selection. They are performed with a new heuristic algorithm called “modified mating probability-based water strider algorithm (MMP-WSA)” that attains the optimal features. Further, the optimally selected features are further fed to the feature fusion process, where the feature fusion is carried out using the adaptive fusion concept. Here, the weights are tuned using the designed MMP-WSA. In addition, the fused features are sent to the classification phase, where the classification is performed using developed FishRNFuseNET, in which the parameters of the RNN are tuned by developed MMP-WSA for getting accurate classified outcomes. The proposed method is an effective substitute for time-consuming and strenuous approaches in human identification by professionals, and it turned as a benefit to monitor the biodiversity of fish in their place.
... Insects in agricultural areas are currently distinguished mainly through human categorization, but this procedure is time-consuming and costly. The study presented in [15] worked with the CNN model to address the multi-classification issue of agricultural insects. The Model extracts the multi-faceted insect characteristics using the neural network's benefits. ...
... During the regional proposal phase, the regional network used fewer proposed windows rather than a conventional selective search technology, which is essential to improve predictability and accelerate calculations. Experimental findings indicate that the presented technique achieved more accuracy and is superior to the conventional insect classification algorithms [15]. ...
Article
Full-text available
Species identification is a critical task for biological studies, ecological monitoring, and conservation efforts. A comprehensive comprehension of the evolutionary mechanisms that lead to biological variety is necessary while species are distinct categories of living organisms; however, naming, identifying, and differentiating between species is more complex than it may seem. Traditional methods, relying on dichotomous keys and manual observation, are time-consuming and error-prone. Precise species identification is crucial for all taxonomic investigations and biological procedures. Numerous experts are currently engaged in the task of identifying a solitary species. To address these challenges, we present a robust artificial intelligence framework for species identification using deep learning techniques, specifically leveraging the ResNet-50 Convolutional Neural Network (CNN). Our approach utilizes a ResNet-50-based CNN to accurately classify 15 species, including humans, plants, and animals, from images taken at unique locations and angles. The dataset was pre-processed and augmented to enhance training, ensuring robustness against variations in lighting, occlusion, and background clutter. Featuring 4 million trainable parameters, our modified ResNet-50 model demonstrated superior computational efficiency and accuracy. The proposed model achieved an overall accuracy of 96.5%, with class-specific accuracies of 98.25% for humans, 97.81% for animals, and 96.90% for plants. These results surpass those of existing models such as GoogleNet, VGG, SegNet, and DeepLab v3+, highlighting the efficacy of our approach. Performance was evaluated using metrics such as sensitivity, specificity, and error rate, further validating its reliability. Our findings suggest that the ResNet-50-based CNN model is highly effective for automatic species identification, offering significant improvements in accuracy and computational efficiency.
... The utilization of convolutional neural networks (CNNs) combined with data augmentation techniques is a well-established approach to similar tasks, widely acknowledged in the scientific community. Previous studies [3][4][5][6][7][8] that focus on the classification of animal species have demonstrated the effectiveness of convolutional models. They propose species classification systems based on pre-trained CNNs such as ResNet-50 [9], VGG16 [10], AlexNet [11], and GoogleNet [12], utilizing datasets augmented with techniques like rotation, zoom, and shift. ...
... In most existing publications, specimens are only classified within the image or distinguished from other objects in the image, such as trash [6,[13][14][15][16][17]. Our research addresses the critical need for objectivity in fish labeling processes, which are traditionally prone to subjective errors. ...
Article
Full-text available
The accurate labeling of species and size of specimens plays a pivotal role in fish auctions conducted at fishing ports. These labels, among other relevant information, serve as determinants of the objectivity of the auction preparation process, underscoring the indispensable nature of a reliable labeling system. Historically, this task has relied on manual processes, rendering it vulnerable to subjective interpretations by the involved personnel, therefore compromising the value of the merchandise. Consequently, the digitization and implementation of an automated labeling system are proposed as a viable solution to this ongoing challenge. This study presents an automatic system for labeling species and size, leveraging pre-trained convolutional neural networks. Specifically, the performance of VGG16, EfficientNetV2L, Xception, and ResNet152V2 networks is thoroughly examined, incorporating data augmentation techniques and fine-tuning strategies. The experimental findings demonstrate that for species classification, the EfficientNetV2L network excels as the most proficient model, achieving an average F-Score of 0.932 in its automatic mode and an average F-Score of 0.976 in its semi-automatic mode. Concerning size classification, a semi-automatic model is introduced, where the Xception network emerges as the superior model, achieving an average F-Score of 0.949.
... Deep learning neural network (DNN) for automatic classification of fish species is proposed in [13]. In this work, a novel training regime is developed to cue the scarcity of training data and achieved a classification rate of 94%. ...
... Liang et al. [3] 2020 Shape features Convolutional neural network 98.1% Knausgård et al. [4] 2022 Generic features Convolutional neural network 87.74% Böer et al. [5] 2021 Morphological features DeepLabV3 and PSPNet models 96.8% Iqbal et al. [6] 2019 Generic features AlexNet model 90.48% Cui et al. [7] 2020 Generic features Convolutional neural network 97.5% Zhang et al. [8] 2021 Morphological features Convolutional neural network 96% Montalbo and Hernandez [9] 2019 Generic features VGG16 DCNN Model 99% Mathur and Goel [10] 2021 Generic features ResNet-50 Model 98.44% Ahmed et al. [11] 2022 Statistical & color features SVM classifier 94.12% Lan et al. [12] 2020 Shape and texture features Deep CNN 89% Allken et al. [13] 2018 Shape features A deep learning neural network 94% Andayani et al. [14] 2019 Co-occurrence Matrix and Geometric invariant moment Probabilistic neural network 89.65% ...
Article
Full-text available
Recently, the process of fish species classification has become one of the most challenging problems addressed by researchers. In this work, a robust scheme to classify fish images based on robust feature extraction from shape signatures is proposed. First, the image contour is fitted using one of the common approaches named radial basis function neural network (RBFNN) fitting to obtain image centroid. Afterward, prominent features from the shape signature are extracted. These features are representative of fish shapes because they can distinguish the characteristics of each class as well as being relatively robust to scale and rotation changes. Finally, for the classification process purpose, RBFNN is used again for image classification against one of the most commonly used classification techniques called support vector machine (SVM). The proposed paradigm has been applied to a standard fish dataset acquired from a live video dataset grouped into twenty-three clusters representing specific fish species. The resulting accuracy based on SVM and RBFNN was 90.41% and 98.04%, respectively.
... Also, the model was 40 M in size, which is suitable for embedding in a device for real-time classification of marine animals from the underwater image. 141 To overcome the limitation of data sets, 141 we created a unique training data set synthetical (realistic simulation of Deep Vision) from images captured from the camera fitted to the trawler system. This data set was used to train the deep neural network, which successfully identifies blue whiting, Atlantic herring, and Atlantic mackerel with an accuracy of 94%. ...
... Also, the model was 40 M in size, which is suitable for embedding in a device for real-time classification of marine animals from the underwater image. 141 To overcome the limitation of data sets, 141 we created a unique training data set synthetical (realistic simulation of Deep Vision) from images captured from the camera fitted to the trawler system. This data set was used to train the deep neural network, which successfully identifies blue whiting, Atlantic herring, and Atlantic mackerel with an accuracy of 94%. ...
Article
Full-text available
Machine learning (ML) refers to computer algorithms that predict a meaningful output or categorize complex systems based on a large amount of data. ML is applied in various areas including natural science, engineering, space exploration, and even gaming development. This review focuses on the use of machine learning in the field of chemical and biological oceanography. In the prediction of global fixed nitrogen levels, partial carbon dioxide pressure, and other chemical properties, the application of ML is a promising tool. Machine learning is also utilized in the field of biological oceanography to detect planktonic forms from various images (i.e., microscopy, FlowCAM, and video recorders), spectrometers, and other signal processing techniques. Moreover, ML successfully classified the mammals using their acoustics, detecting endangered mammalian and fish species in a specific environment. Most importantly, using environmental data, the ML proved to be an effective method for predicting hypoxic conditions and harmful algal bloom events, an essential measurement in terms of environmental monitoring. Furthermore, machine learning was used to construct a number of databases for various species that will be useful to other researchers, and the creation of new algorithms will help the marine research community better comprehend the chemistry and biology of the ocean.
... In [5], Lecun et al. described the fundamental principles and the key benefits of deep learning. Recently, deep learning has been applied to a variety of tasks, such as monitoring marine biodiversity [6], [7], target identification in sonar images [8], [9] and sea ice concentration forecasting [10]. For example, Bermant et al. [6] employed convolutional neural networks (CNNs) to classify spectrograms generated from sperm whale acoustic data. ...
... For example, Bermant et al. [6] employed convolutional neural networks (CNNs) to classify spectrograms generated from sperm whale acoustic data. Allken et al. [7] developed a CNN model for fish species classification, leveraging synthetic data for training data augmentation. Lima et al. [8] proposed a deep transfer learning method for automatic ocean front recognition, extracting knowledge from deep CNN models trained on historical data. ...
Preprint
Full-text available
Traditionally, numerical models have been deployed in oceanography studies to simulate ocean dynamics by representing physical equations. However, many factors pertaining to ocean dynamics seem to be ill-defined. We argue that transferring physical knowledge from observed data could further improve the accuracy of numerical models when predicting Sea Surface Temperature (SST). Recently, the advances in earth observation technologies have yielded a monumental growth of data. Consequently, it is imperative to explore ways in which to improve and supplement numerical models utilizing the ever-increasing amounts of historical observational data. To this end, we introduce a method for SST prediction that transfers physical knowledge from historical observations to numerical models. Specifically, we use a combination of an encoder and a generative adversarial network (GAN) to capture physical knowledge from the observed data. The numerical model data is then fed into the pre-trained model to generate physics-enhanced data, which can then be used for SST prediction. Experimental results demonstrate that the proposed method considerably enhances SST prediction performance when compared to several state-of-the-art baselines.
... As fish research expands, monitoring and studying the living habits, health, and growth of fish in deep-sea environments have become increasingly important. To track the condition of fish over the long term, many researchers have begun using fixed underwater observation devices to capture fish activity, generating large amounts of data [2]. However, the raw images captured are often noisy and difficult to process. ...
Article
Full-text available
Underwater fish image segmentation is a crucial technique in marine fish monitoring. However, typical underwater fish images often suffer from issues such as color distortion, low contrast, and blurriness, primarily due to the complex and dynamic nature of the marine environment. To enhance the accuracy of underwater fish image segmentation, this paper introduces an innovative neural network model that combines the attention mechanism with a feature pyramid module. After the backbone network processes the input image through convolution, the data pass through the enhanced feature pyramid module, where it is iteratively processed by multiple weighted branches. Unlike conventional methods, the multi-scale feature extraction module that we designed not only improves the extraction of high-level semantic features but also optimizes the distribution of low-level shape feature weights through the synergistic interactions of the branches, all while preserving the inherent properties of the image. This novel architecture significantly boosts segmentation accuracy, offering a new solution for fish image segmentation tasks. To further enhance the model’s robustness, the Mix-up and CutMix data augmentation techniques were employed. The model was validated using the Fish4Knowledge dataset, and the experimental results demonstrate that the model achieves a Mean Intersection over Union (MIoU) of 95.1%, with improvements of 1.3%, 1.5%, and 1.7% in the MIoU, Mean Pixel Accuracy (PA), and F1 score, respectively, compared to traditional segmentation methods. Additionally, a real fish image dataset captured in deep-sea environments was constructed to verify the practical applicability of the proposed algorithm.
... Skillful image filtering and image blending resulted in low TAR values of 80.4% and 75.6%, respectively. Allken et al. (2019) used deep learning neural networks to automate species classification in deep-field trawl camera images. Acoustic trawl surveys are important for marine resource management and environmental monitoring, but identifying species from acoustic signals is difficult. ...
Article
Full-text available
Fish species classification is crucial for understanding and preserving marine biodiversity. Advanced technologies such as computer vision and machine learning facilitate the identification and classification of different fish species based on their unique physical characteristics. Automatic fish classification systems are essential for biodiversity assessment, fisheries management, and environmental monitoring. This process involves collecting the image data of fish, extracting relevant features, and training machine learning models. Preprocessing the image data using Gaussian and median filters removes noise and enhances image quality. Mathematical morphological operations are employed for segmentation. For feature extraction, Gray Level Co-occurrence Matrix (GLCM) and geometrical features are used. The GLCM extracts texture features, while geometrical features describe the shape and structure of the fish. Classifiers such as Support Vector Machines (SVM) and Convolutional Neural Networks (CNN) are then used to train the data, comparing it with the extracted features to achieve high accuracy in classification. This accurate classification is critical, especially considering the impact of environmental factors and fish species reduction on the balance of marine ecosystems. Changes in fish population can disrupt the ecological balance, highlighting the importance of effective monitoring and management systems to protect oceanic and sea environments.
... In most circumstances, the image will be captured while the fisher is handling/cleaning the fish, where it will likely be partly obscured, and left-right/dorsal-ventral orientation is uncontrolled. In theory, the use of hundreds or thousands of two-dimensional images from various angles making up a synthetic rendering of a fish would considerably improve real-world identification accuracy (Allken et al. 2019). Further research could assess the use of three-dimensional imaging in the training data. ...
... At present, fish classification in some countries still relies on a large amount of manual work, which has many drawbacks, such as low efficiency, high labor intensity, a harsh working environment, and high costs. Therefore, establishing a marine fish image recognition model and combining machine vision and deep learning algorithms to study fish recognition technology can provide technical support for the development of fishery resources and intelligent fisheries (Allken et al. 2019;Antanasijevi'c et al. 2019;Piechaud et al. 2019). ...
Article
Full-text available
Objective In order to solve the problems of low accuracy and limited generalization ability in traditional marine fish species identification methods, the optimized ResNet 50 model is proposed in this paper. Methods First, a data set of marine fish images was constructed, targeting 30 common marine fish species (e.g., Japanese Eel Anguilla japonica, Japanese Horsehead Branchiostegus japonicus, Black Sea Sprat Clupeonella cultriventris, and Atlantic Cutlassfish Trichiurus lepturus). The marine fish images were pre‐processed to increase the sample size of the data set. Second, the ResNet50 model was optimized by introducing a Dual Multi‐Scale Attention Network (DMSANet) module to improve the model's attention to subtle features. A dropout regularization mechanism and dense layer were added to improve the model's generalization ability and prevent overfitting. The triplet loss function was adopted as the optimization objective of the model to reduce errors. Third, species identification was conducted on 30 species of marine fish to test the comprehensive performance of the optimized ResNet50 model. Result The test results showed that the optimized model had a recognition accuracy of 98.75% in complex situations, which was 3.05% higher than that of the standard ResNet50 model. A confusion matrix of the visual analysis results showed that the optimized ResNet50 model had a high accuracy rate for marine fish species recognition in many cases. To further validate and evaluate the generalization ability of the optimized ResNet50 model, partial fish data from the ImageNet database and the Queensland University of Technology (QUT) Fish Dataset were used as data sets for performance experiments. The results showed that the optimized ResNet50 model achieved accuracies of 97.65% and 98.75% on the two benchmark data sets (ImageNet and the QUT Fish Dataset, respectively). Conclusion The optimized ResNet50 model integrates the DMSANet module, effectively capturing subtle features in images and improving the accuracy of fish classification tasks. This model has good recognition and generalization abilities in complex scenes, and can be applied to marine fish recognition tasks in different situations.
... Although research on fish identification continues, one prior study proposed that 91% of oceanic species, including fish, have not yet been discovered [6]. This identification is essential for maintaining biodiversity (evolutionary biology, interaction, and presence of endangered species), food and drug safety (ingredients and sources), and sustainable fishery management (estimating fish density and stock status) [7][8][9]. Conventional fish species identification has been performed in Saudi Arabia since 1761 [10]. At present, several assessment techniques are used to identify fish species, of which next-generation sequencing (NGS) and DNA barcoding represent important advances in this field. ...
Article
Full-text available
Fish identification in the Red Sea, particularly in Saudi Arabia, has a long history. Because of the vast fish diversity in Saudi Arabia, proper species identification is required. Indeed, identifying fish species is critical for biodiversity conservation, food and drug safety, and sustainable fishery management. Numerous approaches have been used to identify fish species, including conventional morphological identification, next-generation sequencing (NGS), nanopore sequencing, DNA barcoding, and environmental DNA analysis. In this review, we collected as much scientific information as possible on species identification in Saudi Arabia. Our findings suggest that the identification process has advanced and spread rapidly and broadly, as evidenced by the discovery of new fish species in Saudi Arabia. The advantages and disadvantages of each method were discussed as part of a comprehensive comparison. This study aimed to provide further scientific knowledge to promote the growth of fish diversity worldwide.
... Although research on fish identification continues, one prior study proposed that 91% of oceanic species, including fish, have not yet been discovered [6]. This identification is essential for maintaining biodiversity (evolutionary biology, interaction, and presence of endangered species), food and drug safety (ingredients and sources), and sustainable fishery management (estimating fish density and stock status) [7][8][9]. Conventional fish species identification has been performed in Saudi Arabia since 1761 [10]. At present, several assessment techniques are used to identify fish species, of which next-generation sequencing (NGS) and DNA barcoding represent important advances in this field. ...
Article
Full-text available
Fish identification in the Red Sea, particularly in Saudi Arabia, has a long history. Because of the vast fish diversity in Saudi Arabia, proper species identification is required. Indeed, identifying fish species is critical for biodiversity conservation, food and drug safety, and sustainable fishery management. Numerous approaches have been used to identify fish species, including conventional morphological identification, next-generation sequencing (NGS), nanopore sequencing, DNA barcoding, and environmental DNA analysis. In this review, we collected as much scientific information as possible on species identification in Saudi Arabia. Our findings suggest that the identification process has advanced and spread rapidly and broadly, as evidenced by the discovery of new fish species in Saudi Arabia. The advantages and disadvantages of each method were discussed as part of a comprehensive comparison. This study aimed to provide further scientific knowledge to promote the growth of fish diversity worldwide. Keywords: fish diversity, identification process, Saudi Arabia.
... Image classification technique using deep learning often require large image samples, thus obtaining image samples is crucial to train a robust convolutional neural network classifier. Additional images in the fish training dataset can be obtained through generating synthetic fish images and trained them using convolutional neural network [16]. ...
Preprint
The morphological and color variations of Betta splendens present significant challenges for feature extraction in computer vision applications. Accurate identification is crucial, as the valuation and temperament of these fish specimens are heavily dependent on their morphological characteristics. Although previous studies have explored various computer vision techniques for Betta Splendens classification, there has been little focus on evaluating the performance of different deep learning architectures, especially those optimized for deployment on resource-constrained devices. To address this gap, this study compares the performance of general-purpose deep learning architectures with mobile-specific architectures for Betta Splendens classification. Several widely-used deep learning models were selected for this study based on their availability and relevance. Input feature analysis of the pre-trained architectures was conducted to ensure each model effectively extracts features crucial for classification. The results demonstrate that the InceptionV3 model, fine-tuned with the iNaturalist dataset, achieves the highest accuracy of 0.953 and a recall of 0.9532, outperforming other CNN models. Among mobile-specific architectures, MobileNetV3Small achieved the best results, with 0.90 accuracy and 0.90 recall which is on par with VGG-16 baseline model while having significantly lower hyperparameters. Additionally, as a contribution to the research community, a comprehensive dataset of Betta Splendens fish variants was compiled and made publicly available to support future studies in this domain.
... In recent years, computer information technology has developed rapidly, and computer vision has made significant progress [13]. Moreover, target detection technology has become an important tool in the field of underwater biological monitoring [14,15], so more and more target detection methods are used in the field of fishery resource investigation. ...
Article
Full-text available
To improve detection efficiency and reduce cost consumption in fishery surveys, target detection methods based on computer vision have become a new method for fishery resource surveys. However, the specialty and complexity of underwater photography result in low detection accuracy, limiting its use in fishery resource surveys. To solve these problems, this study proposed an accurate method named BSSFISH-YOLOv8 for fish detection in natural underwater environments. First, replacing the original convolutional module with the SPD-Conv module allows the model to lose less fine-grained information. Next, the backbone network is supplemented with a dynamic sparse attention technique, BiFormer, which enhances the model’s attention to crucial information in the input features while also optimizing detection efficiency. Finally, adding a 160 × 160 small target detection layer (STDL) improves sensitivity for smaller targets. The model scored 88.3% and 58.3% in the two indicators of mAP@50 and mAP@50:95, respectively, which is 2.0% and 3.3% higher than the YOLOv8n model. The results of this research can be applied to fishery resource surveys, reducing measurement costs, improving detection efficiency, and bringing environmental and economic benefits.
... Underwater fish detection faces the challenge of limited labeled data [11,12]. Allken et al. [13] proposed a method based on real image simulation in deep visual images to expand training samples, achieving a 94% classification accuracy for cod, Atlantic herring, and Atlantic mackerel. Banan et al. [14] applied a pre-trained VGG16 deep learning model on the ImageNet dataset for fish species recognition, achieving an average classification accuracy of 100% for four species of Asian carps. ...
Article
Full-text available
Due to the complexity of underwater environments and the lack of training samples, the application of target detection algorithms to the underwater environment has yet to provide satisfactory results. It is crucial to design specialized underwater target recognition algorithms for different underwater tasks. In order to achieve this goal, we created a dataset of freshwater fish captured from multiple angles and lighting conditions, aiming to improve underwater target detection of freshwater fish in natural environments. We propose a method suitable for underwater target detection, called DyFish-DETR (Dynamic Fish Detection with Transformers). In DyFish-DETR, we propose a DyFishNet (Dynamic Fish Net) to better extract fish body texture features. A Slim Hybrid Encoder is designed to fuse fish body feature information. The results of ablation experiments show that DyFishNet can effectively improve the mean Average Precision (mAP) of model detection. The Slim Hybrid Encoder can effectively improve Frame Per Second (FPS). Both DyFishNet and the Slim Hybrid Encoder can reduce model parameters and Floating Point Operations (FLOPs). In our proposed freshwater fish dataset, DyFish-DETR achieved a mAP of 96.6%. The benchmarking experimental results show that the Average Precision (AP) and Average Recall (AR) of DyFish-DETR are higher than several state-of-the-art methods. Additionally, DyFish-DETR, respectively, achieved 99%, 98.8%, and 83.2% mAP in other underwater datasets.
... Automated detection via deep learning has successful applications in many domains involving image data, such as face detection [88], pedestrian detection [89], license plate [90], and traffic sign [91] detection in autonomous driving, as well as event and object detection in daily living [92,93]. Transformer models have shown increasing Fig. 2 Ground truths corresponding to false negative predictions by DINO with the highest detection performance for the VCF dataset (first row) and Deformable DETR with the highest detection performance for the spondylolisthesis dataset (second row) success in recent image analysis literature, including image recognition [94], segmentation [95], super-resolution [96], generation [97], and visual question answering [98]. ...
Article
Osteoporosis is the most common chronic metabolic bone disease worldwide. Vertebral compression fracture (VCF) is the most common type of osteoporotic fracture. Approximately 700,000 osteoporotic VCFs are diagnosed annually in the USA alone, resulting in an annual economic burden of ~$13.8B. With an aging population, the rate of osteoporotic VCFs and their associated burdens are expected to rise. Those burdens include pain, functional impairment, and increased medical expenditure. Therefore, it is of utmost importance to develop an analytical tool to aid in the identification of VCFs. Computed Tomography (CT) imaging is commonly used to detect occult injuries. Unlike the existing VCF detection approaches based on CT, the standard clinical criteria for determining VCF relies on the shape of vertebrae, such as loss of vertebral body height. We developed a novel automated vertebrae localization, segmentation, and osteoporotic VCF detection pipeline for CT scans using state-of-the-art deep learning models to bridge this gap. To do so, we employed a publicly available dataset of spine CT scans with 325 scans annotated for segmentation, 126 of which also graded for VCF (81 with VCFs and 45 without VCFs). Our approach attained 96% sensitivity and 81% specificity in detecting VCF at the vertebral-level, and 100% accuracy at the subject-level, outperforming deep learning counterparts tested for VCF detection without segmentation. Crucially, we showed that adding predicted vertebrae segments as inputs significantly improved VCF detection at both vertebral and subject levels by up to 14% Sensitivity and 20% Specificity (p-value = 0.028).
... Several studies, such as those by Norouzzadeh et al. (2018), Allken et al. (2019), Barbedo (2019), Kaya et al. (2019), Montalbo and Hernandez (2019), and Palmer et al. (2022), highlight the effectiveness of CNNs in animal species classification. The adoption of pre-trained models such as ResNet-50, VGG16, and Xception underlines a fundamental trend in the deep learning community: transfer learning. ...
Article
Full-text available
Convolutional neural networks (CNNs) have revolutionized image recognition. Their ability to identify complex patterns, combined with learning transfer techniques, has proven effective in multiple fields, such as image classification. In this article we propose to apply a two-step methodology for image classification tasks. First, apply transfer learning with the desired dataset, and subsequently, in a second stage, replace the classification layers by other alternative classification models. The whole methodology has been tested on a dataset collected at Conil de la Frontera fish market, in Southwest Spain, including 19 different fish species to be classified for fish auction market. The study was conducted in five steps: (i) collecting and preprocessing images included in the dataset, (ii) using transfer learning from 4 well-known CNNs (ResNet152V2, VGG16, EfficientNetV2L and Xception) for image classification to get initial models, (iii) apply fine-tuning to obtain final CNN models, (iv) substitute classification layer with 21 different classifiers obtaining multiple F1-scores for different training-test splits of the dataset for each model, and (v) apply post-hoc statistical analysis to compare their performances in terms of accuracy. Results indicate that combining the feature extraction capabilities of CNNs with other supervised classification algorithms, such as Support Vector Machines or Linear Discriminant Analysis is a simple and effective way to increase model performance.
... The utilization of AI in the preservation of aquatic alongside marine biodiversity as well as water resources has garnered considerable research interest in the past decade. Artificial Intelligence (AI) and Machine Learning (ML) models have been employed to forecast stream flow [75] , assess water quality [76][77][78][79][80][81] , detect water pollution alongside toxicology [82,83] , anticipate changes in aquatic and marine biodiversity [84][85][86] , predict species distribution and map habitats [87,88] , as well as recognize and classify marine and aquatic species [89][90][91][92][93][94][95][96][97] . The aforementioned AI research in aquatic alongside marine biodiversity as well as water resource conservation emphasizes the crucial role of AI in developing innovative technology to discover previously unknown aspects of conservation and potential risks to the structures and functions of aquatic and marine ecosystems. ...
Article
Full-text available
The recent progress in data science, along with the transformation in digital and satellite technology, has enhanced the capacity for artificial intelligence (AI) applications in the forestry and wildlife domains. Nevertheless, the swift proliferation of developmental projects, agricultural, and urban areas pose a significant threat to biodiversity on a global scale. Hence, the integration of emerging technologies such as AI in the fields of forests and biodiversity might facilitate the efficient surveillance, administration, and preservation of biodiversity and forest resources. The objective of this paper is to present a comprehensive review of how AI and machine learning (ML) algorithms are utilized in the forestry sector and biodiversity conservation worldwide. Furthermore, this research examines the difficulties encountered while implementing AI technology in the fields of forestry and biodiversity. Enhancing the availability of extensive data pertaining to forests and biodiversity, along with the utilization of cloud computing and digital and satellite technology, can facilitate the wider acceptance and implementation of AI technology. The findings of this study would inspire forest officials, scientists, researchers, and conservationists to investigate the potential of AI technology for the purposes of forest management and biodiversity conservation.
... data augmentation) are a common way to improve performance and generalization in CNNs. Allken et al. (2019) trained an Inception 3 architecture on 5000 data-augmented images per species to reach 94% accuracy on a test set, while the baseline model, trained on the 70 original images, reached an accuracy between 50 and 71%. Bogucki et al. (2019) used a combination of three CNNs to detect and identify North Atlantic right whales in aerial and satellite images. ...
... At present, the main source of fish identification methods is deep learning [12] [13] [14] convolutional neural network model algorithm. It takes convolutional neural network [15] (CNN) and generative adversarial network [16] as the core. As its most exemplary neural network, convolutional neural network still has advantages in image processing [17] [18] [19], image classification [20] [21] and image recognition [22] [23] [24] [25] [26]. ...
... In a complex marine ecosystem marked by rich biodiversity, numerous unidentified objects, and deep waters, acoustic surveys have proven to be the best means of collecting marine data and estimating fish species biomasses and their abundances. The development of new marine technologies such as Autonomous Under Water Vehicles (AUVs), Remotely Operated Vehicles (ROVs), and trawl equipped with cameras is limiting human involvement and collecting of images and videos data of marine objects, which are more real than echoes data of objects (Allken et al., 2018;Cui et al., 2020;Eickholt et al., 2020). However, the sensors using acoustic tools always provide promising results, especially in deeper waters. ...
Article
In fishery acoustics, surveys using sensor systems such as sonars and echosounders have been widely considered to be accurate tools for acquiring fish species data, fish species biomass, and abundance estimations. During acoustic surveys, research vessels are equipped with echosounders that produce sound waves and then record all echoes coming from objects and targets in the water column. The preprocessing and scrutinizing of acoustic fish species data have always been manually conducted and have been considered time-consuming. Meanwhile, deep learning and machine learning-based approaches have also been adopted to automate or partially automate the acoustic echo scrutinizing process and build an objective process with which the species echo classification uncertainty is expected to be lower than the uncertainty of scrutinizing experts. A review of the state-of-the-art of different deep learning and machine learning applications in acoustic fish species echo classification has been highly requested. Therefore, the present paper is conceived to identify and scan the studies conducted on acoustic fish echo identification using deep learning and machine learning approaches. This document can be extended to include other marine organisms rather than just fish species. To search for related papers, we used a systematic approach to search the most known electronic databases over the last five years. We were able to identify 13 related works, which have been processed to give a summary of multiple deep and machine learning approaches used in acoustic fish species identification, and then compare their architectures, performances, and the challenges encountered in their applications.
... The first group explores cases where the utilization of synthetic data immediately enhances the overall performance of a model, representing an ideal scenario. This group includes studies conducted by Shafaei et al. (2016), Kim et al. (2022) and Allken et al. (2019). These researchers observed that incorporating or generating synthetic datasets led to improved performance when applied to real-world datasets. ...
Article
Full-text available
The assessment of visual blockages in cross-drainage hydraulic structures, such as culverts and bridges, is crucial for ensuring their efficient functioning and preventing flash flooding incidents. The extraction of blockage-related information through computer vision algorithms can provide valuable insights into the visual blockage. However, the absence of comprehensive datasets has posed a significant challenge in effectively training computer vision models. In this study, we explore the use of synthetic data, the synthetic images of culvert (SIC) and the visual hydraulics lab dataset (VHD), in combination with a limited real-world dataset, the images of culvert openings and blockage (ICOB), to evaluate the performance of a culvert opening detector. The Faster Region-based Convolutional Neural Network (Faster R-CNN) model with a ResNet50 backbone was used as the culvert opening detector. The impact of synthetic data was evaluated through two experiments. The first involved training the model with different combinations of synthetic and real-world data, while the second involved training the model with reduced real-world images. The results of the first experiment revealed that structured training, where the synthetic images of culvert (SIC) were used for initial training and the ICOB was used for fine-tuning, resulted in slightly improved detection performance. The second experiment showed that the use of synthetic data, in conjunction with a reduced number of real-world images, resulted in significantly improved degradation rates. HIGHLIGHTS Culverts are prone to blockage and often cause flooding in urban regions; therefore, regular maintenance is significant.; Computer vision and artificial intelligence models are proposed to assess the culverts in terms of visual blockage to automate the process of unsafe and expensive manual inspections.; Proposes the use of artificially generated images to train the computer vision models and report the insights.;
... The network was evaluated using a test data set of 2400 genuine photos, 800 of which represented each species. obtained a classification accuracy of 94% [44]. Chen and Chang, introduced storage of Koi fish using mobile cloud computing. ...
... Image classification methods based on deep learning have superior performance, and researchers have already proposed using them for fish species classification. Allken [16] proposed a method of using convolutional neural networks to train synthetic fish image data for the classification of marine fish species, mainly addressing the problem of insufficient training data and achieving the classification accuracy of 94%. Urbanova et al. [10] proposed to use a fine-tuned and optimized VGG16 network to classify the fish species on Verde islands. ...
Article
Full-text available
The classification of fish species has important practical significance for both the aquaculture industry and ordinary people. However, existing methods for classifying marine and freshwater fishes have poor feature extraction ability and do not meet actual needs. To address this issue, we propose a novel method for multi-water fish classification (Fish-TViT) based on transfer learning and visual transformers. Fish-TViT uses a label smoothing loss function to solve the problem of overfitting and overconfidence of the classifier. We also employ Gradient-weighted Category Activation Mapping (Grad-CAM) technology to visualize and understand the features of the model and the areas on which the decision depends, which guides the optimization of the model architecture. We first crop and clean fish images, and then use data augmentation to expand the number of training datasets. A pre-trained visual transformer model is used to extract enhanced features of fish images, which are subsequently cropped into a series of flat patches. Finally, a multi-layer perceptron is used to predict fish species. Experimental results show that Fish-TViT achieves high classification accuracy on both low-resolution marine fish data (94.33%) and high-resolution freshwater fish data (98.34%). Compared with traditional convolutional neural networks, Fish-TViT has better performance.
... data augmentation) are a common way to improve performance and generalization in CNNs. Allken et al. (2019) trained an Inception 3 architecture on 5000 data-augmented images per species to reach 94% accuracy on a test set, while the baseline model, trained on the 70 original images, reached an accuracy between 50 and 71%. Bogucki et al. (2019) used a combination of three CNNs to detect and identify North Atlantic right whales in aerial and satellite images. ...
Article
Full-text available
Machine learning covers a large set of algorithms that can be trained to identify patterns in data. Thanks to increases in the amounts of data and computing power available, it has become pervasive across scientific disciplines. We first highlight why machine learning is needed in marine ecology. Then we provide a quick primer on machine learning techniques and vocabulary. We built a database of ~1000 publications that implement such techniques to analyse marine ecology data. For various raw data types (images, optical spectra, acoustics, omics, geolocations, biogeochemical profiles and satellite imagery), we present a historical perspective on applications and cite references that proved influential, can serve as templates for new work or represent the diversity of approaches. Then, we illustrate how machine learning can be used to better understand ecological systems, by combining various sources of marine data. Through this coverage of the literature, we demonstrate an increase in the proportion of marine ecology studies that use machine learning, the pervasiveness of images as a data source, the dominance of machine learning for classification-type problems, and a shift towards deep learning for all data types. This overview is meant to guide researchers who wish to apply machine learning methods to their marine data sets.
... This will not only make the LLMs more transparent but can also open the door to field-specific LLMs such as a taxonomy/ biodiversity LLM where it is actually designed to produce accurate descriptions and detailed diagnoses. Fortunately, research into the use of machine learning for taxonomic research is already underway (Allken et al. 2018, Tan et al. 2020, Tan et al. 2021, Thenmozhi et al. 2021, which can provide a model for an open-AI system. Furthermore, investment into image-based learning models (IBLMs) that can be coupled to such LLMs would be the ideal setup to advance this technology to the service of taxonomy. ...
... Especially within the health sector, the approach gained popularity with applications ranging from generating synthetic patient data (Choi et al., 2018) over synthetic electronic health records (Yahi et al., 2017) to generating synthetic cell images (Siwicki, 2021). More and more start-ups are offering synthetic data generation as a service and in the industry, synthetic data are used in such diverse contexts as autonomous driving (Osinski et al., 2020), classifying computed tomography images (Frid-Adar et al., 2018) or environmental monitoring (Allken et al., 2018). ...
Preprint
Full-text available
The idea to generate synthetic data as a tool for broadening access to sensitive microdata has been proposed for the first time three decades ago. While first applications of the idea emerged around the turn of the century, the approach really gained momentum over the last ten years, stimulated at least in parts by some recent developments in computer science. We consider the upcoming 30th jubilee of Rubin's seminal paper on synthetic data (Rubin, 1993) as an opportunity to look back at the historical developments, but also to offer a review of the diverse approaches and methodological underpinnings proposed over the years. We will also discuss the various strategies that have been suggested to measure the utility and remaining risk of disclosure of the generated data.
... Xu et al. (2017) and Jin et al. (2017) used relatively limited datasets for learning and testing CNNs pre-trained on ImageNet to classify images through transfer learning. To increase the amount of training data, Allken et al. (2019) developed a deep vision system for data augmentation, achieving a 94% classification accuracy using pre-trained Inception 3. Banan et al. (2020) used a pre-trained VGG network to achieve a 100% classification accuracy for four types of carp. Although their work showed that the model could effectively extract the visual features of fish, the dataset used in experiments was small, and the background was simple. ...
Article
Full-text available
The visibility of fishes and invertebrates is highly impacted by the complexity of the environment. Images acquired in underwater environments suffer from blurriness and low contrast. This results in a low classification accuracy. To address this problem, this study uses a pre-trained Resnet50 neural network as the feature extractor, which avoids over-fitting and accuracy saturation while realizing improved feature extraction capabilities. It also proposes an enhancement of the error-minimized random vector functional link (EEMRVFL) neural network, which is used as the classifier in the convolutional neural network (CNN) model instead of the original softmax classifier. EEMRVFL reduces the maximum residual error in each incremental process. The selected hidden nodes are added to the network, which improves the compactness of its structure. The proposed residual CNNs model exhibits improved classification accuracy for underwater image classification compared to existing methods. This is demonstrated experimentally on available datasets such as URPC, LifeCLEF 2015, and Fish4Knowledge with accuracy rates reaching 99.68%, 97.34%, and 99.77%, respectively.
Article
Recently, the methods that combine the merits of the numerical model and the deep learning to improve the prediction accuracy of the sea surface temperature have received considerable attention. Existing methods usually apply the output of the numerical model as the physical knowledge to guide the training of the deep learning models. However, the physical knowledge in the observed data has not been fully exploited. With the development of observational instruments and techniques, increasing amount of observational data has been collected. These data can be utilized for the exploration of physical knowledge. Towards this end, we propose novel scheme for sea surface temperature (SST) prediction, which applies generative adversarial networks (GANs) to analyze the physical knowledge in the historical data. In particular, two GAN models are trained with numerical model data and observed data separately. Afterwards, the physical knowledge is extracted from the observed data which is not contained in the data generated by the numerical model by comparing the learned physical feature from the two pretrained GAN models. Finally, to validate the relevance of the physical knowledge which we have discovered, the extracted features are added into the numerical model data which are called newly-corrected data. Besides, we train two spatial-temporal models over the newly-corrected dataset and the original numerical model data for SST prediction, respectively. The experimental results show that the newly-corrected dataset performs better than using the original numerical model for SST prediction.
Article
Full-text available
The application of computer vision in fish identification facilitates researchers and managers to better comprehend and safeguard the aquatic ecological environment. Numerous researchers have harnessed deep learning methodologies for studying fish species identification. Nonetheless, this endeavor still encounters challenges such as high computational costs, a substantial number of parameters, and limited practicality. To address these issues, we propose a lightweight network architecture incorporating deformable convolutions, termed DeformableFishNet. Within DeformableFishNet, an efficient global coordinate attention module (EGCA) is introduced alongside a deformable convolution network (EDCN/EC2f), which is grounded in EGCA, to tackle the deformation of fish bodies induced by swimming motions. Additionally, an EC2f-based feature pyramid network (EDBFPN) and an efficient multi-scale decoupling head (EMSD Head) are proposed to extract multi-scale fish features within a lightweight framework. DeformableFishNet was deployed on our freshwater fish dataset, with experimental outcomes illustrating its efficacy, achieving a mean average precision (mAP) of 96.3%. The model comprises 1.7 million parameters and entails 4.7 billion floating-point operations (FLOPs). Furthermore, we validated DeformableFishNet on three public underwater datasets, yielding respective mAPs of 98%, 99.4%, and 83.6%. The experiments show that DeformableFishNet is suitable for underwater identification of various scenes.
Article
Spine disorders can cause severe functional limitations, including back pain, decreased pulmonary function, and increased mortality risk. Plain radiography is the first-line imaging modality to diagnose suspected spine disorders. Nevertheless, radiographical appearance is not always sufficient due to highly variable patient and imaging parameters, which can lead to misdiagnosis or delayed diagnosis. Employing an accurate automated detection model can alleviate the workload of clinical experts, thereby reducing human errors, facilitating earlier detection, and improving diagnostic accuracy. To this end, deep learning-based computer-aided diagnosis (CAD) tools have significantly outperformed the accuracy of traditional CAD software. Motivated by these observations, we proposed a deep learning-based approach for end-to-end detection and localization of spine disorders from plain radiographs. In doing so, we took the first steps in employing state-of-the-art transformer networks to differentiate images of multiple spine disorders from healthy counterparts and localize the identified disorders, focusing on vertebral compression fractures (VCF) and spondylolisthesis due to their high prevalence and potential severity. The VCF dataset comprised 337 images, with VCFs collected from 138 subjects and 624 normal images collected from 337 subjects. The spondylolisthesis dataset comprised 413 images, with spondylolisthesis collected from 336 subjects and 782 normal images collected from 413 subjects. Transformer-based models exhibited 0.97 Area Under the Receiver Operating Characteristic Curve (AUC) in VCF detection and 0.95 AUC in spondylolisthesis detection. Further, transformers demonstrated significant performance improvements against existing end-to-end approaches by 4–14% AUC (p-values < 10−13) for VCF detection and by 14–20% AUC (p-values < 10−9) for spondylolisthesis detection.
Article
Full-text available
The majority of the monitoring methods of fish and shellfish used globally today are lethal or invasive. In 2020, SLU Aqua handled over 9 million individuals of fish during fishery-independent sampling. In Sweden, unlike the rest of Europe, the fish caught in surveys are also considered laboratory animals and the government has requested that authorities that use laboratory animals should establish strategies for their work with issues related to 3R, i.e. Replace, Reduce and Refine. In order to investigate the possibilities of implementing 3R in SLU Aqua's monitoring of fish stocks, this report presents possible methods to: Replace – Through fishery-dependent sampling with the aim of producing management documents for assessing the status of Lake Vänern's pikeperch stock and mapping the effects of fishing on pikeperch in Lake Vänern (Chapter 5) Reduce – Through studies of hydroacoustic frequency response in fish and/or the use of trawl-mounted stereo video, possibly reducing the need for trawling during hydroacoustic trawl surveys (Chapters 2 and 3). By supplementing gill-net test fishing with hydroacoustics, electrofishing and eDNA, reduce the number of fish being killed in this type of survey (Chapter 4). Refine – By combining different methods, increase the amount of knowledge from each sampled individual provides, and from the whole ecosystem (Chapters 3 and 4).
Article
Full-text available
Cephalopods are harvested in increasingly large quantities but understanding how to control and manage their stocks, and tracking the routes of the consumption that exploits them, lag behind what has been developed for exploiting finfish. This review attempts to redress the imbalance by considering the status of the major cephalopod stock species and the traceability of cephalopod seafood along the trade value chain. It begins with a general overview of the most important exploited cephalopods, their stock status and their market. Four major cephalopod resources are identified: the three squid species Todarodes pacificus, Dosidicus gigas and Illex argentinus; and one species of octopus, Octopus vulgaris. The techniques and problems of stock assessment (to assess sustainability) are reviewed briefly and the problems and possible solutions for assessing benthic stock such as those of octopuses are considered. An example of a stock well managed in the long term is presented to illustrate the value of careful monitoring and management: the squid Doryteuthis gahi available in Falkland Islands waters. Issues surrounding identification, mislabelling and illegal, unreported and unregulated (IUU) fishing are then reviewed, followed by a discussion of approaches and techniques of traceability as applied to cephalopods. Finally, some of the mobile apps currently available and in development for tracking seafood are compared. This review concludes with observations on the necessity for the strengthening and international coordination of legislation, and more rigorous standards for seafood labelling and for taxonomic curation of DNA sequences available in public databases for use in seafood identification.
Article
Fish ageing is a vital component of fisheries management as it enables the evaluation of fish population status and supports the development of sustainable management strategies. However, traditional methods of age determination through otolith analysis by experts are resource-intensive and time-consuming. Therefore, there is a growing demand for more cost-effective and automated techniques to accurately determine fish age. In accordance with this purpose, we proposed a multistage framework for fish age prediction using otolith images. First, otoliths in the images were detected using the Faster Region-based Convolutional Neural Networks (RCNN) model, which includes 8 convolution layers, and the detected otoliths were clipped. Subsequently, using a pre-trained neural network based on the transfer learning approach, deep features were extracted separately from the right and left otolith images, and all the obtained features were combined. Finally, these combined features are given as input to the Gaussian process regression model. To evaluate the performance of the proposed architecture , 4109 images of right and left otoliths belonging to Greenland halibut (flatfish, Reinhardtius hippo-glossoides) were used. The proposed architecture produced 1.83 MSE, 0.98 R-squared, 1.35 RMSE, and 10.29 MAPE scores in the experimental studies. As a result, the proposed model achieved superior performance compared with previous studies. Our findings show that our FishAgePredictioNet system could help experts predict fish age based on otolith images.
Chapter
The Persian Gulf is one of the most important habitats in the Middle East. It can be extremely beneficial to aquatic species’ survival and environmental preservation to continuous monitoring and collect data about aquatic animals, their habitats, and behaviours. Finding a novel and suitable method to carry out accurate and automatic monitoring with low timing and low cost for monitoring aquatic species’ behaviour in this high potential area is helpful. To predict fish habitat in Persian Gulf Convolutional Neural Network method and Naïve Bayes algorithm are used. Deep learning convolutional neural network technology is mostly used for data science classification and recognition because of its exceptional accuracy and to solve search and optimization issues, the Naïve Bayes algorithm is employed. Results indicate for predicting fish habitat in the Persian Gulf, the accuracy of the Convolutional Neural Network algorithm and the Naïve Bayes algorithms is 97.32% and 95.47%, respectively. With p = 0.025 (p0.05), there is a substantial difference between the Naïve Bayes method and the Convolutional Neural Network algorithm. Therefore, The Convolutional Neural Network method seems to be more accurate than the Naïve Bayes method at predicting fish habitat in the Persian Gulf.
Chapter
Flying into space, developing virtual currencies, developing efficient electric vehicles, and many other new technologies are improving humankind’s life on Earth. But, in today’s world, how to define disruptive technologies? Deep tech or hard tech refers to the type of organization, typically a startup, that develops these disruptive technologies.
Article
Systems based on enhanced deep learning can replace the area's currently cumbersome and sluggish traditional approaches. Although it seems straightforward, classifying fish images is a complex procedure. In addition, the scientific study of population distribution and geographic patterns is important for advancing the field's present advancements. The goal of the proposed work is to identify the best performing strategy using cutting-edge computer vision, the Chaotic Oppositional Based Whale Optimization Algorithm (CO-WOA), and data mining techniques. Performance comparisons with leading models, such as Convolutional Neural Networks (CNN) and VGG-19, are made to confirm the applicability of the suggested method. The suggested feature extraction approach with Proposed Deep Learning Model was used in the research, yielding accuracy rates of 100%. The performance was also compared to cutting-edge image processing models with an accuracy of 98.48%, 98.58%, 99.04%, 98.44%, 99.18% and 99.63% such as Convolutional Neural Networks, ResNet150 V2, DenseNet, Visual Geometry Group-19, Inception V3, Xception. Using an empirical method leveraging artificial neural networks, the Proposed Deep Learning model was shown to be the best model.
Preprint
In this paper, we investigate novel data collection and training techniques towards improving classification accuracy of non-moving (static) hand gestures using a convolutional neural network (CNN) and frequency-modulated-continuous-wave (FMCW) millimeter-wave (mmWave) radars. Recently, non-contact hand pose and static gesture recognition have received considerable attention in many applications ranging from human-computer interaction (HCI), augmented/virtual reality (AR/VR), and even therapeutic range of motion for medical applications. While most current solutions rely on optical or depth cameras, these methods require ideal lighting and temperature conditions. mmWave radar devices have recently emerged as a promising alternative offering low-cost system-on-chip sensors whose output signals contain precise spatial information even in non-ideal imaging conditions. Additionally, deep convolutional neural networks have been employed extensively in image recognition by learning both feature extraction and classification simultaneously. However, little work has been done towards static gesture recognition using mmWave radars and CNNs due to the difficulty involved in extracting meaningful features from the radar return signal, and the results are inferior compared with dynamic gesture classification. This article presents an efficient data collection approach and a novel technique for deep CNN training by introducing ``sterile'' images which aid in distinguishing distinct features among the static gestures and subsequently improve the classification accuracy. Applying the proposed data collection and training methods yields an increase in classification rate of static hand gestures from 85%85\% to 93%93\% and 90%90\% to 95%95\% for range and range-angle profiles, respectively.
Article
Full-text available
We present a novel and practical deep fully convolutional neural network architecture for semantic pixel-wise segmentation termed SegNet. This core trainable segmentation engine consists of an encoder network, a corresponding decoder network followed by a pixel-wise classification layer. The architecture of the encoder network is topologically identical to the 13 convolutional layers in the VGG16 network [1]. The role of the decoder network is to map the low resolution encoder feature maps to full input resolution feature maps for pixel-wise classification. The novelty of SegNet lies is in the manner in which the decoder upsamples its lower resolution input feature map(s). Specifically, the decoder uses pooling indices computed in the max-pooling step of the corresponding encoder to perform non-linear upsampling. This eliminates the need for learning to upsample. The upsampled maps are sparse and are then convolved with trainable filters to produce dense feature maps. We compare our proposed architecture with the widely adopted FCN [2] and also with the well known DeepLab-LargeFOV [3] , DeconvNet [4] architectures. This comparison reveals the memory versus accuracy trade-off involved in achieving good segmentation performance. SegNet was primarily motivated by scene understanding applications. Hence, it is designed to be efficient both in terms of memory and computational time during inference. It is also significantly smaller in the number of trainable parameters than other competing architectures and can be trained end-to-end using stochastic gradient descent. We also performed a controlled benchmark of SegNet and other architectures on both road scenes and SUN RGB-D indoor scene segmentation tasks. These quantitative assessments show that SegNet provides good performance with competitive inference time and most efficient inference memory-wise as compared to other architectures. We also provide a Caffe implementation of SegNet and a web demo at http://mi.eng.cam.ac.uk/projects/segnet/.
Article
Full-text available
There is a need for automatic systems that can reliably detect, track and classify fish and other marine species in underwater videos without human intervention. Conventional computer vision techniques do not perform well in underwater conditions where the background is complex and the shape and textural features of fish are subtle. Data-driven classification models like neural networks require a huge amount of labelled data, otherwise they tend to over-fit to the training data and fail on unseen test data which is not involved in training. We present a state-of-the-art computer vision method for fine-grained fish species classification based on deep learning techniques. A cross-layer pooling algorithm using a pre-trained Convolutional Neural Network as a generalized feature detector is proposed, thus avoiding the need for a large amount of training data. Classification on test data is performed by a SVM on the features computed through the proposed method, resulting in classification accuracy of 94.3% for fish species from typical underwater video imagery captured off the coast of Western Australia. This research advocates that the development of automated classification systems which can identify fish from underwater video imagery is feasible and a cost-effective alternative to manual identification by humans. © International Council for the Exploration of the Sea 2017. All rights reserved.
Article
Full-text available
We present a novel and practical deep fully convolutional neural network architecture for semantic pixel-wise segmentation termed SegNet. This core trainable segmentation engine consists of an encoder network, a corresponding decoder network followed by a pixel-wise classification layer. The architecture of the encoder network is topologically identical to the 13 convolutional layers in the VGG16 network . The role of the decoder network is to map the low resolution encoder feature maps to full input resolution feature maps for pixel-wise classification. The novelty of SegNet lies is in the manner in which the decoder upsamples its lower resolution input feature map(s). Specifically, the decoder uses pooling indices computed in the max-pooling step of the corresponding encoder to perform non-linear upsampling. This eliminates the need for learning to upsample. The upsampled maps are sparse and are then convolved with trainable filters to produce dense feature maps. We compare our proposed architecture with the fully convolutional network (FCN) architecture and its variants. This comparison reveals the memory versus accuracy trade-off involved in achieving good segmentation performance. The design of SegNet was primarily motivated by road scene understanding applications. Hence, it is efficient both in terms of memory and computational time during inference. It is also significantly smaller in the number of trainable parameters than competing architectures and can be trained end-to-end using stochastic gradient descent. We also benchmark the performance of SegNet on Pascal VOC12 salient object segmentation and the recent SUN RGB-D indoor scene understanding challenge. We show that SegNet provides competitive performance although it is significantly smaller than other architectures. We also provide a Caffe implementation of SegNet and a webdemo at http://mi.eng.cam.ac.uk/projects/segnet/
Article
Full-text available
Deep learning allows computational models that are composed of multiple processing layers to learn representations of data with multiple levels of abstraction. These methods have dramatically improved the state-of-the-art in speech recognition, visual object recognition, object detection and many other domains such as drug discovery and genomics. Deep learning discovers intricate structure in large data sets by using the backpropagation algorithm to indicate how a machine should change its internal parameters that are used to compute the representation in each layer from the representation in the previous layer. Deep convolutional nets have brought about breakthroughs in processing images, video, speech and audio, whereas recurrent nets have shone light on sequential data such as text and speech.
Conference Paper
Full-text available
Recent results indicate that the generic descriptors extracted from the convolutional neural networks are very powerful. This paper adds to the mounting evidence that this is indeed the case. We report on a series of experiments conducted for different recognition tasks using the publicly available code and model of the OverFeat network which was trained to perform object classification on ILSVRC13. We use features extracted from the OverFeat network as a generic image representation to tackle the diverse range of recognition tasks of object image classification, scene recognition, fine grained recognition, attribute detection and image retrieval applied to a diverse set of datasets. We selected these tasks and datasets as they gradually move further away from the original task and data the OverFeat network was trained to solve. Remarkably we report better or competitive results compared to the state-of-the-art in all the tasks on various datasets. The results are achieved using a linear SVM classifier applied to a feature representation of size 4096 extracted from a layer in the net. The results strongly suggest that features obtained from deep learning with convolutional nets should be the primary candidate in most visual classification tasks.
Chapter
Camera-based fish abundance estimation with the aid of visual analysis techniques has drawn increasing attention. Live fish segmentation and recognition in open aquatic habitats, however, suffers from fast light attenuation, ubiquitous noise and non-lateral views of fish. In this chapter, an automatic live fish segmentation and recognition framework for trawl-based cameras is proposed. To mitigate the illumination issues, double local thresholding method is integrated with histogram backprojection to produce an accurate shape of fish segmentation. For recognition, a hierarchical partial classification is learned so that the coarse-to-fine categorization stops at any level where ambiguity exists. Attributes from important fish anatomical parts are focused to generate discriminative feature descriptors. Experiments on mid-water image sets show that the proposed framework achieves up to 93% of accuracy on live fish recognition based on automatic and robust segmentation results.
Conference Paper
Can a large convolutional neural network trained for whole-image classification on ImageNet be coaxed into detecting objects in PASCAL? We show that the answer is yes, and that the resulting system is simple, scalable, and boosts mean average precision, relative to the venerable deformable part model, by more than 40% (achieving a final mAP of 48% on VOC 2007). Our framework combines powerful computer vision techniques for generating bottom-up region proposals with recent advances in learning high-capacity convolutional neural networks. We call the resulting system R-CNN: Regions with CNN features. The same framework is also competitive with state-of-the-art semantic segmentation methods, demonstrating its flexibility. Beyond these results, we execute a battery of experiments that provide insight into what the network learns to represent, revealing a rich hierarchy of discriminative and often semantically meaningful features.
Conference Paper
Several machine learning models, including neural networks, consistently mis- classify adversarial examples—inputs formed by applying small but intentionally worst-case perturbations to examples from the dataset, such that the perturbed in- put results in the model outputting an incorrect answer with high confidence. Early attempts at explaining this phenomenon focused on nonlinearity and overfitting. We argue instead that the primary cause of neural networks' vulnerability to ad- versarial perturbation is their linear nature. This explanation is supported by new quantitative results while giving the first explanation of the most intriguing fact about them: their generalization across architectures and training sets. Moreover, this view yields a simple and fast method of generating adversarial examples. Us- ing this approach to provide examples for adversarial training, we reduce the test set error of a maxout network on the MNIST dataset.
Technical Report
TensorFlow [1] is an interface for expressing machine learning algorithms, and an implementation for executing such algorithms. A computation expressed using TensorFlow can be executed with little or no change on a wide variety of heterogeneous systems, ranging from mobile devices such as phones and tablets up to large-scale distributed systems of hundreds of machines and thousands of computational devices such as GPU cards. The system is flexible and can be used to express a wide variety of algorithms, including training and inference algorithms for deep neural network models, and it has been used for conducting research and for deploying machine learning systems into production across more than a dozen areas of computer science and other fields, including speech recognition, computer vision, robotics, information retrieval, natural language processing, geographic information extraction, and computational drug discovery. This paper describes the TensorFlow interface and an implementation of that interface that we have built at Google. The TensorFlow API and a reference implementation were released as an open-source package under the Apache 2.0 license in November, 2015 and are available at www.tensorflow.org.
Conference Paper
We trained a large, deep convolutional neural network to classify the 1.2 million high-resolution images in the ImageNet LSVRC-2010 contest into the 1000 dif- ferent classes. On the test data, we achieved top-1 and top-5 error rates of 37.5% and 17.0% which is considerably better than the previous state-of-the-art. The neural network, which has 60 million parameters and 650,000 neurons, consists of five convolutional layers, some of which are followed by max-pooling layers, and three fully-connected layers with a final 1000-way softmax. To make training faster, we used non-saturating neurons and a very efficient GPU implemen- tation of the convolution operation. To reduce overfitting in the fully-connected layers we employed a recently-developed regularization method called dropout that proved to be very effective. We also entered a variant of this model in the ILSVRC-2012 competition and achieved a winning top-5 test error rate of 15.3%, compared to 26.2% achieved by the second-best entry
Conference Paper
Convolutional networks are at the core of most stateof-the-art computer vision solutions for a wide variety of tasks. Since 2014 very deep convolutional networks started to become mainstream, yielding substantial gains in various benchmarks. Although increased model size and computational cost tend to translate to immediate quality gains for most tasks (as long as enough labeled data is provided for training), computational efficiency and low parameter count are still enabling factors for various use cases such as mobile vision and big-data scenarios. Here we are exploring ways to scale up networks in ways that aim at utilizing the added computation as efficiently as possible by suitably factorized convolutions and aggressive regularization. We benchmark our methods on the ILSVRC 2012 classification challenge validation set demonstrate substantial gains over the state of the art: 21.2% top-1 and 5.6% top-5 error for single frame evaluation using a network with a computational cost of 5 billion multiply-adds per inference and with using less than 25 million parameters. With an ensemble of 4 models and multi-crop evaluation, we report 3.5% top-5 error and 17.3% top-1 error.
Article
Underwater video and digital still cameras are rapidly being adopted by marine scientists and managers as a tool for non-destructively quantifying and measuring the relative abundance, cover and size of marine fauna and flora. Imagery recorded of fish can be time consuming and costly to process and analyze manually. For this reason, there is great interest in automatic classification, counting, and measurement of fish. Uncon-strained underwater scenes are highly variable due to changes in light intensity, changes in fish orientation due to movement, a variety of background habitats which sometimes also move, and most importantly similarity in shape and patterns among fish of different species. This poses a great challenge for image/video processing techniques to accurately differentiate between classes or species of fish to perform automatic classification. We present a machine learning approach, which is suitable for solving this challenge. We demonstrate the use of a convolution neural network model in a hierarchical feature combination setup to learn species-dependent visual features of fish that are unique, yet abstract and robust against environmental and intra-and inter-species variability. This approach avoids the need for explicitly extracting features from raw images of the fish using several fragmented image processing techniques. As a result, we achieve a single and generic trained architecture with favorable performance even for sample images of fish species that have not been used in training. Using the LifeCLEF14 and LifeCLEF15 benchmark fish datasets, we have demonstrated results with a correct classification rate of more than 90%.
Article
Living resources of the sea and fresh water have long been an important source of food and economic activity. With fish stocks continuing to be over-exploited, there is a clear focus on fisheries management, to which acoustic methods can and do make an important contribution. The second edition of this widely used book covers the many technological developments which have occurred since the first edition; highly sophisticated sonar and computer processing equipment offer great new opportunities and Fisheries Acoustic, 2e provides the reader with a better understanding of how to interpret acoustic observations and put them to practical use.
Article
Deep neural nets with a large number of parameters are very powerful machine learning systems. However, overfitting is a serious problem in such networks. Large networks are also slow to use, making it difficult to deal with overfitting by combining the predictions of many different large neural nets at test time. Dropout is a technique for addressing this problem. The key idea is to randomly drop units (along with their connections) from the neural network during training. This prevents units from co-adapting too much. During training, dropout samples from an exponential number of different "thinned" networks. At test time, it is easy to approximate the effect of averaging the predictions of all these thinned networks by simply using a single unthinned network that has smaller weights. This significantly reduces overfitting and gives major improvements over other regularization methods. We show that dropout improves the performance of neural networks on supervised learning tasks in vision, speech recognition, document classification and computational biology, obtaining state-of-the-art results on many benchmark data sets. © 2014 Nitish Srivastava, Geoffrey Hinton, Alex Krizhevsky, Ilya Sutskever and Ruslan Salakhutdinov.
Article
In this work we present an end-to-end system for text spotting -- localising and recognising text in natural scene images -- and text based image retrieval. This system is based on a region proposal mechanism for detection and deep convolutional neural networks for recognition. Our pipeline uses a novel combination of complementary proposal generation techniques to ensure high recall, and a fast subsequent filtering stage for improving precision. For the recognition and ranking of proposals, we train very large convolutional neural networks to perform word recognition on the whole proposal region at the same time, departing from the character classifier based systems of the past. These networks are trained solely on data produced by a synthetic text generation engine, requiring no human labelled data. Analysing the stages of our pipeline, we show state-of-the-art performance throughout. We perform rigorous experiments across a number of standard end-to-end text spotting benchmarks and text-based image retrieval datasets, showing a large improvement over all previous methods. Finally, we demonstrate a real-world application of our text spotting system to allow thousands of hours of news footage to be instantly searchable via a text query.
Article
In recent years, deep neural networks (including recurrent ones) have won numerous contests in pattern recognition and machine learning. This historical survey compactly summarises relevant work, much of it from the previous millennium. Shallow and deep learners are distinguished by the depth of their credit assignment paths, which are chains of possibly learnable, causal links between actions and effects. I review deep supervised learning (also recapitulating the history of backpropagation), unsupervised learning, reinforcement learning & evolutionary computation, and indirect search for short programs encoding deep and large networks.
Article
An in-trawl stereo camera system (DeepVision) collected continuous, overlapping, images of organisms ranging from krill and jellyfish to large teleost fishes, including saithe (Pollachius virens) and Atlantic cod (Gadus morhua) infected with parasitic copepods. The four-dimensional position (latitude, longitude, depth, time) of individuals was recorded as they passed the camera, providing a level of within-haul spatial resolution not available with standard trawl sampling. Most species were patchily distributed, both vertically and horizontally, and occasionally individuals were observed at significant vertical and horizontal separation from conspecifics. Acoustically visible layers extending off the continental rise at 250 m depth and greater were verified as primarily blue whiting (Micromesistius poutassou), but also included a small proportion of evenly distributed golden redfish (Sebastes marinus) and greater Argentines (Argentina silus). Small, but statistically significant, differences in length by depth were observed for blue whiting within a single haul. These results demonstrate the technology can greatly increase the amount and detail of information collected with little additional sampling effort.
Article
I n s i t u measurements of fish target strength are selected for use in echo integrator surveys at 38 kHz. The results are expressed through equations in which the mean target strength TS is regressed on the mean fish length l in centimeters. For physoclists, TS=20 log l−67.4, and for clupeoids, TS=20 log l−71.9. These equations are supported by independent measurements on tethered, caged, and freely aggregating fish and by theoretical computations based on the swimbladder form. Causes of data variability are attributed to differences in species, behavior, and, possibly, swimbladder state.
Article
An experiment to verify the basic linearity of fisheries is described. Herring (Clupea harengus L. ) was the subject fish. Acoustic measurements consisted of the echo energy from aggregations of caged but otherwise free-swimming fish, and the target strength functions of similar, anesthetized specimens. Periodic photographic observation of the caged fish allowed characterization of their behavior through associated spatial and orientation distributions. The fish biology and hydrography were also measured. Computations of the echo energy from encaged aggregations agreed well with observation. This success was obtained for each of four independent echo sounders operating at frequencies from 38 to 120 kHz and at power levels from 35 w to nearly 1 kw.
Article
Trials of a computer vision machine (The CatchMeter) for identifying and measuring different species of fish are described. The fish are transported along a conveyor underneath a digital camera. Image processing algorithms: determine the orientation of the fish utilising a moment-invariant method, identify whether the fish is a flatfish or roundfish with 100% accuracy, measure the length with a standard deviation of 1.2 mm and species with up to 99.8% sorting reliability for seven species of fish. The potential application of the system onboard both research and commercial ships is described. The machine can theoretically process up to 30,000 fish/h using a single conveyor based system.
  • D Maclennan
  • E Simmonds
MacLennan, D., and Simmonds, E. 2005. Fisheries Acoustics. Fish and Aquatic Resources Series 10. Chapman & Hall, London.