Article

# Fruit quality and defect image classification with conditional GAN data augmentation

Authors:
To read the full-text of this research, you can request a copy directly from the authors.

## Abstract

Contemporary Artificial Intelligence technologies allow for the employment of Computer Vision to discern good crops from bad, providing a step in the pipeline of selecting healthy fruit from undesirable fruit, such as those which are mouldy or damaged. State-of-the-art works in the field report high accuracy results on small datasets (<1000 images), which are not representative of the population regarding real-world usage. The goals of this study are to further enable real-world usage by improving generalisation with data augmentation as well as to reduce overfitting and energy usage through model pruning. In this work, we suggest a machine learning pipeline that combines the ideas of fine-tuning, transfer learning, and generative model-based training data augmentation towards improving fruit quality image classification. A linear network topology search is performed to tune a VGG16 lemon quality classification model using a publicly-available dataset of 2690 images. We find that appending a 4096 neuron fully connected layer to the convolutional layers leads to an image classification accuracy of 83.77%. We then train a Conditional Generative Adversarial Network on the training data for 2000 epochs, and it learns to generate relatively realistic images. Grad-CAM analysis of the model trained on real photographs shows that the synthetic images can exhibit classifiable characteristics such as shape, mould, and gangrene. A higher image classification accuracy of 88.75% is then attained by augmenting the training with synthetic images, arguing that Conditional Generative Adversarial Networks have the ability to produce new data to alleviate issues of data scarcity. Finally, model pruning is performed via polynomial decay, where we find that the Conditional GAN-augmented classification network can retain 81.16% classification accuracy when compressed to 50% of its original size.

## No full-text available

... Furthermore, higher F1-scores were obtained when the datasets augmented by PI-GAN-based methods were used. In Tables 28-30, the existing augmentation method [46] based on conditional GAN is used to perform the classification for additional experiments. As shown in Tables 25-30, classification accuracies by using the proposed PI-GAN are higher compared to those using the existing method [46]. ...
... In Tables 28-30, the existing augmentation method [46] based on conditional GAN is used to perform the classification for additional experiments. As shown in Tables 25-30, classification accuracies by using the proposed PI-GAN are higher compared to those using the existing method [46]. As shown in Tables 16-18, we achieved higher accuracy when using a bigger dataset. ...
... In addition, as shown in Tables 25-27, classification accuracies obtained by using PI-CNN with PI-GAN are higher compared to those obtained by using PI-CNN with conventional augmentation methods. Moreover, classification accuracies obtained by using PI-CNN with PI-GAN are higher compared to those obtained by using PI-CNN with the existing conditional GAN-based method [46], as shown in . This confirms that the proposed PI-GAN works better with the proposed PI-CNN. ...
Article
Full-text available
Extensive research has been conducted on image augmentation, segmentation, detection, and classification based on plant images. Specifically, previous studies on plant image classification have used various plant datasets (fruits, vegetables, flowers, trees, etc., and their leaves). However, existing plant-based image datasets are generally small. Furthermore, there are limitations in the construction of large-scale datasets. Consequently, previous research on plant classification using small training datasets encountered difficulties in achieving high accuracy. However, research on plant image classification based on small training datasets is insufficient. Accordingly, this study performed classification by reducing the number of training images of plant-image datasets by 70%, 50%, 30%, and 10%, respectively. Then, the number of images was increased back through augmentation methods for training. This ultimately improved the plant-image classification performance. Based on the respective preliminary experimental results, this study proposed a plant-image classification convolutional neural network (PI-CNN) based on plant image augmentation using a plant-image generative adversarial network (PI-GAN). Our proposed method showed the higher classification accuracies compared to the state-of-the-art methods when the experiments were conducted using four open datasets of PlantVillage, PlantDoc, Fruits-360, and Plants.
... Second, they required complex optimization methods to find optimized parameters. According to mentioned challenges, deep learning methods are developed [33]. Deep learning methods used deep networks, i.e., CNNs to automatically learn features rather than manual setting parameters to obtain effective effects in image processing tasks, i.e., image classification [33], image inpainting [34] and image super-resolution [1]. ...
... According to mentioned challenges, deep learning methods are developed [33]. Deep learning methods used deep networks, i.e., CNNs to automatically learn features rather than manual setting parameters to obtain effective effects in image processing tasks, i.e., image classification [33], image inpainting [34] and image super-resolution [1]. Although these methods are effective big samples, they are limited for image tasks with small samples [29]. ...
... Thus, image inpainting had important values in the real world [70]. Due to missing [33] CGAN Image classification Conditional GAN for image classification PD-GAN [34] GAN Image inpainting GAN for image inpainting and image restoration CGAN [35] GAN Image generation GAN in a supervised way for image generation DCGAN [36] GAN Image generation GAN in an unsupervised way for image generation BiGAN [37] GAN Image generation GAN with encoder in an unsupervised way for image generation EBGAN [39] GAN Image generation and training nets GAN based energy for image generation CycleGAN [40] GAN Image generation GAN with cycle-consistent for image generation WGAN-GP [42] GAN Image generation GAN with gradient penalty for image generation BIGGAN [43] GAN Image super-resolution GAN with big channels of image super-resolution StyleGAN [44] GAN Image generation GAN with stochastic variation for image generation LAPGAN [45] CGAN Image super-resolution GAN with Laplacian pyramid for image super-resolution CoupleGAN [46] GAN Image generation GAN for both up-sampling and image generation SAGAN [47] GAN Image generation Unsupervised GAN with self-attention for image generation FUNIT [49] GAN Image translation GAN in an unsupervised way for image-to-image translation SPADE [50] GAN Image generation GAN with spatially-adaptive normalization for image generation U-GAT-IT [51] GAN Image translation GAN with attention in an unsupervised way for image-to-image translation CycleGAN with U-net for image-to-image translation ECycleGAN [67] CycleGAN CycleGAN with convolutional block attention module (CBAM) for image-to-image translation pixels, image inpainting suffered from enormous challenges [71]. To overcome shortcoming above, GANs are used to generate useful information to repair damaged images based on the surrounding pixels in the damaged images [72]. ...
Preprint
Full-text available
Single image super-resolution (SISR) has played an important role in the field of image processing. Recent generative adversarial networks (GANs) can achieve excellent results on low-resolution images with small samples. However, there are little literatures summarizing different GANs in SISR. In this paper, we conduct a comparative study of GANs from different perspectives. We first take a look at developments of GANs. Second, we present popular architectures for GANs in big and small samples for image applications. Then, we analyze motivations, implementations and differences of GANs based optimization methods and discriminative learning for image super-resolution in terms of supervised, semi-supervised and unsupervised manners. Next, we compare performance of these popular GANs on public datasets via quantitative and qualitative analysis in SISR. Finally, we highlight challenges of GANs and potential research points for SISR.
... Some of the related works based on shelf life prediction using various models are listed below: [Bird et al. (2022)] determined the fruit quality and detective lemon fruit image classification using GAN based data augmentation techniques. In this method, a convolutional neural network (CNN) was introduced to predict the quality of the fruit. ...
... Generate the initial population Identify the objective function using Equation (16) Update the migration characteristics using Equations (18) and (19) Update the attacking characteristics using Equations (20) Termination criteria met The standard optimization SOA is the new optimization that can be utilized in various fields to solve optimization issues. The position of sandpipers in the solution space is distributed randomly. ...
... It is noted that the GAN images for each defect class were generated separately rather than through a multi-class data generation process, which can be potentially more efficient. Recently, Bird et al. (2021) applied CGAN to generate synthetic lemon images for classifying healthy and unhealthy fruit, using a public lemon dataset (Adamiak, 2020). By using synthetic fruit images, the classification based on VGG16 achieved an accuracy of 88.75% against 83.77% without data augmentation. ...
... Compared to real samples, low-quality generated images may lack texture details and contain unrealistic, undesirable artifacts. In Bird et al. (2021) for generating lemon images, many synthetic images were found more reminiscent of potatoes than lemons and some suffered from unrealistic checkboard patterns. Provision of sufficient samples for training GANs would facilitate generating high-quality, realistic images and eventually benefit DL models. ...
Preprint
Full-text available
In agricultural image analysis, optimal model performance is keenly pursued for better fulfilling visual recognition tasks (e.g., image classification, segmentation, object detection and localization), in the presence of challenges with biological variability and unstructured environments. Large-scale, balanced and ground-truthed image datasets, however, are often difficult to obtain to fuel the development of advanced, high-performance models. As artificial intelligence through deep learning is impacting analysis and modeling of agricultural images, data augmentation plays a crucial role in boosting model performance while reducing manual efforts for data preparation, by algorithmically expanding training datasets. Beyond traditional data augmentation techniques, generative adversarial network (GAN) invented in 2014 in the computer vision community, provides a suite of novel approaches that can learn good data representations and generate highly realistic samples. Since 2017, there has been a growth of research into GANs for image augmentation or synthesis in agriculture for improved model performance. This paper presents an overview of the evolution of GAN architectures followed by a systematic review of their application to agriculture (https://github.com/Derekabc/GANs-Agriculture), involving various vision tasks for plant health, weeds, fruits, aquaculture, animal farming, plant phenotyping as well as postharvest detection of fruit defects. Challenges and opportunities of GANs are discussed for future research.
... A large amount of research has been conducted on the quality of fruit, road damage, waste classification, and disease. Bird, J. J. et al. performed an experiment to distinguish defects in lemons using VGG16, and classified healthy and unhealthy lemons with an accuracy of 88.75% [5]. Velasco, J. et al. performed a study to classify seven skin diseases using the MobileNet algorithm, with an accuracy of 94.4% [6]. ...
... A large amount of research has been conducted on the quality of fruit, road damage, waste classification, and disease. Bird, J.J. et al. performed an experiment to distinguish defects in lemons using VGG16, and classified healthy and unhealthy lemons with an accuracy of 88.75% [5]. Velasco, J. et al. performed a study to classify seven skin diseases using the MobileNet algorithm, with an accuracy of 94.4% [6]. ...
Article
Full-text available
Despite various economic crisis situations around the world, the courier and delivery service market continues to be revitalized. The parcel shipping volume in Korea is currently 3.37 billion parcels, achieving a growth rate of about 140% compared to 2012, and 70% of parcels are from metropolitan areas. Given the above statistics, this paper focused on the development of an underground logistics system (ULS), in order to conduct a study to handle the freight volume in a more eco-friendly manner in the center of metropolitan areas. In this paper we first analyzed the points at which parcel boxes were damaged, based on a ULS. After collecting image data of the parcel boxes, the damaged parcel boxes were detected and classified using computerized methods, in particular, a convolutional neural network (CNN), MobileNet. For image classification, Google Colaboratory notebook was used and 4882 images were collected for the experiment. Based on the collected dataset, when conducting the experiment, the accuracy, recall, and specificity of classification for the testing set were 84.6%, 82% and 88.54%, respectively,. To validate the usefulness of the MobileNet algorithm, additional experiments were performed under the same conditions using other algorithms, VGG16 and ResNet50. The results show that MobileNet is superior to other image classification models when comparing test time. Thus, in the future, MobileNet has the potential to be used for identifying damaged boxes, and could be used to ensure the reliability and safety of parcel boxes based on a ULS.
... More recently, Conditional GANs (CGANs) [4] have been used for image augmentation. Synthetic images generated by pretrained CGAN were utilized to augment real data, resulting in improved classification accuracy for fruit quality classification [5]. In addition, research [6] demonstrated that deep conditional generative models, including deep CGANs, have the ability to balance the imbalanced data, which lead to better classification performance. ...
Article
Classification using supervised learning requires annotating a large amount of classes-balanced data for model training and testing. This has practically limited the scope of applications with supervised learning, in particular deep learning. To address the issues associated with limited and imbalanced data, this paper introduces a sample-efficient co-supervised learning paradigm (SEC-CGAN), in which a conditional generative adversarial network (CGAN) is trained alongside the classifier and supplements semantics-conditioned, confidence-aware synthesized examples to the annotated data during the training process. In this setting, the CGAN not only serves as a co-supervisor but also provides complementary quality examples to aid the classifier training in an end-to-end fashion. Experiments demonstrate that the proposed SEC-CGAN outperforms the external classifier GAN (EC-GAN) and a baseline ResNet-18 classifier. For the comparison, all classifiers in above methods adopt the ResNet-18 architecture as the backbone. Particularly, for the Street View House Numbers dataset, using the 5% of training data, a test accuracy of 90.26% is achieved by SEC-CGAN as opposed to 88.59% by EC-GAN and 87.17% by the baseline classifier; for the highway image dataset, using the 10% of training data, a test accuracy of 98.27% is achieved by SEC-CGAN, compared to 97.84% by EC-GAN and 95.52% by the baseline classifier.
... However, the collection of manually annotated data in industrial manufacturing processes is difficult and expensive. Although data augmentation technologies [16][17][18] based on generative adversarial networks (GANs) bring us plausible solutions, the data's generative ability is limited by the number of samples, especially for the industrial products, whose amounts are only several hundreds or even dozens. Therefore, most existing DNN-based methods still suffer from the small sample problem and a lack of labeled data. ...
Article
Full-text available
Surface defect inspection is a key technique in industrial product assessments. Compared with other visual applications, industrial defect inspection suffers from a small sample problem and a lack of labeled data. Therefore, conventional deep-learning methods depending on huge supervised samples cannot be directly generalized to this task. To deal with the lack of labeled data, unsupervised subspace learning provides more clues for the task of defect inspection. However, conventional subspace learning methods focus on studying the linear subspace structure. In order to explore the nonlinear manifold structure, a novel neural subspace learning algorithm is proposed by substituting linear operators with nonlinear neural networks. The low-rank property of the latent space is approximated by limiting the dimensions of the encoded feature, and the sparse coding property is simulated by quantized autoencoding. To overcome the small sample problem, a novel data augmentation strategy called thin-plate-spline deformation is proposed. Compared with the rigid transformation methods used in previous literature, our strategy could generate more reliable training samples. Experiments on real-world datasets demonstrate that our method achieves state-of-the-art performance compared with unsupervised methods. More importantly, the proposed method is competitive and has a better generalization capability compared with supervised methods based on deep learning techniques.
... Experimental results show that the enhanced data sets get higher detection accuracy. In order to alleviate the problem of data scarcity, Brid et al. [121] adopted conditional GAN to synthesize images to enhance the dataset (Lemons Quality Control Dataset [122]), and finally achieved 88.75% defect classification accuracy. Even if the model is compressed to half the original size, the conditional GAN enhanced classification network can maintain the classification accuracy of 81.16%. ...
Article
Full-text available
With the development of science and technology and the progress of the times, automation and intelligence have been popularized in manufacturing in all walks of life. With the progress of productivity, product defect detection has become an indispensable part. However, in practical scenarios, the application of supervised deep learning algorithms in the field of defect detection is limited due to the difficulty and unpredictability of obtaining defect samples. In recent years, semi-supervised and unsupervised deep learning algorithms have attracted more and more attention in various defect detection tasks. Generative adversarial networks (GAN), as an unsupervised learning algorithm, has been widely used in defect detection tasks in various fields due to its powerful generation ability. In order to provide some inspiration for the researchers who intend to use GAN for defect detection research. In this paper, the theoretical basis, technical development and practical application of GAN based defect detection are reviewed. This paper also discusses the current outstanding problems of GAN and GAN-based defect detection, and makes a detailed prediction and analysis of the possible future research directions. This paper summarizes the relevant literature on the research progress and application status of GAN based defect detection, which provides certain technical information for researchers who are interested in researching GAN and hope to apply it to defect detection tasks.
... The use of conditional GAN has been investigated by [4] to generate synthetic images of healthy or unhealthy lemons with mould and gangrene. The model can produce synthetic image according to the label provided, and it is found that the use of conditional GAN is able to improve the model accuracy for detecting defective lemons. ...
Conference Paper
Full-text available
The lack of samples and class imbalance will reduce the reliability of the CNN model in image classification and recognition tasks. In this research, we examined the use of Image-to-Image translation with conditional GAN for producing synthetic mango images with bruises. We introduce a conditional GAN for producing mango images with controlled surface defects, which is suited for dataset augmentation tasks within the fruit classification problem domain. The findings shows that our networks is able to generate mango images with bruises that are very close to the ground truth with FID value of 37.0
... Traditional techniques have been in use for a long time, but they are extremely tedious, expensive and out of control over time. In this context, high-tech switches are needed to use machine vision to classify the quality of agricultural food products and to assess timely and accurately [32][33][34][35][36][37][38][39][40][41][42]. ...
Article
Full-text available
Remote sensing sensors-based image processing techniques have been widely applied in non-destructive quality inspection systems of agricultural crops. Image processing and analysis were performed with computer vision and external grading systems by general and standard steps, such as image acquisition, pre-processing and segmentation, extraction and classification of image characteristics. This paper describes the design and implementation of a real-time fresh fruit bunch (FFB) maturity classification system for palm oil based on unrestricted remote sensing (CCD camera sensor) and image processing techniques using five multivariate techniques (statistics, histograms, Gabor wavelets, GLCM and BGLAM) to extract fruit image characteristics and incorporate information on palm oil species classification FFB and maturity testing. To optimize the proposed solution in terms of performance reporting and processing time, supervised classifiers, such as support vector machine (SVM), K-nearest neighbor (KNN) and artificial neural network (ANN), were performed and evaluated via ROC and AUC measurements. The experimental results showed that the FFB classification system of non-destructive palm oil maturation in real time provided a significant result. Although the SVM classifier is generally a robust classifier, ANN has better performance due to the natural noise of the data. The highest precision was obtained on the basis of the ANN and BGLAM algorithms applied to the texture of the fruit. In particular, the robust image processing algorithm based on BGLAM feature extraction technology and the ANN classifier largely provided a high AUC test accuracy of over 93% and an image-processing time of 0,44 (s) for the detection of FFB palm oil species.
... Thus, the imaging was done in a laboratory environment and not in a real-world setting. In this sense, numerous researches have been devoted in the literature where imaging carried out under controlled illumination condition for fruit maturity detection (Behera et al., 2021;Cho and Koseki, 2021;Wan et al., 2018;Zhao and Chen, 2021), and fruit classification (Bird et al., 2022;Momeny et al., 2020;Nasiri et al., 2019). On the other side, the importance of this point cannot be ignored that challenges arise when imaging in the real world. ...
Article
Marketability of agricultural products depends heavily on appearance attributes such as color, size, and ripeness. Sorting plays an important role in increasing marketability by separating crop classes according to appearance attributes, thus reducing waste. As an expert technique, image processing and artificial intelligence (AI) techniques have been applied to classify hawthorns based on maturity levels (unripe, ripe, and overripe). A total of 600 hawthorns were categorized by an expert and the images were taken by an imaging box. The geometric properties, color and, texture features were extracted from segmented hawthorns using the Gray Level Co-occurrence Matrix (GLCM) and evaluation of various color spaces. The efficient feature vector was created by QDA feature reduction method and then classified using two classical machine learning algorithms: Artificial Neural Network (ANN) and Support Vector Machine (SVM). The obtained results indicated that the efficient feature-based ANN model with the configuration of 14–10-3 resulted in the accuracy of 99.57, 99.16, and 98.16% and the least means square error (MSE) of 1 × 10⁻³, 8 × 10⁻³, and 3 × 10⁻³ for training, validation and test phases, respectively. The machine vision system combined with the machine learning algorithms can successfully classify hawthorns according to their maturity levels.
... A Conditional Generative Adversarial Network (CGAN ) [54] is an extended version of the previous GAN model that works on given number of disease class. This mechanism of image augmentation is also used in fruit classification [55]. The generator now aims to learn to generate images belonging to one of ten classes of tomato leaf disease. ...
Article
Full-text available
Automatic leaf disease detection techniques are effective for reducing the time-consuming effort of monitoring large crop farms and early identification of disease symptoms of plant leaves. Although crop tomatoes are seen to be susceptible to a variety of diseases that can reduce the production of the crop. In recent years, advanced deep learning methods show successful applications for plant disease detection based on observed symptoms on leaves. However, these methods have some limitations. This study proposed a high-performance tomato leaf disease detection approach, namely attention-based dilated CNN logistic regression (ADCLR). Firstly, we develop a new feature extraction method using attention-based dilated CNN to extract most relevant features in a faster time. In our preprocessing, we use Bilateral filtering to handle larger features to make the image smoother and the Ostu image segmentation process to remove noise in a fast and simple way. In this proposed method, we preprocess the image with bilateral filtering and Otsu segmentation. Then, we use the Conditional Generative Adversarial Network (CGAN) model to generate a synthetic image from the image which is preprocessed in the previous stage. The synthetic image is generated to handle imbalance and noisy or wrongly labeled data to obtain good prediction results. Then, the extracted features are normalized to lower the dimensionality. Finally, extracted features from preprocessed data are combined and then classified using fast and simple logistic regression (LR) classifier. The experimental outcomes show the state-of-the-art performance on the Plant Village database of tomato leaf disease by achieving 100%, 100%, 96.6% training, testing, and validation accuracy, respectively, for multiclass. From the experimental analysis, it is clearly demonstrated that the proposed multimodal approach can be utilized to detect tomato leaf disease precisely, simply and quickly. We have a potential plan to improve the model to make it cloud-based automated leaf disease classification for different plants.
... Previous ML works applied to tomato, papaya, nectarine, and strawberry have explored applications of a range of data collection strategies including: image analysis (El-Bendary et al., 2015;Pereira et al., 2018), Vis/NIR spectroscopy (Amoriello et al., 2018), bioimpedance (Ibba et al., 2021), hyperspectral imaging (Gao et al., 2020), and bio-speckle method (Romero et al., 2009). These, and other ML-based strategies, focus on characterisation of fruit current-state to detect fruit on plants (Sa et al., 2016), discern shape (Oo and Aung, 2018), grade quality (Mhaske et al., 2020), distinguish cultivars (Osako et al., 2020), categorise defect (disease or damage) states (Bird et al., 2022;Nasiri et al., 2019;Pertot et al., 2012;Tariq et al., 2022), and assign fruit as either 'ripe' or 'unripe' (Chen et al., 2019;Yu et al., 2020). Specifically, studies on strawberry assessment through image analysis have focused on ripeness classification (Anraeni et al., 2021;Fan et al., 2022;Indrabayu et al., 2019;Khort et al., 2020;Thakur et al., 2020), with potential application to robotic strawberry fruit sorting during-and post-harvest (Xiong et al., 2020;Yu et al., 2020). ...
Article
Labour and production costs are considered major production challenges for strawberry (Fragaria sp.) farmers, due to the reliance on manual harvesting methods. Automation has been proposed as a desirable solution, in particular robotic-driven harvesting with in-built decision making for determination of fruit ripeness and early-prediction of harvest timing and conformity to industry quality parameters (fruit weight and length). To support the development of these automated processes, the work presented herein explored the capacity to utilise automated image analysis for the prediction of strawberry quality measures. This involved the hydroponic growth of strawberry plants under controlled conditions and the daily collection of photographs of individual flowers and fruit. Machine learning (ML)-driven image colour extraction from the collected 1685 strawberry images utilised object detection to identify flowers and fruit within images, followed by cropping and counting of remaining image pixels, which were assigned based on pixel RGB to one of 10 pre-defined groups: achromatic, blue, cyan, green, orange, pink, purple, red, white, and yellow. These colour measures were utilised as inputs for general regression with 10-fold cross-validation to generate 3 models: for the prediction of current-state fruit developmental stage (R² = 0.9071), current-state fruit length (R² = 0.8565), and days remaining until harvest (R² = 0.8694). Additionally, current-state fruit development stage and current-state length were utilised as inputs for general regression with 10-fold cross-validation to develop predictive models for the endpoint (harvest) key quality measures: fruit harvest-length (R² = 0.8817) and fruit harvest-weight (R² = 0.7252). Noting that days to harvest could be accurately predicted up to 15 days prior to harvest, and the harvest quality measures could be accurately predicted up to 22 days prior to harvest, the models presented herein may be utilised to increase automation and thereby improve efficiency in the scheduling of harvesting and quality control of strawberry farming.
... To verify the validity of virtual samples, the real and the mixed datasets are adopted for soft sensor training. In addition, five other generation methods are applied for comparison with DA-GAN: the MH algorithm [42] as the MCMC method, GAN [38], WGAN-GP [43], conditional GAN (CGAN) [44] and conditional variational autoencoder (CVAE) [45]. The experiments are implemented on a PC with an Intel (R) Core (TM) i7 3.60 GHz processor with 8 GB RAM using MATLAB 2019a. ...
Article
Many key quality variables are difficult to measure in complex industrial processes for various reasons, such as working conditions or economic costs, leading to inefficient production monitoring. In recent years, soft sensors with outstanding performance in variable estimation have been widely used. However, quality samples collected from industrial sites are often limited, which results in incomplete datasets that cannot meet the training requirements of soft sensors and poor performance in model learning and prediction. In this paper, a new virtual sample generation method DA-GAN based on generative adversarial network (GAN) is proposed to provide extra training samples for soft sensors. Adversarial net-and adversarial sample-based dual adversarial learning is implemented to reduce the adversarial noise in the discriminator gradient, which can improve the convergence speed and learning stability of the generator and obtain virtual samples with higher similarity to the real data. Furthermore, a sample screening method based on asymmetric acceptable domain range expansion is introduced to choose high-quality virtual samples. Experimental results of two industrial case studies show that the virtual samples provided by DA-GAN are closer to real samples than several other widely used generation methods. The performance of the prediction model trained with the dataset added by the virtual samples yielded from DA-GAN can be better improved.
... The key takeaway from the existing work was the model architectures and hyper parameters. In [6] [5] have tried to judge the quality of lemons using the images from Fruit 360 dataset using a publicly-available dataset of 2690 images. Due to lack of data they have used GANs to enrich their dataset. ...
Preprint
Full-text available
India is the second largest producer of fruits and vegetables in the world, and one of the largest consumers of fruits like Banana, Papaya and Mangoes through retail and ecommerce giants like BigBasket, Grofers and Amazon Fresh. However, adoption of technology in supply chain and retail stores is still low and there is a great potential to adopt computer-vision based technology for identification and classification of fruits. We have chosen banana fruit to build a computer vision based model to carry out the following three use-cases (a) Identify Banana from a given image (b) Determine sub-family or variety of Banana (c) Determine the quality of Banana. Successful execution of these use-cases using computer-vision model would greatly help with overall inventory management automation, quality control, quick and efficient weighing and billing which all are manual labor intensive currently. In this work, we suggest a machine learning pipeline that combines the ideas of CNNs, transfer learning, and data augmentation towards improving Banana fruit sub family and quality image classification. We have built a basic CNN and then went on to tune a MobileNet Banana classification model using a combination of self-curated and publicly-available dataset of 3064 images. The results show an overall 93.4% and 100% accuracy for sub-family/variety and for quality test classifications respectively.
Article
Image recognition research based on the Generative Adversarial Networks (GAN) has been widely used in various specialized technical fields because of its universality, adaptability, and scalability of the adversarial framework. The framework achieves the tasks of image generation, synthesis, and transformation to improve the image resolution and recognition rate, which is consistent with the purpose of the police in the image recognition work. However, the task of applying the GAN method to police image recognition is not widely explored. In this paper, we systematically review related research to understand the current applications of GAN in the field of image recognition and summarize their contributions. We analyze 35 academic papers after 2018 to provide a state-of-the-art research stream. According to the GAN applications, we divided the dataset into three domains, which are image-to-image translation, image augmentation, and mixed model. The results show that there is little difference in the number of articles in the three fields. In addition, most of the papers use image conversion methods from more than two domains, which indicates that GAN is flexible in designing the framework according to the research tasks. Based on the methods and challenges in the literature, we further propose future research directions.
Chapter
The purpose of this study is to explore Islamic Human Value which is believed by the young millennial generation to determine career adaptability and career success. Career success is very important for today’s Muslim millennial generation because they have high expectations regarding work-life balance. To achieve career success, it is necessary to have career adaptability so that the Muslim millennial generation is able to prepare themselves to face unexpected transitions or changes. In addition, the Muslim millennial generation also needs to have Islamic values, namely Islamic human values. These values are important so that the millennial generation has good planning in building a career. This research is qualitative research by conducting structured interviews on 8 respondents. The results show that there are 6 Islamic human values that can be applied as the basis for achieving career adaptability and career success.KeywordsIslamic human valuesCareer adaptabilityCareer successMillennial generation
Article
Full-text available
Fruit and vegetable picking robots are affected by the complex orchard environment, resulting in poor recognition and segmentation of target fruits by the vision system. The orchard environment is complex and changeable. For example, the change of light intensity will lead to the unclear surface characteristics of the target fruit; the target fruits are easy to overlap with each other and blocked by branches and leaves, which makes the shape of the fruits incomplete and difficult to accurately identify and segment one by one. Aiming at various difficulties in complex orchard environment, a two-stage instance segmentation method based on the optimized mask region convolutional neural network (mask RCNN) was proposed. The new model proposed to apply the lightweight backbone network MobileNetv3, which not only speeds up the model but also greatly improves the accuracy of the model and meets the storage resource requirements of the mobile robot. To further improve the segmentation quality of the model, the boundary patch refinement (BPR) post-processing module is added to the new model to optimize the rough mask boundaries of the model output to reduce the error pixels. The new model has a high-precision recognition rate and an efficient segmentation strategy, which improves the robustness and stability of the model. This study validates the effect of the new model using the persimmon dataset. The optimized mask RCNN achieved mean average precision (mAP) and mean average recall (mAR) of 76.3 and 81.1%, respectively, which are 3.1 and 3.7% improvement over the baseline mask RCNN, respectively. The new model is experimentally proven to bring higher accuracy and segmentation quality and can be widely deployed in smart agriculture.
Article
In agricultural image analysis, optimal model performance is keenly pursued for better fulfilling visual recognition tasks (e.g., image classification, segmentation, object detection and localization), in the presence of challenges with biological variability and unstructured environments. Large-scale, balanced and ground-truthed image datasets are tremendously beneficial but most often difficult to obtain to fuel the development of highly performant models. As artificial intelligence through deep learning is impacting analysis and modeling of agricultural images, image augmentation plays a crucial role in boosting model performance while reducing manual efforts for image collection and labelling, by algorithmically creating and expanding datasets. Beyond traditional data augmentation techniques, generative adversarial network (GAN) invented in 2014 in the computer vision community, provides a suite of novel approaches that can learn good data representations and generate highly realistic samples. Since 2017, there has been a growth of research into GANs for image augmentation or synthesis in agriculture for improved model performance. This paper presents an overview of the evolution of GAN architectures followed by a first systematic review of t various applications in agriculture and food systems (https://github.com/Derekabc/GANs-Agriculture), involving a diversity of visual recognition tasks for plant health conditions, weeds, fruits (preharvest), aquaculture, animal farming, plant phenotyping as well as postharvest detection of fruit defects. Challenges and opportunities of GANs are discussed for future research.
Article
Single Image Super Resolution (SISR) is a process to obtain a high pixel density and refined details from a low resolution (LR) image to get upscaled and sharper high-resolution (HR) image. In last decade, SISR based on Convolutional Neural Networks (CNN) have achieved impressive results for generating super-resolved images upto the size of x3. This technique focuses on minimizing L1/L2 loss between real HR image and generated HR image without considering the perceptual quality of images. To improve upon, SISR based on Generative Adversarial Network (GAN) has gained researchers attention for generating visually pleasing images with reasonably minimizing L1/L2 loss for ×4 size images. The basic idea of GAN is to train two networks simultaneously, a Generator and a Discriminator such that generator can produce the super-resolved image for a given input LR image by learning real HR image distribution. This paper presents an overview of GAN based SISR techniques for further research as there are a few surveys in this area. Different GAN models have been classified in terms of architecture, algorithms, and loss functions including their benefits and limitations. Lastly, the research gaps as well as some possible solutions for the existing methods have been discussed.
Chapter
In the real life, when you see an apple, eggplant, or beet, how do you know the name of them in another language? In this article, we propose a new approach to identify vegetables and fruits by using deep learning and transfer learning based on DenseNet201 model. The result of this research is a model that help people to identify 104 types of vegetables and tubers. This model is trained on the original dataset and the processed dataset, the highest accuracy at fine-tuning is 94%.KeywordsDenseNet-201Fruit recognitionDeep learningTransfer learningImage classification
Article
Full-text available
The coronavirus disease 2019 (COVID-19) is the fastest transmittable virus caused by severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2). The detection of COVID-19 using artificial intelligence techniques and especially deep learning will help to detect this virus in early stages which will reflect in increasing the opportunities of fast recovery of patients worldwide. This will lead to release the pressure off the healthcare system around the world. In this research, classical data augmentation techniques along with CGAN based on a deep transfer learning model for COVID-19 detection in chest CT scan images will be presented. The limited benchmark datasets for covid-19 especially in chest CT images is the main motivation of this research. The main idea is to collect all the possible images for covid-19 that exists until the very writing of this research and use the classical data augmentations along with CGAN to generate more images to help in the detection of the COVID-19. In this study, five different deep convolutional neural network-based models (AlexNet, VGGNet16, VGGNet19, GoogleNet, and ResNet50) have been selected for the investigation to detect the coronavirus infected patient using chest CT radiographs digital images. The classical data augmentations along with CGAN improve the performance of classification in all selected deep transfer models. The Outcomes show that ResNet50 is the most appropriate classifier to detect the COVID-19 from chest CT dataset using the classical data augmentation and CGAN with testing accuracy of 82.91%.
Article
Full-text available
Synthetic data augmentation is of paramount importance for machine learning classification, particularly for biological data, which tend to be high dimensional and with a scarcity of training samples. The applications of robotic control and augmentation in disabled and able-bodied subjects still rely mainly on subject-specific analyses. Those can rarely be generalised to the whole population and appear to over complicate simple action recognition such as grasp and release (standard actions in robotic prosthetics and manipulators). We show for the first time that multiple GPT-2 models can machine-generate synthetic biological signals (EMG and EEG) and improve real data classification. Models trained solely on GPT-2 generated EEG data can classify a real EEG dataset at 74.71% accuracy and models trained on GPT-2 EMG data can classify real EMG data at 78.24% accuracy. Synthetic and calibration data are then introduced within each cross validation fold when benchmarking EEG and EMG models. Results show algorithms are improved when either or both additional data are used. A Random Forest achieves a mean 95.81% (1.46) classification accuracy of EEG data, which increases to 96.69% (1.12) when synthetic GPT-2 EEG signals are introduced during training. Similarly, the Random Forest classifying EMG data increases from 93.62% (0.8) to 93.9% (0.59) when training data is augmented by synthetic EMG signals. Additionally, as predicted, augmentation with synthetic biological signals also increases the classification accuracy of data from new subjects that were not observed during training. A Robotiq 2F-85 Gripper was finally used for real-time gesture-based control, with synthetic EMG data augmentation remarkably improving gesture recognition accuracy, from 68.29% to 89.5%.
Article
Full-text available
The article presents the results of a study of the efficiency of various neural networks in the limited conditions of the source data and with a number of simple augmentations. In this case, the dependences were obtained for a serial neural network with back propagation of error. For data augmentation, the simplest transformations were used, including the letters tilting (italics), changing the color of letters (from black to red), as well as distortion of the reference images with white Gaussian noise at a signal-to-noise ratio q from 1 to 10. It is shown that the best results of recognition of letters of the Russian alphabet are provided by a network for which all the augmentation methods discussed in this work were used. A study of the dependence of recognition accuracy on the signal-to-noise ratio in all trained neural networkswas also conducted.
Article
Full-text available
The Coronavirus disease 2019 (COVID-19) is the fastest transmittable virus caused by severe acute respiratory syndrome Coronavirus 2 (SARS-CoV-2). The detection of COVID-19 using artificial intelligence techniques and especially deep learning will help to detect this virus in early stages which will reflect in increasing the opportunities of fast recovery of patients worldwide. This will lead to release the pressure off the healthcare system around the world. In this research, classical data augmentation techniques along with Conditional Generative Adversarial Nets (CGAN) based on a deep transfer learning model for COVID-19 detection in chest CT scan images will be presented. The limited benchmark datasets for COVID-19 especially in chest CT images are the main motivation of this research. The main idea is to collect all the possible images for COVID-19 that exists until the very writing of this research and use the classical data augmentations along with CGAN to generate more images to help in the detection of the COVID-19. In this study, five different deep convolutional neural network-based models (AlexNet, VGGNet16, VGGNet19, GoogleNet, and ResNet50) have been selected for the investigation to detect the Coronavirus-infected patient using chest CT radiographs digital images. The classical data augmentations along with CGAN improve the performance of classification in all selected deep transfer models. The outcomes show that ResNet50 is the most appropriate deep learning model to detect the COVID-19 from limited chest CT dataset using the classical data augmentation with testing accuracy of 82.91%, sensitivity 77.66%, and specificity of 87.62%.
Conference Paper
Full-text available
Image-to-image translation is a computer vision problem where a task learns a mapping from a source domain A to a target domain B using a training set. However, this translation is not always accurate, and during the translation process, relevant semantic information can deteriorate. To handle this problem, we propose a new cycle-consistent, adversarially trained image-to-image translation with a loss function that is constrained by semantic segmentation. This formulation encourages the model to preserve semantic information during the translation process. For this purpose, our loss function evaluates the accuracy of the synthetically generated image against a semantic segmentation model, previously trained. Reported results show that our proposed method can significantly increase the level of details in the synthetic images. We further demonstrate our method's effectiveness by applying it as a dataset augmentation technique, for a minimal dataset, showing that it can improve the semantic segmentation accuracy.
Article
Full-text available
Deep learning has been successfully showing promising results in plant disease detection, fruit counting, yield estimation, and gaining an increasing interest in agriculture. Deep learning models are generally based on several millions of parameters that generate exceptionally large weight matrices. The latter requires large memory and computational power for training, testing, and deploying. Unfortunately, these requirements make it difficult to deploy on low-cost devices with limited resources that are present at the fieldwork. In addition, the lack or the bad quality of connectivity in farms does not allow remote computation. An approach that has been used to save memory and speed up the processing is to compress the models. In this work, we tackle the challenges related to the resource limitation by compressing some state-of-the-art models very often used in image classification. For this we apply model pruning and quantization to LeNet5, VGG16, and AlexNet. Original and compressed models were applied to the benchmark of plant seedling classification (V2 Plant Seedlings Dataset) and Flavia database. Results reveal that it is possible to compress the size of these models by a factor of 38 and to reduce the FLOPs of VGG16 by a factor of 99 without considerable loss of accuracy.
Data
Full-text available
Lemon dataset has been prepared to investigate the possibilities to tackle the issue of fruit quality control. It contains 2690 annotated images (1056 x 1056 pixels). Raw lemon images have been captured using the procedure described in the following blogpost (https://blog.softwaremill.com/when-life-gives-you-lemons-create-a-dataset-70522d6b1aa0) and manually annotated using CVAT. Full dataset is available on Github (https://github.com/softwaremill/lemon-dataset)
Article
Full-text available
Papaya (Carica papaya) is a tropical fruit having commercial importance because of its high nutritive and medicinal value. The packaging of papaya fruit as per its maturity status is an essential task in the fruit industry. The manual grading of papaya fruit based on human visual perception is time-consuming and destructive. The objective of this paper is to suggest a novel non-destructive maturity status classification of papaya fruits. The paper suggested two approaches based on machine learning and transfer learning for classification of papaya maturity status. Also, a comparative analysis is carried out with different methods of machine learning and transfer learning. The experimentation is carried out with 300 papaya fruit sample images which includes 100 of each three maturity stages. The machine learning approach includes three sets of features and three classifiers with their different kernel functions. The features and classifiers used in machine learning approaches are local binary pattern (LBP), histogram of oriented gradients (HOG), Gray Level Co-occurrence Matrix (GLCM) and k-nearest neighbour (KNN), support vector machine (SVM), Naïve Bayes respectively. The transfer learning approach includes seven pre-trained models such as ResNet101, ResNet50, ResNet18, VGG19, VGG16, GoogleNet and AlexNet. The weighted KNN with HOG feature outperforms other machine learning-based classification model with 100% of accuracy and 0.099548 second training time. Again, among the transfer learning approach based classification model VGG19 performs better with 100% accuracy and 1minute 52 second training time with consideration of early stop training. The proposed classification method for maturity classification of papaya fruits, i.e. VGG19 based on transfer learning approach achieved 100% accuracy which is 6% more than the existing method.
Article
Full-text available
The most important process before packaging and preserving agricultural products is sorting operation. Sort of carrot by human labor is involved in many problems such as high cost and product waste. Image processing is a modern method, which has different applications in agriculture including classification and sorting. The aim of this study was to classify carrot based on shape using image processing technique. For this, 135 samples with different regular and irregular shapes were selected. After image acquisition and preprocessing, some features such as length, width, breadth, perimeter, elongation, compactness, roundness, area, eccentricity, centroid, centroid nonhomogeneity, and width nonhomogeneity were extracted. After feature selection, linear discriminant analysis (LDA) and quadratic discriminant analysis (QDA) methods were used to classify the features. The classification accuracies of the methods were 92.59 and 96.30, respectively. It can be stated that image processing is an effective way in improving the traditional carrot sorting techniques.
Article
Full-text available
Agriculture has always been an important economic and social sector for humans. Fruit production is especially essential, with a great demand from all households. Therefore, the use of innovative technologies is of vital importance for the agri-food sector. Currently artificial intelligence is one very important technological tool widely used in modern society. Particularly, Deep Learning (DL) has several applications due to its ability to learn robust representations from images. Convolutional Neural Networks (CNN) is the main DL architecture for image classification. Based on the great attention that CNNs have had in the last years, we present a review of the use of CNN applied to different automatic processing tasks of fruit images: classification, quality control, and detection. We observe that in the last two years (2019–2020), the use of CNN for fruit recognition has greatly increased obtaining excellent results, either by using new models or with pre-trained networks for transfer learning. It is worth noting that different types of images are used in datasets according to the task performed. Besides, this article presents the fundamentals, tools, and two examples of the use of CNNs for fruit sorting and quality control.
Article
Full-text available
This paper proposes two new data augmentation approaches based on Deep Convolutional Generative Adversarial Networks (DCGANs) and Style Transfer for augmenting Parkinson’s Disease (PD) electromyography (EMG) signals. The experimental results indicate that the proposed models can adapt to different frequencies and amplitudes of tremor, simulating each patient’s tremor patterns and extending them to different sets of movement protocols. Therefore, one could use these models for extending the existing patient dataset and generating tremor simulations for validating treatment approaches on different movement scenarios.
Conference Paper
Full-text available
Autonomous speaker identification suffers issues of data scarcity due to it being unrealistic to gather hours of speaker audio to form a dataset, which inevitably leads to class imbalance in comparison to the large data availability from non-speakers since large-scale speech datasets are available online. In this study, we explore the possibility of improving speaker recognition by augmenting the dataset with synthetic data produced by training a Character-level Recurrent Neural Network on a short clip of five spoken sentences. A deep neural network is trained on a selection of the Flickr8k dataset as well as the real and synthetic speaker data (all in the form of MFCCs) as a binary classification problem in order to discern the speaker from the Flickr speakers. Ranging from 2,500 to 10,000 synthetic data objects, the network weights are then transferred to the original dataset of only Flickr8k and the real speaker data, in order to discern whether useful rules can be learnt from the synthetic data. Results for all three subjects show that fine-tune learning from datasets augmented with synthetic speech improve the classification accuracy, F1 score, precision, and the recall when applied to the scarce real data vs non-speaker data. We conclude that even with just five spoken short sentences, data augmentation via synthetic speech data generated by a Char-RNN can improve the speaker classification process. Accuracy and related metrics are shown to improve from around 93% to 99% for three subjects classified from thousands of others when fine-tuning from exposure to 2500-1000 synthetic data points. High F1 scores, precision and recall also show that issues due to class imbalance are also solved.
Article
Full-text available
Data augmentation is a popular technique which helps improve generalization capabilities of deep neural networks, and can be perceived as implicit regularization. It plays a pivotal role in scenarios in which the amount of high-quality ground-truth data is limited, and acquiring new examples is costly and time-consuming. This is a very common problem in medical image analysis, especially tumor delineation. In this paper, we review the current advances in data-augmentation techniques applied to magnetic resonance images of brain tumors. To better understand the practical aspects of such algorithms, we investigate the papers submitted to the Multimodal Brain Tumor Segmentation Challenge (BraTS 2018 edition), as the BraTS dataset became a standard benchmark for validating existent and emerging brain-tumor detection and segmentation techniques. We verify which data augmentation approaches were exploited and what was their impact on the abilities of underlying supervised learners. Finally, we highlight the most promising research directions to follow in order to synthesize high-quality artificial brain-tumor examples which can boost the generalization abilities of deep models.
Article
Full-text available
Article
Full-text available
Convolutional Neural Networks (CNNs) achieve excellent computer-assisted diagnosis with sufficient annotated training data. However, most medical imaging datasets are small and fragmented. In this context, Generative Adversarial Networks (GANs) can synthesize realistic/diverse additional training images to fill the data lack in the real image distribution; researchers have improved classification by augmenting data with noise-to-image (e.g., random noise samples to diverse pathological images) or image-to-image GANs (e.g., a benign image to a malignant one). Yet, no research has reported results combining noise-to-image and image-to-image GANs for further performance boost. Therefore, to maximize the DA effect with the GAN combinations, we propose a two-step GAN-based DA that generates and refines brain Magnetic Resonance (MR) images with/without tumors separately: ( ${i}$ ) Progressive Growing of GANs (PGGANs), multi-stage noise-to-image GAN for high-resolution MR image generation, first generates realistic/diverse $256\times 256$ images; ( ii ) Multimodal UNsupervised Image-to-image Translation (MUNIT) that combines GANs/Variational AutoEncoders or SimGAN that uses a DA-focused GAN loss, further refines the texture/shape of the PGGAN-generated images similarly to the real ones. We thoroughly investigate CNN-based tumor classification results, also considering the influence of pre-training on ImageNet and discarding weird-looking GAN-generated images. The results show that, when combined with classic DA, our two-step GAN-based DA can significantly outperform the classic DA alone, in tumor detection (i.e., boosting sensitivity 93.67% to 97.48%) and also in other medical imaging tasks.
Article
Full-text available
In the production process from green beans to coffee bean packages, the defective bean removal (or in short, defect removal) is one of most labor-consuming stages, and many companies investigate the automation of this stage for minimizing human efforts. In this paper, we propose a deep-learning-based defective bean inspection scheme (DL-DBIS), together with a GAN (generative-adversarial network)-structured automated labeled data augmentation method (GALDAM) for enhancing the proposed scheme, so that the automation degree of bean removal with robotic arms can be further improved for coffee industries. The proposed scheme is aimed at providing an effective model to a deep-learning-based object detection module for accurately identifying defects among dense beans. The proposed GALDAM can be used to greatly reduce labor costs, since the data labeling is the most labor-intensive work in this sort of solutions. Our proposed scheme brings two main impacts to intelligent agriculture. First, our proposed scheme is can be easily adopted by industries as human effort in labeling coffee beans are minimized. The users can easily customize their own defective bean model without spending a great amount of time on labeling small and dense objects. Second, our scheme can inspect all classes of defective beans categorized by the SCAA (Specialty Coffee Association of America) at the same time and can be easily extended if more classes of defective beans are added. These two advantages increase the degree of automation in the coffee industry. The prototype of the proposed scheme was developed for studying integrated tests. Testing results of a case study reveal that the proposed scheme can efficiently and effectively generate models for identifying defective beans with accuracy and precision values up to 80 % .
Article
Full-text available
Presently, lots of previous studies on biometrics employ convolutional neural networks (CNN) which requires a large amount of labeled training data. However, biometric data are considered as important personal information, and it is difficult to obtain large amounts of data due to individual privacy issues. Training with a small amount of data is a major cause of overfitting and low testing accuracy. To resolve this problem, previous studies have performed data augmentation that are based on geometric transforms and the adjustment of image brightness. Nevertheless, the data created by these methods have high correlation with the original data, and they cannot adequately reflect individual diversities. To resolve this problem, this study proposes iris image augmentation based on a conditional generative adversarial network (cGAN), as well as a method for improving recognition performance that uses this augmentation method. In our method, normalized iris images that are generated through arbitrary changes in the iris and pupil coordinates are used as input in the cGAN-based model to generate iris images. Due to the limitations of the cGAN model, data augmentation, which uses the periocular region, was found to fail with regard to the improvement of performance. Based on this information, only the iris region was used as input for the cGAN model. The augmentation method proposed in this paper was tested using NICE.II training dataset (selected from UBIRS.v2), MICHE database, and CASIA-Iris-Distance database. The results showed that the recognition performance was improved compared to existing studies.
Article
Full-text available
In this work, we present a complete hardware development and current consumption study of a portable electronic nose designed for the Internet-of-Things (IoT). Thanks to the technique of measuring in the initial action period, it can be reliably powered with a moderate-sized battery. The system is built around the well-known SoC (System on Chip) ESP8266EX, using low-cost electronics and standard sensors from Figaro’s TGS26xx series. This SoC, in addition to a powerful microcontroller, provides Wi-Fi connectivity, making it very suitable for IoT applications. The system also includes a precision analog-to-digital converter for the measurements and a charging module for the lithium battery. During its operation, the designed software takes measurements periodically, and keeps the microcontroller in deep-sleep state most of the time, storing several measurements before uploading them to the cloud. In the experiments and tests carried out, we have focused our work on the measurement and optimization of current consumption, with the aim of extending the battery life. The results show that taking measurements every 4 min and uploading data every five measurements, the battery of 750 mAh needs to be charged approximately once a month. Despite the fact that we have used a specific model of gas sensor, this methodology is quite generic and could be extended to other sensors with lower consumption, increasing very significantly the duration of the battery.
Article
Full-text available
Abstract Deep convolutional neural networks have performed remarkably well on many Computer Vision tasks. However, these networks are heavily reliant on big data to avoid overfitting. Overfitting refers to the phenomenon when a network learns a function with very high variance such as to perfectly model the training data. Unfortunately, many application domains do not have access to big data, such as medical image analysis. This survey focuses on Data Augmentation, a data-space solution to the problem of limited data. Data Augmentation encompasses a suite of techniques that enhance the size and quality of training datasets such that better Deep Learning models can be built using them. The image augmentation algorithms discussed in this survey include geometric transformations, color space augmentations, kernel filters, mixing images, random erasing, feature space augmentation, adversarial training, generative adversarial networks, neural style transfer, and meta-learning. The application of augmentation methods based on GANs are heavily covered in this survey. In addition to augmentation techniques, this paper will briefly discuss other characteristics of Data Augmentation such as test-time augmentation, resolution impact, final dataset size, and curriculum learning. This survey will present existing methods for Data Augmentation, promising developments, and meta-level decisions for implementing Data Augmentation. Readers will understand how Data Augmentation can improve the performance of their models and expand limited datasets to take advantage of the capabilities of big data.
Article
Full-text available
Recent advancements in computer vision have enabled wide-ranging applications in every field of life. One such application area is fresh produce classification, but the classification of fruit and vegetable has proven to be a complex problem and needs to be further developed. Fruit and vegetable classification presents significant challenges due to interclass similarities and irregular intraclass characteristics. Selection of appropriate data acquisition sensors and feature representation approach is also crucial due to the huge diversity of the field. Fruit and vegetable classification methods have been developed for quality assessment and robotic harvesting but the current state-of-the-art has been developed for limited classes and small datasets. The problem is of a multi-dimensional nature and offers significantly hyperdimensional features, which is one of the major challenges with current machine learning approaches. Substantial research has been conducted for the design and analysis of classifiers for hyperdimensional features which require significant computational power to optimise with such features. In recent years numerous machine learning techniques for example, Support Vector Machine (SVM), K-Nearest Neighbour (KNN), Decision Trees, Artificial Neural Networks (ANN) and Convolutional Neural Networks (CNN) have been exploited with many different feature description methods for fruit and vegetable classification in many real-life applications. This paper presents a critical comparison of different state-of-the-art computer vision methods proposed by researchers for classifying fruit and vegetable.
Article
Full-text available
In agriculture science, automation increases the quality, economic growth and productivity of the country. The export market and quality evaluation are affected by assorting of fruits and vegetables. The crucial sensory characteristic of fruits and vegetables is appearance that impacts their market value, the consumer's preference and choice. Although, the sorting and grading can be done by human but it is inconsistent, time consuming, variable, subjective, onerous, expensive and easily influenced by surrounding. Hence, an astute fruit grading system is needed. In recent years, various algorithms for sorting and grading are done by various researchers using computer vision. This paper presents a detailed overview of various methods i.e. preprocessing, segmentation, feature extraction, classification which addressed fruits and vegetables quality based on color, texture, size, shape and defects. In this paper, a critical comparison of different algorithm proposed by researchers for quality inspection of fruits and vegetables has been carried out.
Article
Full-text available
Deep learning methods, and in particular convolutional neural networks (CNNs), have led to an enormous breakthrough in a wide range of computer vision tasks, primarily by using large-scale annotated datasets. However, obtaining such datasets in the medical domain remains a challenge. In this paper, we present methods for generating synthetic medical images using recently presented deep learning Generative Adversarial Networks (GANs). Furthermore, we show that generated medical images can be used for synthetic data augmentation, and improve the performance of CNN for medical image classification. Our novel method is demonstrated on a limited dataset of computed tomography (CT) images of 182 liver lesions (53 cysts, 64 metastases and 65 hemangiomas). We first exploit GAN architectures for synthesizing high quality liver lesion ROIs. Then we present a novel scheme for liver lesion classification using CNN. Finally, we train the CNN using classic data augmentation and our synthetic data augmentation and compare performance. In addition, we explore the quality of our synthesized examples using visualization and expert assessment. The classification performance using only classic data augmentation yielded 78.6% sensitivity and 88.4% specificity. By adding the synthetic data augmentation the results increased to 85.7% sensitivity and 92.4% specificity. We believe that this approach to synthetic data augmentation can generalize to other medical classification applications and thus support radiologists' efforts to improve diagnosis.
Article
Full-text available
In this paper we introduce a new, high-quality, dataset of images containing fruits. We also present the results of some numerical experiment for training a neural network to detect fruits. We discuss the reason why we chose to use fruits in this project by proposing a few applications that could use this kind of neural network.
Article
Full-text available
The generation of artificial data based on existing observations, known as data augmentation, is a technique used in machine learning to improve model accuracy, generalisation, and to control overfitting. Augmentor is a software package, available in both Python and Julia versions, that provides a high level API for the expansion of image data using a stochastic, pipeline-based approach which effectively allows for images to be sampled from a distribution of augmented images at runtime. Augmentor provides methods for most standard augmentation practices as well as several advanced features such as label-preserving, randomised elastic distortions, and provides many helper functions for typical augmentation tasks used in machine learning.
Article
Plant diseases and pernicious insects are a considerable threat in the agriculture sector. Therefore, early detection and diagnosis of these diseases are essential. The ongoing development of profound deep learning methods has greatly helped in the detection of plant diseases, granting a vigorous tool with exceptionally precise outcomes but the accuracy of deep learning models depends on the volume and the quality of labeled data for training. In this paper, we have proposed a deep learning-based method for tomato disease detection that utilizes the Conditional Generative Adversarial Network (C-GAN) to generate synthetic images of tomato plant leaves. Thereafter, a DenseNet121 model is trained on synthetic and real images using transfer learning to classify the tomato leaves images into ten categories of diseases. The proposed model has been trained and tested extensively on publicly available PlantVillage dataset. The proposed method achieved an accuracy of 99.51%, 98.65%, and 97.11% for tomato leaf image classification into 5 classes, 7 classes, and 10 classes, respectively. The proposed approach shows its superiority over the existing methodologies.
Article
Recent successes in Generative Adversarial Networks (GAN) have affirmed the importance of using more data in GAN training. Yet it is expensive to collect data in many domains such as medical applications. Data Augmentation (DA) has been applied in these applications. In this work, we first argue that the classical DA approach could mislead the generator to learn the distribution of the augmented data, which could be different from that of the original data. We then propose a principled framework, termed Data Augmentation Optimized for GAN (DAG), to enable the use of augmented data in GAN training to improve the learning of the original distribution. We provide theoretical analysis to show that using our proposed DAG aligns with the original GAN in minimizing the Jensen–Shannon (JS) divergence between the original distribution and model distribution. Importantly, the proposed DAG effectively leverages the augmented data to improve the learning of discriminator and generator. We conduct experiments to apply DAG to different GAN models: unconditional GAN, conditional GAN, self-supervised GAN and CycleGAN using datasets of natural images and medical images. The results show that DAG achieves consistent and considerable improvements across these models. Furthermore, when DAG is used in some GAN models, the system establishes state-of-the-art Fréchet Inception Distance (FID) scores. Our code is available ( https://github.com/tntrung/dag-gans ).
Chapter
Image-to-image translation is a computer vision problem where a task learns a mapping from a source domain A to a target domain B using a training set. However, this translation is not always accurate, and during the translation process, relevant semantic information can deteriorate. To handle this problem, we propose a new cycle-consistent, adversarially trained image-to-image translation with a loss function that is constrained by semantic segmentation. This formulation encourages the model to preserve semantic information during the translation process. For this purpose, our loss function evaluates the accuracy of the synthetically generated image against a semantic segmentation model, previously trained. Reported results show that our proposed method can significantly increase the level of details in the synthetic images. We further demonstrate our method’s effectiveness by applying it as a dataset augmentation technique, for a minimal dataset, showing that it can improve the semantic segmentation accuracy.
Article
Deep learning approaches to medical image analysis tasks have recently become popular; however, they suffer from a lack of human interpretability critical for both increasing understanding of the methods' operation and enabling clinical translation. This review summarizes currently available methods for performing image model interpretation and critically evaluates published uses of these methods for medical imaging applications. We divide model interpretation in two categories: (1) understanding model structure and function and (2) understanding model output. Understanding model structure and function summarizes ways to inspect the learned features of the model and how those features act on an image. We discuss techniques for reducing the dimensionality of high-dimensional data and cover autoencoders, both of which can also be leveraged for model interpretation. Understanding model output covers attribution-based methods, such as saliency maps and class activation maps, which produce heatmaps describing the importance of different parts of an image to the model prediction. We describe the mathematics behind these methods, give examples of their use in medical imaging, and compare them against one another. We summarize several published toolkits for model interpretation specific to medical imaging applications, cover limitations of current model interpretation methods, provide recommendations for deep learning practitioners looking to incorporate model interpretation into their task, and offer general discussion on the importance of model interpretation in medical imaging contexts.
Article
A deep-learning architecture based on Convolutional Neural Networks (CNN) and a cost-effective computer vision module were used to detect defective apples on a four-line fruit sorting machine at a speed of 5 fruits/s. A CNN based classification architecture was trained and tested, with the accuracy, recall, and specificity of 96.5%, 100.0%, and 92.9%, respectively, for the testing set. An inferior performance was obtained by a traditional image processing method based on candidate defective regions counting and a support vector machine (SVM) classifier, with the accuracy, recall, and specificity of 87.1%, 90.9%, and 83.3%, respectively. The CNN-based model was loaded into the custom software to validate its performance using independent 200 apples, obtaining an accuracy of 92% with a processing time below 72 ms for six images of an apple fruit. The overall results indicated that the proposed CNN-based classification model had great potential to be implemented in commercial packing line.
Article
Litchi (Litchi chinensis Sonn.) originated from China and many of its cultivars have been produced in China so far during the long history of cultivation. One problem in litchi production and research is the worldwide confusion regarding litchi cultivar nomenclature. Because litchi cultivars can be described in terms of cultivar-dependent fruit appearance, it should be possible to discriminate cultivars of postharvest fruits. In this study, we explored this possibility using recently developed deep learning technology for four common Taiwanese cultivars 'Gui Wei', 'Hei Ye', 'No Mai Tsz', and 'Yu Her Pau'. First, we quantitatively evaluated litchi fruit shapes using elliptic Fourier descriptors and characterized the relationship between cultivars and fruit shapes. Results suggest that 'Yu Her Pau' can be clearly discriminated from others mainly based on its higher length-to-diameter ratio. We then fine-tuned a pre-trained VGG16 to construct a cultivar discrimination model. Relatively few images were sufficient to train the model to classify fruit images with 98.33% accuracy. We evaluated our model using images of fruits collected in different seasons and locations and found the model could identify 'Yu Her Pau' fruits with 100% accuracy and 'Hei Ye' fruits with 84% accuracy. A Grad-CAM visualization reveals that this model uses different cultivar-dependent regions for cultivar recognition. Overall, this study suggests that deep learning can be used to discriminate litchi cultivars from images of the fruit.
Article
Quality assessment of agricultural products is one of the most important factors in promoting their marketability and waste control management. Image processing systems are new and non-destructive methods that have various applications in the agriculture sector, including product grading. The purpose of this study is to use an improved CNN algorithm to detect the apparent defects of sour lemon fruit, grade them and provide an efficient system to do so. In order to identify and categorize defects, sour lemon images were prepared and placed in two groups of healthy and damaged ones. After pre-processing, the images were categorized based on an improved algorithm (CNN). From the data augmentation and the stochastic pooling mechanism were used to improve CNN results. In addition, to compare the proposed model with other methods, feature extraction algorithms (histogram of oriented gradients (HOG) and local binary patterns (LBP)) and k-nearest neighbour (KNN), artifical neural network (ANN), Fuzzy, support vector machine (SVM) and decision tree (DT) classification algorithms were used. The results showed that the accuracy of the convolutional neural network (CNN) was 100 %. Therefore, it can be said that the CNN method and image processing are effective in managing waste and promoting the traditional method of sour lemon grading.
Conference Paper
Fruit image classification is the key technology for robotic picking which can tremendously save costs and effectively improve fruit producer's competitiveness in the international fruit market. In the image classification field, deep learning technologies especially DCNNs are state-of-the-art technologies and have achieved remarkable success. But the requirements of high computation and storage resources prohibit the usages of DCNNs on resource-limited environments such as automatic harvesting robots. Therefore, we need to choose a lightweight neural network to achieve the balance of resource limitations and recognition accuracy. In this paper, a fruit image classification method based on a lightweight neural network MobileNetV2 with transfer learning technique was used to recognize fruit images. We used a MobileNetV2 network pre-trained by ImageNet dataset as a base network and then replace the top layer of the base network with a conventional convolution layer and a Softmax classifier. We applied dropout to the new-added conv2d at the same time to reduce overfitting. The pre-trained MobileNetV2 was used to extract features and the Softmax classifier was used to classify features. We trained this new model in two stages using Adam optimizer of different learning rate. This method finally achieved a classification accuracy of 85.12% in our fruit image dataset including 3670 images of 5 fruits. Compared with other network such as MobileNetV1, InceptionV3 and DenseNet121, this hybrid network implemented by Google open source deep learning framework Tensorflow can make a good compromise between accuracy and speed. Since MobileNetV2 is a lightweight neural network, the method in this paper can be deployed in low-power and limited-computing devices such as mobile phone.
Article
A large set of training samples is a prerequisite to effectively learn deep neural networks for image classification. When the labeled samples are scarce, it often fails to produce a promising model. Data augmentation is a widely used technique to overcome the issue, which enlarges the training samples with label invariant transformations, e.g., rotation, flip and random crop etc. However, the diversity of images generated by standard data augmentation is quite limited and thus the final improvement on classification accuracy is not much, especially for fine-grained classification problem. In this paper, we propose a two-stage generative adversarial network, namely Fine-grained Conditional Adversarial Network (F-CGAN), which can produce class-dependent synthetic images with fine-grained details. Moreover, to leverage the synthetic images for fine-grained classification, we develop a multi-task learning classifier, which categorizes training images and synthetic images simultaneously. Experimental results on CUB Birds and Stanford Dogs data sets show that the proposed method indeed improves the classification accuracy.
Article
For noise robust speech recognition, data mismatch between training and testing is a significant challenge. Data augmentation is an effective way to enlarge the size and diversity of training data and solve this problem. Different from the traditional approaches by directly adding noise to the original waveform, in this work we utilize generative adversarial networks (GAN) for data generation to improve speech recognition under noise conditions. In this paper we investigate different configurations of GANs. Firstly the basic GAN is applied: the generated speech samples are based on spectrum feature level and produced frame by frame without dependence among them, and there is no true labels. Thus, an unsupervised learning framework is proposed to utilize these untranscribed data for acoustic modeling. Then, in order to better guide the data generation, condition information is introduced into GAN structures, and the conditional GAN is utilized: two different conditions are explored, including the acoustic state of each speech frame and the original paired clean speech of each speech frame. With the incorporation of specific condition information into data generation, these conditional GANs can provide true labels directly, which can be used for later acoustic modeling. During the acoustic model training, these true labels are combined with the soft labels which make the model better. The proposed GAN-based data augmentation approaches are evaluated on two different noisy tasks: Aurora4 (simulated data with additive noise and channel distortion) and the AMI meeting transcription task (real data with significant reverberation). The experiments show that the new data augmentation approaches can obtain the performance improvement under all noisy conditions, which including additive noise, channel distortion and reverberation. With these augmented data by basic GAN / conditional GAN, a relative 6% to 14% WER reduction can be obtained upon an advanced acoustic model.
Conference Paper
Automated fruit image classification is a challenging problem. The study presented (in this paper) analyzes the effectiveness of transfer learning and fine tuning in improving classification accuracy for this problem. For this purpose, Inception v3 and VGG16 models are exploited. The dataset used in this study is the Fruits 360 dataset containing 72 classes and 48,249 images. The paper presents experiments that prove that transfer learning and fine tuning can significantly improve fruit image classification accuracy. Transfer learning using VGG16 model has been demonstrated to give the best classification accuracy of 99.27%. Experiments have also shown that fine tuning using VGG16 and transfer learning using Inception v3 also produce quite impressive fruit image classification accuracies. Not only is the effectiveness of transfer learning and fine tuning demonstrated through experiments, but a self-designed 14-layer convolutional neural net has also proven to be exceptionally good at the task with classification accuracy of 96.79%.
Article
Physical and imaging properties of apricot fruits are the main factors considered in the design and development of sorting mechanisms. Classification of apricots based on visual appearance was performed using image processing technique. The apricots were classified into three maturity stages (i.e. unripe, ripe, and overripe) and the volume was estimated. The captured images of fruits were processed using a previously developed automatic algorithm. The images were cropped, filtered, and segmented upon which imaging features of apricots including relative R, G, B channels, gray-scale, L*, a*, and b* were extracted. The volumes of apricots were estimated using the stripping method and multiplying the value by an oval factor. The result of statistical analysis indicated that there was significant difference among the maturity stages with respect to G, gray-scale, L* and b* features. The LDA and QDA classifiers could categorize the apricots with the accuracy of 0.904 and 0.923, respectively based on color features. Results showed that the algorithm can properly classify the fruits using the image properties of apricots.
Article
We propose a new framework for estimating generative models via an adversarial process, in which we simultaneously train two models: a generative model G that captures the data distribution, and a discriminative model D that estimates the probability that a sample came from the training data rather than G. The training procedure for G is to maximize the probability of D making a mistake. This framework corresponds to a minimax two-player game. In the space of arbitrary functions G and D, a unique solution exists, with G recovering the training data distribution and D equal to 1/2 everywhere. In the case where G and D are defined by multilayer perceptrons, the entire system can be trained with backpropagation. There is no need for any Markov chains or unrolled approximate inference networks during either training or generation of samples. Experiments demonstrate the potential of the framework through qualitative and quantitative evaluation of the generated samples.
Technical Report
In this work we investigate the effect of the convolutional network depth on its accuracy in the large-scale image recognition setting. Our main contribution is a thorough evaluation of networks of increasing depth, which shows that a significant improvement on the prior-art configurations can be achieved by pushing the depth to 16-19 weight layers. These findings were the basis of our ImageNet Challenge 2014 submission, where our team secured the first and the second places in the localisation and classification tracks respectively.
Conference Paper
In recent years, supervised learning with convolutional networks (CNNs) has seen huge adoption in computer vision applications. Comparatively, unsupervised learning with CNNs has received less attention. In this work we hope to help bridge the gap between the success of CNNs for supervised learning and unsupervised learning. We introduce a class of CNNs called deep convolutional generative adversarial networks (DCGANs), that have certain architectural constraints, and demonstrate that they are a strong candidate for unsupervised learning. Training on various image datasets, we show convincing evidence that our deep convolutional adversarial pair learns a hierarchy of representations from object parts to scenes in both the generator and discriminator. Additionally, we use the learned features for novel tasks - demonstrating their applicability as general image representations.
Article
In recent years, supervised learning with convolutional networks (CNNs) has seen huge adoption in computer vision applications. Comparatively, unsupervised learning with CNNs has received less attention. In this work we hope to help bridge the gap between the success of CNNs for supervised learning and unsupervised learning. We introduce a class of CNNs called deep convolutional generative adversarial networks (DCGANs), that have certain architectural constraints, and demonstrate that they are a strong candidate for unsupervised learning. Training on various image datasets, we show convincing evidence that our deep convolutional adversarial pair learns a hierarchy of representations from object parts to scenes in both the generator and discriminator. Additionally, we use the learned features for novel tasks - demonstrating their applicability as general image representations.
Conference Paper
The aim of this paper is to develop an effective classification approach based on Random Forest (RF) algorithm. Three fruits; i.e., apples, Strawberry, and oranges were analysed and several features were extracted based on the fruits' shape, colour characteristics as well as Scale Invariant Feature Transform (SIFT). A preprocessing stages using image processing to prepare the fruit images dataset to reduce their color index is presented. The fruit image features is then extracted. Finally, the fruit classification process is adopted using random forests (RF), which is a recently developed machine learning algorithm. A regular digital camera was used to acquire the images, and all manipulations were performed in a MATLAB environment. Experiments were tested and evaluated using a series of experiments with 178 fruit images. It shows that Random Forest (RF) based algorithm provides better accuracy compared to the other well know machine learning techniques such as K-Nearest Neighborhood (K-NN) and Support Vector Machine (SVM) algorithms. Moreover, the system is capable of automatically recognize the fruit name with a high degree of accuracy.
Article
The appearances of agricultural products are important indices for evaluating the quality of commodities and the characteristics of different varieties. In general, the appearances are evaluated by experts based on visual observations. However, the concern regarding this method is that it lacks objectivity, and it is not quantifiable because it depends greatly on an empirical knowledge. In addition, agricultural products have multiple appearance features; therefore, several of them need to be analyzed simultaneously for correct evaluation of the appearance. In this study, we developed a new image analysis system that can simultaneously evaluate multiple appearance characteristics such as the color, shape and size, of agricultural products in detail. To evaluate the effectiveness of this system, we conducted quality evaluations and cultivar identification on the basis of cluster analysis, multidimensional scaling and discriminant analysis of the appearance characteristics. The results of the cluster analysis revealed that strawberries could be classified on the basis of their appearance characteristics. Furthermore, we were able to visualize the small differences in the appearance of the fruit based on multiple characteristics on a two-dimensional surface by performing multidimensional scaling. The results demonstrate that our system is effective for qualitative evaluations of the appearance of strawberries. The results of the discriminant analysis revealed that the accuracy of strawberry cultivar classification using 14 cultivars was <42%, when only single feature was used. However, the rate increased to 68% after combining the three features. These results indicate that our system exploits the advantage of analyzing multiple appearance characteristics.
Article
Mechanical injuries to fruits are often caused due to hidden internal damages that results in bruising of fruit. This is a serious cause of concern to the fruit industry, as spoiled or bruised fruits directly impact the producers profit. Hyperspectral imaging method can provide the ability to identify these internal bruises to classify these fruits as normal and injured (bruised), reducing time and increasing efficiency over the sorting line in marketing chain. In this paper, we have used three types of fruits i.e., apple, chikoo & guava for experiments. The mechanical injury is introduced by manual impact on surface of the fruits sample and hyperspectral images were captured over nine narrow band pass filters to produce hyperspectral cubes for a fruit. Three types of methods were used for the data processing. First two are non-invasive in nature i.e., pixel signatures over hyperspectral cubes and second is prediction model for classification of fruits quality into normal and bruised using feed forward back propagation neural network. Finally, invasive method is used to confirm the said prediction model using parameters like firmness, Total Soluble Solid (TSS) and weight with Principal Component Analysis. Results obtained by hyperspectral imaging method indicate scope for non-invasive quality control over spectral wavelength range of 400-1000 nm.