Figure 2 - available via license: Creative Commons Attribution 3.0 Unported
Content may be subject to copyright.
Illustration of Max Pooling and Average Pooling Figure 2 above shows an example of max pooling operation and average pooling with a 2x2 pixel filter size from 4x4 pixel input. At max pooling, each filter is taken the maximum value, then arranged into a new output with a size of 2x2 pixels. While the average pooling value taken is the average value of the filter size. Classification layer is a layer consisting of flattening, hidden layer and activation functions. Hidden layers in artificial neural networks is layers between input layer and output layer, where artificial neurons take a set of weight inputs and produce output through activation functions such as sigmoid[8], ReLU[9], or Softmax[10].

Illustration of Max Pooling and Average Pooling Figure 2 above shows an example of max pooling operation and average pooling with a 2x2 pixel filter size from 4x4 pixel input. At max pooling, each filter is taken the maximum value, then arranged into a new output with a size of 2x2 pixels. While the average pooling value taken is the average value of the filter size. Classification layer is a layer consisting of flattening, hidden layer and activation functions. Hidden layers in artificial neural networks is layers between input layer and output layer, where artificial neurons take a set of weight inputs and produce output through activation functions such as sigmoid[8], ReLU[9], or Softmax[10].

Source publication
Article
Full-text available
Nails are one part of the fingers and toes, by observing the shape and the condition of the nails, health expert can find out information about a person’s health. However, this sometimes not realized and ignored by society, even though many diseases that can be seen through the condition of the nails and the shape of the nails are one of the system...

Similar publications

Article
Full-text available
In several applications, such as scene interpretation and reconstruction, precise depth measurement from images is a significant challenge. Current depth estimate techniques frequently provide fuzzy, low-resolution estimates. With the use of transfer learning, this research executes a convolutional neural network for generating a high-resolution de...
Article
Full-text available
This paper proposes a method based on deep learning to improve the efficiency of children’s English learning. Firstly, the migration method is used to improve the accuracy of Resnet18 model. Then the accuracy of the model was further improved by adding channel attention mechanism. Finally, the model fine-tuning method is used to reduce the computat...
Article
Full-text available
Aiming at the problem of classification and recognition of noisy handwritten digits, a connection method is proposed to add a spatial transformation network to a convolutional neural network. The spatial transformation network can not only obtain the output results, but also understand the parts of the input data that have the greatest influence on...

Citations

... Average pooling saves a lot of information about the "less essential" parts of a block or pool. Whereas max pooling simply discards them by selecting the highest value [32]. ...
Article
Full-text available
In the realm of global food security, plants serve as the primary source of sustenance. However, plant diseases pose a significant threat to this security. The process for diagnosing these diseases forms the bedrock of disease control efforts. The precision and expediency of these diagnoses wield substantial influence over disease management and the consequent reduction of economic losses. This research endeavors to diagnose the prevalent crops in Jordan, as identified by the Jordanian Department of Statistics for the year 2019. These crops encompass four key agricultural varieties: cucumbers, tomatoes, lettuce, and cabbage. To facilitate this, a novel dataset known as “Jordan22” was meticulously curated. Jordan22 was compiled by collecting images of diseased and healthy plants captured on Jordanian farms. These images underwent meticulous classification by a panel of three agricultural specialists well-versed in plant disease identification and prevention. The Jordan22 dataset comprises a substantial size, amounting to 3210 images. The results yielded by the CNN were remarkable, with a test accuracy rate reaching an impressive 0.9712. Optimal performance was observed when images were resized to 256 × 256 dimensions, and max pooling was used instead of average pooling. Furthermore, the initial convolutional layer was set at a size of 32, with subsequent convolutional layers standardized at 128 in size. In conclusion, this research represents a pivotal step towards enhancing plant disease diagnosis and, by extension, global food security. Through the creation of the Jordan22 dataset and the meticulous training of a CNN model, we have achieved substantial accuracy in disease detection, paving the way for more effective disease management strategies in agriculture.
... Gambar 2. Max pooling dan Average pooling [6] ...
Article
Full-text available
CV. Malikus adalah sebuah pabrik rokok yang berlokasi di Kabupaten Blitar, Jawa Timur. Dalam produksi rokok CV Malikus membeli Tembakau cacah yang sudah jadi dan dibeli dari petani tembakau dari berbagai provinsi. CV Malikus juga memiliki peran dalam mendorong ekonomi negara Indonesia dalam produksi rokok kretek untuk masyarakat kalangan menengah bawah dengan menjual rokok dengan harga yang terjangkau dan merekrut tenaga kerja dengan batas skill yang rendah.Pembuatan Implementasi Klasifikasi Jenis Tembakau Cacah Dengan Metode Convolutional Neural Network (CNN) Pada CV. Malikus ini didasari oleh kebutuhan mitra yaitu CV. Malikus yang memiliki tujuan dalam produksi rokok untuk meningkatkan efisiensi dalam produksi dan membantu kinerja karyawan dengan keterbatasan indera mata serta meningkatkan Sumber Daya Manusia (SDM) yang mampu bersaing di perkembangan teknologi deep learning, alat ini digunakan sebagai pembantu efisiensi dalam produksi dengan menggunakan mikrokontroler. Komponen yang digunakan seperti Logitech C270 HD dan Raspberry Pi 5. Komponen-komponen ini digunakan sebagai sistem utama dan untuk Bahasa pemograman menggunakan Python, Google Colab dan Roboflow untuk pembuatan Dataset dan Coding. Komponen-komponen tersebut akan digunakan untuk membuat alat Klasifikasi Tembakau Cacah dengan menggunakan metode CNN (Convolutional Neural Network).
... FIGURE 9 depicts a basic max pooling layer and the average pooling layer utilizing a 2×2 sliding window. FIGURE 9 illustrates the difference between max pooling and average pooling, with max average taking the largest number, while average takes the average [29]. ...
Preprint
Full-text available
In the realm of global food security, plants serve as the primary source of sustenance. However, plant diseases pose a significant threat to this security. The process of diagnosing these diseases forms the bedrock of disease control efforts. The precision and expediency of these diagnoses wield substantial influence over disease management and the consequent reduction of economic losses. Conversely, incorrect diagnoses can render interventions ineffective, leading to agricultural crop deterioration and compounding economic hardships for both farmers and their respective nations. This research endeavors to diagnose the prevalent crops in Jordan, as identified by the Jordanian Department of Statistics for the year 2019. These crops encompass four key agricultural varieties: cucumbers, tomatoes, lettuce, and cabbage. To facilitate this, a novel dataset known as "Jordan 22" was meticulously curated. Jordan 22 was painstakingly compiled through the collection of images featuring both diseased and healthy plants, captured within the confines of Jordanian farms. These images underwent meticulous classification by a panel of three agricultural specialists, well-versed in plant disease identification and prevention. The Jordan 22 dataset comprises a substantial size, amounting to 3210 images. Following the compilation of this dataset, a series of preprocessing steps were executed. These encompassed the standardization of image backgrounds and the uniformization of image dimensions. Furthermore, image augmentation techniques were applied to the dataset to expand its diversity. Subsequently, a deep learning model, the Convolutional Neural Network (CNN), was meticulously trained on the augmented dataset. The results yielded by the CNN were nothing short of remarkable, with a test accuracy rate reaching an impressive 0.9712. Optimal performance was observed when images were resized to 256x256 dimensions, and max pooling was employed in lieu of average pooling within the pooling layer. Furthermore, the initial convolutional layer was set at a size of 32, with subsequent convolutional layers standardized at 128 in size. In conclusion, this research represents a pivotal step towards enhancing plant disease diagnosis and, by extension, global food security. Through the creation of the Jordan 22 dataset and the meticulous training of a CNN model, we have achieved substantial accuracy in disease detection, paving the way for more effective disease management strategies in agriculture.
... The accuracy metric, defined by the formula (4), calculates the proportion of true positive and true negative predictions against all predictions. Error ratio, computed as 1 minus accuracy, (Yani et al. 2019) alternatives to detect classes. Sizes and parameters of the network are changed with MOO algorithms. ...
Article
Full-text available
The coronavirus occurred in Wuhan (China) first and it was declared a global pandemic. To detect coronavirus X-ray images can be used. Convolutional neural networks (CNNs) are used commonly to detect illness from images. There can be lots of different alternative deep CNN models or architectures. To find the best architecture, hyper-parameter optimization can be used. In this study, the problem is modeled as a multi-objective optimization (MOO) problem. Objective functions are multi-class cross entropy, error ratio, and complexity of the CNN network. For the best solutions to the objective functions, multi-objective hyper-parameter optimization is made by NSGA-III, NSGA-II, R-NSGA-II, SMS-EMOA, MOEA/D, and proposed Swarm Genetic Algorithms (SGA). SGA is a swarm-based algorithm with a cross-over process. All six algorithms are run and give Pareto optimal solution sets. When the figures obtained from the algorithms are analyzed and algorithm hypervolume values are compared, SGA outperforms the NSGA-III, NSGA-II, R-NSGA-II, SMS-EMOA, and MOEA/D algorithms. It can be concluded that SGA is better than others for multi-objective hyper-parameter optimization algorithms for COVID-19 detection from X-ray images. Also, a sensitivity analysis has been made to understand the effect of the number of the parameters of CNN on model success.
... Convolution and Transpose Convolution[26] Figure 2 Maxpooling [26] The U-Net model used in this paper consists of the following layers, shown in figure 1 and 2: 1) Encoder: The encoder consists of a series of convolutional and Maxpooling layers. The encoder extracts feature from the input image. ...
Article
Full-text available
As many research work are carried through the usage of Convolutional Neural Network in image processing field, this paper presents a ship detection system from satellite images using a U-Net model. Furthermore, a comparative analysis of U-Net model with ResNet52 model has been carried out. The system is trained on the Airbus Ship Detection Dataset, which contains satellite images of different sizes and of resolutions 768 x 768. The images are preprocessed using run-length decoding to create masks out of the csv file, removing images lower than 50 KB, and image augmentation. The U-Net and ResNet52 models are trained using a combination of the Dice coefficient and Combo loss functions. Early stopping is used to prevent over fitting. The U-Net model achieves an accuracy of 98% and an F1 score of 88% on the test set as compared to ResNet52 having an accuracy of 88% and an F1 score of 80%. Hence, the U-Net model demonstrates its effectiveness in detecting ships in satellite images.
... Pooling Layer example. Left Max Pooling, Right Average Pooling[23] ...
Thesis
Full-text available
The prevalence of Alzheimer’s disease and the critical role of early diagnosis have accelerated the demand for advanced diagnostic tools. This paper explores the application of deep learning techniques, specifically convolutional neural networks (CNNs), to classify Alzheimer's disease stages from MRI images into Non-Demented, Very Mild Demented, Mild Demented, and Moderate Demented categories. This study aims to improve the accuracy of diagnosing Alzheimer's disease by leveraging the capabilities of deep learning algorithms. Utilizing a dataset of brain MRI images, this study develops and trains deep learning models to automate the analysis process, aiming to enhance diagnostic accuracy and efficiency. The research assesses the accuracy, sensitivity, and specificity of several deep learning models, demonstrating their potential to support radiologists by providing reliable, quick, and precise assessments of MRI scans. This study shows an exploratory research on the application of deep learning models to classify Alzheimer's disease stages using convolutional neural networks (CNNs). These findings contribute to existing knowledge in the field of medical image analysis. Although the model developed is still in its early stages, it could serve as a starting point for further research and could help radiologist make decisions. In conclusion, this study underscores the significant potential of utilizing deep learning models to refine and accelerate the diagnostic process in neurology. By employing CNN models tailored for analyzing brain MRI images, the research demonstrates a promising path towards enhancing the accuracy and speed of patient diagnoses. This advancement in diagnostic technology not only optimizes clinical workflows but also ensures that patients receive timely and precise evaluations, thereby improving treatment outcomes and patient care standards. Keywords: Deep Learning, Convolutional Neural Networks (CNNs), Alzheimer’s disease classification, MRI images
... Max pooling and average pooling [55] Fully connected is at the end of the CNN architecture which functions as the final classification of each network of previously interconnected neurons [56], so the purpose of the Fully connected layer is to transform the raw image input into a more meaningful representation that is easily understood by the network. The Fully connected layers can be seen in figure 6. ...
Article
Herbal plants are a source of natural materials used in alternative medicine and traditional therapies to maintain health. The purpose of this research is to develop an intelligent system application that is able to assist people in independently detecting herbal plants around them, provide education, and most importantly, find the optimal value based on certain parameters. This research uses several values for the parameters studied, namely the epoch value which varies between 10, 50, 100, 250, 750, and 1000; the batch size value which varies between 16, 32, 64, 128, 256, and 512; and the learning rate value which varies between 0.00001, 0.0001, 0.001, 0.01, 0.1, and 1. A total of 10,000 training data samples (1,000 samples in 10 classes) were used in Teachable Machine. The method used is to utilize the TensorFlow framework in the Teachable Machine service to train image data. This framework provides Convolutional Neural Networks (CNN) algorithms that can perform image classification with a high degree of accuracy. The test results for more than three months showed that the highest optimal value was achieved at the 50th epoch value, with a learning rate of 0.00001, and a batch size of 32, which resulted in an accuracy rate between 98% and 100%. Based on these results, a mobile web-based intelligent system application service was developed using the TensorFlow framework in Teachable Machine. This application is expected to be widely implemented for the benefit of the community. However, the challenges and limitations in training this test data are the large number of data classes that will be very good so that machine learning can learn to recognize objects but will take hours to train, then the training image object data has a clean background from other objects so that when tested it is not detected and influenced as another object or can result in a decrease in the percentage value.
... (1) Convolutional layer Typically, CNNs consists of an input layer, output layer, and hidden layers [11]. The following formula can be used to represent the mathematical model of a convolutional layer [11,12]: ...
... The ( * ) denotes the dot products of the convolutional operation; M j denotes the number of input maps; l denotes the network's lth layer; k denotes the kernel matrix, which has the dimensions S × S (for example, a kernel size of 3 × 3), and f is the non-linear activation function. There is a multiplicative bias β and an additive bias b assigned to each output map, but its exact form depends on the specific pooling method used [8,12]. ...
... It comprises three convolution layers with filter counts of 8, 16, and 32, each with a 3 × 3 size. These layers are followed by MaxPooling [12] and dropout layers with dropout rates of 50%, 20% and 20%, respectively. The selection of a 3 × 3 kernel is based on the need to classify vibration signals with three axes, allowing the capture of spatial information along all three axes for precise classification. ...
Article
Full-text available
Industrial fans are critical components in industrial production, where unexpected damage of important fans can cause serious disruptions and economic costs. One trending market segment in this area is where companies are trying to add value to their products to detect faults and prevent breakdowns, hence saving repair costs before the main product is damaged. This research developed a methodology for early fault detection in a fan system utilizing machine learning techniques to monitor the operational states of the fan. The proposed system monitors the vibration of the fan using an accelerometer and utilizes a machine learning model to assess anomalies. Several of the most widely used algorithms for fault detection were evaluated and their results benchmarked for the vibration monitoring data. It was found that a simple Convolutional Neural Network (CNN) model demonstrated notable accuracy without the need for feature extraction, unlike conventional machine learning (ML)-based models. Additionally, the CNN model achieved optimal accuracy within 30 epochs, demonstrating its efficiency. Evaluating the CNN model performance on a validation dataset, the hyperparameters were updated until the optimal result was achieved. The trained model was then deployed on an embedded system to make real-time predictions. The deployed model demonstrated accuracy rates of 99.8%, 99.9% and 100.0% for Fan-Fault state, Fan-Off state, and Fan-On state, respectively, on the validation data set. Real-time testing further confirmed high accuracy scores ranging from 90% to 100% across all operational states. Challenges addressed in this research include algorithm selection, real-time deployment onto an embedded system, hyperparameter tuning, sensor integration, energy efficiency implementation and practical application considerations. The presented methodology showcases a promising approach for efficient and accurate fan fault detection with implications for broader applications in industrial and smart sensing applications.
... Max and average pooling[21] ...
Article
Full-text available
span lang="EN-US">Breast cancer represents one of the most common reasons for death in the worldwide. It has a substantially higher death rate than other types of cancer. Early detection can enhance the chances of receiving proper treatment and survival. In order to address this problem, this work has provided a convolutional neural network (CNN) deep learning (DL) based model on the classification that may be used to differentiate breast cancer histopathology images as benign or malignant. Besides that, five different types of pre-trained CNN architectures have been used to investigate the performance of the model to solve this problem which are the residual neural network-50 (ResNet-50), visual geometry group-19 (VGG-19), Inception-V3, and AlexNet while the ResNet-50 is also functions as a feature extractor to retrieve information from images and passed them to machine learning algorithms, in this case, a random forest (RF) and k-nearest neighbors (KNN) are employed for classification. In this paper, experiments are done using the BreakHis public dataset. As a result, the ResNet-50 network has the highest test accuracy of 97% to classify breast cancer images.</span
... Many common transfer learning tendencies exist, including VGG, ResNet, Inception, and others. The pre-trained CNN version Inception V3 [13] [14]. ...
Article
Full-text available
One of the most difficult challenges is recognizing human actions., especially in still images where there isn't much movement. Therefore, Using the transfer learning strategy, we suggested a technique for identifying human action., which consists of training some of the layers of deep learning techniques while freezing others. Also presented a way for data split, which is to choose some frames because we are working on a large dataset such as UCF-101, and this method is summarized by discovering the features for each frame, then clustering the elements, and then choosing a percentage of each cluster for training and test data. We used three techniques. They are VGG16, Inception V3, and xception. The proposed models have been implemented on UCF-101 Dataset. Depending on three data split methods with the dataset, the random split method, and the proposed split method, the Inception V3 achieved the highest accuracy. In contrast, the VGG16 achieved the least accuracy, and the accuracy of the xception was close to that of the Inception V3. By comparing the size of the dataset, the proposed methods achieved good results: the VGG16 in the proposed split attained an accuracy of 92.5%, the Inception V3 in the proposed split attained an accuracy of 98.12%, and the xception in the proposed split attained an accuracy of 95.16%. The VGG16 network is simple, so the VGG16 is less accurate. While the network in Inception V3, xception, is more extensive and complex, the learning space is more significant, although the network size is more prominent in Inception V3, xception. We only trained some blocks in the top layer.