ArticlePDF Available

Abstract and Figures

The most prevalent cancer amongst women is woman breast cancer. Ultrasound imaging is a widely employed method for identifying and diagnosing breast abnormalities. Computer-aided diagnosis technologies have lately been developed with ultrasound images to help radiologists enhance the accuracy of the diagnosis. This paper presents several ultrasound image segmentation techniques, mainly focus on eight clustering methods over the last 10 years, and it shows the advantages and disadvantages of these approaches. Breast ultrasound image segmentation is, therefore, still an accessible and challenging issue due to numerous ultrasound artifacts introduced in the imaging process, including high speckle noise, poor contrast, blurry edges, weak signal-to-noise ratio, and intensity inhomogeneity.
Content may be subject to copyright.
A preview of the PDF is not available
... Image segmentation attempts to divide an image into non-overlapping areas. Techniques that have been applied to this task include thresholding, watershed-based methods, graph-based methods, clustering, and region-based approaches [22]. Thresholding methods are often exceedingly basic, seeking to enhance grayscale images based on intensity values [23]. ...
... Thresholding methods are often exceedingly basic, seeking to enhance grayscale images based on intensity values [23]. Smooth boundaries and unimodal histograms (pixel value) pose problems for this method [22]. Watershed-based methods consider the image is a contour map and seek to find the lowest/highest points [24] but are sensitive to noise and over-segmentation and are computationally expensive as gradient calculation is required [22]. ...
... Smooth boundaries and unimodal histograms (pixel value) pose problems for this method [22]. Watershed-based methods consider the image is a contour map and seek to find the lowest/highest points [24] but are sensitive to noise and over-segmentation and are computationally expensive as gradient calculation is required [22]. Graph-based methods subset images using nodes and edges where the edges never overlap and have been used for region of interest (ROI) identification in breast cancer [25]. ...
Article
Full-text available
(1) Background: Female breast cancer diagnoses odds have increased from 11:1 in 1975 to 8:1 today. Mammography false positive rates (FPR) are associated with overdiagnoses and overtreatment, while false negative rates (FNR) increase morbidity and mortality. (2) Methods: Deep vision supervised learning classifies 299 × 299 pixel de-noised mammography images as negative or non-negative using models built on 55,890 pre-processed training images and applied to 15,364 unseen test images. A small image representation from the fitted training model is returned to evaluate the portion of the loss function gradient with respect to the image that maximizes the classification probability. This gradient is then re-mapped back to the original images, highlighting the areas of the original image that are most influential for classification (perhaps masses or boundary areas). (3) Results: initial classification results were 97% accurate, 99% specific, and 83% sensitive. Gradient techniques for unsupervised region of interest mapping identified areas most associated with the classification results clearly on positive mammograms and might be used to support clinician analysis. (4) Conclusions: deep vision techniques hold promise for addressing the overdiagnoses and treatment, underdiagnoses, and automated region of interest identification on mammography.
... Researchers present a number of computer vision-based automated methods for breast cancer classification using ultrasound images [44,45]. A few of them concentrated on the segmentation step, followed by feature extraction [46], and a few extracted features from raw images. Researchers used the preprocessing step in a few studies to improve the contrast of the input images and highlight the infected part for better feature extraction [47]. ...
Article
Full-text available
After lung cancer, breast cancer is the second leading cause of death in women. If breast cancer is detected early, mortality rates in women can be reduced. Because manual breast cancer diagnosis takes a long time, an automated system is required for early cancer detection. This paper proposes a new framework for breast cancer classification from ultrasound images that employs deep learning and the fusion of the best selected features. The proposed framework is divided into five major steps: (i) data augmentation is performed to increase the size of the original dataset for better learning of Convolutional Neural Network (CNN) models; (ii) a pre-trained DarkNet-53 model is considered and the output layer is modified based on the augmented dataset classes; (iii) the modified model is trained using transfer learning and features are extracted from the global average pooling layer; (iv) the best features are selected using two improved optimization algorithms known as reformed differential evaluation (RDE) and reformed gray wolf (RGW); and (v) the best selected features are fused using a new probability-based serial approach and classified using machine learning algorithms. The experiment was conducted on an augmented Breast Ultrasound Images (BUSI) dataset, and the best accuracy was 99.1%. When compared with recent techniques, the proposed framework outperforms them.
... In recent years, many new variant algorithms [17][18][19][20] have been developed to improve it further. Machine learning is a branch of computer science that make computers to comprehend the tasks without explicitly teaching them to do so [21][22][23]. Introducing a cost function to machine learning and data mining enables the machines to discover appropriate weights for outcomes [24]. Optimization finds the function parameters in a way that the solution of the problem becomes simpler. ...
Article
Full-text available
Cancer is a manifestation of disorders caused by the changes in the body's cells that go far beyond healthy development as well as stabilization. Breast cancer is a common disease. According to the stats given by the World Health Organization (WHO), 7.8 million women are diagnosed with breast cancer. Breast cancer is the name of the malignant tumor which is normally developed by the cells in the breast. Machine learning (ML) approaches, on the other hand, provide a variety of probabilistic and statistical ways for intelligent systems to learn from prior experiences to recognize patterns in a dataset that can be used, in the future, for decision making. This endeavor aims to build a deep learning-based model for the prediction of breast cancer with a better accuracy. A novel deep extreme gradient descent optimization (DEGDO) has been developed for the breast cancer detection. The proposed model consists of two stages of training and validation. The training phase, in turn, consists of three major layers data acquisition layer, preprocessing layer, and application layer. The data acquisition layer takes the data and passes it to preprocessing layer. In the preprocessing layer, noise and missing 7979 Mathematical Biosciences and Engineering Volume 19, Issue 8, 7978-8002. values are converted to the normalized which is then fed to the application layer. In application layer, the model is trained with a deep extreme gradient descent optimization technique. The trained model is stored on the server. In the validation phase, it is imported to process the actual data to diagnose. This study has used Wisconsin Breast Cancer Diagnostic dataset to train and test the model. The results obtained by the proposed model outperform many other approaches by attaining 98.73 % accuracy, 99.60% specificity, 99.43% sensitivity, and 99.48% precision.
... As well as, the image containing the watermark algorithm embeds the watermark into the cover image, and the message is extracted using the watermark extraction algorithm. Watermarking techniques can be able to categorized based on a different number of protrusions, depending on the domain used to include the information, the watermark is classified into spatial domain and transform domain [3] [4]. Most of transform domain watermarking algorithms typically consists of three steps: Data transform, watermark embedding and watermark recovery, the discrete wavelet transform (DWT) is associate optimum response for procedure time overhead. ...
Conference Paper
Full-text available
Digital watermarking is getting more research and industry attention. Digital multimedia data allows for robust and simple data editing and modification. However, the spread of digital media presents concerns for digital content owners. It is important to note that digital data can be copied without quality or content loss. This has a considerable impact on copyright holders' ability to safeguard their intellectual property rights. The method of transmitting information by imperceptibly embedding it into digital media is digital watermarking. There are various methods in literature, such as DWT and DCT, which take full energy, are seen and integrated. New strategies and procedures for optimization are required. The present study proposes a novel design and computation technique based on the discrete wavelet and discrete cosine transforms. Watermarking techniques have been progressing to shield media content such as text, audio, video, etc. From copyright. The proposed hybrid DWT-DCT Bacterial Foraging Optimization (BFO) technique improves the efficiency of watermarking digital images by 97%. Bacterial foraging optimization (BFO) is an innovative technique for intelligent optimization. It is a widely used optimization algorithm in a wide variety of applications. However, when compared to other optimizers, the BFO performs poorly in terms of convergence. This technique uses a high-frequency image region. A variety of techniques are compared with the (NCC) Normalized Cross Correlations, (PSNR) Peak Noise Signal Ratio and IF (Image Fidelity). The highest performance is seen in DWT-DCT-BFO watermarking.
... In this recent age, for extracting features different techniques have been employed, for example, texture features, Gabor features, co-occurrence matrix, and more [10]. Indeed, Gray Level Cooccurrence Matrix (GLCM) and Local Binary Pattern (LBP) are the most common and robust techniques of texture image description that are widely utilized for analyzing image applications [11,12,24,25,26]. The main objectives of this paper are; LBP, FD, and GLCM are the most common as well as the most efficient techniques to produce high-level features that are used in the literature. ...
Article
Full-text available
Coronavirus (COVID-19) is a new contagious disease reasoned by a new virus that is widely spread over the world, this virus never has been identified in humans before. Respiratory disease can be affected by this virus such as flu with several symptoms, for example, fever, headache, cough, and pneumonia. COVID-19 presence in humans can be tested through blood samples or sputum while the result can be obtained in days. Further, biomedical image analysis assists in showing signs of pneumonia in a patient. Therefore, this paper aims to provide a fully automatic COVID-19 identification system by proposing a new fusion scheme of texture features for CT scan images. This paper presents a fusion scheme based on a machine learning system using three significant texture features, namely, Local Binary Pattern (LBP), Fractal Dimension (FD), and Grey Level Co-occurrence Matrices (GLCM). In experimental results, to demonstrate the efficiency of the proposed scheme we have collected 300 CT scan images from a publicly available database. The experimental result shows the performance of LBP, FD, and GLCM obtained an accuracy of 89.87%, 87.84%, and 90.98%, respectively while the proposed scheme yields better results by achieving 96.91% accuracy.
... In this recent age, for extracting features different techniques have been employed, for example, texture features, Gabor features, co-occurrence matrix, and more [10]. Indeed, Gray Level Cooccurrence Matrix (GLCM) and Local Binary Pattern (LBP) are the most common and robust techniques of texture image description that are widely utilized for analyzing image applications [11,12,24,25,26]. The main objectives of this paper are; LBP, FD, and GLCM are the most common as well as the most efficient techniques to produce high-level features that are used in the literature. ...
... Experts use ultrasound in image inspection systems because they produce high-frequency sound waves that permeate the human body, and as the waves bounce off the boundary tissue of the human body, distinctive echoes will be created, which the computer uses to produce the image. Where some palpable lesions (lumps) can be evaluated accurately to identify cysts in the human body, especially in the breast, and thus can help the radiologist to detect lumps [39] [40]. ...
Article
Full-text available
Medical image segmentation plays an essential role in computer-aided diagnostic systems in various applications. Therefore, researchers are attracted to apply new algorithms for medical image processing because it is a massive investment in developing medical imaging methods such as dermatoscopy, X-rays, microscopy, ultrasound, computed tomography (CT), positron emission tomography, and magnetic resonance imaging. (Magnetic Resonance Imaging), So segmentation of medical images is considered one of the most important medical imaging processes because it extracts the field of interest from the Return on investment (ROI) through an automatic or semi-automatic process. The medical image is divided into regions based on the specific descriptions, such as tissue/organ division in medical applications for border detection, tumor detection/segmentation, and comprehensive and accurate detection. Several methods of segmentation have been proposed in the literature, but their efficacy is difficult to compare. To better address, this issue, a variety of measurement standards have been suggested to decide the consistency of the segmentation outcome. Unsupervised ranking criteria use some of the statistics in the hash score based on the original picture. The key aim of this paper is to study some literature on unsupervised algorithms (K-mean, K-medoids) and to compare the working efficiency of unsupervised algorithms with different types of medical images.
... Machin Learning is now used in all fields of computer work where algorithms are developed and performance is enhanced (Abdulqader et al., 2020;Chaudhary & Vasuja, 2019;Zeebaree et al., 2019a;Jahwar & Abdulazeez, 2020;Maulud & Abdulazeez, 2020). Learning from unbalanced data sets has been a key challenge in machine learning in recent years and is also used in many implementations, such as information security, engineering, remote sensing, biomedicine, and transformation (Abdulqader et al., 2020;Muhammad et al., 2020;Anuradha & Reddy, 2008) industries. Classification, regression, and band techniques include supervised learning approaches where the focus variable is categorical in classification and tends to decline (Zeebaree et al., 2019b;Sethi & Mittal, 2019). ...
Article
Full-text available
In the last few decades, there has been considerable amount of research on the use of Machine Learning (ML) for speech recognition based on Convolutional Neural Network (CNN). These studies are generally focused on using CNN for applications related to speech recognition. Additionally, various works are discussed that are based on deep learning since its emergence in the speech recognition applications. Comparing to other approaches, the approaches based on deep learning are showing rather interesting outcomes in several applications including speech recognition, and therefore, it attracts a lot of researches and studies. In this paper, a review is presented on the developments that occurred in this field while also discussing the current researches that are being based on the topic currently.
Article
Full-text available
Due to sharp increases in data dimensions, working on every data mining or machine learning (ML) task requires more efficient techniques to get the desired results. Therefore, in recent years, researchers have proposed and developed many methods and techniques to reduce the high dimensions of data and to attain the required accuracy. To ameliorate the accuracy of learning features as well as to decrease the training time dimensionality reduction is used as a pre-processing step, which can eliminate irrelevant data, noise, and redundant features. Dimensionality reduction (DR) has been performed based on two main methods, which are feature selection (FS) and feature extraction (FE). FS is considered an important method because data is generated continuously at an ever-increasing rate; some serious dimensionality problems can be reduced with this method, such as decreasing redundancy effectively, eliminating irrelevant data, and ameliorating result comprehensibility. Moreover, FE transacts with the problem of finding the most distinctive, informative, and decreased set of features to ameliorate the efficiency of both the processing and storage of data. This paper offers a comprehensive approach to FS and FE in the scope of DR. Moreover, the details of each paper, such as used algorithms/approaches, datasets, classifiers, and achieved results are comprehensively analyzed and summarized. Besides, a systematic discussion of all of the reviewed methods to highlight authors' trends, determining the method(s) has been done, which significantly reduced computational time, and selecting the most accurate classifiers. As a result, the different types of both methods have been discussed and analyzed the findings.
Article
Full-text available
High-dimensional data is interpreted with a considerable number of features, and new problems are presented in groups. The so-called "high dimension" is initially created to explain the common increase in time complexity of many computational problems, and therefore the performance of general aggregation algorithms is unsuccessful. Accordingly, many works focused on introducing new technologies and aggregation algorithms to process data with higher dimensions. Standard algorithms for all aggregate algorithms are the fact that they need a different essential evaluation of the similarity between data objects. However, current aggregation algorithms still have some open research problems. In this review work, they provide a summary of the results of the high-dimensional data space and its effects on different aggregation algorithms. It also provides a detailed overview of several grouping algorithms with several types: subspace methods, model-based grouping, density-based grouping methods, partition-based grouping methods, etc., including a more detailed description of the recent work of its advantages and disadvantages in Solve the problem of higher-dimensional data. The scope of future work is also discussed at the end of the work to expand existing compilation methods and algorithms.
Conference Paper
Full-text available
It is well known that, early diagnosis is also very important for cancer patients. One of the imaging techniques used in the diagnosis of breast cancer is ultrasonography. In this study, a system that helps the doctor to detect the lesion in the breast has been suggested. We used K-Means clustering algorithm to detect the lesion in the images. The effects of three different filters (Median, Laplace, Sobel) have been examined in the study. Also, different partitioning effects are considered. According to the accuracy rates we obtained, it was concluded that the accuracy increased as the number of partitions increased. In addition, the Median filter was the best filter compared to other filters.
Article
Image segmentation is an active research topic in image processing. The Fuzzy C-means (FCM) clustering analysis has been widely used in image segmentation. As there is a large amount of delicate tissues such as blood vessels and nerves in medical images, noise generated during imaging process can easily affect successful segmentation of these tissues. The traditional FCM algorithm is not ideal for segmentation of images containing strong noise. In this study, we proposed an improved FCM algorithm with anti-noise capability. We first discussed the algorithm of dictionary learning for noise reduction. Then we developed a new image segmentation algorithm as a combination of the dictionary learning for noise reduction and the improved fuzzy C-means clustering. Lastly we used the algorithm of the improved FCM to segment images, during which we removed the non-target areas making use of the grayscale features of images and extracted accurately the areas of interests. The algorithm was tested using synthetic Shepp-Logan images and real medical magnetic resonance imaging (MRI) and computed tomography (CT) images. Compared to the synthetic data and real medical images segmented by the fuzzy C-means (FCM) clustering algorithm, the Kernel Fuzzy C-mean (KFCM) clustering algorithm, spectral clustering algorithm, the sparse learning based fuzzy C-means (SL_FCM) clustering algorithm, and the modified spatial KFCM (MSFCM) algorithm, the images segmented by the dictionary learning Fuzzy C-mean clustering (DLFCM) algorithm have higher partition coefficient, lower partition entropy, better visual perception, better clustering accuracy, and clustering purity.
Conference Paper
Abstract—In this paper, the performance comparison of the machine learning techniques on diabetes disease detection is carried out. Diabetes disease attracts great attention in the machine learning community. Because diabetes is a chronic disease and needs to be detected at an early stage in order to deal with the correct medication. A series of machine learning techniques are used in the work such as Decision Trees (DT), Logistic Regressions (LR), Discriminant Analysis (DA), Support Vector Machines (SVM), k-Nearest Neighbors (k-NN) and ensemble learners. MATLAB software is considered. Especially, the MATLAB Classification Learner Tool (MCLT) is used. The MCLT covers the mentioned machine learning techniques and their various variants. Thus, a totally 24 classifiers are used in the presented work. The results are evaluated according to the 10-fold cross-validation criteria and average classification accuracy is used for performance measure. The obtained average accuracy scores are in the range of 65.5% and 77.9%. The best accuracy score 77.9% is produced by the LR method and the worst one 65.5% is produced by the Coarse Gaussian SVM technique. Keywords—Machine learning techniques, Diabetes disease, Pima Indian diabetes dataset, MATLAB classification learner tool.