About
43
Publications
8,475
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
679
Citations
Citations since 2017
Publications
Publications (43)
This chapter expands on intrinsic model interpretability discussed in the last chapter to include many modern techniques that are both interpretable and accurate on many real-world problems. The chapter starts with differentiating between interpretable and explainable models and why, in specific domains where high stakes decisions need to be made,...
This chapter discusses various ways of using pre-modeling explainability: a set of techniques aimed at gaining insights into a dataset to help build more effective models. Since any machine learning model is built from the data, understanding the content on which the model is based is imperative for explainability and interpretability. Many of thes...
In recent years, we have seen gains in adoption of machine learning and artificial intelligence applications. However, continued adoption is being constrained by several limitations. The field of Explainable AI addresses one of the largest shortcomings of machine learning and deep learning algorithms today: the interpretability and explainability o...
Post-hoc techniques represent a vast collection of methods created to specifically address the black-box problem, where we do not have access to the internal feature representations or model structure. There are considerable advantages to using post-hoc methods. They can work for a wide variety of model algorithms. They allow for different represen...
One of the biggest challenges the XAI field faces is formalizing, quantifying, measuring, and comparing different explanation techniques in a unified way. The evaluation of explanations is an interdisciplinary research covering broad areas of human-computer interaction, machine learning, psychology, cognitive science, and visualization, to name a f...
Recent advances in deep learning have made tremendous progress in the adoption of neural network models for tasks from resource utilization to autonomous driving. Most deep learning models are opaque black-box models that are not easily explainable. Unlike linear models, the weights of a neural network are not inherently interpretable to humans. Th...
Various domains such as computer vision, natural language processing, and time series analysis have extensively applied machine learning algorithms in recent years. This chapter will discuss the research and applications of the interpretable and explainable algorithms in this domain. We will start with a time series algorithm survey, starting from...
One of the easiest ways to build explainable models is by having the machine learning algorithm be intrinsically interpretable. Gaining an understanding of how well a model performs from looking at the results of model evaluation is another important way to enhance model explainability. We discuss several techniques to visualize model evaluation in...
This book is written both for readers entering the field, and for practitioners with a background in AI and an interest in developing real-world applications. The book is a great resource for practitioners and researchers in both industry and academia, and the discussed case studies and associated material can serve as inspiration for a variety of...
The work presented in this chapter is motivated by two important challenges that arise when applying ML techniques to big data applications: the scalability of an ML technique as the training data increases significantly in size, and the transparency (understandability) of the induced models. To address these issues we describe and analyze a meta-l...
In this chapter, we investigate deep reinforcement learning for text and speech applications. Reinforcement learning is a branch of machine learning that deals with how agents learn a set of actions that can maximize expected cumulative reward. In past research, reinforcement learning has focused on game play. Recent advances in deep learning have...
Domain adaptation is a form of transfer learning, in which the task remains the same, but there is a domain shift or a distribution change between the source and the target. As an example, consider a model that has learned to classify reviews on electronic products for positive and negative sentiments, and is used for classifying the reviews for ho...
Most supervised machine learning techniques, such as classification, rely on some underlying assumptions, such as: (a) the data distributions during training and prediction time are similar; (b) the label space during training and prediction time are similar; and (c) the feature space between the training and prediction time remains the same. In ma...
In the previous chapter, CNNs provided a way for neural networks to learn a hierarchy of weights, resembling that of n-gram classification on the text. This approach proved to be very effective for sentiment analysis, or more broadly text classification. One of the disadvantages of CNNs, however, is their inability to model contextual information o...
In the last few years, convolutional neural networks (CNNs), along with recurrent neural networks (RNNs), have become a basic building block in constructing complex deep learning solutions for various NLP, speech, and time series tasks. LeCun first introduced certain basic parts of the CNN frameworks as a general NN framework to solve various high-...
Automatic speech recognition (ASR) has grown tremendously in recent years, with deep learning playing a key role. Simply put, ASR is the task of converting spoken language into computer readable text (Fig. 8.1). It has quickly become ubiquitous today as a useful way to interact with technology, significantly bridging in the gap in human–computer in...
In Chap. 10.1007/978-3-030-14596-5_8, we aimed to create an ASR system by dividing the fundamental equation W∗=argmaxW∈V∗P(W|X) into an acoustic model, lexicon model, and language model by using Bayes’ theorem. This approach relies heavily on the use of the conditional independence assumption and separate optimization procedures for the different m...
In this chapter, we introduce the notion of word embeddings that serve as core representations of text in deep learning approaches. We start with the distributional hypothesis and explain how it can be leveraged to form semantic representations of words. We discuss the common distributional semantic models including word2vec and GloVe and their var...
In deep learning networks, as we have seen in the previous chapters, there are good architectures for handling spatial and temporal data using various forms of convolutional and recurrent networks, respectively. When the data has certain dependencies such as out-of-order access, long-term dependencies, unordered access, most standard architectures...
This chapter introduces the major topics in text and speech analytics and machine learning approaches. Neural network approaches are deferred to later chapters.
One of the most talked-about concepts in machine learning both in the academic community and in the media is the evolving field of deep learning. The idea of neural networks, and subsequently deep learning, gathers its inspiration from the biological representation of the human brain (or any brained creature for that matter).
The goal of this chapter is to review basic concepts in machine learning that are applicable or relate to deep learning. As it is not possible to cover every aspect of machine learning in this chapter, we refer readers who wish to get a more in-depth overview to textbooks, such as Learning from Data [AMMIL12] and Elements of Statistical Learning Th...
Scalability of clustering algorithms is a critical issue in real world clustering applications. Usually, data sampling and parallelization are two common ways to address the scalability issue. Despite their wide utilization in a number of clustering algorithms, they suffer from several major drawbacks. For example, most data sampling can often lead...
With the widespread adoption of deep learning, natural language processing (NLP),and speech applications in many areas (including Finance, Healthcare, and Government) there is a growing need for one comprehensive resource that maps deep learning techniques to NLP and speech and provides insights into using the tools and libraries for real-world app...
Motivation: Bacterial resistance to antibiotics is a growing concern. Antimicrobial peptides (AMPs), natural components of innate immunity, are popular targets for developing new drugs. Machine learning methods are now commonly adopted by wet-laboratory researchers to screen for promising candidates.
Results: In this work we utilize deep learning...
Growing bacterial resistance to antibiotics is spurring research on utilizing naturally-occurring antimicrobial peptides(AMPs) as templates for novel drug design. While experimentalists mainly focus on systematic point mutations to measure the effect on antibacterial activity, the computational community seeks to understand what determines such act...
Many real-world problems involve massive amounts of data. Under these
circumstances learning algorithms often become prohibitively expensive, making
scalability a pressing issue to be addressed. A common approach is to perform
sampling to reduce the size of the dataset and enable efficient learning.
Alternatively, one customizes learning algorithms...
Growing bacterial resistance to antibiotics is urging the development of new lines of treatment. The discovery of naturally-occurring antimicrobial peptides (AMPs) is motivating many experimental and computational researchers to pursue AMPs as possible templates. In the experimental community, the focus is generally on systematic point mutation stu...
Mean shift is a nonparametric clustering technique that does not require the number of clusters in input and can find clusters of arbitrary shapes. While appealing, the performance of the mean shift algorithm is sensitive to the selection of the bandwidth, and can fail to capture the correct clustering structure when multiple modes exist in one clu...
Background:
Many open problems in bioinformatics involve elucidating underlying functional signals in biological sequences. DNA sequences, in particular, are characterized by rich architectures in which functional signals are increasingly found to combine local and distal interactions at the nucleotide level. Problems of interest include detection...
A variety of real world applications fit into the broad definition of time series classification. Using traditional machine learning approaches such as treating the time series sequences as high dimensional vectors have faced the well known "curse of dimensionality" problem. Recently, the field of time series classification has seen success by usin...
The scalability of machine learning (ML) algorithms has become a key issue as the size of training datasets continues to increase. To address this issue in a reasonably general way, a parallel boosting algorithm has been developed that combines concepts from spatially structured evolutionary algorithms (SSEAs) and ML boosting techniques. To get mor...
The scalability of machine learning (ML) algorithms has become increasingly important due to the ever increasing size of datasets and increasing complexity of the models induced. Standard approaches for dealing with this issue generally involve developing parallel and distributed versions of the ML algorithms and/or reducing the dataset sizes via s...
Recently Quantitative Genetics has been successfully employed to understand and improve operators in some Evolutionary Algorithms (EAs) implementations. This theory offers a phenotypic view of an algorithm's behavior at a population level, and suggests new ways of quantifying and measuring concepts such as exploration and exploitation. In this pape...
Associating functional information with biological sequences remains a challenge for machine learning methods. The performance of these methods often depends on deriving predictive features from the sequences sought to be classified. Feature generation is a difficult problem, as the connection between the sequence features and the sought property i...
The annotation of DNA regions that regulate gene transcription is the first step towards understanding phenotypical differences among cells and many diseases. Hypersensitive (HS) sites are reliable markers of regulatory regions. Mapping HS sites is the focus of many statistical learning techniques that employ Support Vector Machines (SVM) to classi...
Prediction of promoter regions continues to be a challenging subproblem in mapping out eukaryotic DNA. While this task is key to understanding the regulation of differen- tial transcription, the gene-specific architecture of promoter sequences does not readily lend itself to general strategies. To date, the best approaches are based on Support Vect...
Hypersensitive (HS) sites in genomic sequences are reliable markers of DNA regulatory regions that control gene expression. Annotation of regulatory regions is important in understanding phenotypical differences among cells and diseases linked to pathologies in protein expression. Several computational techniques are devoted to mapping out regulato...
Support vector machines (SVMs) are now one of the most popular machine learning techniques for solving difficult classification problems. Their effectiveness depends on two critical design decisions: 1) mapping a decision problem into an n-dimensional feature space, and 2) choosing a kernel function that maps the n-dimensional feature space into a...
This paper proposes a method to improve the recognition of regulatory genomic sequences. Annotating sequences that regulate gene transcription is an emerging challenge in genomics research. Identifying regulatory sequences promises to reveal underlying reasons for phenotypic differences among cells and for diseases associated with pathologies in pr...