About
325
Publications
69,971
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
3,828
Citations
Citations since 2017
Introduction
Additional affiliations
July 2012 - present
September 2005 - June 2011
Publications
Publications (325)
COVID-19 has taken a huge toll on our lives over the last 3 years. Global initiatives put forward by all stakeholders are still in place to combat this pandemic and help us learn lessons for future ones. While the vaccine rollout was not able to curb the spread of the disease for all strains, the research community is still trying to develop effect...
In this work, we have presented a way to increase the contrast of an image. Our target is to find a transformation that will be image specific. We have used a fuzzy system as our transformation function. To tune the system according to an image, we have used Genetic Algorithm and Hill Climbing in multiple ways to evolve the fuzzy system and conduct...
Motivation: Phylogenetic trees are often inferred from a multiple sequence alignment (MSA) where the tree accuracy is heavily impacted by the nature of estimated alignment. Carefully equipping an MSA tool with multiple application-aware objectives positively impacts its capability to yield better trees. Results: We introduce Multiobjective Applicat...
Background
and Motivations: Continuous Blood Pressure (BP) monitoring is crucial for real-time health tracking, especially for people with hypertension and cardiovascular diseases (CVDs). The current cuff-based BP monitoring methods are non-invasive but discontinuous while continuous BP monitoring methods are mostly invasive and can only be applied...
Lysine succinylation is a kind of post-translational modification (PTM) that plays a crucial role in regulating the cellular processes. Aberrant succinylation may cause inflammation, cancers, metabolism diseases and nervous system diseases. The experimental methods to detect succinylation sites are time-consuming and costly. This thus calls for com...
Customer churn is one of the most critical issues faced by the telecommunication industry (TCI). Researchers and analysts leverage customer relationship management (CRM) data through the use of various machine learning models and data transformation methods to identify the customers who are likely to churn. While several studies have been conducted...
Cardiovascular diseases are one of the most severe causes of mortality, annually taking a heavy toll on lives worldwide. Continuous monitoring of blood pressure seems to be the most viable option, but this demands an invasive process, introducing several layers of complexities and reliability concerns due to non-invasive techniques not being accura...
As the COVID-19 pandemic continues to affect all countries across the globe, this study seeks to investigate the relationship between nations' governance, COVID-19 national data, and nation-level COVID-19 vaccination coverage. National-level governance indicators (corruption index, voice and accountability, political stability, and absence of viole...
Uncontrolled proliferation of B-lymphoblast cells is a common characterization of Acute Lymphoblastic Leukemia (ALL). B-lymphoblasts are found in large numbers in peripheral blood in malignant cases. Early detection of the cell in bone marrow is essential as the disease progresses rapidly if left untreated. However, automated classification of the...
Species tree estimation is frequently based on phylogenomic approaches that use multiple genes from throughout the genome. However, for a combination of reasons (ranging from sampling biases to more biological causes, as in gene birth and loss), gene trees are often incomplete, meaning that not all species of interest have a common set of genes. In...
In this paper, we revisit the problem of computing longest common almost increasing subsequence (LCAIS) where, given two input sequences, the goal is to compute a common subsequence that is ‘almost’ increasing. Here the concept of an almost increasing subsequence offers an interesting relaxation over the increasing condition. This problem has been...
Many countries wish to achieve digital transformation, especially during the COVID-19 pandemic. The digital skills demand is changing fast. The time-series online job portal data for the ICT industry in Bangladesh provides an opportunity to analyze high demand job titles and skills over time. These time-series data address the question of the speed...
This study leveraged the phylogenetic analysis of more than 10K strains of novel coronavirus (SARS-CoV-2) from 67 countries. Due to the requirement of high-end computational power for phylogenetic analysis, we leverage a fast yet highly accurate alignment-free method to develop the phylogenetic tree out of all the strains of novel coronavirus. K-Me...
COVID-19 pandemic is taking a toll on the social, economic, and psychological well-being of people. During this pandemic period, people have utilized social media platforms (e.g., Twitter) to communicate with each other and share their concerns and updates. In this study, we analyzed nearly 25M COVID-19 related tweets generated from 20 different co...
Motivation
Lysine succinylation is a kind of post-translational modification (PTM) which plays a crucial role in regulating the cellular processes. Aberrant succinylation may cause inflammation, cancers, metabolism diseases and nervous system diseases. The experimental methods to detect succinylation sites are time-consuming and costly. This thus c...
In this work, we introduce BanglaBERT, a BERT-based Natural Language Understanding (NLU) model pretrained in Bangla, a widely spoken yet low-resource language in the NLP literature. To pretrain BanglaBERT, we collect 27.5 GB of Bangla pretraining data (dubbed 'Bangla2B+') by crawling 110 popular Bangla sites. We introduce two downstream task datase...
All proteomes contain both proteins and polypeptide segments that don’t form a defined three-dimensional structure yet are biologically active—called intrinsically disordered proteins and regions (IDPs and IDRs). Most of these IDPs/IDRs lack useful functional annotation limiting our understanding of their importance for organism fitness. Here we ch...
Lung cancer is a leading cause of death throughout the world. Because the prompt diagnosis of tumors allows oncologists to discern their nature, type, and mode of treatment, tumor detection and segmentation from CT scan images is a crucial field of study. This paper investigates lung tumor segmentation via a two-dimensional Discrete Wavelet Transfo...
Multiple sequence alignment (MSA) is a prerequisite for several analyses in bioinformatics, such as, phylogeny estimation, protein structure prediction, etc. PASTA (Practical Alignments using SATé and TrAnsitivity) is a state-of-the-art method for computing MSAs, well-known for its accuracy and scalability. It iteratively co-estimates both MSA and...
p>Introduction: To self-monitor asthma symptoms, existing methods (e.g. peak flow metre, smart spirometer) require special equipment and are not always used by the patients. Voice recording has the potential to generate surrogate measures of lung function and this study aims to apply machine learning approaches to predict lung function and severity...
Cardiovascular diseases are the most common causes of death around the world. To detect and treat heart-related diseases, continuous blood pressure (BP) monitoring along with many other parameters are required. Several invasive and non-invasive methods have been developed for this purpose. Most existing methods used in hospitals for continuous moni...
Raman spectroscopy provides a vibrational profile of the molecules and thus can be used to uniquely identify different kind of materials. This sort of fingerprinting molecules has thus led to widespread application of Raman spectrum in various fields like medical dignostics, forensics, mineralogy, bacteriology and virology etc. Despite the recent r...
Tremendous changes have been witnessed in the post-COVID-19 world. Global efforts were initiated to reach a successful treatment for this emerging disease. These efforts have focused on developing vaccinations and/or finding therapeutic agents that can be used to combat the virus or reduce its accompanying symptoms. Gulf Cooperation Council (GCC) c...
Dual-energy X-ray absorptiometry (DXA) has been traditionally used to assess body composition covering bone, fat and muscle content. Cardiovascular disease (CVD) has deleterious effects on bone health and fat composition. Therefore, early detection of bone health, fat and muscle composition would help to anticipate a proper diagnosis and treatment...
Data transformation (DT) is a process that transfers the original data into a form which supports a particular classification algorithm and helps to analyze the data for a special purpose. To improve the prediction performance we investigated various data transform methods. This study is conducted in a customer churn prediction (CCP) context in the...
Lung cancer is one of the deadliest cancers, and in part its effective diagnosis and treatment depend on the accurate delineation of the tumor. Human-centered segmentation, which is currently the most common approach, is subject to inter-observer variability, and is also time-consuming, considering the fact that only experts are capable of providin...
In recent years, physiological signal-based authentication has shown great promises, for its inherent robustness against forgery. Electrocardiogram (ECG) signal, being the most widely studied biosignal, has also received the highest level of attention in this regard. It has been proven with numerous studies that by analyzing ECG signals from differ...
In Bangladesh, groundwater is the main source of both drinking water and irrigation. Suction lift pumps and force mode of operation are the predominant technologies for groundwater abstraction in Bangladesh. For a sustainable usage policy, it is thus important to identify which technology would be more appropriate for which area in Bangladesh. With...
Lung cancer is a leading cause of death in most countries of the world. Since prompt diagnosis of tumors can allow oncologists to discern their nature, type and the mode of treatment, tumor detection and segmentation from CT Scan images is a crucial field of study worldwide. This paper approaches lung tumor segmentation by applying two-dimensional...
Cardiovascular diseases are the most common causes of death around the world. To detect and treat heart-related diseases, continuous Blood Pressure (BP) monitoring along with many other parameters are required. Several invasive and non-invasive methods have been developed for this purpose. Most existing methods used in the hospitals for continuous...
Species tree estimation is frequently based on phylogenomic approaches that use multiple genes from throughout the genome. However, for a combination of reasons (ranging from sampling biases to more biological causes, as in gene birth and loss), gene trees are often incomplete, meaning that not all species of interest have a common set of genes. In...
In this note, we consider the problem of counting and verifying abelian border arrays of binary words. We show that the number of valid abelian border arrays of length \(n\) is \(2^{n-1}\). We also show that verifying whether a given array is the abelian border array of some binary word reduces to computing the abelian border array of a specific bi...
The immense spread of coronavirus disease 2019 (COVID-19) has left healthcare systems incapable to diagnose and test patients at the required rate. Given the effects of COVID-19 on pulmonary tissues, chest radiographic imaging has become a necessity for screening and monitoring the disease. Numerous studies have proposed Deep Learning approaches fo...
With the recent developments in deep learning, automatic cell segmentation from images of microscopic examination slides seems to be a solved problem as recent methods have achieved comparable results on existing benchmark datasets. However, most of the existing cell segmentation benchmark datasets either contain a single cell type, few instances o...
Background: Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), the virus responsible for coronavirus disease 2019 (COVID-19), has had an unprecedented effect, especially among under-resourced minority communities. Surveillance of those at high risk is critical for preventing and controlling the pandemic. We must better understand the rel...
The COVID-19 pandemic has spread globally. Only three cases in Bangladesh were reported on March 8, 2020. Here, we aim to predict the epidemic progression for 1 year under different scenarios in Bangladesh. We extracted the number of daily confirmed cases from March 8 to July 20, 2020. We considered the suspected-infected-removed (SIR) model and pe...
COVID-19 has harshly impacted communities globally. This study provides relevant information for creating equitable policy interventions to combat the spread of COVID-19. This study aims to predict the knowledge, attitude, and practice (KAP) of the COVID-19 pandemic at a global level to determine control measures and psychosocial problems. A cross-...
Contemporary works on abstractive text summarization have focused primarily on high-resource languages like English, mostly due to the limited availability of datasets for low/mid-resource ones. In this work, we present XL-Sum, a comprehensive and diverse dataset comprising 1 million professionally annotated article-summary pairs from BBC, extracte...
Unpaired domain translation models with distribution matching loss such as CycleGAN are now widely being used to shift domain in medical images. However, synthesizing medical images using CycleGAN can lead to misdiagnosis of a medical condition as it might hallucinate unwanted features, especially if there's a data bias. This can potentially change...
Background
Genomic Islands (GIs) are clusters of genes that are mobilized through horizontal gene transfer. GIs play a pivotal role in bacterial evolution as a mechanism of diversification and adaptation to different niches. Therefore, identification and characterization of GIs in bacterial genomes is important for understanding bacterial evolution...
Providing proper timely treatment of asthma, self-monitoring can play a vital role in disease control. Existing methods (such as peak flow meter, smart spirometer) requires special equipment and are not always used by the patient. Using voice recording as surrogate measures of lung function can be used to assess asthma, which has good potential to...
Coronavirus disease 2019 (COVID-19) has been the main agenda of the whole world, since it came into sight in December 2019 as it has significantly affected the world economy and healthcare system. Given the effects of COVID-19 on pulmonary tissues, chest radiographic imaging has become a necessity for screening and monitoring the disease. Numerous...
In this paper, we revisit the r-gathering problem. Given sets C and F of points on the plane and distance d(c,f) for each c∈C and f∈F, an r-gathering of C to F is an assignment A of C to open facilities F′⊆F such that r or more members of C are assigned to each open facility. The cost of an r-gathering is maxc∈Cd(c,A(c)). The r-gathering problem c...
In Bangladesh two predominant pumping modes namely, Suction (S) and Force (F) are used for groundwater abstraction. Identifying which pumping mode would be apporopiate where in Bangladesh will help in formulating sustainable usage policy in Bangladesh. Therefore, this paper proposes a methodology leveraging the power of machine learning (ML) models...
The coronavirus disease 2019 (COVID-19) has resulted in an ongoing pandemic worldwide. Countries have adopted non-pharmaceutical interventions (NPI) to slow down the spread. This study proposes an agent-based model that simulates the spread of COVID-19 among the inhabitants of a city. The agent-based model can be accommodated for any location by in...
In recent years, physiological signal based authentication has shown great promises,for its inherent robustness against forgery. Electrocardiogram (ECG) signal, being the most widely studied biosignal, has also received the highest level of attention in this regard. It has been proven with numerous studies that by analyzing ECG signals from differe...
Background
Segmentation of nuclei in cervical cytology pap smear images is a crucial stage in automated cervical cancer screening. The task itself is challenging due to the presence of cervical cells with spurious edges, overlapping cells, neutrophils, and artifacts.
Methods
After the initial preprocessing steps of adaptive thresholding, in our ap...
Pre-training language models on large volume of data with self-supervised objectives has become a standard practice in natural language processing. However, most such state-of-the-art models are available in only English and other resource-rich languages. Even in multilingual models, which are trained on hundreds of languages, low-resource ones sti...
Background: The inception of next generations sequencing technologies have exponentially increased the volume of biological sequence data. Protein sequences, being quoted as the `language of life', has been analyzed for a multitude of applications and inferences. Motivation: Owing to the rapid development of deep learning, in recent years there hav...
Updating and querying on a range is a classical algorithmic problem with a multitude of applications. The Segment Tree data structure is particularly notable in handling the range query and update operations. A Segment Tree divides the range into disjoint segments and merges them together to perform range queries and range updates elegantly. Althou...
Clinical decision support systems (CDSSs) have received increasing research attention in recent years because they can improve the quality, safety, efficiency, and effectiveness of healthcare. A CDSS combined with advanced data analytics is more accurate and efficient than traditional systems. In this domain, survival or deterioration prediction of...
Malaria, one of the leading causes of death in underdeveloped countries, is primarily diagnosed using microscopy. Computer-aided diagnosis of malaria is a challenging task owing to the fine-grained variability in the appearance of some uninfected and infected class. In this paper, we transform a malaria parasite object detection dataset into a clas...
Epigenetic aging has been found to be associated with a number of phenotypes and diseases. A few studies have investigated its effect on lung function in relatively older people. However, this effect has not been explored in the younger population. This study examines whether lung function in adolescence can be predicted with epigenetic age acceler...
Significance
Recent large-scale sequencing efforts have enabled the detection of millions of missense variants. Elucidating their functional effect is of crucial importance but challenging. We approach this problem by performing a wide-scale characterization of missense variants from 1,330 disease-associated genes using >14,000 protein structures....
Obesity is an emerging public health problem in the Western world as well as in the Gulf region. Qatar, a tiny wealthy county, is among the top-ranked obese countries with a high obesity rate among its population. Compared to Qatar's severity of this health crisis, only a limited number of studies focused on the systematic identification of potenti...
Homoglyphs are pairs of visual representations of Unicode characters that look similar to the human eye. Identifying homoglyphs is extremely useful for building a strong defence mechanism against many phishing and spoofing attacks, ID imitation, profanity abusing, etc. Although there is a list of discovered homoglyphs published by Unicode consortiu...
The COVID-19 epidemic had spread rapidly through China and subsequently proliferated globally leading to a pandemic situation around the globe. Human-to-human transmissions, as well as asymptomatic transmissions of the infection, have been confirmed. As of April 03, 2020, public health crisis in China due to COVID-19 is potentially under control. W...
Multiple sequence alignment (MSA) is a preliminary task for estimating phylogenies. It is used for homology inference among the sequences of a set of species. Generally, the MSA task is handled as a single-objective optimization process. The alignments computed under one criterion may be different from the alignments generated by other criteria, in...
Motivation
Researchers and practitioners use a number of popular sequence comparison tools that use many alignment-based techniques. Due to high time and space complexity and length-related restrictions, researchers often seek alignment-free tools. Recently, some interesting ideas, namely, Minimal Absent Words (MAW) and Relative Absent Words (RAW),...