# Hasan KurbanIndiana University Bloomington | IUB · Department of Computer Science

Hasan Kurban

Ph.D. in Computer Science

## About

33

Publications

6,958

Reads

**How we measure 'reads'**

A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more

172

Citations

Citations since 2017

Introduction

Research Interests: Machine Learning, Data Mining, Big Data, Artificial Intelligence

Additional affiliations

August 2017 - August 2018

Education

August 2012 - September 2017

## Publications

Publications (33)

Machine learning (ML) has been recently used to make sense of large volume of data as data-driven methods to identify correlations and then examine material properties in detail. Herein, we analyze the correlations between structural and electronic properties of ZnO nanoparticles (NPs) obtained from density-functional tight-binding method using Dat...

This interdisciplinary study is conducted to find answers to two important questions which researchers often face in Machine Learning (ML) and Material Science (MS) fields . In this work, we measure the performance of the most popular ML algorithms (more than a dozen) on rare-class learning problem and determine the best learning algorithm for atom...

Existing data mining techniques, more particularly iterative learning algorithms, become overwhelmed with big data. While parallelism is an obvious and, usually, necessary strategy, we observe that both (1) continually revisiting data and (2) visiting all data are two of the most prominent problems especially for iterative, unsupervised algorithms...

Using a large volume of bus data in the form of
GPS coordinates (over 100 million data points) and automated
passenger count data (over 1 million data points) we have
developed (1) a system of analysis and prediction of future public
transportation demand (2) a new model that uses concepts specific
to college campuses that maximizes passenger satis...

Bazı hastalık belirtilerinin birçok tıbbi tedavi alanıyla ilgili olması, hastaların tedavi için randevu alırken zorlanmalarına sebep
olmaktadır. Örneğin; karın ağrısı rahatsızlığı bulunan bir hastanın rahatsızlığı dahiliye, hariciye ya da intaniye bölümlerinden
herhangi birisiyle ilgisi bulunabilmektedir. Bu çalışmada T.C. Sağlık Bakanlığına bağlı...

Impedance spectroscopy is a powerful technique and broadly used for battery characterization. In this study, we introduce a novel machine framework we call the duplex (for paired outputs) that constructs a linear ensemble of the best k models. Several impedance spectra of commercial lithium-ion battery coin cells at various states of charge and amb...

Impedance spectroscopy is a powerful technique and broadly used for battery characterization. In this study, we introduce a novel methodology that devise a system to utilize the experimental impedance data by processing each one of the parameters with the most favorably designated machine learning techniques. Several impedance spectra of commercial...

This paper describes and provides the data on the regenerated-impedance spectra that is computed from experimental results of electrochemical impedance spectroscopy measurements taken from a commercial Li-ion battery. The empirical impedance data of secondary coin type Li-ion batteries were collected in different states of charge ranging from empty...

The experimental impedance data were collected in cell potentials as 3.2V, 3.4V, 3.6V, 3.8V, 4.0V, and 4.2V (corresponding state of charge values: 0%, 3%, 8%, 40%, 78%, 100%).
Impedance data is generated by a Machine Learning model.

The use of Electrochemical Impedance Spectroscopy on rechargeable Lithium-ion battery characterization is an extensively recognized non-destructive procedure for both in-situ and ex-situ analyses. In an impedance measurement for a rechargeable battery, the oscillating current with an accompanying phase angle is the response for a potential perturba...

Predicting material properties by solving the Kohn‐Sham (KS) equation, which is the basis of
modern computational approaches to electronic structures, has provided significant improvements in materials sciences. Despite its contributions, both DFT and DFTB calculations are limited by the number of electrons and atoms that translate into increasingl...

In recent years, the introduction of single-cell RNA sequencing (scRNAseq)
has enabled the analysis of a cell’s transcriptome at an unprecedented granularity
and processing speed. The experimental outcome of applying this technology is a
M × N matrix containing aggregated mRNA expression counts of M genes and N cell
samples. From this matrix, scien...

Background:
In recent years, the introduction of single-cell RNA sequencing (scRNA-seq) has enabled the analysis of a cell's transcriptome at an unprecedented granularity and processing speed. The experimental outcome of applying this technology is a $M \times N$ matrix containing aggregated mRNA expression counts of $M$ genes and $N$ cell samples....

Clustering is intractable, so techniques exist to give a best approximation. Expectation Maximization (EM), initially used to impute missing data, is among the most popular. Parameters of a fixed number of probability distributions (PDF) together with the probability of a datum belonging to each PDF are iteratively computed. EM does not scale with...

To deal with the unimaginable continual growth of data and the focus on its use rather than its governance, the value of data has begun to deteriorate seen in lack of reproducibility, validity, provenance, etc. In this work, we aim to simply understand what is the value of data and how this basic understanding might affect existing AI algorithms, i...

In this work, we perform a theoretical analysis of structural, electronic, and optical properties of pure and Mg-doped amorphous ZnO nanoparticles (a-ZnO NPs) using DFTB method. Our results show that Zn atoms are more preferential for Mg atoms than for O atoms because the number of Mg-Zn bonds is greater than that of Mg-O. The rise in the content o...

In this study, we built a variety of Machine Learning (ML) systems over 23 different sizes of CH3NH3PbI3 perovskite nanoparticles (NPs) to predict the atoms in the NPs from their geometric locations. Our findings show that a specific type of ML algorithms, tree-based models which are Random Forest (RF), Extreme Gradient Boosting (XGBoost), Decision...

Machine learning (ML) has recently made a major contribution to the fields of Material Science (MS). In this study, ML algorithms are used to learn atoms types over structural geometrical data of anatase TiO2 nanoparticles produced at different temperature levels with the density-functional tight-binding method (DFTB). Especially for this work, Ran...

Structural, energetic, electronic, reactivity and stability properties of armchair (3,3), (4,4), (5,5), (6,6), (7,7), (8,8), (9,9) and (10,10) aluminum nitride nanotubes (AlNNTs) with different diameter have been probed using density functional theory (DFT) in terms of Moreover, the chemical reactivity characteristics of AlNNTs have performed via s...

In this work, we analyze the correlations between structural and electronic properties of anatase, brookite and rutile phases TiO2 nanoparticles (NPs) using data science techniques. For this purpose, we use the geometries of three phases TiO2 NPs under heat treatment obtained from molecular dynamics (MD) simulations in the frame of DFTB+ code. We i...

In this work, we used the density-functional tight-binding (DFTB) and investigate ZnO nanoparticle (NP) properties, i.e., the structural and electronic properties. First, a ZnO NP with ~0.9 nm including 258 atoms was characterized from 30×30×30 supercell based on the hexagonal crystal structure of ZnO. Second, HOMO, LUMO electronic properties, band...

We carried out a thorough examination of the structural and electronic features of undoped and Nitrogen (N)-doped ZnO nanoparticles (NPs) by the density-functional tight-binding (DFTB) method. By increasing the percent of N atoms in undoped ZnO NPs, the number of bonds (n), order parameter (R) and radial distribution function (RDF) of two-body inte...

We perform a theoretical investigation using the density functional tight binding (DFTB) approach for the structural analysis and electronic structure of copper hydride (CuH) metallic nanoparticles (NPs) of different size (from 0.7 to 1.6 nm). By increasing the size of CuH NPs, the number of bonds, segregation phenomena and radial distribution func...

Without question, astronomy is about Big Data and clustering is a very common task over astronomy domain. The expectation-maximization algorithm is among the top 10 data mining algorithms used in scientific and industrial applications, however, we observe that astronomical community does not make use of it as a clustering algorithm. In this work, w...

Iterative machine learning algorithms, i.e., kmeans
(KM), expectation maximization (EM), become overwhelmed
with big data since all data points are being continually
and indiscriminately visited while a cost is being minimized.
In this work, we demonstrate (1) an optimization approach
to reduce training run-time complexity of iterative machine
lear...

Stellar data, only a few years ago, measured in the
.1M of objects. Now, sets are routinely 1M. With the launch of
ESA’s Gaia in 2013, we expect 1000M stellar objects measured
more precisely and with more measurements.Without question,
astronomy is about Big Data and clustering is a very common
task over astronomy domain. The expectation-maximizati...

Existing data mining techniques, more particularly iterative learning algorithms, become overwhelmed with big data. While parallelism is an obvious and, usually, necessary strategy, we observe that both (1) continually revisiting data and (2) visiting all data are two of the most prominent problems especially for iterative, unsupervised algorithms...

Existing data mining techniques, more particularly
iterative learning algorithms, become overwhelmed
with big data. While parallelism is an obvious and, usually,
necessary strategy, we observe that both (1) continually
revisiting data and (2) visiting all data are two of the most
prominent problems especially for iterative, unsupervised
algorithms...

The challenges presented by data to scientific inquiry and hypothesis testing in an oceanographic setting are not new problems. Indeed, the challenges are at least a century old. The problems are not with the data itself, but rather with the attention to the management of the "data ecology" in the information systems. Data needs to be accessible as...

Random Forests have been used as effective
ensemble models for classification. We present in this paper
a new type of Random Forests (RFs) called Red(uced)-RF
that adopts a new dynamic data reduction principle and a
new voting mechanism called Priority Vote Weighting (PV)
which improve accuracy, execution time and AUC values
compared to Breiman’s R...

Random forests have been used as effective models to tackle a number of classification and regression problems. In this paper, we present a new type of Random Forests (RFs) called Red(uced)-RF that adopts a new voting mechanism called Priority Vote Weighting (PV) and a new dynamic data reduction principle which improve accuracy and execution time c...

Dramatic increases in the amount and complexity of stellar data must be matched by new or refined algorithms that can help scientists make sense of this data and so better understand the universe. ParaHeap-k is a parallel cluster algorithm for analyzing big data that can potentially prove useful to astronomical research.