Content uploaded by Rezzy Eko Caraka

Author content

All content in this area was uploaded by Rezzy Eko Caraka on Feb 19, 2020

Content may be subject to copyright.

[SYLWAN., 164(1)]. ISI Indexed 161

Abstract—. In order to account for

correlated count data with excess zeros, we use

a variational approximation multivariate

latent generalized linear model. We performed

two different simulation-based on level species

and genus with Poisson and negative binomial

to subject-specific interpretations. Methods: In

this work, we use variational approximation to

estimate parameter in multivariate latent

generalized linear model. Otherwise,

overdispersed a count outcome exhibiting

many zeros, above the amount expected under-

sampling from a Poisson distribution. Results:

Through simulation studies, species counts

follows negative binomial, and genus counts

follow Poisson distribution and the

performance of this methods evaluate by

Akaike information criterion (AIC), Akaike

information criterion corrected (AICc), and

Bayesian Information Criterion (BIC).

Conclusion: While these two sets of latent class

parameters might be meaningful in certain

species counts and genus counts.

Index Terms—Variational Approximation,

Latent Variables, Niche Modelling, GLLVM,

Termites.

I. INTRODUCTION

cology is defined as the study of the

relationship of organisms, or groups of

organisms to their environment. Also, Ecology

Manuscript received October 11, 2019; revised January 15,

2020 This work supported by MOST, Taiwan (ROC).

1 College of Informatics, Chaoyang University of Technology,

Taiwan (ROC), 41349.Corresponding email:

crching@cyut.edu.tw†, rezzyekocaraka@gmail.com

2 Department of Statistics, College of Natural Sciences Seoul

National University, Shin Lim-Dong, Kwan Ak-Ku, South

Korea,151-747.

3 Department of Statistics, College of Natural Sciences,

Pukyong National University, 45, Busan, South Korea.

can be described as the science of the mutual

relations between living organisms and their

environment. Every ecological process that occurs

in nature is represented as a mathematical equation

that will build a model.

Nevertheless, the mathematical equation can be

determined the solution with the help of

computational techniques and numerical methods

and statistical modelling. Ecology has initially been

a general knowledge and only studied

environmental relations individually based on

physiology. At that time, scholars, especially from

the natural sciences, paid less attention to various

sciences that were general, but more people

directed the development of the sciences toward

specialization. Although people's attention to

ecology compared to other sciences, especially

economics and politics, is inadequate, ecology

continues to grow. As proof that ecology can

continue to grow and spread its wings to other

fields such as botany, and zoology.

Ecological research ranges from the

adaptation of organisms to ecosystem dynamics

because there are many levels and types of

interactions between individuals in dealing with

challenges posed by their abiotic environment. In

general, research on species counts will produce

zero data because identified at that location there

are no species, and this is very difficult to analyze

because the assumptions in other methods that data

should not be 0 so the method with the Poisson

distribution will be beneficial in the analysis.

4 Department of Statistics, Padjadjaran University, Indonesia,

43563.

5 Bioinformatics and Data Science Research Center, Bina

Nusantara University, Jakarta 11480, Indonesia

6 Computer Science Department, BINUS Graduate Program

Master of Computer Science Bina Nusantara University,

Jakarta, Indonesia, 11480.

7School of Environmental and Natural Resource Science,

Universiti Kebangsaan Malaysia, 43600.

Rezzy Eko Caraka1,2,4,5, Rung Ching Chen1†, Youngjo Lee2, Maengseok Noh3,

Toni Toharudin4, Bens Pardamean5,6, Andi Saputra7

Variational Approximation Multivariate Generalized

Linear Latent Variable Model in Diversity Termites

Riau and Peninsular Malaysia

E

[SYLWAN., 164(1)]. ISI Indexed 162

Data count describes the number of events

in a certain period and can only be positive because

an event cannot be negative. In modelling with the

data counting will violate the Ordinary Least

Square (OLS) regression assumption, such as the

error follows the normal distribution (normality)

and has constant variance, so the data counting

cannot use OLS regression.

The modelling of count data in its

development led to Generalized Linear Models

(GLMs). GLMs are generalizations of classical

regression models or OLS regression (M Noh et al.

2019); (Kwon et al. 2016), and there are analytical

methods for data that do not meet the assumption

of a normal distribution (De Jong and Heller,

2008). One member of the GLMs family from

the Poisson distribution is Poisson regression. The

assumption that must be fulfilled in Poisson

regression is that the variation value of the Y

response variable must be the same as the average

value (Myers et al. 2012). In Poisson regression

analysis with discrete data, there is usually a

violation of that assumption (Abraham et al. 2007),

where the variance value is smaller than the

average value which is generally called the under

dispersion or the variance value is higher than the

average value called over dispersion. (Consul &

Famoye 1992) Stated that sometimes cases of over

dispersion were found in the data count. Chopped

data usually has a tremendous integer value and

contains a lot of zero values, so the variance is quite

large. If an assumption violation occurs, the

resulting conclusion is invalid because it

underestimates the estimated error standard. How

to overcome over dispersion is to form several

models that are a combination of the Poisson

distribution with several distributions, both discrete

and two continuous (mixed model distribution)

(Kéry 2010). In Poisson distribution combinations,

only a few distributions are often used in research

due to complex calculations.

In ecological and species modelling

(Warton, Foster, et al. 2015), the analysis will be

more complicated when involving latent variables,

i.e. unobserved variables, so it needs to be

developed into Multivariate GLM (Warton,

Blanchet, et al. 2015) which is one type of

statistical analysis used to analyse data with data

used in the form of many predictor variables

(Warton 2015); (Joyner et al. 2019) and many

response variables are very suitable for species

modelling (Caraka et al. 2018); (Rahman et al.

2019); and (Herliansyah & Fitia 2018). The main

challenge in GLLVM modelling is that the

estimation process involves integrals on random

variables that do not have an explicit form unless

the response variable is normally distributed. To

estimate the integral function, a method is needed.

Some methods currently being developed are

Laplace approximations (Bower & Savitsky 2008)

and Variational approximations (Hui et al. 2017).

However, another challenge is due to a high-

dimensional data (X. D. Wang et al. 2019). Number

of respond variables and random variables used by

computational problems which are quite

complicated.

Latent variable models are powerful

probabilistic tools for extracting useful latent

structure from otherwise unstructured data and

have proved useful in numerous applications.

Especially in ecological modelling (Y. Wang et al.

2012). A particular case of latent variable models,

where observations originate from a linear

transformation of latent variables. Despite their

modelling simplicity, latent linear models are

useful and widely used instruments for data

analysis in practice and include, among others,

such notable examples as probabilistic principal

component analysis and correlation component

analysis, independent component analysis.

Otherwise, it is well known that estimation and

inference are often intractable for many latent

linear models and one has to make use of

approximate methods often with no recovery

guarantees.

The remainder of the paper is organized as

follows. Section II provides an explanation of

Niche-modelling and presents the multivariate

latent glm and variational approximation, Section

III our results and discussion. Finally, conclusions

and future research directions are indicated in

Section IV.

II. METHODS

The models distribution of species generally

known envelope-modelling, habitat modelling, and

niche-modelling. The main objective is to estimate

the similarity of conditions in all regions by using

emergence, and predictor data as objects in models.

Generally, The species distribution uses climate

data (Kurniawan, Soesilohadi, et al. 2018) as a

predictor describing the outline of the modelling

process of species distribution (Kurniawan,

Rahmadi, et al. 2018). Also, when modelling

species, the data will tend to follow the Poisson

distribution (Warton 2005). Poisson distribution is

a distribution for events with a small probability of

[SYLWAN., 164(1)]. ISI Indexed 163

occurrence where the occurrence depends on a

specific time interval or in a particular area with

observations in the form of discrete variables. The

characteristics of the experiments that follow the

Poisson distribution are as follows. 1. Events that

occur in large populations with small probabilities

2. Events depend on specific time intervals 3.

Events are included in the counting process. and, 4.

Repetition of events that follow the distribution of

binomial distributions.

The probability function of the Poisson

distribution can be stated as follows (Ha & Lee

2003).

(1)

Where is the average of the random variable Y

with the Poisson distribution where the average

value and variance have values greater than zero.

The function used in the Poisson regression model

is ln, so ln i = i . Thus Poisson

regression can be stated as follows.

(2)

Poisson regression analysis is a regression analysis

that is part of the Generalized Linear Model

(GLM). Poisson regression is used for data with

response variables that follow the Poisson

distribution (Y ~ Poisson). The important

assumption in this analysis is that the variance must

be equal to the average called equidispersion. But

in some studies, this condition is not met, often

found the count data which has a higher than the

average range is called over dispersion. However,

if the condition found in Poisson regression

analysis with a smaller than average variance, it is

called under dispersion.

According to Hinde and Demetrio (Hinde

& Demétrio 1998), there are several possibilities

for equidispersion not to be fulfilled in a model,

including the diversity of observations (the

difference between individuals as components not

explained by the model), correlations between

individual responses. The consequence of not meet

the equidispersion is that the Poisson regression is

not suitable for modelling the data because the

formed model will produce a biased parameter

estimate. In addition, over dispersion also results in

a smaller standard error value (underestimates)

than it should, resulting in inappropriate

conclusions. Over dispersion can be checked by

use the deviance value. The range of the Poisson

distribution is equal to the average (σ2 = µ). Over

dispersion is detected using the amount of deviance

divided by the degree of freedom, that has a value

greater than 1. At the same time, under dispersion

is detected by the value of deviance divided by the

degree of freedom that has a value of less than 1.

Deviance value can be expressed as an equation:

(3)

Where

n=number of observation

= variable response to-i with i=1,2,...,n

= mean of variable response y which is

influenced by the predictor variable value on the i-

th observation

In a GLM, the response y follows a distribution

from the exponential family of distributions

(including normal, binomial, Poisson, and

gamma), and its expectation is modelled as

E(y) = µ (4)

There is a link function g(·) connecting µ with Xβ

such that

g(µ) = Xβ (5)

The variance of y is a function of µ. Take a

Poisson distribution, for instance, the variance of

y is equal to the mean µ (Maengseok Noh & Lee

2007) ; (Lee & Noh 2012). This relationship

between the mean and the variance depends

directly on the assumed distribution of y (Lee &

Nelder 2001). Table 1 represents for all GLMs,

the variance of y is the product of a variance

function V (µ) and a dispersion parameter φ (del

Castillo & Lee 2008). With m being the binomial

denominator, we have the following variances and

variance functions, V (µ).

Table 1 Variance

Variance of y

V(µ)

Normal

1

Poisson

µ

µ

Gamma

Binomial

µ(m- µ)/m

µ(m- µ)/m

[SYLWAN., 164(1)]. ISI Indexed 164

For the Poisson and binomial distributions

=1, whereas for normal and gamma

is a parameter to be estimated. For the normal

distribution, is simply the residual variance.

Generalized linear latent variable models

(GLLVMs) is extended version of GLM with

latent variables (Rahman et al. 2019) ; (Niku, Hui,

et al. 2019), (Niku, Brooks, et al. 2019); (Niku et

al. 2017). Suppose is the multivariate

responses across species with being

the observational units and being the

number of species. The expectation of is

modeled through the following relationship

(6)

with being the linear predictor and is a

link function. Linear components of the predictor

are similar to that of GLM with the inclusion of

random effects as follows:

(7)

Where represents the row effect, contains a

matrix of the regression coefficient to

corresponding independent variables,, and is

the loading factors or quantities describing the

interactions across species and connecting the

unobserved variables to responses. In many

papers, the distributional choice of latent

variables, , is a normal distribution with mean

zero and constant variance.

In the heart of parameter estimation, the

data likelihood with a lower bound can be

estimated by variational approximation. VA

generally has known computational viability and

trade the bias. However, if our random vector

followed exponential family then we can write the

distribution joint as follows:

(8)

Where is latent variable X and observed

variables Y,<x,y> and have inner product between

x and y, s a vector of parameters

and can be defined

as the realised value of the variables.

However is the log

partition function and ensures distribution is

normalised. The parameter will be estimated

using the method of moments is a generalization

of the Gaussian-Poisson model

(9)

(10)

It is clear that the under-determination of the

estimating equations (2) is a direct result of

reducing the dimensionality of X via the clustering

function k. By projecting the data

for each cluster onto a single

dimension of an auxiliary data object

the latent process could be fitted to

the auxiliary data. Without encountering this

problem:

(11)

Where are some functions and

(12)

(13)

Are analogous to

In terms of both interpretability and computational

convenience, restricting to be a linear function

of its arguments can be easily justiﬁed, and this

approach is taken here. Several choices of function

are available in this regard, including cluster

averages:

(14)

Then we can rewrite to random effects:

(15)

or

(17)

And a representative dimension projection

(18)

[SYLWAN., 164(1)]. ISI Indexed 165

One dimension in each random effect is chosen to

represent the dimension whose

average data over replicates

was

closest in norm to the average data over replicates

(19)

(20)

Also, if more than one dimension in

Minimizes the norm then one of them can be

chosen arbitrarily. Once a representative dimension

for each cluster has been chosen,

sample moments from the data can be used to

approximate the expected values and estimate the

parameters of the latent process.

III. SPECIES COUNTS

Indonesia is a tropical country that is rich in plant

diversity which strongly supports termites. About

80% of Indonesia's land area is a suitable habitat

for development. Termites belong to the order

Blattodea family Termitidae consisting of 2000

species that spread in the world. Termite species

diversity on the island of Sumatra and Peninsular

Malaysia have not been fully inventoried. In the

1990s, an inventory was carried out by several

researchers. However, (Nandika et al. 2003)

examined the types of termites and their spread in

the DKI Jakarta and Bandung regions. The study

found nine species of termites, namely

Microtermes insperatus, M. incertoides,

Macrotermes gilvus, Odontotermes javanicus, O.

malaccensis, Schedorhinotermes javanicus,

Coptotermes curvignathus, C. haviliandi, C.

kalshoveni, C. heimi, and C. travians.

Termites also attack the nursery phase of the

cocoa, so it is hazardous if attacked in that phase.

(Keng 2006) Has researched about termites in

cocoa plantations compared to primary forests in

Bukit Tawau Park, Sabah. Based on the results of

these studies obtained data that termites also attack

cocoa plantations even though the level of diversity

is lower when compared to primary forests.

Termite abundance is more abundant in cocoa

plantations so that the attacks become severe.

Plentiful food sources cause this. The identification

of termites is more accessible from the family level

to the species level. Termites in tropical regions

such as Indonesia have been widely studied, and

there are several families, namely Kalotermitdae,

Rhinotermitidae, and Termitidae. The

Kalotermitidae family is a group of termites that

attack and nest in living trees or nest in dry wood

that is not related to the soil.

This group of termites is commonly called dry

wood termites. 2 Species of dry wood termites that

usually attack settlements include Cryptotermes

cynocephalus. These termite species will attack

settlements whose buildings are made of wood

(Bong et al. 2012). The second family is

Rhinotermitidae. The distinctive feature of this

family is the presence of sclerite pieces on the flat

thorax. This family often nests in wood or other

materials that contain cellulose found on the

surface of the soil. Types of termites from this

family that often attack settlements and plantations

are of the genus Schedorhinotermes and

Coptotermes. Coptotermes genus has a

characteristic that is when the colony feels

disturbed, and the soldiers will release liquid like

milk whose function is to paralyze the enemy

(Saputra et al. 2017). This fluid comes out of the

fontanel in front of the head. The third family is

Termitidae, which is the group of termites with the

most species. Characteristics of the family

Termitidae are the presence of sclerite pieces on the

pronotum shaped like a saddle; the center of the

nest is in the ground and makes a mushroom-

shaped gallery of sponges, and usually makes

mounds of land. Examples of genera that are

known to attack plantations are Macrotermes,

Odontotermes, and Microtermes. Termites and ants

are a group of social insects. Both have almost the

same characteristics.

People often mention that termites are white

ants, even though termites and ants do not have a

kinship. Based on its taxonomy, termites are an

order of Blattodea (Inward et al. 2007) while ants

are a group of the order Hymenoptera. In addition,

in terms of ant morphology, there is a concern

between the piston and the abdomen which is a

characteristic of the wasp group (Hymenoptera:

Apocrita). The hardening is called petiole whereas

in termites there is no hardening, and often there is

no clear line between the piston and the abdomen.

Generally termites feed in closed areas (cryptic)

characterized by wandering tubes that come from

the ground to form channels that connect one

wandering tube to another, whereas ants tend to

forage in open areas. Termite food in the form of

lignocellulose consisting of cellulose, lignin, and

hemicellulose.

Cellulose is composed of glucose polymers that

are rich in fibre while ants eat organic material that

contains sugar which is composed of

polysaccharides. Social insects have different tasks

[SYLWAN., 164(1)]. ISI Indexed 166

in a colony or commonly referred to as caste.

Termites have a caste distribution system, namely

reproductive caste, soldiers, and workers.

Reproductive caste has the duty to mate and lays

eggs consisting of king and queen as primarily

reproductive.

Reproductive caste is divided into two, namely

primary and neotene (secondary). Primary

reproductive caste originates from winged termites

(alate) or larons, whereas neotent reproduction

arises when the queen or king dies or disappears in

the colony so that this neotent reproductive caste

will appear (Nandika et al. 2003). The soldier caste

has the characteristic of not marrying, eyes are

reduced, and its job is only to defend or protect the

colony in the event of an attack from the enemy.

This caste is characterized by the development of a

forward-looking mandible which is usually used to

attack enemies who attack its colony. The termite

group Nasutitermitinae is a group of termites that

is unique compared to other termite groups. The

warrior caste has 3 mandibles that are not well

developed, but the fontanel (forehead) of a more

developed head. Workers can digest wood that

contains a lot of cellulose. The help of microbes in

the termite's body makes it easier in the digestive

process. Microbes that aid in the digestion process

releases cellulolytic enzymes to facilitate the

degradation of cellulose from wood.

According to (Klepzig et al. 2009) the types of

relationships between insects and symbiotic

microorganisms are comprehensive. This

relationship has many variations such as

mutualism, commensalism, or parasitism. Termites

are one example that has a symbiotic mutualism

between termites and their symbiotic organisms.

Termites have a specific relationship to their

symbiotic microorganisms such as bacteria, fungi,

and protozoa. This relationship has benefits both

directly and indirectly. The symbiotic relationship

in these termites can be helpful in food digestion,

nutrient absorption, and in protecting against

natural enemies.

According to (Sarkar 1998), symbiosis is a

shared life in different organisms. Termites consist

of a diverse collection of species, broadly divided

into two, namely low-level termites and high-level

termites. Low-level termites are symbiotic with a

large proportion of prokaryotic and protist

populations (single-celled eukaryotes). High-level

termites only consist of the family Termitidae, but

the species are more than three-quarters of all

species and are symbiotic with most groups of

bacteria. The association of cellulolytic protists in

low-termite digestion is known as an example of

mutual symbiosis. Protists produce acetate from

cellulose particles or wood endocytosis; the result

of the acetate is absorbed by termites as an energy

and carbon source.

This research was conducted in Indonesia and

Malaysia This study was conducted in 11 sampling

locations, which is the entire sampling location in

oil palm plantations located in Riau (Indonesia),

Johor and Pahang (Peninsular Malaysia). Six

different sampling locations in Riau were

conducted while in Johor and Pahang; five different

areas were sampled, each sampling location

representing different types of land and farm

management.

Fig. 1 Sampling locations range from the Riau palm oil ecosystem (Indonesia) to Johor-Pahang,

(Peninsular Malaysia) (Saputra et al. 2016).

[SYLWAN., 164(1)]. ISI Indexed 167

Table 2 Location of Termite Sampling in Indonesia to Peninsular Malaysia

Location

Sampling Location

Regency /

Country

Farm Management

and Land Type

GPS

IdBCB

Indonesia Batang

Cenaku Belilas

Indragiri Hulu,

Riau

Private & Clay

00036’571” S

102o33’592” T

IdSPTK

Indonesia Sako Pangean

Taluk Kuantan

Kuantan

Singingi, Riau

Company &

Sandland

00o20’314” U

101o35’089” T

IdRS

Indonesia Redang Seko

Indragiri Hulu

Riau

Private & Clay

00o12’805” S

102o17’032” T

IdSgPK

Indonesia Sungai Pagar

Kampar

Kampar, Riau

Private & Peat

00o15’559”

U101o24’623” T

IdCPSK

Indonesia Central

Plantation Services

Kampar

Kampar, Riau

Private & Peat

00o15’379” U

101o35’979” T

IdFRGB

Indonesia First

Resources Group

Bengkalis

Bengkalis, Riau

Company & Peat

01o20’968” U

102o01’441” T

MyBPM

Malaysia Bukit Pasir

Muar

Muar, Johor

Private & Clay

02o06’0.97” U

102o37’02.4” T

MyFKT

Malaysia Felda Kahang

Timur

Kahang, Johor

Company & Clay

02o87’13.9” U

103o28’55.9” T

MyFNT

Malaysia Felda Nitar

Timur

Nitar, Johor

Company & Clay

02o22’14.5” U

103o45’47.2” T

MyLER

Malaysia Ladang Endau

Rompin

Endau, Pahang

Company & Clay

02o36’12.4” U

103o32’47.9”T

MyLSK

Malaysia Ladang Sungai

Kemelai

Rompin, Johor

Company &

Sandland

02o36’19.4” U,

103o30’39.9” T

3.1 Counts Based on Species

In this simulation using not open dataset that has

been performed by previous research on Termites in

Riau and Peninsular Malaysia (Saputra et al. 2016);

(Nur-Atiqah et al. 2017) ; (Halim et al. 2018) ;

(Saputra et al. 2018) ; (Zaki et al. 2019). In the first

stage, the modeling will be carried out by

considering only the groups of species. As seen in

Figure 2 below. Coptotermitinae was not found in

(IdSgPK, MyBPM, MyFKT, MyFNT) but was

found in (IdFRGB, IdCPSK), Rhinotermitinae was

not found in (IdBCB, MyBPM) and was found in

(IdCPSK). Macrotermitinae not found in (IdFRGB,

IdCPSK, and MyLER), Nasutitermitinae not found

in (IdRS, IdSgPK, and MyBPM). Overall different

from Termitinae Found in all research sites (Saputra

et al. 2016). A total of 522 termites been

successfully sampled at oilfields from Belilas (Riau

Province, Indonesia) to Endau (Johor-Pahang,

Peninsular Malaysia) Out of the total number of

individuals, this study recorded five subfamilies of

two The termites are the subfamilies

Coptotermitinae and Rhinotermitinae of the family

Rhinotermitidae and the subfamilies Termitinae,

Macrotermitinae and Nasutitermitinae from the

family Termitidae.

The highest abundance of termites was

recorded from the family Termitidae (43 species;

349 populations) from three subfamilies namely

Termitinae (17 species; 165 populations),

Macrotermitinae (12 species; 65 populations) and

Nasutitermitinae (14 species; 119 populations).

While for Rhinotermitidae 15 species (173

populations) have been successfully recorded from

two subfamilies, first, Coptotermitinae (5 species;

60 populations) and Rhinotermitinae (10 species;

113 populations).

Based on the number of species, MyLSK recorded

the highest number of species and 26 species were

recorded, followed by IdCPSK (21 species),

MyFKT and MyLER (15 species), MyFNT (12

species), IdSPTK (11 species), MyBPM with (8

species), then IdSgPK and IdFRGB (7 species),

IdRS (5 species) and the smallest region of IdBCB

of only three species were found.

[SYLWAN., 164(1)]. ISI Indexed 168

Fig. 2 Species Counts in Riau and Peninsular Malaysia

In species richness and diversity compared to

temperature and humidity acquired abundance

tends to increase with increasing humidity. It can be

seen that the pattern of abundance decreases

immediately after the temperature factor and the

humidity decreases in figure 3. However, the

temperature does not show a significant increase or

decrease in temperature at the sampling location.

Different results were found in the pattern of

temperature and humidity, whereas the temperature

increased, the humidity decreased, but this did not

affect the abundance of termites. The richness of the

termite species is found to show similarities in

abundance.

The structure of the termite community may

be influenced by environmental factors found in the

pattern of temperature and humidity as temperatures

increase, humidity decreases but this does not affect

the abundance of termites and species richness.

However, temperature, humidity is the primary

physical factors affecting the termite pattern. These

different results can be achieved because data

collection of environmental factors needs to be done

more thoroughly and carefully so as not to

misinterpret the existing ones.

Fig. 3. Heatmap Diversity, Richness and Climatic

Factors

Then, we wil perform GLLVM to see the distribution

at the species level. Based on modelling with regard

to AIC. A good model is the one that has the

minimum AIC among all other models (Joyner et al.

2019). So that the model chosen with LV 1 is chosen.

Since the best distribution is negative binomial, we

can write in equation (6) Let Follows the

negative binomial distribution with mean μ and

variance. By using the log link function.

Table 3. Selected distribution based on species counts

Distribution

LV

log-likelihood:

Accuracy

negative.binomial

1

-171.9731

AIC: 373.9463

AICc: 277.9463

BIC: 379.9147

negative.binomial

2

-175.2916

AIC: 388.5833

AICc: 304.1388

BIC: 396.1433

Poisson

1

-294.5091

AIC: 609.0182

AICc: Inf

BIC: 612.9971

Poisson

2

-186.0558

AIC: 400.1117

AICc: 295.1117

BIC: 405.6822

[SYLWAN., 164(1)]. ISI Indexed 169

We have the same relationship between μ and like

the Poisson model. The conditional distribution on

is given by (Caraka et al. 2018).

(21)

Function of log-likelihood for negative binomial

response can be written:

(22)

Where

(23)

And

(24)

We get the values from the intercept in table 3 and

visualise the species ordination in Figure 4.

Table 4. Parameter GLLVM Based On Species Counts

Intercept

theta.LV1

Dispersion

Coptotermitinae

0.7243423

1.4167729

2.2014351

Rhinotermitinae

1.8027498

1.0410484

1.0636533

Termitinae

2.6231070

-0.4822085

0.8501716

Macrotermitinae

1.4606034

-1.0184050

0.7945360

Nasutitermitinae

2.3342301

0.3717223

1.9698468

Fig. 4 Ordination Species Based on Negative Binomial GLLVM

Based on Figure 5 explained that in each species co

unts there is a perfect relationship (negative 1) Rhi

notermitinae against Termitinae and Macrotermitin

ae and perfectly (positive 1) like Termitinae agains

t Macrotermitinae.

[SYLWAN., 164(1)]. ISI Indexed 170

Fig. 5 Species Correlation

3.2 Counts Based on Genus

After getting information on species counts, then

we are interested to run this model until the genus

level. However, we do the same modelling to

compare binomial negatives and Poisson. Tabel 4

represents that Poisson provides high accuracy than

other models with AIC 1080.345. Besides, after

modelling using Poisson LV 2 we can get the

parameter based on genus in Table 5. The value of

intercept visualized in Figure 6 as ordinary genus.

Table 4.Selected Distribution Based on Genus Counts

Distribution

LV

log-likelihood

Accuracy

negative.binomial

1

-474.9195

AIC: 1117.839

AICc: 924.8659

BIC: 1151.262

negative.binomial

2

-474.1781

AIC: 1170.356

AICc: 924.178

BIC: 1214.523

Poisson

1

-499.4764

AIC: 1110.953

AICc: 972.1703

BIC: 1133.235

Poisson

2

-457.1727

AIC: 1080.345

AICc: 889.3317

BIC: 1113.371

Figure 6, Figure 7, and Figure 8, respectively. We

can form a variational bound of the likelihood

on the . Then, the variational

approximation can be relies on maximizing the lower

bound over a tractable of :

(24)

Where

(25)

Then, term by term inequality in equation (25) can

be written:

(26)

In our variational approximation, we can set and

choose of product distribution of q-dimensional

[SYLWAN., 164(1)]. ISI Indexed 171

multivariate cases with diagonal covariance

matrices:

(27)

Where

In the Poisson-case, the variational expectation of the

non-linear part involving b – the matrix of

conditional expectations A– is equal to and can be

expressed as:

(28)

The choice of a good starting value is crucial in

iterative procedures as it helps the algorithm start in

the attractor field of a good local maximum and can

substantially speed-up convergence. Here we

initialize by fitting a GLLVM-Poisson to Y,

then extracting the regression coefficients and

the variance-covariance matrix of the

Pearson residuals. We set and

the best rank q-

approximation of, as given by keeping the

first q-dimensions of a singular value decomposition

of. We set the other starting values as

. In general, the idBCB region has a

significant difference compared to the other areas, as

does IdFRGB, MyLER. However, MyFNT, IDRs,

IdRS have the same kinship for the diversity value of

this species.

Fig. 6 Ordination Genus Based on Poisson- GLLVM

Fig. 7 Correlation Based on Genus

Fig. 8 Q Graph Based on Genus

[SYLWAN., 164(1)]. ISI Indexed 172

Table 5. Paramater Based on Genus GLLVM

Intercept

theta.LV1

theta.LV2

Coptotermes curvignathus

0.4579891

-0.5180037

0.0000000

Coptotermes kalshoveni

0.4945442

-0.7563808

-0.3912430

Coptotermes sepangensis

0.7098347

-0.5713176

-0.4052082

Coptotermes havilandi

0.1429008

-0.3260377

-0.2614940

Parrhinotermes aequalis

0.1861423

0.3279931

-0.6498629

Parrhinotermes pygmaeus

0.3342082

-0.6126105

-0.3818074

Parrhinotermes spp. A

0.0000002

0.0000001

-0.0000001

Schedorhinotermes brevialatus

0.6323911

-0.6129293

-0.6993178

Schedorhinotermes javanicus

0.2002901

-0.2152474

-0.1701744

Schedorhinotermes mediobscurus

0.4900498

-0.3467540

-0.2655336

Schedorhinotermes malaccensis

0.3713058

-0.7299368

-0.5104836

Schedorhinotermes sarawakensis

0.5179787

-0.8183993

0.2943385

Prohamitermes mirabilis

0.0700554

0.0830517

-0.1603465

Microcerotermes dubius

0.2974319

-0.0924066

-0.1246402

Termes rostratus

0.8005997

0.8291831

0.2435449

Procapritermes spp. G

0.2316985

0.0966011

0.1023581

Pericapritermes mohri

0.2982160

0.2042979

-0.3159980

Pericapritermes semarangi

0.0000001

0.0000000

0.0000001

Macrotermes gilvus

0.4152051

-0.2335550

1.1629640

Macrotermes malacensis

0.3328945

0.4642242

-0.0326502

Nasutitermes havilandi

0.3280501

-0.5767489

-0.3486288

Nasutitermes matangensis

0.5653070

0.1825307

-0.5255028

Nasutitermes neopravus

0.1529289

-0.0052302

-0.1658990

Nasutitermes proatripennis

0.6173534

0.4406987

-1.0388870

Nasutitermes roboratus

0.1723334

0.0162690

-0.3598069

Bulbitermes constrictiformis

0.3647910

0.0729814

0.1236907

Bulbitermes constrictoides

0.3067339

0.0637614

0.0542762

Bulbitermes neopasullis

0.0867176

0.0193210

0.0165708

For Termitinae, species that can be found are

Prohamitermes mirabilis, Microcerotermes

dubius, M. havilandi, Termes rostratus,

Procapritermes sp. G., Pericapritermes

buiteinzorgi, P. mohri. The species are distributed

as follows; P. mirabilis species are found in

IdSgPK. The species M. dubius is found at

IdCPSK. M. havilandi species are found IdFRGB.

T. rostratus species found in IdRS, IdBCB,

IdSgPK, IdCPSK Then the species

Procapritermes sp. G is found on IdSgPK. P.

buiteinzorgi species found IdSgPK. Also, P.

mohri species are found in IdSPTK and IdSgPK.

Each diversity based on the genus can be seen in

Figure 9 that the distribution of densities is many

0 which means that in some species there is no

equal relationship with the number at a particular

location.

Fig. 9 Heatmap Based on Genus

Species from the subfamily Termitinae are found

to have a wide variety in each location which

clearly visualize in figure 10. The species of P.

mirabilis, M. dubius, M. serrula is an arboreal

[SYLWAN., 164(1)]. ISI Indexed 173

species of wood eater. While species of T.

rostratus and P. sp. G is a species of wood-eater

or middle-class eater. The nesting method for

species of T. rostratus is nestled by inquilines

which means that this species builds up hostage

on other species and for species of P. sp. G. builds

a hive hypogeal. Species of P. buitenzorgi and P.

mohri are termites of organic soil and are

hypogealous.

Fig. 10 Heatmap Based on Family in Each Location

IV.CONCLUSION AND FUTURE WORK

In this study, we succeeded in simulating the

diversity of species of termites by applying and

perform multivariate latent generalized linear

models or shortly GLLVM. Then to get the best

parameters on our GLLVM, we employ variational

approximation by evaluating based on AIC, AICc,

and BIC. In simulations obtained at the species

level by a negative binomial with level 1, obtained

AIC: 373.9463, AICc: 277.9463, and BIC:

379.9147. Unlike the genus level modelling, the

best distribution is Poisson with AIC: 1080.345,

AICc: 889.3317, and BIC: 1113.371. In general,

GLLVM is well able to present the diversity and

richness of Termites in Riau and Peninsular

Malaysia. For further research, we will compare the

variational approximation technique with Laplace

approximation to see the difference in

computational time. Then the distribution will be

tried using zero-inflated Poisson (Loeys et al.

2012), zero-inflated negative-binomial (Hall

2000), beta-binomial (Kim & Lee 2019), Tweedie

(Shono 2008), extended Tweedie (Bonat et al.

2018), hurdle (Zeileis et al. 2008), and extended

hurdle negative binomial (Lee et al. 2017) ;

(Maengseok Noh & Lee 2019).

Acknowledgement. This paper is supported by

the Ministry of Science and Technology, Taiwan,

under Grant MOST-107-2221-E-324-018-MY2

and MOST-106-2218-E-324-002 and under

collaboration with Lab Hierarchical Generalized

Linear Model (H-GLM), Department of

Statistics, College of Natural Sciences Seoul

National University and Department of Statistics,

Padjadjaran University. This research partially

supported by Bioinformatics Data Science

Research Center Bina Nusantara University.

Author Contributions:

Conceptualization: Rezzy Eko Caraka, Youngjo

Lee, Rung Ching Chen, Maengseok Noh.

Data curation: Rezzy Eko Caraka, Andi Saputra.

Formal analysis: Rezzy Eko Caraka.

Investigation: Rezzy Eko Caraka, Youngjo Lee,

Rung Ching Chen, Maengseok Noh.

Methodology: Rezzy Eko Caraka, Youngjo Lee,

Rung Ching Chen, Maengseok Noh.

Software: Rezzy Eko Caraka.

Validation: Rezzy Eko Caraka, Youngjo Lee,

Rung Ching Chen, Maengseok Noh.

Visualization: Rezzy Eko Caraka

Writing – original draft: Rezzy Eko Caraka,

Youngjo Lee, Rung Ching Chen, Maengseok

Noh.

[SYLWAN., 164(1)]. ISI Indexed 174

Writing – review & editing: Rezzy Eko Caraka,

Youngjo Lee, Rung Ching Chen, Maengseok

Noh, Toni Toharudin, Bens Pardamean, Andi

Saputra.

REFERENCES

Abraham, V. M., Walpole, R. E. & Myers, R. H.

2007. Probability and Statistics for

Engineers and Scientists. The Mathematical

Gazette. doi:10.2307/3616039

Bonat, W. H., Jørgensen, B., Kokonendji, C. C.,

Hinde, J. & Demétrio, C. G. B. 2018.

Extended Poisson–Tweedie: Properties and

regression models for count data. Statistical

Modelling.

doi:10.1177/1471082X17715718

Bong, M. C. F., King, P. J. H., Ong, K. H. &

Mahadi, N. M. 2012. Termites assemblages

in oil palm plantation in Sarawak, Malaysia.

Journal of Entomology.

doi:10.3923/je.2012.68.78

Bower, B. & Savitsky, T. 2008. Laplace

Approximation. Graphical Models.

Caraka, R. E., Shohaimi, S., Kurniawan, I. D.,

Herliansyah, R., Budiarto, A., Sari, S. P. &

Pardamean, B. 2018. Ecological Show Cave

and Wild Cave: Negative Binomial Gllvm’s

Arthropod Community Modelling.

Procedia Computer Science 135: 377–384.

doi:10.1016/j.procs.2018.08.188

Consul, P. C. & Famoye, F. 1992. Generalized

poisson regression model. Communications

in Statistics - Theory and Methods.

doi:10.1080/03610929208830766

del Castillo, J. & Lee, Y. 2008. GLM-methods for

volatility models. Statistical Modelling

8(3): 263–283.

doi:10.1177/1471082X0800800303

Ha, I. Do & Lee, Y. 2003. Estimating Frailty

Models via Poisson Hierarchical

Generalized Linear Models. Journal of

Computational and Graphical Statistics.

doi:10.1198/1061860032256

Halim, M., Nasir, D. M., Saputra, A., Ayob, Z. A.,

Ahmad, S. Z. S., Din, A. M. M.,

Khairuddin, W. N. W. M., et al. 2018.

Komuniti makroartropoda yang berasosiasi

dengan ekosistem sawit di atas jenis tanah

yang berbeza. Serangga 22(3): 38–55.

Hall, D. B. 2000. Zero-inflated poisson and

binomial regression with random effects: A

case study. Biometrics. doi:10.1111/j.0006-

341X.2000.01030.x

Herliansyah, R. & Fitia, I. 2018. Latent variable

models for multi-species counts modeling

in ecology. Biodiversitas Journal of

Biological Diversity 19(5): 1871–1876.

doi:10.13057/biodiv/d190538

Hinde, J. & Demétrio, C. G. B. 1998.

Overdispersion: Models and estimation.

Computational Statistics and Data

Analysis. doi:10.1016/S0167-

9473(98)00007-3

Hui, F. K. C., Warton, D. I., Ormerod, J. T.,

Haapaniemi, V. & Taskinen, S. 2017.

Variational Approximations for

Generalized Linear Latent Variable

Models. Journal of Computational and

Graphical Statistics.

doi:10.1080/10618600.2016.1164708

Inward, D., Beccaloni, G. & Eggleton, P. 2007.

Death of an order: A comprehensive

molecular phylogenetic study confirms that

termites are eusocial cockroaches. Biology

Letters. doi:10.1098/rsbl.2007.0102

Joyner, C., McMahan, C., Baurley, J. &

Pardamean, B. 2019. A twophase Bayesian

methodology for the analysis of binary

phenotypes in genomewide association

studies. Biometrical Journal 1–11.

doi:10.1002/bimj.201900050

Keng, W. . 2006. Spesies comparison of termite

(Isoptera) in primary forest of Tawau Hill

Park, Sabah and adjacent cocoa plantation

area. University Malaysia Sabah.

Kéry, M. 2010. Poisson Mixed-Effects Model

(Poisson GLMM). Introduction to

WinBUGS for Ecologists, hlm. 203–209.

doi:10.1016/B978-0-12-378605-0.00016-8

Kim, G. & Lee, Y. 2019. Marginal versus

conditional beta-binomial regression

models. Statistical Methods in Medical

Research. doi:10.1177/0962280217735703

Klepzig, K. D., Adams, A. S., Handelsman, J. &

Raffa, K. F. 2009. Symbioses: A Key Driver

of Insect Physiological Processes,

Ecological Interactions, Evolutionary

Diversification, and Impacts on Humans.

Environmental Entomology.

doi:10.1603/022.038.0109

Kurniawan, I. D., Rahmadi, C., Caraka, R. E. &

Ardi, T. A. 2018. Short Communication:

Cave-dwelling Arthropod community of

Semedi Show Cave in Gunungsewu Karst

Area, Pacitan, East Java, Indonesia.

Biodiversitas 19(3): 857–866.

doi:10.13057/biodiv/d190314

Kurniawan, I. D., Soesilohadi, R. C. H., Rahmadi,

C., Caraka, R. E. & Pardamean, B. 2018.

The difference on Arthropod communities’

structure within show caves and wild caves

in Gunungsewu Karst area, Indonesia.

Ecology, Environment and Conservation

[SYLWAN., 164(1)]. ISI Indexed 175

24(1).

Kwon, S., Oh, S. & Lee, Y. 2016. The use of

random-effect models for high-dimensional

variable selection problems. Computational

Statistics and Data Analysis 103(1): 401–

412.

Lee, Y. & Nelder, J. 2001. Modelling and

analysing correlated non-normal data.

Statistical Modeling 1(1): 3–16.

doi:10.1177/1471082X0100100102

Lee, Y. & Noh, M. 2012. Modelling random

effect variance with double hierarchical

generalized linear models. Statistical

Modelling 12(6): 487–502.

doi:10.1177/1471082X12460132

Lee, Y., Rönnegård, L. & Noh, M. 2017. Data

analysis using hierarchical generalized

linear models with R. Data Analysis Using

Hierarchical Generalized Linear Models

with R. doi:10.1201/9781315211060

Loeys, T., Moerkerke, B., de Smet, O. & Buysse,

A. 2012. The analysis of zero-inflated count

data: Beyond zero-inflated Poisson

regression. British Journal of Mathematical

and Statistical Psychology.

doi:10.1111/j.2044-8317.2011.02031.x

Myers, R. H., Montgomery, D. C., Vining, G. G.

& Robinson, T. J. 2012. Generalized Linear

Models: With Applications in Engineering

and the Sciences: Second Edition.

Generalized Linear Models: With

Applications in Engineering and the

Sciences: Second Edition.

doi:10.1002/9780470556986

Nandika, D., Rismayadi, Y., Diba, F. & Harun, J.

. 2003. Rayap: Biologi dan

Pengendaliaannya. Surakarta:

Muhammadiyah University Press.

Niku, J., Brooks, W., Herliansyah, R., Hui, F. K.

C., Taskinen, S. & Warton, D. I. 2019.

Efficient estimation of generalized linear

latent variable models. PLoS ONE 14(5): 1–

20. doi:10.1371/journal.pone.0216129

Niku, J., Hui, F. K. C., Taskinen, S. & Warton, D.

I. 2019. gllvm: Fast analysis of multivariate

abundance data with generalized linear

latent variable models in r. Methods in

Ecology and Evolution 1–10.

doi:10.1111/2041-210X.13303

Niku, J., Warton, D. I., Hui, F. K. C. & Taskinen,

S. 2017. Generalized Linear Latent Variable

Models for Multivariate Count and Biomass

Data in Ecology. Journal of Agricultural,

Biological, and Environmental Statistics.

doi:10.1007/s13253-017-0304-7

Noh, M, Lee, Y., Oud, J. H. . & Toharudin. 2019.

Hierarchical likelihood approach to non-

Gaussian factor analysis. Journal of

Statistical Computation and Simulation

89(3): 1555–1573.

Noh, Maengseok & Lee, Y. 2007. Robust

modeling for inference from generalized

linear model classes. Journal of the

American Statistical Association 102(479):

1059–1072.

doi:10.1198/016214507000000518

Noh, Maengseok & Lee, Y. 2019. Extended

negative binomial hurdle models. Statistical

Methods in Medical Research.

doi:10.1177/0962280218766567

Nur-Atiqah, J., Saputra, A., Mohammad Esa, M.

F., Shafuraa, O., Billy, A. N. A., Mohd

Yaziz, N. A. A. & Faszly, R. 2017.

Coptotermes sp. (rhinotermitidae:

Coptotermitinae) infestation pattern shifts

through time in oil palm agroecosystem.

Serangga 22(2): 15–31.

Rahman, D. A., Herliansyah, R., Rianti, P.,

Rahmat, U. M., Firdaus, A. Y. &

Syamsudin, M. 2019. Ecology and

Conservation of the Endangered Banteng

(Bos javanicus) in Indonesia Tropical

Lowland Forest. HAYATI Journal of

Biosciences, 26(2), 68. 26(2): 68–80.

Saputra, A., Halim, M., Jalaludin, N.-A., Hazmi,

I. R. & Faszly Rahim. 2017. Effects of Day

Time Sampling on The Activities of

Termites in Oil Palm Plantation at

Malaysia-Indonesia. Serangga 22(1): 23–

32.

Saputra, A., Jalaludin, N. A., Hazmi, I. R. &

Rahim, F. 2016. Termite assemblages from

oil palm agroecosystems across Riau

Province, Sumatra, Indonesia. AIP

Conference Proceedings.

doi:10.1063/1.4966841

Saputra, A., Muhammad Nasir, D., Jalaludin, N.

A., Halim, M., Bakri, A., Mohammad Esa,

M. F., Riza Hazmi, I., et al. 2018.

Composition of termites in three different

soil types across oil palm agroecosystem

regions in Riau (Indonesia) and Johor

(Peninsular Malaysia). Journal of Oil Palm

Research. doi:10.21894/jopr.2018.0054

Sarkar, S. 1998. Evolution by association: A

history of symbiosis. Studies in History and

Philosophy of Science Part C: Studies in

History and Philosophy of Biological and

Biomedical Sciences. doi:10.1016/s1369-

8486(98)00010-7

Shono, H. 2008. Application of the Tweedie

distribution to zero-catch data in CPUE

analysis. Fisheries Research.

doi:10.1016/j.fishres.2008.03.006

[SYLWAN., 164(1)]. ISI Indexed 176

Wang, X. D., Chen, R. C., Yan, F., Zeng, Z. Q. &

Hong, C. Q. 2019. Fast Adaptive K-Means

Subspace Clustering for High-Dimensional

Data. IEEE Access 7: 42639–42651.

doi:10.1109/ACCESS.2019.2907043

Wang, Y., Naumann, U., Wright, S. T. & Warton,

D. I. 2012. Mvabund- an R package for

model-based analysis of multivariate

abundance data. Methods in Ecology and

Evolution. doi:10.1111/j.2041-

210X.2012.00190.x

Warton, D. I. 2005. Many zeros does not mean

zero inflation: Comparing the goodness-of-

fit of parametric models to multivariate

abundance data. Environmetrics.

doi:10.1002/env.702

Warton, D. I. 2015. New opportunities at the

interface between ecology and statistics.

Methods in Ecology and Evolution.

doi:10.1111/2041-210X.12345

Warton, D. I., Blanchet, F. G., O’Hara, R. B.,

Ovaskainen, O., Taskinen, S., Walker, S. C.

& Hui, F. K. C. 2015. So Many Variables:

Joint Modeling in Community Ecology.

Trends in Ecology and Evolution.

doi:10.1016/j.tree.2015.09.007

Warton, D. I., Foster, S. D., De’ath, G., Stoklosa,

J. & Dunstan, P. K. 2015. Model-based

thinking for community ecology. Plant

Ecology. doi:10.1007/s11258-014-0366-3

Zaki, N. I. A., Nasir, D. M., Aziz, A., Azhari, L.

H., Saputra, A., Halim, M., Muslim, S. A.,

et al. 2019. Diversity of ground beetles

(Coleoptera: Carabidae) in oil palm

plantation in endau-rompin, Pahang,

Malaysia. Serangga 24(1): 91–102.

Zeileis, A., Kleiber, C. & Jackman, S. 2008.

Hurdle regression models in R. Journal of

Statistical Software.

REZZY EKO CARAKA

received The B.S. degree

(S.Si) from Department of

Statistics Diponegoro

University and Master of

Science by research (MSc-

Res) School of Mathematical

Sciences the National

University of Malaysia.

Moreover, in 2019 he starts PhD in College of

Informatics, Chaoyang University of Technology,

Taiwan. He acts as a researcher in Bioinformatics

& Data Science Research Center University of

Bina Nusantara (BINUS) and Department of

Statistics, Padjadjaran University. He also fellow

researcher in lab GLM-H Department of

Statistics, Seoul National University, South

Korea. At the same time, He was co-founder

Statistical Calculator (STATCAL). His research

interests include Statistical Climatology, Climate

Modeling, Ecological Modelling, Statistical

Machine Learning, and Large-scale Optimization.

Email: rezzyekocaraka@gmail.com

RUNG CHING CHEN

received a B.S. from the

Department of Electrical

Engineering in 1987, and

an M. S. from the Institute

of Computer Engineering

in 1990, both from

National Taiwan

University of Science and

Technology, Taipei,

Taiwan. In 1998, he received his Ph.D. from the

Department of Applied Mathematics in computer

science, National Chung Hsing University. He is

now a distinguished professor in the Department

of Information Management, Taichung, Taiwan.

His research interests include network

technology, pattern recognition, and knowledge

engineering, IoT and data analysis, and

applications of Artificial Intelligence.

Email: crching@cyut.edu.tw

YOUNGJO LEE is full

Profesor in Department of

Statistics, College of Natural

Sciences, Seoul National

University (SNU). He already

published 4 distinct books in

Generalized Linear Models

with Random Effects and

GLM Likelihood. His Research Interest

Generalized Linear Model, Hierarchical

Generalized Linear Model, Random effects, Data

Science, Statistical Software Development.

MAENGSEOK NOH

received the B.S., M.S. and

Ph.D. degrees from the

Department of Statistics,

Seoul National University,

in 1996, 1998 and 2005,

respectively. His thesis was

on analysis of binary data and robust modelling

via hierarchical likelihood. Since 2006, he has

been a Professor with Department of Statistics,

Pukyong National Univeristy, Busan, Korea. His

current research interests are application and

software developments for hierarchical

generalized linear models, development of

methodology for zero-inflated Poisson model

[SYLWAN., 164(1)]. ISI Indexed 177

with spatial correlation and hierarchical approach

non-Gaussian factor analysis

TONI TOHARUDIN

currently works at the

Department of Statistics,

Universitas Padjadjaran.

Toni researches in Statistics.

He received the Master of

Science University of

Leuven Belgium (2004-

2005) and Ph.D. Spatial Sciences University of

Groningen (2007–2010). Moreover, Toni act as

head of the research group in time series and

regression.

BENS PARDAMEAN has

over thirty years of global

experience in information

technology, bioinformatics,

and education. After

successfully leading the

Bioinformatics Research

Interest Group, He currently

holds a dual appointment as the Director of

Bioinformatics & Data Science Research Center

(BDSRC) and as an Associate Professor of

Computer Science at the University of Bina

Nusantara (BINUS) in Jakarta, Indonesia. He

earned a doctoral degree in informative research

from the University of Southern California

(USC), as well as a master’s degree in computer

education and a bachelor’s degree in computer

science from California State University, Los

Angeles. Andi Saputra received The

B. S. degree (S.Si) from

Department of Biology, Riau

University and Master of

Science by Research (M. Sc)

in Zoology School of

Environmental and Natural

Resource Science, Universiti

Kebangsaan Malaysia. His research interest in

Zoology, Entomology and Ecology.