ArticlePDF Available

Developing a Research Methodology with the Application of Explorative Factor Analysis and Regression

Authors:
  • Gangadhar Meher University, Amruta Vihar, Sambalpur, India

Abstract and Figures

Developing a methodology for an article or thesis has immense importance. Jharotia & Singh (2015) said that methodology eases the work plan of a thesis. Patel & Patel (2019) added that research methodology is the mean to resolve research issues scientifically. The principal purpose of this research work is to develop a methodology when factors are generated after applying explorative factor analysis and regression is to be applied to such factors. This current work also enlightened data screening process and normality assumption. It will help the researchers who are conducting perception studies, behavioural studies etc. with categorical data.
Content may be subject to copyright.
IOSR Journal of Business and Management (IOSR-JBM)
e-ISSN: 2278-487X, p-ISSN: 2319-7668. Volume 23, Issue 4. Ser. II (April 2021), PP 23-34
www.iosrjournals.org
DOI: 10.9790/487X-2304022335 www.iosrjournals.org 23 | Page
Developing a Research Methodology with the Application of
Explorative Factor Analysis and Regression
Dr Priyabrata Panda1, Sovan Mishra2, Dr Bhagabat Behera3
Abstract
Developing a methodology for an article or thesis has immense importance. Jharotia & Singh (2015) said that
methodology eases the work plan of a thesis. Patel & Patel (2019) added that research methodology is the mean
to resolve research issues scientifically. The principal purpose of this research work is to develop a methodology
when factors are generated after applying explorative factor analysis and regression is to be applied to such
factors. This current work also enlightened data screening process and normality assumption. It will help the
researchers who are conducting perception studies, behavioural studies etc. with categorical data.
Keywords: Explorative Factor Analysis, Regression, Methodology, Data Screening, Normality
---------------------------------------------------------------------------------------------------------------------------------------
Date of Submission: 22-03-2021 Date of Acceptance: 06-04-2021
---------------------------------------------------------------------------------------------------------------------------------------
I. Introduction
The methodology segment of a thesis or article has immense importance as it helps to proceed for further
analysis. Jharotia & Singh (2015) emphasised that methodology defines work plan for completion of research.
Patel & Patel (2019) added that research methodology is the way to solve research problem scientifically. This
part of a research work includes a series of activities in sequence like formulation of statement of the problem,
comprehensive design of a research which includes its scope, required data, sampling design etc. Crafting a
research methodology chapter largely depends on the type and nature of research. This chapter for time series data
is different from panel data. This work is specifically designed for a research work which applied linear regression
after generating factors from explorative factor analysis with the help of principal component analysis. The
research methodology chapter of an ongoing Ph D thesis of Mishra (2021) has been taken as a background for
this research work.
II. Objective of the Study
The main purpose of the study is to develop a methodology of a research work which applies regression
on factors extracted from principal component analysis under explorative factor analysis. Data screening and
assumption testing are also highlighted.
III. Research Design
Research design answers what, why, where and how (Kothari, 2004). It narrates about the data, type of research,
variables, sample etc. which are enumerated below.
Type of Research
Empirical research differs from exploratory research. Researcher has to mention about the type of research here.
Nature of Data
Primary data and or secondary data are applied for hypothesis testing. Moreover, objective of a research specifies
the type of data which are to be used. The researcher should theoretically justify the requirement of data in
accordance with objectives.
Source of Data
The sources of data must be reliable. Government sources are said to be authentic. CMIE database is widely used
by different researchers. Primary data must be collected from targeted population with comprehensive scheduled
questionnaire.
1 Assistant Professor of Commerce, Gangadhar Meher University, Amruta Vihar, Sambalpur, Odisha,
India, 7978123683, pandapriyabrata@rocketmail.com
2 Research Scholar, Ravenshaw University, Cuttack, India
3 Assistant Professor of Commerce, Ravenshaw University, Cuttack, India
Developing a Research Methodology with the Application of Explorative Factor ..
DOI: 10.9790/487X-2304022335 www.iosrjournals.org 24 | Page
Questionnaire and Scale Development
Development of questionnaire depends on objectives. Past literatures must be referred in this regard. The
first part must include the demographic information of the respondents. Five-point scale or seven-point scale is
desired. Scale development facilitates reliability and validity measures (Tay & Jebb, 2017). Research work of
Churchill & Peter (1984), Morgado (2017), Carpenter (2018), Worthington & Whittaker (2006) may be referred
for scale development.
Variables
Name of variables and their justification must be mentioned. Dependent variables, independent variables and
control variables should be specifically described. Proper citation of earlier literatures must be made.
Sampling Design and Selection of Sample Unit
Probability method of sampling is preferred for collection of primary data. Purpose of the study, population size,
sampling error etc. determine the size of sample (Israel, 1992). Selection of sample units from a population has
immense importance. The proper determination of sample units reduces standard error and confirms robust
inferences. Yamane (1967) formula is quite popular for determining sample size with finite population. Krejcie
& Morgan (1970) table has been used by many researchers for sample size determination in “known populations”.
Other literatures like Jones et. al., (2003), Taherdoost (2017), Blessing & Oribhabor & Anyanwu (2019) have
contributed a lot to the sample size determination theory. Sample size also depends on the concerned statistical
tool which is being applied in the research work. Like, Kline (1998) infered that sample size may be 10 to 20
times of variables when structural equation modelling is applied. In addition, KMO test confirms sample adequacy
when explorative factor analysis is applied.
Scope of the Study
Scope of the study includes geographical scope, parameter scope, time scope etc. The limit and range of the study
need to be particularly mentioned.
IV. Data Cleaning
Data cleaning process ensures a comprehensive data set with zero error. It digs out any missing values and outliers.
The careless response or unengaged responses are also traced out and ignored before deciding the final data set.
Missing Data
Table 1: Missing Data of Demographic Variables
Valid/Missing
Gender
Age
Income Level
Occupation
Valid
Missing
0
0
0
0
Source: Authors own Compilation
It is found in the table that there are no missing values for all the four variables in the above table. Two hundred
seventeen responses are observed with no missing values.
Unengaged Responses
There may be possibilities of some careless responses. It may happen that a respondent may select same option
for all variables. It will lead to fallacious derivations. Thus, an attempt is made to omit such responses.
Table 2: Unengaged Responses
Responses Standard Deviation
1. 0.418854
2. 0.455414
3. 0.455562
4. 0.461251
5.
0.466324
Source: Authors own Compilation
Row wise standard deviation is calculated to find such responses. If the value of standard deviation tends
to be zero, it means a respondent have selected same option carelessly for all the variables. The above table shows
row wise standard deviation of some responses in ascending order which is not zero. Thus, it is inferred that there
are no unengaged responses in the data set.
Outliers
Outliers are the extreme values of a data series or data set. If the standardised value exceeds +-3, the data point is
said to be an outlier. In the table below, there are no outliers as the standardised value of all the variables are
within +-3.
Developing a Research Methodology with the Application of Explorative Factor ..
DOI: 10.9790/487X-2304022335 www.iosrjournals.org 25 | Page
Table 3: Standardised Value and Outliers
Standardised Variables N Minimum Maximum Desired Value
Z score: X1 22 -1.02203 2.70875
+3 to -3
Z score: X2 22 -.51835 2.22934
Z score: X3 22 -2.22089 2.12708
Z score: X4 22 -2.47221 .41740
Z score: X5 22 -.43054 2.44132
Source: Authors own Compilation
V. Reliability Analysis
The table below measures reliability statistics of pilot study with the help of Cronbach’s alpha (1951). Further,
Churchill & Peter (1984) inferred that reliability is necessary for valid research. Such value must be more than .7
(Nunnally, 1978; Nunnally, 1988).
Table 4: Reliability Statistics of Pilot Study
Cronbach's Alpha
No of Items
.775
19
Source: Authors own Compilation
The table above reveals the Cronbach’s alpha value .776 for 78 respondents. The value satisfies the recommended
criteria.
Table 5: Reliability Statistics of Total Sample
Cronbach's Alpha
No of Items
.786 19
Source: Authors own Compilation
The table shows the reliability statistics through Cronbach's Alpha value for the full sample which consists of 217
responses. Such value is .786 which also satisfies the recommended criterion.
VI. Factor Analysis
Factor analysis may be categorised into following categories.
1. Exploratory factor analysis.
2. Confirmatory factor analysis.
3. Structural equation modelling.
Exploratory factor analysis is analysed in the present study with the help of SPSS software4.
Kaiser Meyer Olkin (KMO) test of Sampling Adequacy and Bartlett's Test of Sphericity
Table 6: KMO and Bartlett's Test Result
Kaiser
-
Meyer
-
Olkin Measure of Sampling Adequacy
.749
Bartlett's Test of Sphericity
Approx. Chi
-
Square
643.795
df
Sig.
.000
Source: Authors own Compilation
The table above portrays Kaiser-Meyer-Olkin (KMO) and Bartlett statistics. Kaiser-Meyer-Olkin’s value
measures the adequacy of sampling (Ayuni & Sari, 2018; Hadi, et. al., 2016). KMO statistics is calculated with
the following formula (Kaiser, 1970).
Where r is simple correlation and u is the partial correlation between i and j.
4 https://stats.idre.ucla.edu/spss/seminars/introduction-to-factor-analysis/a-practical-introduction-to-
factor-analysis/
Developing a Research Methodology with the Application of Explorative Factor ..
DOI: 10.9790/487X-2304022335 www.iosrjournals.org 26 | Page
Such value ranges from 0 to 1. A value near to one and farer to zero infers that a sample is adequate for
factor analysis. To be particular, such value should be more than .5 (Kaiser, 1970; Field, 2000) and .6 (Pallant,
2013, Shree et al., (2017). Hutcheson & Sofroniou (1999) opined that “KMO value from .7 to .8 is good, .8 to .9
is great and above .9 is superb”. In this research work, KMO value is .749 which is good and within the
recommended value according to above all criteria.
On the other hand, Bartlett's Test of Sphericity (Bartlett, 1950,1951) measures the relatedness of
variables. The null hypothesis for such test is that variables are uncorrelated. Correlation among some of the
variables are required to apply factor analysis. Further, a population matrix is having 1 in diagonal and 0 in non -
diagonal, a sample drawn from the population cannot be fit for factor analysis (Tobias & Carlson, 2010). Thus,
the null hypothesis must be rejected at least at 5% level of significance (Shree et. al., 2017). Such test is also
recommended by Knapp and Swoyer (1967); Gorsuch, (1973). The null hypothesis here is rejected at 1% level of
significance as the p value is 0.00 as shown in the above table. It can be inferred that variables are correlated and
can be processed for factor analysis.
Factor Extraction
Factor extraction can be carried on by many methods like Scree test (Catell, 1996), Parallel Analysis
(Horn, 1965), Principal Component Analysis (PCA) (Pallant, 2013). PCA is widely used. It gives better results
(Sehgal et.al., 2014). Mishra et. al., (2017) mentioned that PCA can extract and group inter-correlated variables
from a statistical data. Thus, PCA method is applied here.
Communalities
Communalities are the variance or squared factor loadings of variables. It explains the proportion of
variability which is explained by the factors and its value is same in spite of using unrotated factor loadings or
rotated factor loadings5. Its value ranges from 0 to 1. The value closer to one infers that such variable is well
explained by the factors. Communalities value decides retaining or removing a variable. But there is a debate in
deciding the accepted value of communalities. Osborne (2014) opined that communalities more than 0.4 can be
accepted whereas Child (2006) inferred that a variable can be removed if its communality value is less than 0.2.
In the present study, variables with communality value nearer 0.5 or more are retained for further analysis.
Table 7: Communalities
SL No Variables Initial Extraction
1.
X1
1.000 .656
2.
X2
1.000 .702
3.
X3
1.000 .643
4.
X4
1.000 .547
5.
X5
1.000 .557
6.
X6
1.000 .286
7.
X7
1.000 .689
8.
X8
1.000 .451
9.
X9
1.000 .585
10.
X10
1.000 .476
11.
X11
1.000 .468
12.
X12
1.000 .484
13.
X13
1.000 .478
14.
X14
1.000 .594
15.
X15
1.000 .493
16.
X16
1.000 .636
17.
X17
1.000 .418
18.
X18
1.000 .635
19.
X19
1.000 .469
Source: Authors own Compilation
5 https://support.minitab.com/en-us/minitab/18/help-and-how-to/modeling-statistics/multivariate/how-
to/factor-analysis/interpret-the-results/all-statistics-and-graphs/.
Developing a Research Methodology with the Application of Explorative Factor ..
DOI: 10.9790/487X-2304022335 www.iosrjournals.org 27 | Page
In the table above, such value for all the variables is more than .4 except X6 which is removed and PCA is again
undertaken. The results are shown below.
Table 8: Communalities After Deleting One Variable
SL No Variables Initial Extraction
1.
X1
1.000 .670
2.
X2
1.000 .701
3.
X3
1.000 .640
4.
X4
1.000 .587
5.
X5 1.000 .555
6.
X6
1.000 .657
7.
X7
1.000 .443
8.
X8
1.000 .576
9.
X9
1.000 .487
10.
X10
1.000 .476
11.
X11
1.000 .488
12.
X12 1.000 .489
13.
X13 1.000 .593
14.
X14 1.000 .530
15.
X15 1.000 .700
16.
X16 1.000 .413
17.
X17 1.000 .628
18.
X18
1.000 .470
Source: Authors own Compilation
The variable X16 is having communality .413 which is dropped and EFA is undertaken again. The communality
value, Kaiser-Meyer-Olkin Measure of Sampling Adequacy and Bartlett's Test of Sphericity is shown below after
dropping such variable.
Table 9: KMO and Bartlett's Test
Kaiser
-
Meyer
-
Olkin Measure of Sampling Adequacy.
.720
Bartlett's Test of Sphericity
Approx. Chi
-
Square
551.584
df
Sig.
.000
Source: Authors own Compilation
The KMO and Bartlett's Test statistics are matching with the recommended criterion as mentioned above.
Table 10: Communalities of Selected Variables
SL No Variables Initial Extraction
1.
X1
1.000 .676
2.
X2
1.000 .696
3.
X3
1.000 .630
4.
X4
1.000 .580
5.
X5
1.000 .550
6.
X6
1.000 .660
7.
X7
1.000 .486
8.
X8
1.000 .599
9.
X9
1.000 .488
10.
X10
1.000 .478
11.
X11
1.000 .491
12.
X12 1.000 .484
13.
X13 1.000 .616
14.
X14 1.000 .543
15.
X15 1.000 .718
16. X16 1.000 .626
17.
X17 1.000 .498
Source: Authors own Compilation
Developing a Research Methodology with the Application of Explorative Factor ..
DOI: 10.9790/487X-2304022335 www.iosrjournals.org 28 | Page
The communalities of all the variables in the above table is nearer or more than .5. A total of seventeen
variables are processed for further analysis.
In the table below, the total column portrays the eigen value which explains the amount of variance of a
factor or component for original variables. Such variance is termed in percentage in the next column followed by
the cumulative percentages. A value more than 1 can be accepted for selecting number of components. In this
research work, six components are selected for further analysis where eigen value is more than 1. The extracted
variance explained for these six factors is 57.64% which more than the recommended value i.e.,50%. Thus, all
the seventeen variables explain 57.64% of the total information.
Table 11: Total Variance Explained
Factor
Initial Eigenvalues
Extraction Sums of Squared Loadings
Rotation Sums of Squared Loadings
Total
% of Variance
Cumulative %
Total
% of Variance
Cumulative %
Total
% of Variance
Cumulative %
1
3.37
19.84
19.84
3.37
19.84
19.84
1.87
11.01
11.19
2
1.75
10.33
30.17
1.75
10.33
30.17
1.78
10.50
21.52
3
1.28
7.57
37.75
1.28
7.57
37.75
1.58
9.32
30.84
4
1.20
7.10
44.86
1.20
7.10
44.86
1.57
9.27
40.24
5
1.12
6.60
51.46
1.12
6.60
51.46
1.50
8.83
48.96
6
1.05
6.17
57.64
1.05
6.17
57.64
1.47
8.68
57.64
7
.973
5.72
63.36
8
.882
5.18
68.55
9
.748
4.39
72.94
10
.738
4.34
77.29
11
.675
3.96
81.25
12
.644
3.787
85.045
13
.603
3.544
88.589
14
.570
3.352
91.942
15
.512
3.012
94.954
16
.461
2.713
97.667
17
.397
2.333
100.000
Extraction Method: Principal Component Analysis.
Source: Authors own Compilation
Rotated component matrix, in the table below, explains the factor loadings which is the correlation
between the variable and the factor or component. Four factors are grouped under factor 1. Three variables are
grouped under factor 2. Third factor includes two variables. Fourth factor includes three variables. Fifth factor
includes three variables viz. X13, X14 and X15. X16 and X17 are grouped under the sixth component or factor.
Researchers often faced issues while naming such factors or components. With rigorous literature review and
theoretical background, factors may be given required names. After generating factors from rotated component
matrix, reliability of all variables is again calculated.
Table 12: Rotated Component Matrix
Variables
Component
Factors
1
2
3
4
5
6
X1
.745
Factor 1
X2
.610
X3
.587 .471
X4
.475 .313
X5
.786
Factor 2
X6
.604
X7
.572 .331
X8
.798
Factor 3
X9
.713
X10
.827 Factor 4
Developing a Research Methodology with the Application of Explorative Factor ..
DOI: 10.9790/487X-2304022335 www.iosrjournals.org 29 | Page
X11
.642
X12
.346 .423 .367
X13
.747
Factor 5
X14
.353 .486
X15
-.327 .318 .377
X16
.800
Factor 6
X17
.698
Source: Authors own Compilation
Table 13: Reliability Statistics of Extracted Factors.
Cronbach's Alpha
N of Items
.740 17
Source: Authors own Compilation
The reliability statistics with the help of Cronbach’s Alpha of all variables is .740 which is also more than the
recommended level.
VII. Test of Normality
Normally distributed data yields robust inferences. Thus, examining such assumption is essential before
proceeding to further analysis to avoid inconsistent result (Das & Imon, 2016, Ghasemi & Zahediasl, 2012, Kwak
& Park, 2019). Many statistical tests need satisfaction of normality assumption. (Mishra et. al., 2019). Thus,
normality of all variables is to be tested.
There are many such tests for measuring normality which are as follows.
Table 14: Different Tests of Normality
Graphical Method Mathematical Method/ Analytical Method
Histogram
Kolmogorov
-
Smirnov Test (Kolmogorov
,
1933)
Stem
-
Leaf Plot
Shapiro
-
Wilk Test (Shapiro and Wilk ,1965)
Box
-
and
-
Whisker Plot
Anderson
-
Darling Test (Anderson and Darling, 1952)
Probability
-
Probability/Percent
-
Percent (PP) Plot
D’Agostino
-
Pearson Omnibus Test (D’Agostino
-
Pearson, 1973)
Quantile
-
Quantile (QQ) Plot
Jarqua
-
Bera Test (Bowman and Shenton, 1975))
Detrended Probability Plot
Source: Authors own Compilation
For selection of test and method of normality, Das & Imon, 2016 deduced that analytical test of normality
is more preferable than graphical test and recommended Shapiro-Wilk Test. Razali & Wah (2011) compared four
normality tests with ten thousand samples and also deduced that Shapiro-Will test of normality is most powerful
followed by Anderson-Darling test, Lilliefors test & Kolmogorov-Smirnov test. In the present study, Shapiro-
Wilk test, Kolmogorov-Smirnov test along with histogram, PP plot, QQ plot and descriptive statistics are applied
for assessing normality in different situations. The null hypothesis for normality is that data tend to be normal.
Thus, sig./p value more than .05 is desired to accept such hypothesis. Normality of continuous variables are
desired.
Table 15: Normality of Variables
Variables
Kolmogorov-Smirnov Shapiro-Wilk
Statistic df
Sig. Remark Statistic df
Sig. Remark
V1 .270 7 .133 Normal .822 7 .066 Normal
V2 .428 7 .000 Non-Normal .564 7 .000 Non-Normal
V3 .180 7 .200 Normal .956 7 .780 Normal
V4 .240 7 .200 Normal .889 7 .272 Normal
V5 .252 7 .199 Normal .765 7 .018 Non-Normal
Source: Authors own Compilation
Developing a Research Methodology with the Application of Explorative Factor ..
DOI: 10.9790/487X-2304022335 www.iosrjournals.org 30 | Page
Normality with Kolmogorov-Smirnov and Shapiro-Wilk test of different financial parameters is shown
in the above table. All the parameters are normal except the second variable under Kolmogorov-Smirnov test.
Similar results are also drawn from Shapiro-Wilk test statistics. Further, Shapiro-Wilk test reveals that variable 5
is not normally distributed.
VIII. Regression Assumptions
Normality of Residuals
In linear regression analysis, normality of residuals is an essential assumption. Chan (2004) narrated such
assumptions quite clearly. The normality of regression equations is shown below.
Figure 1: Normality of Residuals with Histogram and PP Plot of Regression Model 1
Source: Authors own Compilation
The distribution of residuals shows that all the data points are within the histogram. The data points are
also nearer to the PP Plot. Both the pictures above infer that residuals are normally distributed. The residuals of
other models are tested in the similar way and found a normal distribution.
Table 16: Normality of Std. Residuals with Descriptive Statistics
Models Mean SD
1 .000 .993
2 .000 .988
3 .000 .991
4 .000 .986
Source: Authors own Compilation
The table above shows normality of residuals of regression models with mean and standard deviation.
Mean is zero and standard deviation is very nearer to one. It ensures that residuals of all regression models are
normally distributed.
R squared value explains the degree of variability in the dependent variable by the independent variable.
It infers about the goodness of fit of the model to the observed data. “An R-squared of 60% reveals that 60% of
the data fit the regression model”6. Further, 60% of the dependent variable variance has been affecting independent
6 https://corporatefinanceinstitute.com/resources/knowledge/other/r-
squared/#:~:text=The%20most%20common%20interpretation%20of,better%20fit%20for%20the%20
model.
Developing a Research Methodology with the Application of Explorative Factor ..
DOI: 10.9790/487X-2304022335 www.iosrjournals.org 31 | Page
variable. But difference of opinion has been observed for the reliability of this test result. It is opined that small R
square value does not always mislead and high R-square value may not be always essentially good7.
Table 17: R- squared and Adjusted R-Squared
Models R Square Adjusted R Square
1 .138 .126
2 .143 .122
3 .206 .190
4 .213 .191
5 .159 .143
6 .166 .142
Source: Authors own Compilation
All the models have low R-squared value. Researchers opined that research analysis can be proceeded
even with low R-squared value. Akossou & Palm, (2013) opined that R squared is a biased estimate. Filho et. al.
(2011) substantiated that coefficient of determination fails to draw a meaningful conclusion. Dodge (1999) added
that such value can be increased by subsequent addition of variables. Thus, other parameters like F statistics, D-
W statistics can be examined in regression analysis along with R-squared value.
Durbin-Watson (D-W) Statistics
Durbin-Watson (1950) Statistics measures auto-correlation. It is said that “the Durbin Watson (DW)
statistic is a test for autocorrelation in the residuals from a statistical regression analysis8”. Maxwell & David,
(1995) and White (1992) opined that such statistics should be within 1.5 to 2.5 so that there will be no
autocorrelation at lag 1.
Table 18: Durbin-Watson (D-W) Statistics
Models D-W Statistics
1 1.725
2 1.728
3 2.129
4 2.122
5 1.749
6 1.766
Source: Authors own Compilation
The Durbin Watson statistics of all the models are within the recommended level. Thus, it can be inferred that the
model is not affected by auto-correlation.
F-statistics
Table 19: F-statistics and Model Fit
Models F Value Sig./P value
1 11.138 .000
2 6.882 .000
3 13.458 .000
4 9.318 .000
5 10.010 .000
6 6.966 .000
Source: Authors own Compilation
7 https://statisticsbyjim.com/regression/interpret-r-squared-regression/
8 https://www.investopedia.com/terms/d/durbin-watson-statistic.asp
Developing a Research Methodology with the Application of Explorative Factor ..
DOI: 10.9790/487X-2304022335 www.iosrjournals.org 32 | Page
F statistics assumes that the regression model may not have predictive efficiency. The null hypothesis is that there
are zero regression coefficients. There are no variables affecting the target variable except the intercept or
constant. The null hypothesis must be rejected. In the above table, the null hypothesis for all the models is rejected
1% level of significance. The model is said to be fit and having some value of regression coefficients.
Test of Multicollinearity
There should not be multicollinearity issue in a regression model. Multicollinearity is a state where two
or more independent variables are highly inter-correlated with each other. The independency of the predictors is
violated. It occurs when multiple factors are correlated.9 It leads to overestimation of the results. Daoud (2017)
inferred that the correlation among dependent variables is undesired. Such problem increases the value of standard
error of the suffered coefficient. Type II error is invaded to the model. Lindner (2020) found that multicollinearity
does not lead to bias however it violates regression assumption. Detection of such issue can be made by made by
assessing tolerance level and Variance Inflation Factor (VIF). No multicollinearity is detected when tolerance
level is more than .7 and VIF is within 3.
Table 20: Collinearity Statistics
Factors Tolerance VIF
Factor 1 .942 1.061
Factor 2 .927 1.079
Factor 3 .968 1.033
Factor 4 .882 1.134
Factor 5 .910 1.099
Source: Authors own Compilation
In the above table, the variables which are derived from explorative factor analysis for regression purpose have
no multicollinearity problem as both tolerance and VIF are within the recommended value.
Homoscedasticity or Constant Variance of Residuals
Panda et al., (2020) reported that “there must be constant variance among residuals” when regression
analysis is applied. Heteroscedasticity is a problem because ordinary least squares (OLS) regression assumes that
all residuals are drawn from a population should have a constant variance (homoscedasticity)10. The plots must
be scattered across the area, then it is said that there is constant variance of residuals.
Model 1
Model 2
Fig 2: Constant Variance of Residuals
Source: Authors own Compilation
10https://statisticsbyjim.com/regression/heteroscedasticityregression/#:~:text=Heteroscedasticity%20is
%20a%20problem%20because,should%20have%20a%20constant%20variance.
Developing a Research Methodology with the Application of Explorative Factor ..
DOI: 10.9790/487X-2304022335 www.iosrjournals.org 33 | Page
The table above shows the scatter plot of residuals of model 1 and model 2. The plots are spread across
the scatter plot without forming a clear pattern. Thus, it can be inferred that there is a constant variance of residuals.
Similar inferences are to be drawn for other models also.
IX. Conclusion
The way or mean of achieving research objectives is carefully decided in this chapter. Research design
is framed. A rigorous screening is undertaken for the data set so that unengaged responses are to be ignored.
Moreover, outliers are checked with missing values. Statistical tools are selected on the basis of objective and
nature of data. Due care is taken to test the assumptions of a tool before applying it.
References
[1]. Akossou, A., Y., J., & Palm, R. (2013). Impact Data Structure on the Estimators R-Square and Adjusted R-square in
Linear Regression, International Journal of Mathematics and Computation, 20(3), 84-93.
[2]. Anderson, T. W., and Darling, D. A. (1952). Asymptotic theory of certain goodness-of-fit criteria based on stochastic
processes. The Annals of Mathematical Statistics, 23(2), 193-212.
[3]. Ayuni, N.W.D., & Sari, I. G. A. M. K. K. (2018). Analysis of factors that influencing the interest of Bali State
Polytechnic’s students in entrepreneurship. Journal of Physics: Conference Series, 1–10.
[4]. Bartlett, M. S. (1950). Tests of significance in factor analysis, The British Journal of Psychology, 3 (Part II), 77-85.
[5]. Bartlett, M. S. (1951). The effect of standardization on a Chi-square approximation in factor analysis. Biometrika,
38(3/4), 337-344.
[6]. Bowman, K. O., and Shenton, B. R. (1975). Omnibus test contours for departures from normality based on √b1 and b2.
Biometrika 64: 243-50.
[7]. Carpenter, S. (2018). Ten Steps in Scale Development and Reporting: A Guide for Researchers. Communication
Methods and Measures, 12(1), 25–44.
[8]. Catell, R. R. (1966). The screen test for number of factors, Multivariate Behavioral Research, 1, 245-276.
[9]. Chan, Y. H. (2004). Biostatistics 201: Linear Regression Analysis. Singapore Med J, 45(2), 55–61.
[10]. Child, D. (2006) The essentials of factor analysis. Continuum, London, 1-106.
[11]. Churchill, G. A., & Peter, J. P. (1984). Research Design Effects on the Reliability of Rating Scales: A Meta-Analysis.
Journal of Marketing Research, XXI, 360–375.
[12]. Cronbach, L. J. (1951). Coefficient Alpha and the Internal Structure of Tests, Psychometrika, 16 (3) 297–334,
doi:10.1007/bf02310555.
[13]. DʼAgostino R, and Pearson E. S. (1973). Tests for departure from normality. Empirical results for the distributions of
b2 and √b1. Biometrika. 60(3), 613-622.
[14]. Daoud, J. I. (2017). Multicollinearity and Regression Analysis. Journal of Physics: Conference Series, 1–6. doi
:10.1088/1742-6596/949/1/012009
[15]. Das, K. R., & Imon, A. H. M. R. (2016). A Brief Review of Tests for Normality. American Journal of Theoretical and
Applied Statistics, 5(1), 5–12.
[16]. Dodge, Y. (1999). Analyse De Regression Appliquee, Dunod, Paris.
[17]. Durbin, J. & Watson, G. S. (1950). "Testing for Serial Correlation in Least Squares Regression, I". Biometrika. 37 (3–
4): 409–428.
[18]. Field, A. (2000). Discovering Statistics using SPSS for Windows. London – Thousand Oaks New Delhi: Sage
publications.
[19]. Filho, D., B., F., Silva, J., A., & Rocha, E. (2011). What is R2 all About? Leviathan – Cadernos de Pesquisa Política,
(3), 60-68.
[20]. Ghasemi, A., & Zahediasl, S. (2012). Normality Tests for Statistical Analysis: A Guide for Non-Statisticians.
International Journal of Endocrinology & Metabolism, 10(2), 486–489.
[21]. Gorsuch, R. L. (1973). Using Bartlett’s Significance Test to Determine the Number of Factors to Extract. Educational
and Psychological Measurement, (33), 361–364.
[22]. Hadi, N. U., Abdullah, N., & Ilham, S. (2016). An Easy Approach to Exploratory Factor Analysis: Marketing
Perspective. Journal of Educational and Social Research, 6(1), 215–223.
[23]. Horn, J. L. (1965). A rationale and test for the number of factors in factor analysis. Psychometrika, (30), 179-185.
[24]. Hutcheson, G. D., and Sofroniou, N. (1999). The Multivariate Social Scientist: an introduction to generalized linear
models. Sage Publications.
[25]. Israel, G., D. (1992). Sampling the Evidence of Extension Program Impact. Program Evaluation and Organizational
Development, IFAS, University of Florida. PEOD-5. October.
[26]. Jharotia, A. K., & Singh, S. (2016). Use of Research Methodology in Research: An Overview. International Journal
of Social Science, Journalism & Mass Communication, 2(2), 44–51.
[27]. Kaiser, H. (1970). A Second-Generation Little Jiffy, Psychometrika, 35, 401-415.
[28]. Kaiser, H. (1974). An index of factorial simplicity. Psychometrika, 39, 31-6.
[29]. Kline, R. B. (1998). Principles and Practice of Structural Equation Modeling. New York.
[30]. Knapp, T. R. and Swoyer, V. H. (1967). Some empirical results concerning the power of Bartlett’s test of the
significance of a correlation matrix. American Educational Research Journal, 4, 13-17.
[31]. Kolmogorov, A. (1933) Sulla determinazione empirica di una legge di distribuzione.’’ G. Ist. Ital. Attuari, 4, 83–91.
Developing a Research Methodology with the Application of Explorative Factor ..
DOI: 10.9790/487X-2304022335 www.iosrjournals.org 34 | Page
[32]. Kothari C. R. (2004). Research Methodology: Methods and Techniques. New Age International Publisher, Second
Edition, 1-401.
[33]. Krejcie, R. V., & Morgan, D. W. (1970). Determining Sample Size for Research Activities. Educational and
Psychological Measurement, 30(3), 607–610.
[34]. Kwak, S. G., & Park, S. (2019). Normality Test in Clinical Research. Journal of Rheumatic Diseases, 26(1), 5–11.
[35]. Lindner, T., Puck, J., & Verbeke, A. (2020). Misconceptions about multicollinearity in international business research:
Identification, consequences, and remedies. Journal of International Business Studies, 51, 283–298.
https://doi.org/10.1057/s41267-019-00257-1.
[36]. Maxwell, L., K and & David C., H. (1995): The Application of the Durbin-Watson Test to the Dynamic Regression
Model Under Normal and Non-Normal Errors, Econometric Reviews, 14(4), 487-510.
[37]. Mishra (2021). An analysis of Merger and Acquisition in Indian Banking Sector. (Unpublished doctoral dissertation).
Ravenshaw University, Cuttack, India.
[38]. Mishra, P., Pandey, C. M., Singh, U., Gupta, A., Sahu, C., & Keshri, A. (2019). Descriptive Statistics and Normality
Tests for Statistical Data. Annals of Cardiac Anaesthesia, 22(1), 67–72.
[39]. Mishra, S., Sarkar, U., Taraphder, S., & Datta, S. (2017). Principal Component Analysis. International Journal of
Livestock Research, 7(5), 60–78.
[40]. Morgado, F. F. R., Meireles, J. F. F., Neves, C. M., Amaral, A. C. S., & Ferreira, M. E. C. (2017). Scale development:
Ten main limitations and recommendations to improve future research practices. Psicologia: Reflexao e Critica, 30(1),
1–20.
[41]. Nunnally, J. C. (1978). Psychometric Theory. New York: McGraw-Hill Publishing.
[42]. Nunnally, J. C. (1988). Psychometric Theory. New Jersey: McGraw-Hill, Englewood Cliffs.
[43]. Oribhabor, C. B., & Anyanwu, C. A. (2019). Research Sampling and Sample Size Determination: A practical
Application. Journal of Educational Research (Fudjer), 2(1), 47–57.
[44]. Osborne, J. W. (2016 or 2014). Best Practices in Exploratory Factor Analysis. Scotts Valley, CA: CreateSpace
Independent Publishing. ISBN-13: 978-1500594343,
[45]. Pallant, J. (2013). SPSS Survival Manual. A step-by-step guide to data analysis using SPSS, 4th edition. Allen & Unwin,
www.allenandunwin.com/spss.
[46]. Panda, P., Das, K., K. & Mohanty, M., K. (2020). Direct Tax Reform in India: An Impact Analysis with Special
Reference to Government Revenue. The Orissa Journal of Commerce, 41(1), 66-86.
[47]. Patel, M., & Patel, N. (2019). Exploring Research Methodology: Review Article. International Journal of Research
and Review, 6(3), 48–55.
[48]. Razali, N. M., & Wah, Y. B. (2011). Power comparisons of Shapiro-Wilk, Kolmogorov-Smirnov, Lilliefors and
Anderson-Darling tests. Journal of Statistical Modeling and Analytics, 2(1), 21–33.
[49]. S. S. Shapiro, & M. B. Wilk. (1965). An Analysis of Variance Test for Normality (Complete Samples). Biometrika.52,
(3/4), 591-611.
[50]. Sehgal, S., Singh, H., Agarwal, M., & Shantanu, V. B. (2014). Data Analysis Using Principal Component Analysis.
International Conference on Medical Imaging, m-Health and Emerging Communication Systems (MedCom), 45–48.
[51]. Shree, S. V., Pugazhenthi, R., & Chandrasekaran, M. (2017). Statistical Investigation of the Performance Evaluation
in Manufacturing Environment. International Journal of Pure and Applied Mathematics, 114(12), 225–235.
[52]. Taherdoost, H. (2017). Determining sample size; How to calculate survey sample size. International Journal of
Economics and Management Systems, 2(2), 237–239.
[53]. Tay, L., & Jebb, A. (1996). Scale development. In Journal of Health and Social Policy (Vol. 8, Issue 1).
https://doi.org/10.1300/J045v08n01_02
[54]. Tay, L., & Jebb, A. (2017). Scale Development. In S. Rogelberg (Ed), The SAGE Encyclopedia of Industrial and
Organizational Psychology, 2nd edition. Thousand Oaks, CA: Sage.
[55]. Tobias, S., & Carlson, J. E. (2010). Brief Report: Bartlett’ S Test of Sphericity and Chance Findings in Factor Analysis.
Multivariate Behavioral Research, 375–377.
[56]. White, K. (1992). The Durbin-Watson Test for Autocorrelation in Nonlinear Models. The Review of Economics and
Statistics, 74(2), 370-373.
[57]. Worthington, R. L., & Whittaker, T. A. (2006). Scale Development Research: A Content Analysis and
Recommendations for Best Practices. The Counseling Psychologist, 34(6), 806–838.
[58]. Yamane, T. (1967). Statistics, An Introductory Analysis, 2nd Ed., New York: Harper and Row.
Panda, P., Mishra, S. & Behera, B. (2021). Developing a Research Methodology with the
Application of Explorative Factor Analysis and Regression. IOSR Journal of Business and
Management (IOSR-JBM), 23(04), 23-34.
... The fifth collects information about choice and feedback mechanism of consumers with six variables. Variables are rated with 5point Likert Scale with 1 (Strongly Disagree) and 5 (Strongly Agree) (Panda et al., 2021) [17] . The data collection was conducted and collected through different platforms in social media. ...
... The fifth collects information about choice and feedback mechanism of consumers with six variables. Variables are rated with 5point Likert Scale with 1 (Strongly Disagree) and 5 (Strongly Agree) (Panda et al., 2021) [17] . The data collection was conducted and collected through different platforms in social media. ...
... Source: Compiled by the Author ~ 199 ~ The table assess the KMO and Bartlett's test and assess the sampling adequacy and findings of the study stated that estimated KMO value of the study is .944 which is close to 1 (Panda et al., 2021) [17] . Moreover, Bartlett's Test of Sphericity value is .000 ...
... Before the discussion on the results of the regression, the assumptions (Panda et al., 2021) discussed in the following subsections, are verified. ...
... However, the effectiveness of these organizations is contingent upon several factors, including their organizational efficiency, ability to mobilize resources, and influence on policy [4]. This study seeks to identify and evaluate the efficiency indicators of ENGOs in Iran, focusing on their contribution to reducing environmental degradation, improving access to clean drinking water, and enhancing the living conditions of residents in impoverished urban areas [5]. ...
... Hence, it can be assumed that the variables are related and can be proceed for factor analysis. The rotated component matrix explains the component loading, which reflects the correlation between the variables and factors (Panda et al., 2021) [9] . We considered only those variables that had a factor loading greater than 0.5. ...
... The assumption of residual autonomy was upheld, as the Durbin-Watson test for the models yielded a value of 1.7, falling within the acceptable range of 1.89-2.03 (Panda et al., 2021). ...
... The rotated component matrix explains the component loading, which reflects the correlation between the variables and factors (Panda et al., 2021). We considered only those variables that had a factor loading greater than 0.5 for most of the variables. ...
Article
Full-text available
The study assesses the influence of company policy on the buying intentions of life insurance policyholders. The influence of agent behaviour and financial self-sufficiency on buying insurance products is measured as well. An attempt is also made to trace the impact of demographic factors like income and education level. The study is confined to the Odisha province of India. ANOVA and explorative factor analysis followed by linear regression have been applied for the impact assessment. The study analysed the responses from 389 policyholders collected on a random basis. It is found that factors like Agents' Behaviour, Self Sufficiency, and Financial Behaviour have a significant impact on Buying Intentions of policyholders, but Company Policy has no such impact. Similarly, income level has a significant impact on buying intentions. It can be inferred that policyholders buy policies when they are financially self-reliant. In addition, agents at the local level can influence the customer. The findings of the study will be helpful for insurance companies to develop an appropriate strategy to ensure better customer satisfaction and delivery of services.
... Autonomy of residuals was assumed since the value of the Durbin-Watson test for the models was 1.9 falling within the accepted range of 1.89-2.03 (Panda et al., 2021). ...
Article
Full-text available
During the Covid-19 era, Ghana's cocoa sector relied heavily on mobile phone agriculture for extension delivery services, aiming to enhance climate-smart agricultural activities and overcome physical limitations. However, there is limited literature on the role of mobile phones in extension delivery during the pandemic. The study investigated the effectiveness of mobile phone agriculture in extension delivery and its relationship with climate-smart agricultural practices in Ghana's cocoa sector during the pandemic. The study selected 152 community extension agents in the Ashanti Region. The cross-sectional data was estimated using frequencies, percentages, means, standard deviations Pearson, Point Biserial, Spearman rho, and ordinary least squares regression. The result indicates a positive correlation between mobile phone agriculture extension delivery and climate-smart agricultural practices. Additionally, there was a significant relationship between climate-smart practices and factors such as knowledge, skills, frequency, and intensity of phone use. However, years of experience and age showed a negative relationship. The findings showed that funding, knowledge of use, and years of experience as influential factors in climate-smart agricultural activities facilitated by mobile phone extensions. To reach underserved cocoa communities, the Ghana Cocoa Board must enhance the capacity of community extension agents. Countries considering an intensive adoption of mobile phone agriculture for innovative extension services in climate-smart agriculture should consider factors like knowledge, skills, duration, frequency, and financial investments in acquiring, converting, applying, protecting, and distributing climate-smart agricultural information. The study contributes to mitigating the negative impact of climate change through the application of climate-smart information in agriculture.
... Subsequently, it eliminates the variance explained by the first factor and extracts the second factor, and so on to the last factor. A Varimax rotation was also developed, which makes the understanding of the result more reliable (Panda et al., 2021). This study meets the requirements of relevance (determinant = 7.64E-006; KMO = .864; ...
Article
Full-text available
This investigation analyzes the prevalence and type of intimate partner violence (IPV) suffered by men and women over 65 years of age. A cross-sectional study was conducted in 24 Senior Citizen Centers in the Community of Madrid, Spain. An ad hoc questionnaire was developed and 1,030 subjects (826 women and 204 men) aged 65–93 years participated. A factor analysis was performed with the different types of violence and a binary logistic regression to analyze the probability of risk of the type of violence suffered by older people with IPV. Women experienced more cases of IPV than men, although, in both groups, physical violence was lower, and the psychological or social type was more prevalent. According to the study, women were 2.2 times more likely to experience physical violence, 3.1 times more likely to experience economic violence, and 1.8 times more likely to experience verbal violence concerning men. IPV experienced by older people differs from that of younger people. It is necessary to give visibility to IPV in seniors in terms of preventive and intervention actions that can be implemented.
Article
The transition to green shipping is critical for addressing the maritime industry’s environmental challenges amid rising concerns about climate change and pollution. Green ships utilize alternative fuels and innovative technologies to reduce their carbon footprint, necessitating substantial upgrades to port infrastructure. This study employs a positivist approach, using a quantitative research methodology to analyze the financial, technological, and environmental requirements for accommodating these vessels. A descriptive research design, coupled with stratified random sampling, captures stakeholder perceptions from diverse groups, including industry experts and port authority officials, through a Likert-scale questionnaire. The findings reveal strong consensus on the urgent need for improvements in docking facilities, modern fuel supply systems, and waste management, indicating financial and technological challenges. Stakeholders largely believe that adequate financial resources exist for these upgrades, emphasizing the importance of government funding and private investment. There is a general agreement that upgrading infrastructure will decrease carbon emissions, while calls for stricter regulatory enforcement and legal incentives persist. Overall, this research underscores the readiness of ports to accommodate green ships and highlights the potential for aligning investments with global sustainability goals. The originality of this study lies in its comprehensive analysis of stakeholder perceptions and the integration of multiple dimensions—financial, legal, environmental, and technical—in evaluating port infrastructure upgrades for green shipping.
Article
Full-text available
İleri yetişkinler stereotiplere en fazla maruz kalan ve bu stereotiplerle baş etmek durumunda olan gruplardan biridir. Bu çalışmada, ileri yetişkinlerin (65+) kendi yaş gruplarına yönelik stereotip düzeylerini belirlemek için bir ölçek geliştirmek amaçlanmıştır. Çalışmaya, Türkiye’nin farklı şehirlerinde yaşayan, her hangi bir nörolojik tanısı olmayan 297 katılımcı dâhil edilmiştir. Katılımcıların yaş aralığı 65 ile 91 arasında değişmektedir (Ort.yaş = 71.66, SS = 6.52). Tamamen gönüllük esasına dayalı bir olan araştırmada veriler anket yöntemi ile yüz yüze toplanmıştır. Katılımcılara alan yazından faydalanılarak ileri yetişkinlere yönelik oluşturulan dokuz farklı stereotipe ne kadar katılıp katılmadığı sorulmuştur. Verilerin faktör analizine uygunluğunu ve örneklemin yeterliliğini test etmek için Kaiser-Meyer-Olkin (KMO) katsayısı ve Bartlett Küresellik Testi (BKT) kullanılmıştır. KMO katsayısı (.826) ve BKT'nin anlamlılığı (χ² (36) = 664.204; p < 0.001) ölçeğin faktör analizine uygun olduğunu ve verilerin çok değişkenli normal dağılımdan geldiğini göstermiştir. Gerçekleştirilen AFA sonucunda ölçeğin dokuz maddeden oluşan tek faktörlü bir yapı oluşturduğu görülmüştür. Ölçüm aracının faktör yükleri .32 ile .70 arasında değişmektedir ve maddelerdeki ortak varyansın %40 olduğu görülmektedir. DFA sonuçları da maddelerin tek faktörde toplandığını ve uyum iyiliği değerlerinin kabul edilebilir düzeyde olduğunu göstermiştir, χ²/SS = 2.682, RMSEA = .075, CFI = 0.932, TLI = .905, SRMR = .042. Cronbach alfa katsayısı ise 0.80 olarak bulunmuştur. Sonuç olarak, bu ölçeğin ileri yetişkinlerin kendi yaş gruplarına yönelik stereotipleri değerlendirmek için güvenilir ve geçerli bir ölçek olduğu söylenebilir. Bunun yanında, hem ülkemizde hem de yurt dışında ilk olma özelliğine sahiptir ve ileri yetişkinlere yönelik önyargı ve ayrımcılığın kapsamlı bir şekilde araştırılmasını ve anlaşılmasını sağlayacaktır.
Article
Full-text available
The Direct Tax regime has witnessed many changes every year since its inception. The proposed change or reform in tax policy has manifold objectives. The impact of such reform should be judged in regard to its contribution to government treasury which is the primary purpose of tax reform. This research work dissects the impact of direct tax policy reform, direct tax administrative reform and economic policy reform on direct tax revenue and total revenue. Notable research work in the similar field is reviewed and gap is traced out. Anova and regression is applied to assess such impact.
Article
Full-text available
One of the major issues in planning a research is the decision as to how a sample and the method to be employed to select the estimated sample in order to meet the objective of the research. Besides emphasizing the need for a representative sample, the importance of sampling was examined in this study. Sampling is an essential tool for research in Education. Specific sampling techniques are used for specific research problems because one technique may not be appropriate for all problems. Similarly, if the sample size is inappropriate it may lead to erroneous conclusions. The aim of this paper is to sensitize our researchers on the importance of proper sampling and sample size determination. The various types of probability and non-probability sampling techniques were explained in this paper.
Article
Full-text available
Descriptive statistics are an important part of biomedical research which is used to describe the basic features of the data in the study. They provide simple summaries about the sample and the measures. Measures of the central tendency and dispersion are used to describe the quantitative data. For the continuous data, test of the normality is an important step for deciding the measures of central tendency and statistical methods for data analysis. When our data follow normal distribution, parametric tests otherwise nonparametric methods are used to compare the groups. There are different methods used to test the normality of data, including numerical and visual methods, and each method has its own advantages and disadvantages. In the present study, we have discussed the summary measures and methods used to test the normality of the data
Article
Full-text available
The high rate of unemployment results the economic growth to be hampered. To solve this situation, the government try to change the students' mindset from becoming a job seeker to become a job creator or entrepreneur. One real action that usually been held in Bali State Polytechnic is Student Entrepreneurial Program. The purpose of this research is to identify and analyze the factors that influence the interest of Bali State Polytechnic's Students in entrepreneurship, especially in the Entrepreneurial Student Program. Method used in this research is Factor Analysis including Bartlett Test, Kaiser-Mayer Olkin (KMO), Measure of Sampling Adequacy (MSA), factor extraction using Principal Component Analysis (PCA), factor selection using eigen value and scree plot, and factor rotation using orthogonal rotation varimax. Result shows that there are four factors that influencing the interest of Bali State Polytechnic's Students in Entrepreneurship which are Contextual Factor (including Entrepreneurship Training, Academic Support, Perceived Confidence, and Economic Challenge), Self Efficacy Factor (including Leadership, Mental Maturity, Relation with Entrepreneur, and Authority), Subjective Norm Factor (including Support of Important Relative, Support of Friends, and Family Role), and Attitude Factor (including Self Realization).
Article
Full-text available
The sample size is a significant feature of any empirical study in which the goal is to make inferences about a population from a sample. In order to generalize from a random sample and avoid sampling errors or biases, a random sample needs to be of adequate size. This study presents a summary of how to calculate the survey sample size in social research and information system research.
Article
Full-text available
In regression analysis it is obvious to have a correlation between the response and predictor(s), but having correlation among predictors is something undesired. The number of predictors included in the regression model depends on many factors among which, historical data, experience, etc. At the end selection of most important predictors is something objective due to the researcher. Multicollinearity is a phenomena when two or more predictors are correlated, if this happens, the standard error of the coefficients will increase [8]. Increased standard errors means that the coefficients for some or all independent variables may be found to be significantly different from In other words, by overinflating the standard errors, multicollinearity makes some variables statistically insignificant when they should be significant. In this paper we focus on the multicollinearity, reasons and consequences on the reliability of the regression model.
Article
Full-text available
Scale development involves numerous theoretical, methodological, and statistical competencies. Despite the central role that scales play in our predictions, scholars often apply measurement building procedures that are inconsistent with best practices. The defaults in statistical programs, inadequate training, and numerous evaluation points can lead to improper practices. Based on a quantitative content analysis of communication journal articles, scholars have improved very little in the communication of their scale development decisions and practices. To address these reoccurring issues, this article breaks down and recommends 10 steps to follow in the scale development process for researchers unfamiliar with the process. Furthermore, the present research makes a unique contribution by overviewing procedures scholars should employ to develop their dimensions and corresponding items. The overarching objective is to encourage the adoption of scale development best practices that yield stronger concepts, and in the long run, a more stable foundation of knowledge.
Article
Full-text available
Principal component analysis (PCA) is a multivariate technique that analyzes a data table in which observations are described by several inter-correlated quantitative dependent variables. Its goal is to extract the important information from the statistical data to represent it as a set of new orthogonal variables called principal components, and to display the pattern of similarity between the observations and of the variables as points in spot maps. Mathematically, PCA depends upon the eigen-decomposition of positive semi-definite matrices and upon the singular value decomposition (SVD) of rectangular matrices. It is determined by eigenvectors and eigenvalues. Eigenvectors and eigenvalues are numbers and vectors associated to square matrices. Together they provide the eigen-decomposition of a matrix, which analyzes the structure of this matrix such as correlation, covariance, or cross-product matrices. Performing PCA is quite simple in practice. Organize a data set as an m × n matrix, where m is the number of measurement types and n is the number of trials. Subtract of the mean for each measurement type or row xi . Calculate the SVD or the eigenvectors of the co-variance. It was found that there were many interesting applications of PCA, out of which in day today life knowingly or unknowingly multivariate data analysis and image compression are being used alternatively.
Article
Full-text available
The scale development process is critical to building knowledge in human and social sciences. The present paper aimed (a) to provide a systematic review of the published literature regarding current practices of the scale development process, (b) to assess the main limitations reported by the authors in these processes, and (c) to provide a set of recommendations for best practices in future scale development research. Papers were selected in September 2015, with the search terms “scale development” and “limitations” from three databases: Scopus, PsycINFO, and Web of Science, with no time restriction. We evaluated 105 studies published between 1976 and 2015. The analysis considered the three basic steps in scale development: item generation, theoretical analysis, and psychometric analysis. The study identified ten main types of limitation in these practices reported in the literature: sample characteristic limitations, methodological limitations, psychometric limitations, qualitative research limitations, missing data, social desirability bias, item limitations, brevity of the scale, difficulty controlling all variables, and lack of manual instructions. Considering these results, various studies analyzed in this review clearly identified methodological weaknesses in the scale development process (e.g., smaller sample sizes in psychometric analysis), but only a few researchers recognized and recorded these limitations. We hope that a systematic knowledge of the difficulties usually reported in scale development will help future researchers to recognize their own limitations and especially to make the most appropriate choices among different conceptions and methodological strategies.
Article
Collinearity between independent variables is a recurrent problem in quantitative empirical research in International Business (IB). We explore insufficient and inappropriate treatment of collinearity and use simulations to illustrate the potential impact on results. We also show how IB researchers doing quantitative work can avoid collinearity issues that lead to spurious and unstable results. Our six principal insights are the following: first, multicollinearity does not introduce bias. It is not an econometric problem in the sense that it would violate assumptions necessary for regression models to work. Second, variance inflation factors are indicators of standard errors that are too large, not too small. Third, coefficient instability is not a consequence of multicollinearity. Fourth, in the presence of a higher partial correlation between the variables, it can paradoxically become more problematic to omit one of these variables. Fifth, ignoring clusters in data can lead to spurious results. Sixth, accounting for country clusters does not pick up all country-level variation.