ArticlePublisher preview available
To read the full-text of this research, you can request a copy directly from the authors.

Abstract and Figures

Principal component analysis (PCA) is a popular technique for building social indicators in the field of spatial analysis. However, literature shows that there is no consensus on how to apply PCA to longitudinal studies, and researchers have done the analysis using different approaches, varying the way data are combined and the frequency in which the data are sampled. This research explores such approaches with two objectives: to draw attention to the limitations of using PCA in longitudinal analyses, and to show how to overcome these limitations. For this purpose, indicators of urban inequality of eight cities are compared in each approach. The results show that the use of PCA presents limitations for the longitudinal study of urban inequality either because the evolution of the phenomenon is not always captured, or a large part of the indicators does not explain the phenomenon properly, or yet when a change in the calculation of the indicator distorts and enhances the differences in urban inequality through the years. An analytical chart is proposed to guide researchers with explanations and justifications that should accompany the use of PCA in longitudinal analyses.
This content is subject to copyright. Terms and conditions apply.
Principal component analysis applied to multidimensional
social indicators longitudinal studies: limitations
and possibilities
Matheus Pereira Libo
´rio .Oseias da Silva Martinuci .Alexei Manso Correa Machado .
Thiago Melo Machado-Coelho .Sandro Laudares .Patrı
´cia Bernardes
Accepted: 11 October 2020 / Published online: 21 October 2020
ÓSpringer Nature B.V. 2020
Abstract Principal component analysis (PCA) is a
popular technique for building social indicators in the
field of spatial analysis. However, literature shows that
there is no consensus on how to apply PCA to
longitudinal studies, and researchers have done the
analysis using different approaches, varying the way
data are combined and the frequency in which the data
are sampled. This research explores such approaches
with two objectives: to draw attention to the limita-
tions of using PCA in longitudinal analyses, and to
show how to overcome these limitations. For this
purpose, indicators of urban inequality of eight cities
are compared in each approach. The results show that
the use of PCA presents limitations for the longitudi-
nal study of urban inequality either because the
evolution of the phenomenon is not always captured,
or a large part of the indicators does not explain the
phenomenon properly, or yet when a change in the
calculation of the indicator distorts and enhances the
differences in urban inequality through the years. An
analytical chart is proposed to guide researchers with
explanations and justifications that should accompany
the use of PCA in longitudinal analyses.
Keywords Longitudinal analyses
Multidimensional phenomena Synthesis indicators
Intra-urban Inequality Principal component analysis
Introduction
Phenomena such as development, progress, poverty
and inequality are characterized by a combination of
variables and assessed from a large amount of data of
multiple dimensions (Mazziotta and Pareto 2017). By
using a combination of appropriated variables, it is
possible to obtain Composite Indicators that facilitate
the interpretation of these originally complex phe-
nomena (Saisana and Tarantola 2002). In short,
Composite Indicators involve the aggregation of
M. P. Libo
´rio (&)A. M. C. Machado
S. Laudares P. Bernardes
Pontifical Catholic University of Minas Gerais,
Belo Horizonte 30535-012, Brazil
e-mail: m4th32s@gmail.com
A. M. C. Machado
e-mail: alexeimcmachado@gmail.com
S. Laudares
e-mail: sandrolaudares@gmail.com
P. Bernardes
e-mail: patriciabernardes@pucminas.br
O. da Silva Martinuci
Maringa
´State University, Maringa
´, Parana
´87020-900,
Brazil
e-mail: osmartinuci@uem.br
A. M. C. Machado T. M. Machado-Coelho
Federal University of Minas Gerais,
Belo Horizonte 31270-901, Brazil
e-mail: thmmcoelho@gmail.com
123
GeoJournal (2022) 87:1453–1468
https://doi.org/10.1007/s10708-020-10322-0(0123456789().,-volV)(0123456789().,-volV)
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
... The SoVI has gained widespread acceptance in the fields of disaster risk Land 2025, 14, 263 4 of 28 reduction and policy planning, owing to its capacity to deliver a systematic and replicable evaluation of vulnerability [16]. Nonetheless, its dependence on PCA has faced criticism for certain limitations, such as challenges in result interpretation and insufficient consideration of local contexts or stakeholder perspectives [19,40]. ...
... Studies that did not quantify SVI as an index or collected data using qualitative methods, such as surveys, were excluded. When qualitative data collection was included, the composition of the variables or the subject of the study was limited [40], which does not fit the purpose of this study. Studies that used SVI as one of the multiple components for composite results were excluded because most studies that used SVI as a factor to form a comprehensive result no longer focused on SVI [41]. ...
Article
Full-text available
Social vulnerability plays a vital role in understanding how various societal characteristics influence communities during extreme events. This study aimed to systematically identify key indicators and methodological approaches used in Social Vulnerability Index (SVI) research by utilizing the Preferred Reporting Items for Systematic reviews and Meta-Analyses extension for Scoping Reviews (PRISMA-ScR). The study specifically addresses gaps in indicator selection and emphasizes incorporating land-related and diverse variables for improved applications across various contexts. Social variables essential for SVI development were collected, and their applications across studies were analyzed. A total of 30,443 articles were identified from multiple databases, with 72 meeting the inclusion criteria after rigorous evaluation. Key aspects such as methodologies, weighting schemes, and primary variables used in SVI computation were outlined. Principal component analysis emerged as the most commonly employed method, though diverse approaches have gained traction in recent years. Significant variability was observed in the variables across studies, with demographic indicators being the most frequently utilized. The identified variables were categorized into 21 domains, comprising 61 indicators. While the findings of this study focus on improving the understanding of SVI development and its diverse applications, they also hold potential for informing sustainable land management and disaster resilience strategies, particularly in tailoring interventions to region-specific vulnerabilities.
... Unlike traditional linear methods such as PCA [44], autoencoders (AEs) can capture the nonlinear features of GW signals more effectively [45]. While PCA is efficient for simple signals, its linear projections may miss important features in the nonlinear phases of GW evolution [46,47]. In contrast, AEs reduce dimensionality through nonlinear mappings, preserving key physical features such as orbital dynamics, tidal effects, and ringdown. ...
Preprint
Gravitational waves from binary neutron star mergers provide insights into dense matter physics and strong-field gravity, but waveform modeling remains computationally challenging. We develop a deep generative model for gravitational waveforms from BNS mergers, covering the late inspiral, merger, and ringdown, incorporating precession and tidal effects. Using the conditional autoencoder, our model efficiently generates waveforms with high fidelity across a broad parameter space, including component masses (m1,m2)(m_1, m_2), spin components (S1x,S1y,S1z,S2x,S2y,S2z)(S_{1x}, S_{1y}, S_{1z}, S_{2x}, S_{2y}, S_{2z}), and tidal deformability (Λ1,Λ2)(\Lambda_1, \Lambda_2). Trained on 3×1053 \times 10^5 waveforms from the IMRPhenomXP\_NRTidalv2 waveform model, it achieves an average overlap accuracy of 99.6\% on the test set. The model significantly accelerates waveform generation. For a single sample, it requires 0.12 seconds (s), compared to 0.38 s for IMRPhenomXP\_NRTidalv2 and 0.62 s for IMRPhenomPv2\_NRTidal, making it approximately 3 to 5 times faster. When generating 10310^3 waveforms, the network completes the task in 0.86 s, while traditional waveform approximation methods take over 46--53 s. Our model achieves a total time of 7.48 s to generate 10410^4 such waveforms, making it about 60 to 65 times faster than traditional waveform approximation methods. This speed advantage enables rapid parameter estimation and real-time gravitational wave searches. With higher precision, it will support low-latency detection and broader applications in multi-messenger astrophysics.
... The principal component analysis method is a commonly used method to calculate the social deprivation index [17]. Its basic principle is to use the idea of dimension reduction to convert a group of related indices into another group of unrelated comprehensive indices [51], i.e., main components, and then calculate the social deprivation score through the formula. The steps for calculating the social deprivation value using the principal component analysis method are as follows: 1. Normalize the index data using Eq. ...
Article
Full-text available
Background Although significant progress has been made in the health status of Chinese citizens, disparities are still strikingly evident. This paper reveals the interconnection between social deprivation and the health of the Chinese population using the latest census data, and delves into the impact of social deprivation on health outcomes. Methods To assess social deprivation, this study selected 14 indicators from six domains: income, employment, education, housing condition, housing area, and demographic structure. The social deprivation value was calculated using entropy method, variation coefficient method, CRITIC method, and principal component analysis method, and its spatial distribution was compared. Meanwhile, the best models are selected from ordinary least squares regression models, spatial lag models and spatial error models to analyze the effect of social deprivation on health outcomes according to the performances of these models. Results The spatial distribution of social deprivation in China displays notable heterogeneity. The best models indicates that social deprivation is negatively correlated with mortality rate of Class A and B infectious diseases, average life expectancy and proportion of healthy elderly, but positively correlated with incidence rate of Class A and B infectious diseases, maternal mortality rate, and prevalence rate of low-weight children. The part of regression models for analyzing the relationship between social deprivation and metrics like incidence rate of infectious diseases, maternal mortality rate, average life expectancy, and proportion of healthy elderly are in the form of spatial lag. The part of regression models for analyzing the relationship between social deprivation with mortality rate of Class A and B infectious diseases and prevalence rate of low weight children are in the form of spatial error. Conclusion Social deprivation impacts the health of different populations, and this influence exhibits correlation and interaction across various regions. Therefore, it is necessary for governments to develop policies, particularly those aimed at enhancing the equality of public health services, to address the imbalance in regional development, allocate resources scientifically, and narrow the gap in economic, social, and healthcare development across regions.
... One cannot think of measuring such different realities with the same indicator and one cannot think that a single value can adequately represent complex structures that have several dimensions to evaluate. The economic dimension must be accompanied by a social and environmental assessment in order to have a balance between all the components of human living (LIBÓRIO et al., 2022). ...
Article
Full-text available
Measuring the well-being of a nation means identifying all the tools that enable its individuals to live well without worsening the lives of their neighbours or those to come in the future. For many years, the focus has been solely on the economic dimension, creating critical problems in the social and environmental spheres that will take years and large investments to remedy. This article gives an overview of the measurements used over the years in the international arena by recognised and respected bodies. The application of these indicators to the realities of countries has made it possible to identify models to be followed in order to enable the growth of the well-being of societies as a whole.
... Iteration) [48]. A detail comparative analysis of WHO with principal component analysis (PCA) and genetic algorithm (GA) have been performed in Table. 2 and Table.3 to validate the superiority of the WHO algorithms as compared to PCA and GA [49][50][51][52][53][54][55][56][57][58]. Numerous PCA constraints may be successfully addressed by using WHO. ...
Article
Full-text available
With the advent of smart grids, advanced information infrastructures, advanced metering facilities, bidirectional exchange of information, and battery storage home area networks have all transformed the electricity consumption and energy efficiency. There is a significant shift in the energy management structure from the traditional centralized infrastructure to the flexible demand side driven cyber-physical power systems with clean energy and energy storage system. These changes have significantly evolved the home energy management (HEM) space. Consequently, stakeholders must define their responsibilities, create efficient regulatory frameworks, and test out novel commercial strategies. P2P energy trading appears to be a feasible solution in these circumstances, allowing users to trade electricity with one another before becoming completely reliant on the utility. P2P energy trading offers a more stable platform for energy trade by facilitating the exchange of energy between prosumers and consumers. This research proposes a novel demand and generation prediction techniques of P2P HEMS for optimal energy using the Multi-Objective Optimization model. An enhanced Wild Horse Optimization technique was first used to summarize historical records' qualities. Then, the Bi-LSTM is used to predict the demand and generation values. Furthermore, a Grasshopper optimization (GHO) approach is employed to fine-tune the model's hyperparameters. The P2P HEMS demand and generation prediction framework is offered with a probabilistic and fault evaluation that upholds load flow balance between need and supply for continuous operations. It results in an intelligent community transforming cities into smart ones, opening new avenues for scientific research in terms of technological developments.
... PCA is one of the most applied statistical methods for reducing the dimensionality and regaining understanding of the acquired data (Libório et al., 2022). It aims to transform the original data into a new dataset composed of powerful uncorrelated variables called principal components, which take the maximum variance from the data. ...
Article
Full-text available
Particularly, environmental pollution, such as air pollution, is still a significant issue of concern all over the world and thus requires the identification of good models for prediction to enable management. Blind Source Separation (BSS), Copula functions, and Long Short-Term Memory (LSTM) network integrated with the Greylag Goose Optimization (GGO) algorithm have been adopted in this research work to improve air pollution forecasting. The proposed model involves preprocessed data from the urban air quality monitoring dataset containing complete environmental and pollutant data. The application of Noise Reduction and Isolation techniques involves the use of methods such as Blind Source Separation (BSS). Using copula functions affords an even better estimate of the dependence structure between the variables. Both the BSS and Copula parameters are then estimated using GGO, which notably enhances the performance of these parameters. Finally, the air pollution levels are forecasted using a time series employing LSTM networks optimized by GGO. The results reveal that GGO-LSTM optimization exhibits the lowest mean squared error (MSE) compared to other optimization methods of the proposed model. The results underscore that certain aspects, such as noise reduction, dependence modeling and optimization of parameters, provide much insight into air quality. Hence, this integrated framework enables a proper approach to monitoring the environment by offering planners and policymakers information to help in articulating efficient environment air quality management strategies.
... PCA was chosen as it is typically used in the SoVI literature (Aksha et al., 2019;Tate, 2012;Cutter & Finch, 2008). Indeed, PCA is a popular technique in statistical analyses which involve spatial components, to bring together multiple components into a lower-dimensional set of components while preserving variation of original variable with the least possible loss of information and facilitate the interpretation of the original concepts (Libório et al., 2022). Once the data are selected (Level 1 and Level 2), we constructed the vulnerability index following standard steps; PCA rotation, PCA component selection, and a weighting scheme (Schmidtlein et al., 2008). ...
Article
Full-text available
We consider the availability of new harmonized data sources and novel machine learning methodologies in the construction of a social vulnerability index (SoVI), a multidimensional measure that defines how individuals’ and communities may respond to hazards including natural disasters, economic changes, and global health crises. The factors underpinning social vulnerability—namely, economic status, age, disability, language, ethnicity, and location—are well understood from a theoretical perspective, and existing indices are generally constructed based on specific data chosen to represent these factors. Further, the indices’ construction methods generally assume structured, linear relationships among input variables and may not capture subtle nonlinear patterns more reflective of the multidimensionality of social vulnerability. We compare a procedure which considers an increased number of variables to describe the SoVI factors with existing approaches that choose specific variables based on consensus within the social science community. Reproducing the analysis across eight countries, as well as leveraging deep learning methods which in recent years have been found to be powerful for finding structure in data, demonstrate that wealth-related factors consistently explain the largest variance and are the most common element in social vulnerability.
Article
Full-text available
This study investigates whether local government expenditure correlate with the Quality of Life (QoL) of the local population and whether high public expenditure are indicative of high or low QoL. Data, including information on public expenditure and objective QoL indicators, were gathered for Finnish municipalities from several existing databases and cover a period of 2015–2019. A composite indicator was constructed to measure municipal QoL performance. The results indicate that there is a clear correlation between public expenditure and QoL: the higher the public expenditure the lower the QoL of the local population. This is due to a greater demand for public goods in municipalities that have low QoL levels. Further, QoL and public expenditure levels are a fairly constant phenomenon. There was no evidence that changes in public expenditure and QoL would affect each other in the short term. The added value of this paper is that it fills parts of the research gap concerning our knowledge on the empirical links between public expenditure and QoL at the local level.
Article
The recent Covid-19 pandemic has tremendously changed the livelihoods of slum dwellers due to the sudden loss of occupations and this situation increased their vulnerability. The objective of this paper was to assess the slum dwellers’ livelihood vulnerability by implementing LVI with reference to five livelihood capitals which comprise 27 sub-components. The RCC slum area was deliberately selected as the study area and it is one that is categorized into inner, middle and outer zones based on distance from the CBD. In total, 361 households were selected through simple random sampling from twelve slum areas with a 95% precision level. Primary data were gathered from the three stated slum zones using semi-structured questionnaires that investigated health, knowledge and skills, leadership potential, demographic profile, participation and connection, housing and sanitation, income and finance, total land and water as major components to assess LVI. Results revealed that the outer slum zone was the most vulnerable (0.697) based on overall LVI because financial and physical capital vulnerability were found to be higher. As well, LVI reported that inner (0.560) and middle (0.660) slum zones were categorized as moderate. The study also found that the slums located near the CBD were found to be less vulnerable because they managed to receive basic needs from relief efforts during the pandemic. Inner slum zone dwellers’ human capital (health, knowledge and skills, leadership potential) vulnerability proved to be lower than in the middle and the outer zones. Social capital (demographic profile and participation and connection) vulnerability of the inner zone was better than the other two zones. Overall, less access to own/agricultural land or grazing land and water facilities in slum zones was reported in natural capital vulnerability. Radar diagrams showed all livelihood capitals vulnerability of the outer zone were to be higher than the inner zone except for natural capital. Finally, the central government should devise appropriate guidelines to reduce livelihood vulnerability which hugely compromises the lives and livelihoods of slum dwellers.
Article
Full-text available
This paper investigates the role of spatial dependence, spatial heterogeneity and spatial scale in principal component analysis for geographically distributed data. It considers spatial heterogeneity by adopting geographically weighted principal component analysis at a fine spatial resolution. Moreover, it focuses on dependence by introducing a novel approach based on spatial filtering. These methods are applied in order to derive a composite indicator of socioeconomic deprivation in the Italian province of Rome while considering two spatial scales: municipalities and localities. The results show that considering spatial information uncovers a range of issues, including neighbourhood effects, which are useful in order to improve local policies.
Article
Full-text available
O processo de desenvolvimento econômico regional tem implicações para a dinâmica populacional, a qual possui efeitos recíprocos sobre o desenvolvimento. O objetivo do artigo é identificar a contribuição da hierarquia urbana para a decisão de migrar no Brasil entre 1980 e 2010. Para sua consecução, são analisados microdados dos censos demográficos referentes ao período em análise, fornecidos pelo ibge, e estimados modelos de regressão logística para a condição de migração individual. A análise dos dados demonstra que os deslocamentos populacionais recentes no Brasil guardam elevada relação com os respectivos níveis de desenvolvimento urbano-regional , bem como estão associados às vantagens dos centros urbanos. Novas tendências parecem surgir com o aumento da migração, inclusive de retorno, em direção a regiões tradicionalmente incapazes de reter suas respectivas populações, o que está atrelado significativamente à expansão de seus ritmos de crescimento e à complexidade dos mercados de trabalho em regiões mais dinâmicas do país. palavras-chave | desenvolvimento regional e local, migração, sistema urbano.
Article
Full-text available
Ecological vulnerability, as an important evaluation method reflecting regional ecological status and the degree of stability, is the key content in global change and sustainable development. Most studies mainly focus on changes of ecological vulnerability concerning the temporal trend, but rarely take arid and semi-arid areas into consideration to explore the spatial heterogeneity of the ecological vulnerability index (EVI) there. In this study, we selected the Ningxia Hui Autonomous Region on the Loess Plateau of China, a typical arid and semi-arid area, as a case to investigate the spatial heterogeneity of the EVI every five years, from 1990 to 2015. Based on remote sensing data, meteorological data, and economic statistical data, this study first evaluated the temporal‒spatial change of ecological vulnerability in the study area by Geo-information Tupu. Further, we explored the spatial heterogeneity of the ecological vulnerability using Getis-Ord Gi*. Results show that: (1) the regions with high ecological vulnerability are mainly concentrated in the north of the study area, which has high levels of economic growth, while the regions with low ecological vulnerability are mainly distributed in the relatively poor regions in the south of the study area. (2) From 1990 to 2015, ecological vulnerability showed an increasing trend in the study area. Additionally, there is significant transformation between different grades of the EVI, where the area of transformation between a slight vulnerability level and a light vulnerability level accounts for 41.56% of the transformation area. (3) Hot-spot areas of the EVI are mainly concentrated in the north of the study area, and cold-spot areas are mainly concentrated in the center and south of the study area. Spatial heterogeneity of ecological vulnerability is significant in the central and southern areas but insignificant in the north of the study area. (4) The grassland area is the main driving factor of the change in ecological vulnerability, which is also affected by both arid and semi-arid climates and ecological projects. This study can provide theoretical references for sustainable development to present feasible suggestions on protection measures and management modes in arid and semi-arid areas.
Article
Full-text available
The study assesses the level of development and disparities in terms of living conditions of the households in the districts of Bundelkhand region. To measure actual scenario of living conditions of the HHs, a Composite Index was developed on the basis of 18 indicators. Even to assess living conditions of the HHs, four indices have been developed namely Housing Index, Physical Capital Index and Asset Index. The level of development of the districts has been categorized on the basis of Composite Index value. The results show that there is a wide disparity in terms of conditions of living in different districts of Bundelkhand region. The results also shows that northern part of Bundelkhand region is more developed as compared to southern part. From the result, it was recorded that the districts belonging in Madhya Pradesh having better condition of living in comparison to Uttar Pradesh in Bundelkhand region. The research study suggests that authorities should focus on the proper implementation of the existing policies and more effective planning and policies should be implemented to enhance the better living conditions of the households in Bundelkhand region.
Article
Full-text available
A sobreposição de mapas em diferentes escalas está associada a problemas geométricos, da falácia ecológica e da mudança de suporte espacial. Essa pesquisa analisa trabalhos publicados (2014-2019) nas principais revistas de geografia do Brasil que se inserem no contexto do problema da mudança de suporte de dados geográficos. Além de trazer evidências que mostram a relevância desse problema para área da geografia, esse artigo tem como objetivo propor um procedimento que, ao compatibilizar mapas em diferentes escalas, reduz problemas de mudança de suporte espacial. O procedimento é realizado em software de Sistema de Informações Geográficas (SIG) livre e consiste em: (i) inserir mais vértices nos elementos dos mapas de Setores Censitários (SCs) do Instituto Brasileiro de Geografia e Estatística (IBGE) e do arruamento da cidade de Foz do Iguaçu-PR; e (ii) substituir as coordenadas dos vértices que apresentam possíveis erros de sobreposição. Além de compatibilizar 87% dos SCs, e apresentar erros de ajuste em 2,81% dos casos, o procedimento dispensa o desenvolvimento de algoritmos ou codificações, que são necessários em propostas anteriores.
Article
Full-text available
The reasons for and against composite indicators are briefly reviewed, as well as the available theories for their construction. After noting the strong normative dimension of these measures—which ultimately aim to ‘tell a story’, e.g. to promote the social discovery of a particular phenomenon, we inquire whether a less partisan use of a composite indicator can be proposed by allowing more latitude in the framing of its construction. We thus explore whether a composite indicator can be built to tell ‘more than one story’ and test this in practical contexts. These include measures used in convergence analysis in the field of cohesion policies and a recent case involving the World Bank’s Doing Business Index. Our experiments are built to imagine different constituencies and stakeholders who agree on the use of evidence and of statistical information while differing on the interpretation of what is relevant and vital.
Article
Full-text available
Rural public health still faces serious challenges in China. These challenges in rural public health reduce peasants’ well-being and social satisfaction. Examining the social factors of rural public health helps improve the public health in rural areas. This study attempts to characterize the relationship between social deprivation and rural public health in China. In particular, 14 indicators are integrated for assessing social deprivation, which is described from five domains: income, employment, education, housing and demographic structure. The analytic hierarchy process, Delphi method, entropy method and coefficient variation method are selected as weight determining methods to evaluate the corresponding weights for the indicators of social deprivation. Then, the best models are selected from ordinary least squares regression models, spatial lag models and spatial error models according to the performances of these models. The results of assessing social deprivation indicate that the spatial distribution of social deprivation has great heterogeneity in rural China. In addition, the relative levels of social deprivation among 31 provinces that are estimated by different weight determination methods remain stable. Finally, the spatial regression models reveal that social deprivation is a positive contributor to the maternal mortality rate and child mortality rate, while social deprivation presents negative relationships with the proportion of healthy elders.
Article
Neighborhood socioeconomic disadvantage is a measure of socio-spatial inequality that has been shown to be associated with a variety of social, economic, and health outcomes. Existing studies that explore the local patterning of disadvantage often construct composite indices that summarize the interactions between multiple dimensions of social status, but do not consider if, and how, disadvantage exhibits spatial structure. This study applies a Bayesian multivariate factor analytic modeling approach to examine the spatial structure of socioeconomic disadvantage in Toronto, Canada. Socioeconomic disadvantage is modeled as an area-based composite index associated with three variables measuring low income, low-educational attainment, and low occupational status, and a series of models with different assumptions regarding the spatial structure of disadvantage are compared. The best-fitting model shows that the prevalence of low-income households has the strongest positive association with disadvantage and that spatial clustering is three times more important than spatial heterogeneity for explaining the spatial structure of disadvantage. The implications of this study for analyzing multivariate spatial data and for understanding the interactions amongst multiple dimensions of disadvantage are discussed.
Article
Chlorophyll-a is an established indexing marker for phytoplankton abundance and biomass amongst primary food producers in an aquatic ecosystem. Understanding and modeling the level of Chlorophyll-a as a function of environmental parameters have been found to be very beneficial for the management of the coastal ecosystems. This study developed a mathematical model to predict Chlorophyll-a concentrations based on a data driven modeling approach. The prediction model was developed using principal component analysis (PCA) and multiple linear regression analysis (MLR) approaches. The predictive success (R 2) of the model was found to be ~84.8% for first approach and ~83.8% for the second approach. A final model was generated using a combined principal component scores (PCS) and MLR approach that involves fewer parameters and has a predictive ability of 83.6%. The PCS-MLR method helped to identify the relationship amongst dependent as well as predictor variables and eliminated collinearity problems. The final model is quite simple and intuitive and can be used to understand real system operations.
Article
We propose a longitudinal principal component analysis method for multivariate longitudinal data using a random-effects eigen-decomposition, where the eigen-decomposition utilizes longitudinal information through nonparametric splines and the multivariate random effects incorporate significant store-wise heterogeneity. Our method can effectively analyze large marketing data containing sales information for products from hundreds of stores over an 11-year time period. The proposed method leads to more accurate estimation and interpretation compared to existing approaches. We illustrate our method through simulation studies and an application to marketing data from IRI.