Chapter

The Cost-Benefit Analysis of Data Accuracy in City Development Strategy: Exploring the Trade-Off Between Accuracy and Cost

Authors:
To read the full-text of this research, you can request a copy directly from the authors.

Abstract

In the realm of urban planning and infrastructure development, the fusion of expansive data and cutting-edge analytics has ushered in an era where cities can craft their trajectories based on evidence and foresight. This data-driven approach empowers municipalities to optimize resource allocation, enhance operational efficiency, and ultimately elevate the quality of services provided to residents. However, the pursuit of data accuracy, a linchpin in this paradigm, is not without its hurdles, including the challenges of cost, time constraints, and varying data availability. Despite these obstacles, the significance of accurate data cannot be overstated. It serves as the cornerstone for informed decision-making, enabling city governments to pinpoint and rectify disparities, channel resources with precision, and curtail inefficiencies. This paper under-takes a comprehensive examination of the cost-effectiveness of data accuracy within the framework of city development strategy, employing a meticulous cost-benefit analysis. By delineating the nuanced relationship between accuracy and cost, the research aims to identify the optimal threshold for data accuracy. Armed with this knowledge, city governments can make judicious choices, harnessing the transformative potential of accurate data to sculpt more equitable, sustainable, and livable urban landscapes. The findings of this study promise to guide cities toward a future where precision in data informs policy, fostering cities that are not only resilient but responsive to the diverse needs of their inhabitants.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

ResearchGate has not been able to resolve any citations for this publication.
Article
Full-text available
Forecasting economic growth is critical for formulating national economic development policies. Neural Networks are a type of artificial intelligence that may be used to model complex target functions. ANN (Artificial Neural Networks) are one of the most effective learning approaches now available for specific sorts of tasks, such as learning to understand complex real-world sensor data. This paper proposes the regional economic prediction model based on neural networks techniques. Bayesian vector neural network (BVNN) is integrated with backpropagation (BP) model. The database has been collected based on the economics of particular region which has been extracted and classified using knowledge-based computer analysis by neural networks. Discretization, reduction, importance ranking, and prediction rule are attributes considered here. Then, as the input training sample, feed extracted important components into the NN. This strategy enhanced the training speed and prediction accuracy by reducing structure of NN. WEO, APDREO, and AFRREO are the dataset and FWA-SVR and LSTM are the existing method taken for comparison. For the WEO dataset, 97% of GDP and 98% of accuracy are produced. For APDREO dataset, 92% of accuracy and GDP of 97% are obtained. For AFRREO dataset, 98% of accuracy is produced. The neural network can tackle nonlinear problems, according to experimental data, and the technology has been proven to be successful and viable with high accuracy. For practical application, the model has a good reference value. The proposed model reduces error by increasing the convergence rate and accuracy for each dataset.
Article
Full-text available
The issue of property evaluation and appraisal has been of high interest for private and public agents involved in the housing industry for the purposes of trade, insurance and tax. This paper aims to investigate how different factors related to the location of a property affect its price over time. The predictive models applied in this research are driven by real estate transactions data of Tehran Metropolitan Area, captured from open data available to the public. The parameters of the functions that describe the behavior of the housing market are estimated through applying different types of statistical models, including ordinary least squares (OLS), geographically weighted regression (GWR) and geographically and temporally weighted regression (GTWR). This suite of models has been run in order to compare their efficiency and accuracy in predicting the variations in housing price. The GTWR model showed significantly better performance than OLS and GWR models, as the goodness of fit index (adjusted R²) improved by 22 percent. Therefore, spatio-temporal non-stationary modelling is significant in the explanation of the variations in housing value and the GTWR coefficients were found more reliable. Three internal factors (size of building; building age; building quality), and eight external factors (topography; land-use mix; population density; distance to city center; distance to subway station; distance to regional parks; distance to highway; distance to airport) influence the property price, either positively or negatively. Moreover, using significant variables that extracted from regression models, the optimum number of housing value clusters is generated using the spatial ‘k’luster analysis by tree edge removal (SKATER) method. Five clusters of housing patterns were recognized. The policy implication of this paper is grouping of Metropolitan Tehran housing value data into five clusters with different characteristics. The varying factors influencing housing value in each cluster are different, making this data analysis technique useful for policy-makers in the housing sector.
Article
Full-text available
The technological landscape of intelligent transport systems (ITS) has been radically transformed by the emergence of the big data streams generated by the Internet of Things (IoT), smart sensors, surveillance feeds, social media, as well as growing infrastructure needs. It is timely and pertinent that ITS harness the potential of an artificial intelligence (AI) to develop the big data-driven smart traffic management solutions for effective decision-making. The existing AI techniques that function in isolation exhibit clear limitations in developing a comprehensive platform due to the dynamicity of big data streams, high-frequency unlabeled data generation from the heterogeneous data sources, and volatility of traffic conditions. In this paper, we propose an expansive smart traffic management platform (STMP) based on the unsupervised online incremental machine learning, deep learning, and deep reinforcement learning to address these limitations. The STMP integrates the heterogeneous big data streams, such as the IoT, smart sensors, and social media, to detect concept drifts, distinguish between the recurrent and non-recurrent traffic events, and impact propagation, traffic flow forecasting, commuter sentiment analysis, and optimized traffic control decisions. The platform is successfully demonstrated on 190 million records of smart sensor network traffic data generated by 545,851 commuters and corresponding social media data on the arterial road network of Victoria, Australia.
Article
Full-text available
In recent years, modern metropolitan areas are the main indicators of economic growth of nation. In metropolitan areas, number and frequency of vehicles have increased tremendously, and they create issues, like traffic congestion, accidents, environmental pollution, economical losses and unnecessary waste of fuel. In this paper, we propose traffic management system based on the prediction information to reduce the above mentioned issues in a metropolitan area. The proposed traffic management system makes use of static and mobile agents, where the static agent available at region creates and dispatches mobile agents to zones in a metropolitan area. The migrated mobile agents use emergent intelligence technique to collect and share traffic flow parameters (speed and density), historical data, resource information, spatio-temporal data and so on, and are analyzes the static agent. The emergent intelligence technique at static agent uses analyzed, historical and spatio-temporal data for monitoring and predicting the expected patterns of traffic density (commuters and vehicles) and travel times in each zone and region. The static agent optimizes predicted and analyzed data for choosing optimal routes to divert the traffic, in order to ensure smooth traffic flow and reduce frequency of occurrence of traffic congestion, reduce traffic density and travel time. The performance analysis is performed in realistic scenario by integrating NS2, SUMO, OpenStreatMap (OSM) and MOVE tool. The effectiveness of the proposed approach has been compared with the existing approach.
Chapter
Full-text available
Preemptive measures are of utmost importance for crime prevention. Law enforcement agencies need to have an agile approach to solve everchanging crimes. Data analytics has proven to be an effective deterrent in the field of crime data analysis. Various countries like the United States of America have benefitted by this approach. The Government of India has also taken an initiative to implement data analytics to facilitate crime prevention measures. In this research paper, we have used R Studio, an open source data mining tool to perform the data analysis on the crime dataset shared by the Gujarat Police Department. To develop predictive model and study crime patterns we used various supervised and unsupervised data mining techniques such as Multiple Linear Regression, K-Means Clustering and Association Rules Analysis. The scope of this research paper is to showcase the effectiveness of data mining in the domain of crime prevention. In addition, an effort has been put forth to help the Gujarat Police Department to analyze their crime records and provide meaningful insights for decision making to solve the cases recorded.
Article
Full-text available
Although literature indicates that big data and predictive analytics (BDPA) convey a distinct organisational capability, little is known about their performance effects in particular contextual conditions (inter alia, national context and culture, and firm size). Grounding our investigation in the dynamic capability views and organisational culture and based on a sample of 205 Indian manufacturing organisations, we empirically investigate the effects of BDPA on social performance (SP) and environmental performance (EP) using variance based structural equation modelling (i.e. PLS). We find that BDPA has significant impact on SP/EP. However, we did not find evidence for moderating role of flexible orientation and control orientation in the links between BDPA and SP/EP. Our findings offer a more nuanced understanding of the performance implications of BDPA, thereby addressing the crucial questions of how and when BDPA can enhance social/environmental sustainability in supply chains.
Article
Full-text available
Recently, Artificial Intelligence (AI) has been used widely in medicine and health care sector. In machine learning, the classification or prediction is a major field of AI.Today, the study of existing predictive models based on machine learning methods is extremely active. Doctors need accurate predictions for the outcomes of their patients’ diseases.In addition, for accurate predictions, timing is another significant factor that influences treatment decisions. In this paper, existing predictive models in medicine and health care have critically reviewed. Furthermore, the most famous machine learning methods have explained, and the confusion between a statistical approach and machine learning has clarified. A review of related literature reveals that the predictions of existing predictive models differ even when the same data set is used. Therefore, existing predictive models are essential, and current methods must be improved.
Article
Full-text available
A comprehensive assessment of the performance of predictive models is necessary as they have been increasingly employed to generate spatial predictions for environmental management and conservation and their accuracy is crucial to evidence-informed decision making and policy. In this study, we clarified relevant issues associated with variance explained (VEcv) by predictive models, established the relationships between VEcv and commonly used accuracy measures and unified these measures under VEcv that is independent of unit/scale and data variation. We quantified the relationships between these measures and data variation and found about 65% compared models and over 45% recommended models for generating spatial predictions explained no more than 50% data variance. We classified the predictive models based on VEcv, which provides a tool to directly compare the accuracy of predictive models for data with different unit/scale and variation and establishes a cross-disciplinary context and benchmark for assessing predictive models in future studies.
Article
Full-text available
While the use of mapping in criminal justice has increased over the last 30 years, most applications are retrospective -that is, they examine criminal phenomena and related factors that have al-ready occurred. While such retrospective mapping efforts are useful, the true promise of crime mapping lies in its ability to identify early warning signs across time and space, and inform a proactive approach to police problem solving and crime prevention. Recently, attempts to develop predictive models of crime have increased, and while many of these efforts are still in the early stages, enough new knowledge has been built to merit a review of the range of methods employed to date. This chapter identifies the various methods, describes what is required to use them, and assesses how accurate they are in predicting future crime concentrations, or "hot spots." Factors such as data require-ments and applicability for law enforcement use will also be explored, and the chapter will close with recommendations for further research and a discussion of what the future might hold for crime forecasting.
Article
Full-text available
The purpose of this paper is to develop an approach to a resource-allocation problem that typically appears in organizations with a centralized decision-making environment, for example, supermarket chains, banks, and universities. The central unit is assumed to be interested in maximizing the total amount of outputs produced by the individual units by allocating available resources to them. We will develop an interactive formal approach based on data envelopment analysis (DEA) and multiple-objective linear programming (MOLP) to find the most preferred allocation plan. The units are assumed to be able to modify their production in the current production possibility set within certain assumptions. Various assumptions are considered concerning returns to scale and the ability of each unit to modify its production plan. Numerical examples are used to illustrate the approach.
Article
Recent advancements in renewable energy provide a clean, alternative source to fossil fuels. Smart grids (SGs) work by collecting data about customer requests, comparing them to current supply data, computing electricity costs, and so on. Because the processes are time-dependent, a dynamic estimate of SG stability is a critical system need. Analyzing fluctuations and disturbances in a dynamic manner is critical for the SGs to work properly. Recent advancements in data science, machine learning (ML), and deep learning (DL) models have facilitated the creation of useful stability prediction models for the SGs environment. In this regard, this study provides an Artificial Hummingbird (AHB) algorithm-based feature selection model for the SG environment with optimal DL enabled stability prediction (AHBFS-ODLSP). The AHBFS-ODLSP model is primarily concerned with the design of an AHB-based feature selection technique. Furthermore, a prediction system based on Multiheaded-Self Attention Long Short-Term Memory (MHSA-LSTM) is being created right now to predict the stability level. Then, using the symbiotic organism search (SOS) optimization technique, the MHSA-LSTM model hyperparameters were adjusted. The AHBFS and SOS algorithm designs have a big impact on how well the MHSA-LSTM model predicts stability. AHBFS-ODLSP model modifications are demonstrated through a number of simulations, and the outcomes are assessed in a variety of ways. The AHBFS-ODLSP method has achieved its maximum performance with a F score of 99.02 percent. The AHBFS-ODLSP model outperformed other prediction models, according to a thorough comparative analysis.
Chapter
Big data analytics (BDA) is systematized based on recognized diverse data group models, data family members, and movements within a huge amount of information. It considers BDA for criminal data where investigative data analysis is carried out for getting flow of activities on criminals. The rate of cyber‐crime is growing in the internet to violate the digital access of public demand. The traditional data of crime can't be analysed to catch criminals. The different structure of criminal data needs analysing to detect criminal activities. Thus, it may be considered to take BDA for huge information of different criminal activities. The data collection and distribution over geographic location is important for data analytics with security. Different kind of business and individual development are increased by internet as per public or private demand. In this scenario, cybercrime incidents are happened through several digital media. To identify crime characteristics is a big challenge, because criminals are smarter than investigation process. Thus, it needs to develop the technology to identify the criminal activities using Geographic Information System (GIS) and techniques of machine learning to develop different techniques to catch criminals based on their track of activities.
Article
This paper takes Shenzhen Futian comprehensive transportation junction as the case, and makes use of continuous multiple real-time dynamic traffic information to carry out monitoring and analysis on spatial and temporal distribution of passenger flow under different means of transportation and service capacity of junction from multi-dimensional space-time perspectives such as different period and special period. Virtual reality geographic information system is employed to present the forecasting result.
Article
House sales are determined based on the Standard & Poor’s Case-Shiller home price indices and the housing price index of the Office of Federal Housing Enterprise Oversight (OFHEO). These reflect the trends of the US housing market. In addition to these housing price indices, the development of a housing price prediction model can greatly assist in the prediction of future housing prices and the establishment of real estate policies. This study uses machine learning algorithms as a research methodology to develop a housing price prediction model. To improve the accuracy of housing price prediction, this paper analyzes the housing data of 5359 townhouses in Fairfax County, Virginia, gathered by the Multiple Listing Service (MLS) of the Metropolitan Regional Information Systems (MRIS). We develop a housing price prediction model based on machine learning algorithms such as C4.5, RIPPER, Naïve Bayesian, and AdaBoost and compare their classification accuracy performance. We then propose an improved housing price prediction model to assist a house seller or a real estate agent make better informed decisions based on house price valuation. The experiments demonstrate that the RIPPER algorithm, based on accuracy, consistently outperforms the other models in the performance of housing price prediction.
Article
Various models have been developed over the past several decades to predict the dynamic modulus /E*/ of hot-mix asphalt (HMA) based on regression analysis of laboratory measurements. The models most widely used in the asphalt community today are the Witczak 1999 and 2006 predictive models. Although the overall predictive accuracies for these existing models as reported by their developers are quite high, the models generally tend to overemphasize the influence of temperature and understate the influence of other mixture characteristics. Model accuracy also tends to fall off at the low and high temperature extremes. Recently, researchers at Iowa State Univ. have developed a novel approach for predicting HMA /E*/ using an artificial neural network (ANN) methodology. This paper discusses the accuracy and robustness of the various predictive models (Witczak 1999 and 2006 and ANN-based models) for estimating the HMA /E*/ inputs needed for the new mechanistic-empirical pavement design guide. The ANN-based /E*/ models using the same input variables exhibit significantly better overall prediction accuracy, better local accuracy at high and low temperature extremes, less prediction bias, and better balance between temperature and mixture influences than do their regression-based counterparts. As a consequence, the ANN models as a group are better able to rank mixtures in the same order as measured /E*/ for fixed (e.g., project-specific) environmental and design traffic conditions. The ANN models as a group also produced the best agreement between predicted rutting and alligator cracking computed using predicted versus measured /E*/ values for a typical pavement scenario.
Article
This paper investigates the driving forces, emission trends and reduction potential of China's carbon dioxide (CO2) emissions based on a provincial panel data set covering the years 1995 to 2009. A series of static and dynamic panel data models are estimated, and then an optimal forecasting model selected by out-of-sample criteria is used to forecast the emission trend and reduction potential up to 2020. The estimation results show that economic development, technology progress and industry structure are the most important factors affecting China's CO2 emissions, while the impacts of energy consumption structure, trade openness and urbanization level are negligible. The inverted U-shaped relationship between per capita CO2 emissions and economic development level is not strongly supported by the estimation results. The impact of capital adjustment speed is significant. Scenario simulations further show that per capita and aggregate CO2 emissions of China will increase continuously up to 2020 under any of the three scenarios developed in this study, but the reduction potential is large.
Article
Predictive modelling of species geographical distributions isa thriving ecological and biogeographical discipline. Majoradvances in its conceptual foundation and applications havetaken place recently, as well as the delineation of the outstandingchallenges still to be met (Araujo & Guisan, 2006; Guisan
Article
This paper deals with the problem of ranking alternatives under multiple criteria. A fuzzy credibility relation (FCR) method is proposed. Owing to vague concepts represented in decision data, in this study the rating of each alternative and the weight of each criterion are expressed in fuzzy numbers. Then we define the concordance, discordance, and support indices. By aggregating the concordance index and support index, a fuzzy credibility relation is calculated to represent the intensity of the preferences of one alternative over another. Finally, according to the fuzzy credibility relation, the ranking order of all alternatives can be determined. A numerical example is solved to highlight the procedure of the FCR method at the end of this paper.
Article
Growing concern over traffic safety has led to research efforts directed towards predicting freeway crashes in Advanced Traffic Management and Information Systems (ATMIS) environment. This paper aims at developing a crash-likelihood prediction model using real-time traffic-flow variables (measured through series of underground sensors) and rain data (collected at weather stations) potentially associated with crash occurrence. Archived loop detector and rain data and historical crash data have been used to calibrate the model. This model can be implemented using an online loop and rain data to identify high crash potential in real-time. Principal component analysis (PCA) and logistic regression (LR) have been used to estimate a weather model that determines a rain index based on the rain readings at the weather station in the proximity of the freeway. A matched case-control logit model has also been used to model the crash potential based on traffic loop data and the rain index. The 5-min average occupancy and standard deviation of volume observed at the downstream station, and the 5-min coefficient of variation in speed at the station closest to the crash, all during 5-10 min prior to the crash occurrence along with the rain index have been found to affect the crash occurrence most significantly.
GIS for crime analysis: Geography for predictive models
  • J Ferreira
  • P João
  • J Martins
Ferreira, J., João, P., & Martins, J. GIS for crime analysis: Geography for predictive models. Electronic Journal of Information Systems Evaluation, 15(1), pp36-49 (2012).
Traffic management and forecasting system based on 3d gis
  • X Li
  • Z Lv
  • J Hu
  • B Zhang
  • L Yin
  • Zhong
  • S Feng
Li, X., Lv, Z., Hu, J., Zhang, B., Yin, L., Zhong, C & Feng, S. (2015, May). Traffic management and forecasting system based on 3d gis. 15th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing, pp. 991-998 (2015).
Crime predictive model using big data analytics". Intelligent Data Analytics for Terror Threat Prediction: Architectures, Methodologies
  • H K Bhuyan
  • S K Pani
Bhuyan, H. K., & Pani, S. K. Crime predictive model using big data analytics". Intelligent Data Analytics for Terror Threat Prediction: Architectures, Methodologies, Techniques and Applications, 57-78 (2021).