
Gianmarco AlbertiUniversity of Malta · Department of Criminology
Gianmarco Alberti
PhD
About
75
Publications
137,717
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
409
Citations
Introduction
Senior Lecturer (PhD) in Spatial Forensics at University of Malta. Experienced academic with extensive expertise in applying advanced data analytical techniques across diverse research fields such as criminology, archaeology, and social sciences. Skilled in developing custom statistical software, with a strong track record in teaching, research, and publication. Fellow of the Royal Statistical Society. H-index: 16, with 580+ citations as of January 2025.
Additional affiliations
October 2016 - present
Education
January 2009 - February 2012
January 2004 - December 2007
January 1995 - December 2002
Publications
Publications (75)
Johnson's scalar stress theory, describing the mechanics of (and the remedies to) the increase in in-group conflictuality that parallels the increase in groups' size, provides scholars with a useful theoretical framework for the understanding of different aspects of the material culture of past communities (i.e., social organization, communal food...
The present study seeks to understand the determinants of land agricultural suitability in Malta before heavy mechanization. A GIS-based Logistic Regression model is built on the basis of the data from mid-1800s cadastral maps (cabreo). This is the first time that such data are being used for the purpose of building a predictive model. The maps rec...
This book offers a clear and accessible guide to cross-tabulation analysis, transforming a complex subject into an accessible topic. It diverges from traditional statistical texts, adopting a conversational tone that addresses common questions and concerns. The author demystifies intricate concepts, with clear explanations and relatable analogies t...
Landscapes have been shaped and reshaped by humans to meet the changing needs of shifting subsistence strategies and demographic patterns. In the Mediterranean region, a widespread subsistence strategy that has left a major imprint is pastoralism, often tied with transhumance. Pastoralism and the associated tensions between pastoralists and settled...
2x2 Table Association Chatbot is based on Gianmarco Alberti's book, "From Data to Insights: A Beginner's Guide to Cross-Tabulation Analysis." Whether you're a student, researcher, or data enthusiast, this chatbot simplifies the process of understanding and interpreting associations between two variables in a 2x2 contingency table.
The chatbot offe...
https://chat.openai.com/g/g-iIpc95xWl-thematic-analysis-assistant
https://chat.openai.com/g/g-bpedLaM3C-editorial-assistant
Introducing "stratastats", your go-to R package for stratified data analysis. It's packed with advanced tools including odds ratios, chi-squared tests, and the Cochran-Mantel-Haenszel (CMH) and Breslow-Day (DB) tests for assessing association and homogeneity. Coupled with clear interpretation guidelines, it may prove valuable for epidemiologists, m...
This vignette aims at showing the use of the current version of the 'chisquare' package. Analyze categorical data effortlessly, derive insights from cross-tabulations, and visualize significant associations with ease. Perfect for researchers and data enthusiasts! It is available from CRAN.
This vignette aims at showing the use of the current version of the landform package. It allows classify a landscape into different categories based on the Topographic Position Index (TPI) and slope. It offers two types of classifications: Slope Position Classification (Weiss 2001) and Landform Classification ( Weiss 2001 ; Jenness 2003). The packa...
This document provides an introduction to the caresid R package. The package contains a function of the same name, caresid() , that allows to carry out Correspondence Analysis (CA) on an input contingency table and to create a scatterplot of the row and column points on the selected dimensions. More importantly, the function can add segments to the...
The document provides an introduction to the 'boxplotcluster' R package. The package contains a function of the same name, boxplotcluster, that implements a special clustering method based on boxplot statistics. Version 0.2 will be released on CRAN in June 2023.
This vignette aims at showing the use of the current version of the caplot package. It allows to perform Correspondence Analysis on the input dataframe and plots the results in a scatterplot that emphasizes the geometric interpretation aspect of the analysis (see Borg-Groenen 2005 ; Yelland 2010 ). It is particularly useful for highlighting the rel...
This vignette aims at showing the use of version 0.2 of the brsim package. To hopefully enhance clarity, it is organised as a sequence of tasks. For more details about the brsim() function, its parameters, for the returned values, and for relevant literature, users are referred to the package’s help documentation. Version 0.2 will be released to CR...
The 'brsim' package provides the facility to calculate the Brainerd-Robinson similarity coefficient for the rows of an input table, and to calculate the significance of each coefficient based on a permutation approach. Optionally, hierarchical agglomerative clustering can be performed and the silhouette method is used to identify an optimal number...
This vignette aims at showing the use of the current version of the movecost package and of its functions. To hopefully enhance clarity, it is organised as a sequence of tasks. In-built datasets will be used throughout this document. For more details about each function’s parameter, for the values returned by each function, and for relevant literat...
The Digital Transformation of the Real World entails the need to move from analogue to digital to virtual in an attempt to recreate that reality into a digital twin that is enhanced through multi-domain, multi-disciplinary integration systems. The What factor in the W6H model (What, Why, Who, Where, When, How and Why Not) takes central stage in thi...
One of the most debated and explored period of the prehistory of Sicily is the Middle Bronze Age (15th-13th century BCE), which is considered as a crucial moment for the development of local prehistoric social, economic, and cross-cultural dynamics. The local Thapsos culture is what best represents this chronological period and is characterized at...
The post‐2015 agenda calls for a ‘data revolution in development’ and recognises that statistical capacity amongst the workforce is a prerequisite for achieving it. Universities have a critical role to play in building this capacity. This paper reports insights from in‐depth interviews with development professionals in Malta, Spain, Turkey and the...
Provides the facility to perform the chi-square and G-square test of independence, calcu- lates permutation-based p value, and provides measures of association such as Phi, odds ratio with 95 percent CI and p value, adjusted contingency coefficient, Cramer's V, bias-corrected Cramer's V, Cohen's w, Goodman-Kruskal's lambda, gamma, and tau, Cohen's...
With few ground-breaking exceptions, mainly framed in the context of archaeological research, GIS and quantitative methods have not been used so far in Malta to better understand the development and layered making of the landscape in relatively recent historical times. The present work aims at describing the main achievements of the authors’ resear...
The package provides the facility to calculate accumulated cost surface, least-cost paths and corridors using a number of human-movement-related cost functions that can be selected by the user. It just requires a Digital Terrain model, a start location and (optionally) destination locations.
URL: https://cran.r-project.org/package=movecost
Cost-surface and least-cost path analyses are widely used tools to understand the ways in which movement relates and engages with the surrounding space. They are employed in research fields as diverse as the analysis of travel corridors, land accessibility, site locations, maritime pathways, animal seascape connectivity, transportation, search and...
The package contains many functions useful for univariate outlier detection, permutation-based t-test, permutation-based chi-square test, visualization of residuals, and bootstrap Cramer's V, plotting of the results of the Mann-Whitney and Kruskall-Wallis test, calculation of Brainerd-Robinson similarity coefficient and subsequent clustering, valid...
The package allows to plot a number of information related to the interpretation of Correspondence Analysis' results. It provides the facility to plot the contribution of rows and columns categories to the principal dimensions, the quality of points display on selected dimensions, the correlation of row and column categories to selected dimensions,...
Traditionally, simple correspondence analysis is performed by decomposing a matrix of standardised residuals using singular value decomposition where the sum-of-squares of these residuals gives Pearson's chi-squared statistic. Such residuals, which are treated as being asymptotically normally distributed, arise by assuming that the cell frequencies...
Traditionally, simple correspondence analysis is performed by decomposing a matrix of standardised residuals using singular value decomposition where the sum-of-squares of these residuals gives Pearson's chi-squared statistic. Such residuals, which are treated as being asymptotically normally distributed , arise by assuming that the cell frequencie...
Cost-surface analysis in Geographic Information System (GIS) environment has been less frequently used in the study of ancient sail navigation than in other studies of the human past. Navigation cost-surface analysis entails the use of GIS tools that are versatile but not very easy to grasp and to put to work. This article describes an ArcGIS toolb...
The report describes the TRANSIT ArcGIS toolbox that allows estimating the duration of sail-powered navigation from a user-defined starting location. The toolbox, its rationale, use, and application in a worked example, are fully described in the following article, which is in press:
Alberti G. 2017. TRANSIT: a GIS toolbox for estimating the durati...
The report describes an ArcGIS toolbox that provides the facility to calculate the Fuzzy Viewshed as proposed by Ogburn 2006, which modifies the original proposal by Fisher. It produces a Fuzzy Vieshed raster in which, as customary in 'regular' (i.e., binary) viewshed rasters, 0 indicates cells that are not visible, 1 indicates cells that are visib...
’plotJenks’ is an R function which allows to break a dataset down into a user-defined number of breaks and to nicely plot the results, adding a number of other relevant information. Implementing the Jenks’ natural break method, it allows to find the best arrangement of values into different classes. (Preferentially refer to the linked web address f...
’outlier’ is an R function which allows to perform univariate outliers detection using three different methods. The implemented methods are those described in R.R. Wilcox, Fundamentals of Modern Statistical Methods: Substantially Improving Power and Accuracy, Springer 2010 (2nd edition), at pages 31-35. Two of the three methods are robust, and are...
Correspondence Analysis has become increasingly popular in archaeology to visualize contingency tables and to understand their structure. As an exploratory tool, the technique has found wide use for a variety of purposes, encompassing (but not limited to) activity areas research, analysis of pottery distribution among a variety of contexts, study o...
’ca.percept’ is a R function which allows to plot a variant of the traditional Correspondence Analysis scatterplots that allows facilitating the interpretation of the results. In particular, the function aims at producing what in marketing research is called perceptual map. This function aims at producing that kind of visual representation of the C...
’log.regr’ is an R function which allows to make it easy to perform binary Logistic Regression, and to graphically display the estimated coefficients and odds ratios. It also allows to visually check model’s diagnostics such as outliers, leverage, and Cook’s distance. Please: preferentially refer to http://cainarchaeology.weebly.com/r-function-for-...
’perm.t.test’ is an R function which allows to perform a permutation-based t-test to compare two independent groups. The test’s results are graphically displayed within the returned chart. A permutation t-test proves useful when the assumption of ’regular’ t-test are not met. In particular, when the two groups being compared show a very skewed dist...
description available from: http://cainarchaeology.weebly.com/r-function-for-binary-logistic-regression-internal-validation.html
description available from: http://cainarchaeology.weebly.com/r-function-for-improved-ca-scatterplot.html
Description available from: http://cainarchaeology.weebly.com/r-function-for-post-prob-for-different-relations-btw-2-bayesian-14c-phases.html
description available from: http://cainarchaeology.weebly.com/r-function-for-permutation-based-chi-square-test-of-independence.html
description available from: http://cainarchaeology.weebly.com/r-function-for-scalar-stress-probability-calculation.html
description available from: http://cainarchaeology.weebly.com/r-function-for-optimism-adjusted-auc.html
description available from: http://cainarchaeology.weebly.com/r-function-for-visually-displaying-mann-whitney-test.html
description available from: http://cainarchaeology.weebly.com/r-function-for-posterior-probability-density-plot.html
description available from: http://cainarchaeology.weebly.com/r-function-for-brainerd-robinson-similarity-coefficient.html
description and new improved version (comprising Dunn’s post-hoc test) available from: http://cainarchaeology.weebly.com/r-function-for-visually-displaying-kruskal-wallis-test.html
Correspondence Analysis (CA) is a statistical exploratory technique frequently used in many research fields to graphically visualize the structure of contingency tables. Many programs, both commercial and free, perform CA but none of them as yet provides a visual aid to the interpretation of the results. The ‘CAinterprTools’ package, designed to be...
Among the Cypriot pottery from local Middle Bronze Age tomb contexts of south-eastern Sicily, Base Ring II jugs remain somewhat unexplored under the prospective of both chronology and possible manufacture centre. As to the latter issue, different opinions exist: Levantine, Cypriot, or local (or even Aegean) origin. In the author’ view, a better def...
A number of interesting packages are available to perform Correspondence Analysis in R. At the best of my knowledge, they lack some tools to help users to eyeball some critical CA aspects (e.g., contribution of rows/cols categories to the principal axes, quality of the display,correlation of rows/cols categories with dimensions, etc). Besides provi...
CAseriation allows to sort the rows and columns of the input contingency table according to the scores of rows and columns on the Correspondence Analysis’ dimension selected by the user. The package also allows to plot the CA scatterplot of selected dimensions, and to seek for clusters in the dataset. As for seriation, two plots are returned, displ...
This article focuses on the Early–Middle Bronze Age (MBA) transition in Sicily and southern Italy from a Bayesian radiocarbon perspective. The aims are to: (i) estimate the beginning of the MBA (i.e. Thapsos– Milazzese culture in Sicily; Apennine culture in southern Italy) at four key-sites; (ii) assess the existence of a site- wide variability; an...
Through the years Correspondence Analysis has become a valuable tool for archaeologists in that it enables the exploration of patterns of associations in large contingency tables. While commercial statistical programs provide the facility to perform Correspondence Analysis, a number of packages are available for the free R statistical environment....
The use of contingency tables is widespread in archaeology. Cross-tabulations are used in many different studies as a useful tool to synthetically report data, and are also useful when analyst wishes to seek for latent data structures. The latter case is when Correspondence Analysis (CA) comes into play. By graphically displaying the dependence bet...
This study deals with the archaeological evidence from the Middle Bronze Age (ca 1460-1250 BC) settlements on the Acropolis of Lipari, on the Montagnola di Capo Graziano (Filicudi), at Punta Milazzese (Panarea), and Portella (Salina). The work is aimed at the reconstruction of forms of social organization, and is based on the data provided by previ...
This paper deals with radiocarbon determinations from the Middle Bronze Age site of Portella on the island of Salina (Aeolian Archipelago, Italy). The available 14C evidence is taken into account, in a simple Bayesian model, in order to explore the issue of the absolute chronology of both the settlement and the stage of the local cultural sequence...
Two (local) Middle Bronze Age sites in Sicily are known for having yielded Cypriot imports: Thapsos to southeast, and Cannatello in the south-central part of the island. These imports come from contexts with a strong intercultural character, as they are featured by the presence of items from other Mediterranean areas, especially Mycenaean Greece. M...
This study deals with the ceramic repertoire of the Aeolian Middle Bronze Age culture, the so called Milazzese facies. The work takes into account the edited documentation from the four main settlements on the Aeolian Archipelago, unearthed by Luigi Bernabò Brea in several excavations between 1940 and 1970. These settlements are on the Montagnola o...
This paper is faced with the problem of the relative and absolute chronology of the first two phases of Thapsos’ residential quarter. It is well known that the phasing put forward by the excavator (G. Voza) is in contrast with the traditional Sicilian cultural sequence outlined by L. Bernabò Brea. In Bernabò Brea’s view, the Thapsos period (Middle...
Tha aim of this work is to sketch a picture of the social stratification of the Thapsos’ centre (the name-site of the Sicilian Middle Bronze Age), filling a gap in the literature available so far. This analysis is based upon the funerary data provided by the rock-cut tomb cemetery of the center. Within a strongly context-oriented framework, we shal...
Thapsos is the name-site of the Sicilian Middle Bronze Age. It lies in south-eastern Sicily, in the gulf of Augusta, on
a low lying limestone promontory connected to the mainland by a narrow isthmus. The site was systematically investigated by Paolo Orsi at the end of XIXth century: these first researches were focused on the cemetery of rock-cut to...
Questions
Questions (34)
Dear colleagues,
I'm currently working with Mediterranean Sea current data from Copernicus Marine Service (CMEMS), specifically their gridded products in NetCDF format with 4km resolution.
Does anyone know of similar gridded current datasets for the Mediterranean with higher spatial resolution that are equally manageable in GIS environments?
Thank you
Hello All,
I am having some a hard time getting hold of the following article:
Generalization of Cramer's coefficient of association for contingency tables : theory and methods
by Sadao Tomizawa, Nobuko Miyamoto and Hidechika Houya (https://journals.co.za/doi/10.10520/EJC99072).
Does anyone have a copy to share?
Best
Does anyone have access to the following (quite old) article:
It would seems that I have no institutional access to it, and i was wondering if anyone would be willing to share.
Thanks
Hello,
I have compositional data for a number of pottery sherds. For each, I have elemtnal composition as percentage (sherds in row; each row summing to 100%).
I have read that it is not sound to perform PCA directly on such type of data (percentages).
I am wondering what (possibly simple, yet sound) pre-procesing or values re-expression can be used before running PCA.
Thank you
GmA
As per question's title, I am wondering if anyone knows how to implement the Tobler's hiking function in R using the 'gdistance' package.
I have indeed read the package's documentation (https://cran.r-project.org/web/packages/gdistance/vignettes/gdistance1.pdf; paragraph 9) in which the function is implemented, but I am interested in a different thing: getting a raster of walking time, not speed. Also, the ultimate goal is to plot isochrones around the start location. The latter would entail producing an accumulated cost-surface with cost defined as walking time.
Thank you
I am wondering if there is any package in R containing function(s) to perform viewshed analysis. I am not interested in packages that need the underlying presence of ArcGIS or GRASS; I am looking for self-contained R facilities that do not require those GIS programs.
Hello,
I am looking for suggestions as to how solve what follows.
Let's assume I have some Thiessen polygons built around some locations (crosses in the attached image). Some points (red dots in the attached image) lie within the polygons. For each polygon I want to calculate the expected number of points, against which compare the observed number of points, and to calculate the significance (i.e., p-value) of the observed number of points under the Null Hypothesis of a random distribution of points within the study area as a whole.
We actually know (a) the area of each polygon, (b) the total area of the study plot (=sum of the polygons' area), (c) the total number of points, and (d) the number of points falling inside each polygon.
I guess one should use the binomial distribution, but I seem I can't figure out how to exact implement the calculation. Where I am stuck is in working out the value of p, which in the case of polygons of equal size (i.e., quadrats) should be the reciprocal of the number of quadrats into which the study area is divided. But I do not know how to calculate p in case of polygons of unequal size.
In case of quadrats (equal sized polygons), we should use a binomial distribution with:
p=1/x (where x is the number of quadrats)
n=number of events in the pattern (i.e., total number of points)
k=number of events in a quadrat.
In the specific case I described, would p be equal to the fraction of each polygon's area relative to the whole study plot (i.e., p=polygon area/sum of all the polygons area)?
Thanks for any suggestion.
Best
Gm
As per title, I was wondering how in a DEM (i.e., a bare-earth DEM) one can locate depressions that are deeper than the surrounding ground by at least a given value. The main issue is that, quite obviously, a DEM does not store information about the elevation of points relative to the ground, but relative to sea level. Trying to accomplish that entails to devise a way to estimate the height of any cell relative to its neighbours. Maybe Focal Statistics can be put to work, but I wonder if anyone has specific suggestions to put me on the right track.
Thank you.
Hello.
I am wondering if anyone here knows about materical culture evidence related to hide working in prehistory. By "material culture" I broadly refer to either specific facilities (be them fixed manufacts, particular types of huts or room) or specific tool-kits, or both. My question is open to any geographic area. References to contexts dating to more recent periods, or to ethnographic contexts, are welcome as well.
Best
I seem to recall that a ArcGIS toolbox was available somewhere on the WEB that provided the facility to process a DEM in order to get different types of rilief visualizations (e.g., PCA of hillshades). Unfortunately, I can't remember where I got that from. Does anyone has any suggestion?