Insights Through Data
| Stefana Belbe |
09 June 2022
INTRODUCTION
This article aims to give an overview of the original paper entitled “Spatial Clustering Behaviour of Covid-19 Conditioned by the Development Level: Case Study for the Administrative Units in Romania” written by my research colleague Codruta Mare and me (Cioban & Mare, 2021)*. The paper is the first attempt to interpret the spatial distribution of the infection rate of Covid-19 and its socio-economic determinants at a highly spatially disaggregated level: the administrative territorial units, colloquially known as communes, towns, and cities. The area of interest in this use case is Romania, a developing country which has suffered from the Covid-19 pandemic and reached the third wave of infection between March and May 2021.
Via means of exploratory spatial data analysis, cluster analysis, and spatial modelling, we attempted to answer the following questions:
- whether and to which extent the Covid-19 epidemic rate distribution forms spatial clusters at the locality level and to assess the intensity of the clustering behaviour over time;
- to check for the existence and intensity of the spatial effects the development level has on the infection rate. Our assumption was that the level of development contributes to the spatial spread of Covid-19 due to multiple facts: the infrastructure, the socially interactive life and higher mobility tendency, along with the general compliance with the measures to contain the pandemic are characteristic for developed and highly urbanised areas. On the other hand, remote and underdeveloped regions with higher unemployment rates either show lower mobility tendencies or simply fail to adequately report the infection cases.
DATA
There are 3 main variables in the study, with values for 3181 spatial units representing Romanian administrative territorial units – 101 cities, 216 towns and 2864 communes:
- the Covid-19 infection rate between March 1st, 2021, and May 8th, 2021: the number of total Covid-19 infections reported to the population;
- the unemployment rate: the number of total unemployed people in February 2021 reported to the population;
- Local Human Development Index (Sandu, 2020) computed using the 2011 census data.
EXPLORATORY SPATIAL DATA ANALYSIS
The exploratory spatial data analysis (ESDA) phase consists of maps of quantile values computed for two striking turns in the third Covid-19 wave in Romania: the peak of recorded Covid-19 infections (March 26th, 2021) and the end of the study period with lower overall values within the spatial units (May 1st, 2021). What we learned from these representations is that the Covid-19 infections have a more spatially grouped distribution during the peak of infection and a relatively random quantile distribution at the end of the infection wave. In this case, we were justified to take the next step and assess the clustering behaviour using spatial statistical means.

Quartile map of the Covid-19 incidence on March 26th, 2021, using 3 quantile intervals. Source: adapted from Cioban & Mare (2021)
CLUSTER ANALYSIS
Following the work of Griffith (2003), we computed the global and local Moran’s I Spatial Autocorrelation coefficient, which allowed a statistical computation of the degree of self-relative correspondence within the spatial units of the infection rate.
First, we computed a value of the global autocorrelation index for each day of the third Covid-19 wave in the country. All values were positive; hence the infection rate had a significant global clustering tendency. What is more, the daily evolution of the index followed a quadratic trend, with high values corresponding to the peak period of the infection wave and low values at the beginning and end of the period.

Daily evolution of the global autocorrelation index for the Covid-19 infection rate in Romania, at the ATU level. Source: adapted from Cioban & Mare (2021)
Second, the local clusters (Anselin, 1995) of Covid-19 infection data for March 26th, 2021, were mapped. We discovered that the statistically relevant clusters of high values (reds) coincide with developed regions across the country, while the low-value ones (blues) are mostly spread across more remote or less developed regions.

Local spatial autocorrelation map of the Covid-19 incidence on March 26th, 2021, with high-high (red), low-low (blue), high-low (orange) and low-high (yellow) clusters. Source: adapted from Cioban & Mare (2021)
SPATIAL MODELLING
To confirm the results so far and assess the significance of impact of the level of development upon the infection rate of Covid-19, a spatial modelling phase was performed. We used the Covid-19 rate for March 26th as a dependent variable in a series of spatial and non-spatial regressions in which we introduced the Local Human Development Index and the unemployment rate separately as proxies of the development level in Romania. The modelling started from the simple OLS (Ordinary Least Squares) with the spatial tests. The tests confirmed the existence of spatial effects in the model and the necessity to respecify the model by taking spatial dependences into account.
We then estimated two spatially weighted two-stage least squares regressions (Anselin, 2011). The results enhance contagion and diffusion processes with a significant impact of development upon the spatial spread of the pandemic:
- when assessing the dependence of the Local Human Development Index upon the Covid-19 rate, a highly significant positive relation is given by the model; hence, the infections cluster around wealthier and highly urbanised localities;
- when assessing the unemployment impact on the spread of the pandemic, a highly significant negative relation is given; hence, the regions with lower unemployment rates coincide with higher infection probabilities, whilst the regions where the unemployment reaches higher rates are less prone to infection spread.
CONCLUSIONS
Assessing the Covid-19 rate for the communes, towns, and cities in Romania has emphasised a significant clustering tendency that follows the temporal pattern of the number of infections themselves. Locally, the clusters of high incidence correspond to the developed regions, where lower unemployment rates are recorded.
The results of the spatial models confirm contagion and diffusion spatial processes conditioned by the local development of the administrative territorial units in Romania. In other words, the level of development of a locality can act as a transmission and conditioning channel for the Covid-19 pandemic.
While such conclusions are the result of a first assessment of the level of localities in a country, more in-depth analyses are needed to investigate these relations at a granular socio-economical level. We intend to continue with further studies in which we enrich our initial database with more and individual coefficients that compose the local human development index and to assess their relation against Covid-19 related indices.
REFERENCES
Anselin, L., 1995. Local Indicators of Spatial Association-LISA. Geogr. Anal., 27(2), 93-115.Anselin, L., 2011. GMM estimation of spatial error autocorrelation with and without heteroskedasticity. Technical Report. GeoDa Center for Geospatial Analysis and Computation.
Cioban, S., & Mare, C., 2021. Spatial clustering behaviour of Covid-19 conditioned by the development level: Case study for the administrative units in Romania. Spatial Statistics. doi: https://doi.org/10.1016/j.spasta.2021.100558
Griffith, D.A., 2003. Spatial Autocorrelation and Spatial Filtering: Gaining Understanding through Theory and Scientific Visualization. Springer, Berlin, Germany.
Sandu, D., 2020, November. Updating the Local Human Development Index. Substantiation and Applicability. Research Report, World Bank, Washington, DC.
* Please note that I changed my name from Cioban to Belbe since the publication of the original research paper.
Stefana Belbe
Senior Data Scientist
Stefana is a data scientist passionate in the areas of Natural Language Processing, Geospatial Technologies, and Predictive Modelling. She is eager to share and gives presentations, trainings, and classes to peers interested in similar topics. She is currently pursuing a PhD in Space-Time Predictive Econometrics and collaborates with the Interdisciplinary Data Science Center of the Babes-Bolyai University in Cluj-Napoca. When Stefana turns off her laptop, she either opens a book, goes to the movies, or spends time with her family and friends.All Categories
Related Articles
-
13 November 2023
Delving Deeper Into Generative AI: Unlocking Benefits and Opportunities
-
07 November 2023
Retrieval Augmented Generation: Combining LLMs, Task-chaining and Vector Databases
-
19 September 2023
The Rise of Vector Databases
-
01 June 2023
Challenges for Adopting AI Systems in Software Development
-
14 February 2023
Generative AI: Technology of Tomorrow, Today
-
26 July 2022
Is Data Mesh Going to Replace Centralised Repositories?
-
08 June 2021
Elasticsearch and Apache Lucene: Fundamentals Behind the Relevance Score
-
27 May 2021
Endava at NASA’s 2020 Space Apps Challenge