ArticlePDF Available

How does broadband supply affect participation in panel surveys?: Using geospatial broadband data at the district level to analyze mode choice and panel attrition

Springer Nature
Quality & Quantity
Authors:

Abstract and Figures

Good broadband supply is crucial for different aspects of participation behavior in web surveys. In this study, we combined geospatial broadband data and the survey data of 16 waves of the mixed-mode GESIS Panel to investigate mode choice and panel attrition. Since small-scale geospatial data is often unavailable or difficult to access for research purposes, we report here results of a feasibility study investigating whether publicly available broadband data at the district level are sufficient to draw meaningful conclusions about participation behavior. In the first part of the analysis, we investigated the effects of broadband supply on mode choice. The results showed a positive effect of the broadband category with the highest coverage rate on choosing the online mode. We also found effects for internet familiarity, age, and education. In the second part, we investigated the effects of broadband supply on panel attrition within the 16 survey waves. We did not find significant effects of broadband supply on panel attrition. However, we found effects for evaluation of duration, measured duration, and age. Overall, we conclude that the granularity of the geospatial data on broadband supply is not ideal. A detailed discussion on the granularity of the broadband data in the context of the general availability of fine-grained geospatial data is provided at the end of this study.
This content is subject to copyright. Terms and conditions apply.
Vol.:(0123456789)
Quality & Quantity (2024) 58:5805–5828
https://doi.org/10.1007/s11135-024-01912-y
How does broadband supply affect participation inpanel
surveys?: Using geospatial broadband data atthedistrict
level toanalyze mode choice andpanel attrition
MaikelSchwerdtfeger1 · RubenL.Bach2
Accepted: 29 May 2024 / Published online: 12 June 2024
© The Author(s) 2024
Abstract
Good broadband supply is crucial for different aspects of participation behavior in web
surveys. In this study, we combined geospatial broadband data and the survey data of 16
waves of the mixed-mode GESIS Panel to investigate mode choice and panel attrition.
Since small-scale geospatial data is often unavailable or difficult to access for research pur-
poses, we report here results of a feasibility study investigating whether publicly available
broadband data at the district level are sufficient to draw meaningful conclusions about
participation behavior. In the first part of the analysis, we investigated the effects of broad-
band supply on mode choice. The results showed a positive effect of the broadband cat-
egory with the highest coverage rate on choosing the online mode. We also found effects
for internet familiarity, age, and education. In the second part, we investigated the effects of
broadband supply on panel attrition within the 16 survey waves. We did not find significant
effects of broadband supply on panel attrition. However, we found effects for evaluation of
duration, measured duration, and age. Overall, we conclude that the granularity of the geo-
spatial data on broadband supply is not ideal. A detailed discussion on the granularity of
the broadband data in the context of the general availability of fine-grained geospatial data
is provided at the end of this study.
Keywords Broadband supply· Mode choice· Panel attrition· Panel data· Geospatial data
* Maikel Schwerdtfeger
maikel.schwerdtfeger@gesis.org
Ruben L. Bach
r.bach@uni-mannheim.de
1 GESIS – Leibniz Institute fortheSocial Sciences, P.O. Box12 21 55, 68072Mannheim, Germany
2 Mannheim Centre forEuropean Social Research (MZES), University ofMannheim,
68131Mannheim, Germany
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
5806
M.Schwerdtfeger, R.L.Bach
1 Introduction
Over the last decades, collecting data online via web surveys has become crucial for quan-
titative research in the social sciences. Reflecting this circumstance, several survey studies
(e.g., the German Internet Panel, LISS Panel, Understanding America Study) even pro-
vided participants with internet access and participation devices to enable them to par-
ticipate in their web surveys. With the increasing prevalence of web surveys, many stud-
ies have examined the impact and consequences of the coverage problems resulting from
the fact that not all participants are able or willing to participate online (Scherpenzeel &
Bethlehem 2011; Bandilla etal. 2009; Baur & Florian 2009). From the perspective of this
research on coverage, having broadband access often implies that people can participate
in web surveys without any technical constraints, although the quality and speed of this
broadband access are not considered. In reality, broadband speed can vary massively and
thereby affect the survey experience. The present study aims to expand the survey meth-
odological research by investigating the effects of broadband supply on panel participation
in a feasibility study on the use of publicly available geospatial data. This geospatial broad-
band data is based on the Breitbandatlas of the Federal Ministry for Digital and Transport
(Bundesministerium für Digitales und Verkehr 2023).
We measured the effects of broadband supply on panel participation by using two essen-
tial characteristics of a mixed-mode panel survey: mode choice and panel attrition. This
approach enabled us to examine the impact of broadband supply at two different stages of
participation. First, mode choice refers to a situation in which people in a recruitment inter-
view of a panel survey decide whether they want to participate online via web surveys or
offline via paper-and-pencil surveys. This decision may depend on several factors, such as
respondents’ preferences, access to technology, and comfort levels with the selected partici-
pation mode. Usually, this initial decision sets the participation mode for all subsequent sur-
vey waves. For this early stage of participation, we investigated whether people in regions
with poor broadband supply are more inclined to use the offline mode. Second, among
those who chose the online mode, we investigated whether poor broadband supply led to
increased panel attrition. Increased panel attrition is a severe problem since even moderate
attrition rates can substantially reduce the number of participants over the course of a long-
term panel survey. As a result, panel attrition can undermine statistical power, and selective
panel attrition can lead to biased survey estimates (see, e.g., Lugtig 2014). To assess broad-
band supply, we used geospatial data on regionally available broadband supply at the district
level and combined it with geocoded panel survey data. Therefore, our research objectives
involve exploring how broadband supply affects mode choice in mixed-mode panel surveys
and understanding its impact on the attrition of online participants in panel surveys.
In summary, this study aims to expand methodological research in the context of panel
surveys in two different ways. First, the approach of using geospatial data on broadband
supply is a novelty in survey methodology that overcomes the restrictions of existing sur-
vey data. Geospatial broadband data can replace survey questions about existing broad-
band speed, which can reduce survey time and the respondent burden. Furthermore, broad-
band data is not affected by the motivated or unintentional misreporting of respondents
and can be applied retrospectively. Second, analyzing participation behavior in the context
of regionally available broadband supply enables us to draw conclusions about whether
participants in regions with a poor broadband supply avoid the online mode; and if not,
whether they have a higher probability of unit nonresponse than participants in regions
with good broadband supply. These conclusions can be used to develop targeting strategies
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
5807
How does broadband supply affect participation inpanel surveys?:…
that actively guide the participation mode choice based on the panelists’ residence, thereby
reducing the likelihood of panel attrition.
2 Theoretical background andhypotheses
2.1 Online waiting
Slow broadband speeds cause online waiting situations, although the delay often is just a
matter of seconds. For example, it takes one additional second to load the search results of
a search engine and then two additional seconds to load the suggested website. In total, the
delay is only three additional seconds, but the flow experience is interrupted each time.
The crucial factor in evaluating waiting times is not the absolute duration but the waiting-
time gap. The waiting-time gap is the difference between the time someone is willing to wait
and their perceived waiting time (Chebat etal. 2010). In an environment in which every request
is expected to be processed and answered within seconds, a 20-s wait for a page download
may be perceived as unreasonable, just as a 20-min wait for a comparable traditional service
would be perceived as unreasonable (Chebat etal. 2010). An experiment with Google users
showed that even minimal changes in waiting time matter. Slowing down the search results
page by 400ms had an average impact of − 0.6% on the number of searches per user (Brutlag
2009). Thus, even half a second delay can have a measurable impact on internet users.
The Weber-Fechner Law explains such findings as the ability of the human sensory sys-
tem to perceive so-called just noticeable differences (Reichl etal. 2010). A just noticeable
difference describes the minimum difference between two stimuli that is required for a person
to perceive that the stimuli are not the same. Transferred to our use case, the following basic
thresholds apply in expert communities for system programming: the limit for having the user
feel that a system reacts instantaneously is about 0.1s; the limit for keeping the user’s flow
uninterrupted is about 1.0s, even though the user will notice the delay (Nielsen 1994).
To summarize, the decisive aspect for evaluating online waiting situations is not the
absolute duration of the delay but the noticeable difference between the acceptable and the
perceived waiting time. In online environments, acceptable waiting times are much shorter
than in face-to-face situations. As a result, one could argue that even minimal delays can
lead to worse evaluations or increased dropouts. In the following two sections, we apply
these general mechanisms of online waiting to mode choice and panel attrition.
2.2 Mode choice
According to Smyth etal. (2014), access to and familiarity with a participation mode are
the strongest predictors of mode preference, whereas measures of safety concerns, physical
abilities, and normative concerns remained "unexpectedly weak" predictors. Accordingly,
Herzing and Blom (2019) found that their indicator of digital affinity, which includes digi-
tal access and internet usage, influences whether people participate in online panel surveys.
Both access to and familiarity with the internet can be seen as minimum requirements for
choosing the online mode since people cannot participate online without internet access
and a certain level of internet familiarity. Another study found that the familiarity, comfort,
and convenience with a communication medium reflected a preference for a particular sur-
vey mode (Olson etal. 2012). Thus, in addition to the minimum requirements of internet
access and familiarity, comfort and convenience with the internet play a crucial role in
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
5808
M.Schwerdtfeger, R.L.Bach
choosing a participation mode. Similarly, Bretschi and Weiß (2022) found that short-term
mode switching from offline to online mode is positively affected by three dimensions of
internet use: frequency, variety, and number of devices.
With respect to the above-mentioned online waiting mechanisms, we expect that the
degree of comfort and convenience is determined by past experiences with the internet.
Positive experiences, like intuitive navigation and fluid operation, are expected to increase
preference towards choosing the online mode. Negative experiences, like the inability to
do something or unexpectedly long waiting times, are expected to increase preference
towards choosing the offline mode. With regard to broadband supply speeds, we expect that
increased page load times due to poor broadband supply will have a negative impact on the
comfort and convenience with the internet, and thus also on choosing the online mode.
Thus, our first hypothesis is: Living in a region with a better broadband supply increases
the probability of choosing online participation in a mixed-mode panel.
In addition to the influence of the regional broadband supply, we add important predic-
tors from previous mode choice research to strengthen the explanatory power of the model
that we will use to investigate the first hypothesis. As mentioned above, several studies
have found a direct effect of internet familiarity on participation mode preference (Olson
etal. 2012; Smyth etal. 2014). Also, male participants were more likely to choose the web
mode (Diment & Garrett-Jones 2007) and possess lower tolerance, acceptance, and satis-
faction levels for slower system response times (Yu etal. 2020). Finally, participation mode
preferences are affected by age, as older participants have lower preferences for web mode
(Diment & Garrett-Jones 2007; Millar etal. 2009), and education level, as higher education
increases the likelihood of preferring the web mode (Millar etal. 2009; Smyth etal. 2014).
Therefore, we included internet familiarity, gender, year of birth, and education level as
control variables. The hypotheses in this regard are:
Second hypothesis: Being more familiar with the internet increases the probability of
choosing online participation in a mixed-mode panel.
Third hypothesis: Being male increases the probability of choosing online participation
in a mixed-mode panel.
Fourth hypothesis: Younger participants have a higher probability of choosing online
participation in a mixed-mode panel.
Fifth hypothesis: Participants with a higher level of education have a higher probability
of choosing online participation in a mixed-mode panel.
Next, we review the literature on panel attrition and apply it to the mechanism of online
waiting.
2.3 Panel attrition
Galesic (2006) has emphasized that interest in survey questions and the burden expe-
rienced while answering them are the most important aspects of studying survey
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
5809
How does broadband supply affect participation inpanel surveys?:…
behavior. Specifically, interest in the survey and other initial factors of motivation (e.g.,
incentives) become weaker over the course of a panel survey, and the influence of nega-
tive aspects that increase the burden (e.g., boredom) becomes stronger (Galesic 2006).
Consistent with these findings, Gummer and Daikeler (2018) investigated respondents’
experience in a mixed-mode panel survey and found that particularly the perceived
length of surveys influences further participation, regardless of the participation mode.
With respect to survey burden, many studies have focused on the length of a survey as a
crucial factor (see, e.g., Vicente & Reis 2010). Shorter web surveys yield higher response
rates (Deutskens et al. 2004; Marcus et al. 2007) and lower dropout rates (Ganassali
2008). Both of these studies manipulated the number of questions to investigate the effects
of survey length. However, it does not require a higher number of questions to increase
the survey length and, thus, the survey burden. Crawford etal. (2001) found that their web
survey with server problems, compared to a second identical web survey without server
problems, increased the average length of a survey for those who completed it to 21.6 vs.
17.8min, and the nonresponse to 66.1% vs. 63.9%, and the breakoffs to 10.6% vs. 8.8%.
In view of these findings and the above-described mechanisms of online waiting, we
expect that even short delays in page load time can affect the survey burden negatively
and increase the probability of dropping out of a panel survey.
Therefore, the sixth hypothesis is: Living in a region with a better broadband supply
decreases the risk of attrition in an online panel survey.
In line with the mode choice analysis, we added important predictors from existing
research on panel attrition to the model we used to investigate this hypothesis empiri-
cally. In the first place, many studies on web and mixed-mode surveys have found that
survey duration is a crucial factor of panel attrition (Crawford etal. 2001; Deutskens
et al. 2004; Ganassali 2008; Vicente & Reis 2010). With respect to the sociodemo-
graphic variables, the attrition rate in face-to-face interviews is lower among women
(Behr etal. 2005; Lepkowski & Couper 2002) and people with higher education (Wat-
son & Wooden 2009). The findings relating to participants’ ages were mixed. For exam-
ple, Lipps (2009) measured an increased risk of attrition for the oldest and youngest
participants in face-to-face and telephone interviews, and Struminskaya (2014) found a
small negative effect in an online panel survey. Consequently, we included the following
control variables: evaluation of survey duration, measured survey duration, gender, year
of birth, and education level. The hypotheses in this regard are:
Seventh hypothesis: Evaluating the survey duration as less long decreases the risk of
attrition in an online panel survey.
Eight hypothesis: Taking more time to complete the survey increases the risk of attrition
in an online panel survey.
Ninth hypothesis: Being male increases the risk of attrition in an online panel survey.
10th hypothesis: Younger participants have a higher risk of attrition in an online panel survey.
11th hypothesis: Participants with a higher level of education have a lower risk of attri-
tion in an online panel survey.
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
5810
M.Schwerdtfeger, R.L.Bach
2.4 Feasibility ofusing publicly available geospatial data
Since small-scale geospatial data is often unavailable or difficult to access for research
purposes, we decided to conduct a feasibility study to test whether publicly available
broadband data at the district level is sufficient for our examination. This is an exploratory
approach for which we have no comparative or empirical values to draw from.
12th Hypothesis: Using publicly available geospatial broadband data at the district level
is sufficient to draw meaningful conclusions about participation behavior in a panel survey.
3 Data andmethods
3.1 GESIS Panel survey data
The survey data we used in this paper is the GESIS Panel – Extended Edition version
31.0.0 (GESIS 2019). The GESIS Panel is a probability-based panel survey of the GESIS
– Leibniz Institute of the Social Sciences that started in 2013. It uses two self-adminis-
tered participation modes: online via web surveys or offline via paper and pencil surveys.
The target population of the GESIS Panel comprises all German-speaking people between
18 and 70years of age who permanently reside in Germany (Bosnjak etal. 2018). After
the initial sampling, the GESIS Panel carried out refreshment samples in 2016 and 2018
from the German General Social Survey (ALLBUS). Schaurer and Weyandt (2018) and
Schaurer etal. (2020) have pointed out that the participants of the ALLBUS were asked
whether they were willing to participate in a subsequent self-administered panel survey. If
they agreed, those who use the internet were nudged to use the online mode. Those par-
ticipants who did not use the internet or did not want to participate online were offered the
offline mode.
For this paper, we used the second cohort of the GESIS Panel since the geospatial
broadband data were not available before 2016. This cohort was recruited in 2016 with a
minimum recruitment rate (RECR) of 18.36% (Schaurer & Weyandt 2018). The resulting
data comprises 16 regular waves from June 2016 ("wave dc") to December 2018 ("wave
ff") in addition to the recruitment interviews of the second cohort ("d11" and "d12").
3.2 Broadband supply inGermany
The geospatial data on broadband supply we used in this paper is from Germany’s Fed-
eral Agency for Cartography and Geodesy, which provides a machine-readable map on
their website called Broadband Supply with 50 Mbit/s1 (Bundesamt für Kartographie und
Geodäsie 2016; cf. Internet Archive 2018). The data is based on the Breitbandatlas of the
Federal Ministry for Digital and Transport (Bundesministerium für Digitales und Verkehr
2023). As shown in Fig.1, the map provides information on the proportion of broadband
supply with at least 50 Mbit/s available in each German administrative district in mid-
2016. The administrative districts comprise 432 units categorized by rural districts ("Land-
kreise"), independent cities ("Kreisfreie Städte"), districts ("Kreise"), and city districts
1 Although the map is not available anymore on the website of the Federal Agency for Cartography and
Geodesy, a screenshot of the website from October 2018 can be obtained from Internet Archive (2018).
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
5811
How does broadband supply affect participation inpanel surveys?:…
("Stadtkreise"). The lowest broadband category (0–25%) means that a maximum of 25% of
the households in such an administrative district have a broadband connection with a data
transmission rate of at least 50 Mbit/s. This 50 Mbit/s speed was considered to be a thresh-
old value for sufficient data transmission (Bundesamt für Kartographie und Geodäsie 2016;
cf. Internet Archive 2018).
3.3 Operationalization andmethods
3.3.1 Preparation oftheanalysis data
The geospatial broadband data and the geocoded survey data needed to be combined to
investigate the effects of broadband supply on the mode choice and panel attrition of the
participants of the GESIS Panel. The first step was to retrieve and process the machine-
readable data of the map of broadband supply with a geographic information system (GIS).
We accomplished this by using the open-source software QGIS. We used the resulting
shapefile in the statistical programming language R as a SpatialPolygonsDataFrame (R
Core Team 2021). Next, we transformed the geospatial broadband data and the coordinates
of the GESIS Panel participants to the same coordinate reference system (CRS). The CRS
provides information on the coordinate origins and curvature of the earth. Finally, we gen-
erated a variable with broadband categories for each participant in each survey wave and
appended these variables to the GESIS Panel survey data. The new variables assign each
participant the broadband category of their district for each survey wave.
Fig. 1 Broadband supply in
Germany in 2016
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
5812
M.Schwerdtfeger, R.L.Bach
In the recruitment interview, participants were asked whether they used the internet
for private purposes. For the mode choice analysis, participants who did not have internet
access at all were excluded from the analysis due to the missing choice situation regarding
the participation mode. Furthermore, 15 participants were excluded due to switching their
participation mode between the recruitment interview and the first survey wave. Four of
them switched from online to offline because they did not provide a valid email address.
The other 11 participants switched from offline to online since they used the URL in the
invitation letter of the profile survey and then provided a valid email address. The resulting
dataset for the mode choice analysis contained 1,455 observations. For the panel attrition
analysis, we used the online participants and converted their data into a long format to ana-
lyze the longitudinal panel data of the 16 waves. The resulting dataset for the panel attrition
analysis contained 13,095 observations.
3.3.2 Mode choice analysis
The first set of analyses was concerned with mode choice as the dependent variable.
We used a binomial logistic regression model since mode choice is a binary variable
(0 = Offline, 1 = Online). Choosing the offline mode serves as a reference category.
In the recruitment interview, participants who reported that they used the internet for
private purposes were asked whether it was acceptable for them to answer the panel survey
online. Those who disagreed became offline participants. The main independent variable
of the mode choice analysis was the broadband category at the time of the first wave. It was
classified according to the proportion of broadband supply with at least a 50 Mbit/s speed
in the administrative district of the participant (1 = 0–25% of at least 50 Mbit/s, 2 = 26–50%
of at least 50 Mbit/s, 3 = 51–75% of at least 50 Mbit/s, 4 = 76–95% of at least 50 Mbit/s,
5 = 96–99% of at least 50 Mbit/s).
Additionally, we included four control variables in the mode choice analysis: inter-
net familiarity (measured as the reported frequency of private internet usage: 1 = Rarer,
2 = About once a week, 3 = More than once a week, 4 = About once a day, 5 = Several times
a day), gender (0 = Female, 1 = Male), year of birth, and the highest level of education (cor-
responding to the general school-leaving qualification: 1 = Low, 2 = Medium, 3 = High2).
The lowest categories of broadband, familiarity, and education, as well as female gender,
serve as reference categories. See Table1 in the Appendix for the descriptive statistics of
all the variables we used in the mode choice analysis.
3.3.3 Panel attrition analysis
The dependent variable of the second analysis was panel dropout, which is a dichotomous
indicator for attrition in each wave (0 = No dropout, 1 = Dropout). Not dropping out serves
as a reference category. The main independent variable of the panel attrition analysis was
the time-dependent broadband category with a broadband value for each participant in
2 Low education level: Individuals without a formal school-leaving qualification or those with a "Haupts-
chulabschluss," which is typically obtained after completing lower secondary education. Medium education
level: Participants with a "Realschulabschluss," indicating completion of a higher level of secondary educa-
tion. High education level: Individuals with a "(Fach-)Hochschulreife" or "Abitur," representing the highest
level of secondary education in Germany.
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
5813
How does broadband supply affect participation inpanel surveys?:…
each wave.3 Consistent with the previous analysis, the independent variable was classi-
fied according to the proportion of broadband supply with at least a 50 Mbit/s speed in
the administrative district of the participant (1 = 0–25% of at least 50 Mbit/s, 2 = 26–50%
of at least 50 Mbit/s, 3 = 51–75% of at least 50 Mbit/s, 4 = 76–95% of at least 50 Mbit/s,
5 = 96–99% of at least 50 Mbit/s).
Furthermore, in the panel attrition analysis, we included five control variables. The first
was the evaluation of duration. At the end of each survey wave, the participants evalu-
ated whether they experienced the questionnaire as long (1 = Not at all, 2 = Rather not,
3 = Partially agree, 4 = Rather, 5 = Very). However, the evaluation of duration could not be
included in the analysis as a time-dependent variable due to missing events (dropouts) in
some of the response categories. The solution was to generate a categorized mean for each
participant by calculating their mean evaluation of the duration across all waves and group-
ing this mean into one of the five original categories. We included the categorized mean
of the evaluation of duration as a time-independent variable in the analysis. The second
control variable was the actual duration in seconds. Due to the same issues as the time-
dependent evaluation of duration, we calculated the mean duration for each participant and
categorized it into four groups by dividing it into quartiles. We included the other three
control variables—gender (0 = Female, 1 = Male), year of birth, and education (1 = Low,
2 = Medium, 3 = High)— as time-independent variables. The lowest categories of broad-
band, evaluation of duration, and education, as well as female gender, serve as reference
categories. See Table2 in the Appendix for the descriptive statistics of all the variables we
used in the panel attrition analysis.
Table 1 Descriptive statistics of the mode choice analysis
Variable Statistics/Values Frequencies (%)
Mode choice: Online 0 = Offline
1 = Online
394 (27.1%)
1,061 (72.9%)
Broadband category 1 = 0–25% of at least 50 Mbit/s
2 = 26–50% of at least 50 Mbit/s
3 = 51–75% of at least 50 Mbit/s
4 = 76–95% of at least 50 Mbit/s
5 = 96–99% of at least 50 Mbit/s
25 (1.7%)
287 (19.7%)
542 (37.3%)
545 (37.5%)
56 (3.8%)
Internet familiarity 1 = Rarer
2 = About once a week
3 = More than once a week
4 = About once a day
5 = Several times a day
31 (2.1%)
39 (2.7%)
146 (10.0%)
265 (18.2%)
974 (66.9%)
Gender: Male 0 = Female
1 = Male
727 (50.0%)
728 (50.0%)
Year of birth Mean (sd): 1968 (15.7)
min < med < max: 1928 < 1967 < 1997
IQR (CV): 25 (0)
N = 1445
Education 1 = Low
2 = Medium
3 = High
232 (15.9%)
502 (34.5%)
721 (49.6%)
3 The actual number of changes in the broadband category of a participant between waves was low because
they only occurred when a participant changed residence.
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
5814
M.Schwerdtfeger, R.L.Bach
We modeled panel attrition with a Cox regression, which is a method that enabled us to fit
multivariate survival models with time-dependent and time-independent covariates. To fit our
Cox regression model with time-dependent covariates, we converted the data to a long format.
Moreover, we split the episodes so that each participant had one row for each wave in which
they participated (Broström, 2018, pp. 67–70). The purpose of a Cox regression model is to
evaluate simultaneously the effects of multiple covariates on an event, which is dropout in the
case of the panel attrition analysis. The Cox model is expressed by the hazard function, and
its outcome can be interpreted as the risk of an event at each point in time. Like the logistic
regression, it is common to take the exponential function of the coefficients of the Cox regres-
sion to inverse the logarithmic function, which provides hazard ratios that are easier to inter-
pret. A hazard ratio above 1 indicates a covariate that is positively associated with an event
probability and, thus, is negatively associated with length of survival.
A key assumption of the Cox regression model is that the hazard curves for groups of
observations are proportional and do not cross, which is why it is also called the proportional
hazards model (STHDA 2019). Using scaled Schoenfeld residuals, we tested the proportional
hazards (PH) assumption underlying the Cox regression model.
Table 2 Descriptive statistics of the panel attrition analysis
Variable Statistics/Values Frequencies (%)
Panel dropout 0 = No dropout
1 = Dropout
12,882 (98.4%)
213 (1.6%)
Broadband category 1 = 0–25% of at least 50 Mbit/s
2 = 26–50% of at least 50 Mbit/s
3 = 51–75% of at least 50 Mbit/s
4 = 76–95% of at least 50 Mbit/s
5 = 96–99% of at least 50 Mbit/s
192 (1.5%)
2,550 (19.5%)
4,796 (36.6%)
4,950 (37.8%)
607 ( 4.6%)
Evaluation of duration 1 = Long–Not at all
2 = Long–Rather not
3 = Long–Partially agree
4 = Long–Rather
5 = Long–Very
1,210 (9.2%)
7,160 (54.7%)
4,057 (31.0%)
611 (4.7%)
57 (0.4%)
Measured duration
(quartiles)
Mean (sd): 2.5 (1.1)
min < med < max: 1 < 2 < 4
IQR (CV): 2 (0.4)
1: 3,296 (25.2%)
2: 3,328 (25.4%)
3: 3,264 (24.9%)
4: 3,207 (24.5%)
Gender: Male 0 = Female
1 = Male
6,360 (48.6%)
6,735 (51.4%)
Year of birth Mean (sd): 1,968.8 (15.6)
min < med < max: 1,933 < 1,968 < 1,997
IQR (CV): 24 (0)
N = 13,095
Education 1 = Low
2 = Medium
3 = High
1,586 (12.1%)
4,074 (31.1%)
7,435 (56.8%)
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
5815
How does broadband supply affect participation inpanel surveys?:…
4 Results
4.1 Mode choice
Figure2 presents the results of the mode choice logistic regression model. The dots repre-
sent the odds ratios of each covariate of the binomial logistic regression for mode choice,
and the horizontal lines represent the 95% confidence intervals. The odds ratio of the fifth
broadband category is 6.405, which means that holding the other covariates at a fixed
value, living in a region where 96–99% of the households have a data transmission rate
of at least 50 Mbit/s (broadband category five) increases the odds of choosing the online
mode by a factor of 6.405 (p < 0.01) compared to a region where 0–25% of the households
have a data transmission rate of at least 50 Mbit/s (broadband category one). However,
broadband categories two, three, and four are not statistically different from broadband cat-
egory one. The odds ratios of internet familiarity ranges from 5.021 in the third category
(private internet usage: more than once a week) to 13.342 in the fifth category (private
internet usage: several times a day), which indicates that having an internet familiarity of
category three, four, or five increases the odds of choosing the online mode significantly
(p < 0.002) compared to an internet familiarity of category one. The control variables year
of birth4 (1.001) and education: 3—which is high education—(2.680) also have a signifi-
cant positive effect on choosing the online mode, whereas the coefficient of gender is not
Fig. 2 Odds ratios of the covariates of the binomial logistic regression on mode choice (online) with 95%
confidence intervals (Model 1). Note: Mode choice (offline), lowest categories of broadband, familiarity,
and education, as well as female gender, serve as reference categories
4 In a separate model, we also tested categorizing year of birth into different generational cohorts but found
no significant effects and no improvement in model fit.
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
5816
M.Schwerdtfeger, R.L.Bach
statistically significant and, therefore, has no relevant effect in this model. Consequently,
hypotheses 1 (broadband supply), 2 (internet familiarity), 4 (age), and 5 (education) cannot
be rejected, while hypothesis 3 (gender) must be rejected.
Figure3 depicts the marginal effects of the independent variable broadband on mode
choice. According to the predicted probabilities for broadband, categories one to four,
which cover a data transmission rate of at least 50 Mbit/s for 0–95% of the households,
indicate low probabilities (ranging from about 10% to 15%) of users choosing the online
mode. However, in the highest broadband category, where 96–99% of the households have
a data transmission rate of at least 50 Mbit/s, the probability of choosing the online mode
is over 40%, and the regression model indicates that this effect is significantly different
from the first category. Overall, there is a small difference in gradient between categories
one to four and a large increase with wide confidence intervals in category five. Based on
Fig. 3 Marginal effects of broad-
band category on mode choice
(Model 1)
Fig. 4 Marginal effects of inter-
net familiarity on mode choice
(Model 1)
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
5817
How does broadband supply affect participation inpanel surveys?:…
these findings, we find partial support for the hypothesis that living in a region with a better
broadband supply increases the probability of choosing to participate online in a mixed-
mode panel (hypothesis 1).
In Fig.4, we see that there is a strong linear relationship between the lowest internet
familiarity of about 10%, which includes participants who use the internet less than once
a week, and the highest internet familiarity of over 60% of online mode choices, which
includes participants who use the internet several times a day. However, we also see large
standard deviations in each category of internet familiarity. The regression model shows
that categories three, four, and five are significantly different from the first category. Over-
all, we find that higher internet familiarity increases the probability of choosing the online
mode in a mixed-mode panel (hypothesis 2).
Figure5 indicates a linear relationship of education on choosing the online mode, with
the lowest level of education having a predicted probability of about 10% and the highest
level of education having a predicted probability of about 22%. Again, we see large stand-
ard deviations in all three education categories, but the regression model shows that the
Fig. 5 Marginal effects of educa-
tion on mode choice (Model 1)
Fig. 6 Interaction of broadband
and internet familiarity on mode
choice (Model 2)
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
5818
M.Schwerdtfeger, R.L.Bach
highest education category has a significantly higher effect on mode choice than the lowest
education category. Overall, we find that a high level of education increases the probability
of choosing the online mode in a mixed-mode panel (hypothesis 5).
In a second model, we include the interaction of broadband and internet familiarity to
test whether the effect of broadband is dependent on internet familiarity. The interaction
model shows no true interaction effects between broadband and internet familiarity (see
Fig.6). However, the model shows the tendency that the lower the level of internet famili-
arity, the higher the uncertainty in mode choice. This finding can be interpreted in con-
formity with the previously mentioned considerations that internet familiarity functions as
a precondition for other predictors of mode choice.
Compared to the model without the interaction effect, the residual deviance decreased
by 25.6 to 1,458.6, but the Akaike information criterion (AIC) increased by 0.4 to a value
of 1,510.6, which can be interpreted as a deterioration of the model fit. Therefore, the first
model fits the data best and serves as the basis for the hypothesis tests. See Table3 in the
Appendix for the complete table of results of both models.
4.2 Panel attrition
In Fig.7, the dots represent the hazard ratios of each covariate of the Cox regression on
panel dropout, and the horizontal lines represent the 95% confidence intervals. The haz-
ard ratios of broadband categories two to five are not significantly distinct from broad-
band category one. This suggests that the varying proportions of households with a data
transmission rate of at least 50 Mbit/s do not affect panel attrition. The hazard ratios of
the evaluation of duration are at 6.107 (p < 0.001) for the fourth category and at 20.851
(p < 0.001) for the fifth category, indicating that evaluating the survey as long significantly
increases the risk of dropout compared to not evaluating the survey as long (category one).
The control variables measured duration (1.215) and year of birth (1.029) have a signifi-
cant positive effect on panel attrition, whereas the coefficients of gender and education are
not statistically significant and, therefore, have no relevant effects in this model. Conse-
quently, hypotheses 7 (evaluation of duration), 8 (measured duration), and 10 (age) cannot
be rejected, while hypotheses 6 (broadband supply), 9 (gender), and 11 (education) must be
rejected.
Figure8 shows the adjusted survival curves of the five broadband categories on panel
dropout. The survival rates of each broadband category are almost identical, with a differ-
ence of about five percentage points between the first and fifth broadband categories after
16 survey waves. Given these results, we cannot confirm that living in a region with a bet-
ter broadband supply decreases the risk of attrition in an online panel survey (hypothesis
6).
In contrast, the survival rates of the evaluation of duration are widely spread (see Fig.9).
After 16 waves, the survival rate for the participants who rated the survey as not at all long
(first category) is about 88%, whereas the survival rate for participants who rated the sur-
vey as very long (fifth category) is about 10%. Thus, participants who rate the duration on
average as long have an exceedingly higher risk of panel dropout than the participants who
did not rate the duration as long (hypothesis 7).
While testing the proportional hazards assumption of model 1, the individual Sch-
oenfeld test results revealed that each covariate had a p-value above 0.05, except for
the year of birth and measured duration. As the curves appear to be sufficiently flat and
the global Schoenfeld test yielded an insignificant result of p = 0.205, we do not see a
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
5819
How does broadband supply affect participation inpanel surveys?:…
Table 3 Results table of the
logistic regression models on
mode choice
DV: Mode choice—Online
(Model 1) (Model 2)
Broadband: 2 0.341 14.778
(0.469) (1,028.170)
Broadband: 3 0.339 15.842
(0.459) (1,028.170)
Broadband: 4 0.349 0.995
(0.460) (1,162.783)
Broadband: 5 1.857** 1.189
(0.683) (0.973)
Internet familiarity: 2 0.287 14.509
(0.599) (543.076)
Internet familiarity: 3 1.614** 16.729
(0.497) (1,028.170)
Internet familiarity: 4 1.746*** 0.584
(0.485) (1,254.084)
Internet familiarity: 5 2.591*** 17.696
(0.480) (1,028.169)
Gender: Male 0.182 0.180
(0.130) (0.131)
Year of birth 0.009*0.009
(0.005) (0.005)
Education: 2 0.335 0.305
(0.178) (0.181)
Education: 3 0.986*** 0.963***
(0.181) (0.183)
Broadband: 2 × Familiarity: 2 − 13.594
(543.077)
Broadband: 3 × Familiarity: 2 − 16.641
(543.078)
Broadband: 4 × Familiarity: 2 NA
Broadband: 5 × Familiarity: 2 NA
Broadband: 2 × Familiarity: 3 − 14.702
(1,028.170)
Broadband: 3 × Familiarity: 3 − 16.121
(1,028.170)
Broadband: 4 × Familiarity: 3 − 1.592
(1,162.783)
Broadband: 5 × Familiarity: 3 14.379
(1,455.398)
Broadband: 2 × Familiarity: 4 0.879
(1,254.084)
Broadband: 3 × Familiarity: 4 0.466
(1,254.084)
Broadband: 4 × Familiarity: 4 15.038
(1,366.623)
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
5820
M.Schwerdtfeger, R.L.Bach
problem with the proportionality of the hazards. However, we tested categorizing the
year of birth into different generational cohorts in a separate model. We did not find an
* p < 0.05; **p < 0.01; ***p < 0.001
Table 3 (continued) DV: Mode choice—Online
(Model 1) (Model 2)
Broadband: 5 × Familiarity: 4 16.969
(718.051)
Broadband: 2 × Familiarity: 5 − 14.967
(1,028.169)
Broadband: 3 × Familiarity: 5 − 16.196
(1,028.169)
Broadband: 4 × Familiarity: 5 − 1.162
(1,162.783)
Broadband: 5 × Familiarity: 5
Constant − 20.172* − 34.268
(8.920) (1,028.209)
Observations 1,455 1,455
Null deviance 1,699.6 1,699.6
Residual deviance 1,484.2 1,458.6
AIC 1,510.2 1,510.6
Fig. 7 Hazard ratios of the covariates of the Cox regression on panel dropout with 95% confidence intervals
(Model 1). Note: No dropout, lowest categories of broadband, evaluation of duration, and education, as well
as female gender, serve as reference categories
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
5821
How does broadband supply affect participation inpanel surveys?:…
improvement in the proportionality of the hazards. See Fig.10 in the Appendix for the
results of the proportional hazards tests and respective graphs.
In a further model, we tested for interaction effects between the broadband categories
and the evaluation of duration categories. The tests for the global statistical significance
of the model remained significant, but the model comparison with the LR test revealed
that the interaction effects did not improve the model fit compared to the first model.
See Table4 in the Appendix for the complete table of results of both models of the
panel attrition analysis.
As a robustness check, we ran the first model with two different subsets of the used
dataset. The first subset excluded the participants who used a smartphone for two waves
in a row, and the second subset excluded the participants who used a smartphone for three
waves in a row. Smartphone participants can use the mobile network to answer the survey,
Fig. 8 Survival rate of broadband
categories on panel dropout
(Model 1)
Fig. 9 Survival rate of evaluation
of duration on panel dropout
(Model 1)
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
5822
M.Schwerdtfeger, R.L.Bach
bypassing the broadband network and associated limitations due to poor broadband supply.
With these robustness checks with two subsets, we evaluated whether the use of smart-
phones, which enabled the use of both a broadband connection via a wireless network at
home and a mobile network, impacted the effects of broadband supply. The results were
similar to those of the complete dataset, which indicated that the findings were robust for
survey participation via smartphones. See Table5 in the Appendix for the results of the
robustness check.
5 Conclusion anddiscussion
In the present study, we conducted a feasibility study to determine whether publicly avail-
able broadband data at the district level were sufficient to draw meaningful conclusions
about participation behavior. The specific research focus was to explain how broadband
supply affects the choice of participation mode in a mixed-mode panel survey and how
it determines panel attrition. We will first conclude the results of the substantive research
questions and then discuss the granularity of the broadband data in the context of the gen-
eral availability of fine-grained geospatial data.
The literature review in the context of mode choice revealed that internet familiarity
was an important precondition for choosing the online mode, followed by online experi-
ences, which were closely linked to a fast and stable broadband connection. In light of this
Fig. 10 Test of the proportional hazards assumption (Cox regression model 1)
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
5823
How does broadband supply affect participation inpanel surveys?:…
Table 4 Results table of the Cox regression models on panel dropout
DV: Event = Panel dropout
(Model 1) (Model 2)
Broadband: 2 0.038 13.471
(0.731) (1,723.302)
Broadband: 3 0.147 13.128
(0.720) (1,723.302)
Broadband: 4 0.316 12.971
(0.718) (1,723.302)
Broadband: 5 0.274 14.554
(0.773) (1,723.303)
Evaluation of duration: 2 0.237 13.035
(0.336) (1,723.303)
Evaluation of duration: 3 0.598 14.070
(0.341) (1,723.303)
Evaluation of duration: 4 1.809*** 0.857
(0.365) (1.167)
Evaluation of duration: 5 3.037*** 1.761
(0.473) (1.435)
Duration (Quartiles) 0.194** 0.192**
(0.066) (0.067)
Gender: Male − 0.254 − 0.266
(0.139) (0.142)
Year of birth 0.029*** 0.029***
(0.005) (0.005)
Education: 2 − 0.089 − 0.042
(0.242) (0.244)
Education: 3 − 0.420 − 0.405
(0.236) (0.238)
Broadband: 2 × Eval. of Dur.: 2 − 13.461
(1,723.303)
Broadband: 3 × Eval. of Dur.: 2 − 12.883
(1,723.303)
Broadband: 4 × Eval. of Dur.: 2 − 12.292
(1,723.303)
Broadband: 5 × Eval. of Dur.: 2 − 14.102
(1,723.303)
Broadband: 2 × Eval. of Dur.: 3 − 13.824
Broadband: 3 × Eval. of Dur.: 3 − 13.231
(1,723.303)
Broadband: 4 × Eval. of Dur.: 3 − 13.296
(1,723.303)
Broadband: 5 × Eval. of Dur.: 3 − 15.291
(1,723.303)
Broadband: 2 × Eval. of Dur.: 4 0.781
(1.406)
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
5824
M.Schwerdtfeger, R.L.Bach
The model comparison tests the improvement of model fit compared to the previous model by an LR test
* p < 0.05; **p < 0.01; ***p < 0.001
Table 4 (continued)
DV: Event = Panel dropout
(Model 1) (Model 2)
Broadband: 3 × Eval. of Dur.: 4 0.659
(1.305)
Broadband: 4 × Eval. of Dur.: 4 1.352
(1.334)
Broadband: 5 × Eval. of Dur.: 4 NA
(0.000)
Broadband: 2 × Eval. of Dur.: 5 0.309
(1.750)
Broadband: 3 × Eval. of Dur.: 5 1.490
(1.599)
Broadband: 4 × Eval. of Dur.: 5 1.963
(1.695)
Broadband: 5 × Eval. of Dur.: 5 NA
(0.000)
Number of Events 213 213
Observations 13,095 13,095
Model comparison: Pr(> Chi) 0.745
Wald Test 166.160***(df = 13) 175.360***(df = 27)
LR Test 133.618***(df = 13) 143.848***(df = 27)
Score (Logrank) Test 234.628***(df = 13) 271.382***(df = 27)
review, we expected that living in a region with a better broadband supply would increase
the probability of choosing the online mode in a mixed-mode panel survey (hypothesis 1).
Our results show that only the highest broadband category, where the district has 96–99%
broadband coverage of at least 50 Mbit/s, significantly increases the likelihood of choos-
ing the online mode compared to districts with 0–25% broadband coverage of at least 50
Mbit/s. With these results, we can partially confirm the first hypothesis since it is true, at
least for the outermost categories, that living in a region with a better broadband supply
increases the probability of choosing the online mode in a mixed-mode panel survey. Fur-
thermore, we can confirm previous research findings, as our model shows that higher inter-
net familiarity (hypothesis 2), younger age (hypothesis 4), and higher education (hypoth-
esis 5) increase the likelihood of choosing the online mode in a mixed-mode panel survey,
with the effect of higher internet familiarity appearing to be particularly strong.
The literature review in the context of panel attrition revealed that survey burden
and panel experience play a central role in determining panel dropouts. The crucial fac-
tor in this context is the perception of burden, which is influenced by flow experience
connected to waiting times and the overall perceived survey duration. Therefore, we
expected that living in a region with a better broadband supply would decrease the risk
of attrition in online panel surveys (hypothesis 6). The sixth hypothesis was rejected
by the results, as there was no significant effect of broadband supply on panel attrition.
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
5825
How does broadband supply affect participation inpanel surveys?:…
Table 5 Robustness check of the
Cox regression on panel dropout
Subset 1 Excluded participants that used the smartphone for at least
two waves in a row
Subset 2 Excluded participants that used the smartphone for at least
three waves in a row
* p < 0.05; **p < 0.01; ***p < 0.001
DV: Event = Panel dropout
Subset 1 Subset 2
(1) (2)
Broadband: 2 − 0.176 − 0.124
(0.737) (0.735)
Broadband: 3 0.011 0.042
(0.722) (0.721)
Broadband: 4 0.167 0.187
(0.722) (0.721)
Broadband: 5 − 0.051 0.185
(0.789) (0.775)
Evaluation of duration: 2 0.128 0.141
(0.339) (0.337)
Evaluation of duration: 3 0.567 0.521
(0.347) (0.345)
Evaluation of duration: 4 1.790*** 1.705***
(0.373) (0.372)
Evaluation of duration: 5 2.702*** 2.864***
(0.509) (0.491)
Duration (Quartiles) 0.205** 0.211**
(0.072) (0.070)
Gender: Male − 0.403** − 0.361*
(0.152) (0.146)
Year of birth 0.039*** 0.036***
(0.006) (0.006)
Education: 2 − 0.126 − 0.051
(0.259) (0.251)
Education: 3 − 0.430 − 0.437
(0.254) (0.249)
Number of events 182 195
Observations 9,874 10,709
Wald test (df = 13) 166.470*** 173.010***
LR test (df = 13) 143.193*** 144.015***
Score (Logrank) Test (df = 13) 223.616*** 242.010***
Besides that, the analysis shows that the probability of dropping out increases the more
the survey is evaluated as long (hypothesis 7) and the longer the actual duration of the
survey is (hypothesis 8). Also, younger respondents have a higher risk of dropping out
(hypothesis 10).
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
5826
M.Schwerdtfeger, R.L.Bach
Besides examining the effects of broadband supply on mode choice and panel attrition,
the present study was designed as a feasibility study to test whether publicly available geo-
spatial broadband data at the district level were sufficient for this type of analysis.
There is an ongoing discussion about the availability of geospatial data for research purposes
and the associated privacy concerns (Solymosi etal. 2023). Fine-grained geospatial data offers
the potential for precise analyses but also carries a high risk of identifying personal identities.
In addition, fine-grained geospatial data is often not available for research purposes or subject
to severe restrictions, while larger-scale geospatial data is more accessible. In the case of the
broadband supply data used, the district level was accessible online, whereas we could not gain
access to the more detailed grid level when designing the study.
In view of the results, the classification of about 400 administrative districts in Germany
into five broadband categories was not ideal for the present analyses. Although the analysis
of mode choice shows a significant effect for the highest broadband category, we expect
that better broadband data would provide a more accurate picture of the impact of broad-
band supply. Such better broadband data would include both finer spatial units and more
information about the different levels of broadband speed available per unit. Therefore, we
cannot reject hypothesis 12 since the district-level data has produced meaningful results.
However, we do not recommend focusing solely on district-level data for such research
questions, although it was important to test this in a feasibility study. The approach of using
publicly available geodata to optimize survey participation behavior has a high potential for
improving survey quality with relatively simple means and low costs. When fine-grained
geospatial data is limited and restricted, larger-scale geospatial data with fewer barriers
can be used to find opportunities for survey optimization. This study shows that it might be
reasonable to replicate these analyses—especially the mode choice part—with more fine-
grained broadband data to test whether the insignificance of the broadband coefficients is
attributable to the large granularity of the broadband data.
To conclude, this field of research is becoming more and more important due to the increas-
ing number of web surveys and the ambition of full coverage of the population with web sur-
veys. Based on the mode choice analysis results, survey researchers can extract practical impli-
cations. For instance, we can assume that in mixed-mode scenarios, respondents in areas with
very good broadband supply are more likely to choose the online mode. Not only does this help
in predicting online participation, but it also allows for a targeted approach to increasing online
participation. Unfortunately, it is not possible to deduce any practical implications regarding
broadband supply from the results of the panel attrition analysis. Therefore, the approach of
using geospatial broadband data to investigate participation behavior should be re-examined
with more precise data on broadband supply. In the course of this, new types of data, such as
digital behavioral data, can also be examined in order to obtain more in-depth insights into, for
example, familiarity, internet usage, and online habits. Findings from this research can be espe-
cially useful in survey recruitment when it is possible to offer potential participants the most
suitable survey mode based on their place of residence, thereby improving participation rates
and reducing panel attrition.
Appendix
See Fig.10.
See Tables1, 2, 3, 4, 5.
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
5827
How does broadband supply affect participation inpanel surveys?:…
Author contributions All authors contributed to the study conception and design. Data preparation and
analysis were performed by Maikel Schwerdtfeger. The first draft of the manuscript was written by Maikel
Schwerdtfeger and all authors commented on previous versions of the manuscript. All authors read and
approved the final manuscript. A previous version of this manuscript was presented at the virtual General
Online Research Conference (GOR) in 2021.
Funding Open Access funding enabled and organized by Projekt DEAL.
Declarations
Conflict of interest The authors have no relevant financial or non-financial interests to disclose.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License,
which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long
as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Com-
mons licence, and indicate if changes were made. The images or other third party material in this article
are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the
material. If material is not included in the article’s Creative Commons licence and your intended use is not
permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly
from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
References
Bandilla, W., Kaczmirek, L., Blohm, M., Neubarth, W.: Coverage- und Nonresponse-Effekte bei Online-
Bevölkerungsumfragen. In: Jackob, N., Schoen, H., Zerback, T. (eds.) Sozialforschung im internet:
methodologie und Praxis der Online-Befragung, pp. 129–143. VS Verlag für Sozialwissenschaften,
Wiesbaden (2009). https:// doi. org/ 10. 1007/ 978-3- 531- 91791-7_8
Baur, N., Florian, M.J.: Stichprobenprobleme bei Online-Umfragen. In: Jackob, N., Schoen, H., Zerback,
T. (eds.) Sozialforschung im Internet: Methodologie und praxis der online-befragung, pp. 109–128.
VS Verlag für Sozialwissenschaften, Wiesbaden (2009). https:// doi. org/ 10. 1007/ 978-3- 531- 91791-7_7
Behr, A., Bellgardt, E., Rendtel, U.: Extent and determinants of panel attrition in the European community
household panel. Eur. Sociol. Rev. 21(5), 489–512 (2005)
Bosnjak, M., Dannwolf, T., Enderle, T., Schaurer, I., Struminskaya, B., Tanner, A., Weyandt, K.W.: Estab-
lishing an open probability-based mixed-mode panel of the general population in Germany: the GESIS
Panel. Soc. Sci. Comput. Rev. 36(1), 103–115 (2018)
Bretschi, D., Weiß, B.: How do internet-related characteristics affect whether members of a german mixed-
mode panel switch from the mail to the web mode? Soc. Sci. Comput. Rev. (2022). https:// doi. org/ 10.
1177/ 08944 39322 11172 67
Broström, G.: Event history analysis with R. CRC Press, Boca Raton (2018)
Brutlag, J. (2009). Speed Matters. Google AI Blog. http:// ai. googl eblog. com/ 2009/ 06/ speed- matte rs. html
Bundesamt für Kartographie und Geodäsie. (2016). Breitbandversorgung mit 50 Mbit/s. https:// www. geopo
rtal. de/ Share dDocs/ Karten/ DE/ Theme nkarte_ Breit bandv ersor gung. html
Bundesministerium für Digitales und Verkehr. (2023). Breitbandatlas. https:// gigab itgru ndbuch. bund. de/
GIGA/ DE/ Breit banda tlas/ Vollb ild/ start. html
Chebat, J.-C., Salem, N.H., Poirier, J.-F., Gélinas-Chebat, C.: Reactions to waiting online by men and
women. Psychol. Rep. 106(3), 851–869 (2010). https:// doi. org/ 10. 2466/ pr0. 106.3. 851- 869
Crawford, S.D., Couper, M.P., Lamias, M.J.: Web surveys: perceptions of Burden. Soc. Sci. Comput. Rev.
19(2), 146–162 (2001). https:// doi. org/ 10. 1177/ 08944 39301 01900 202
Deutskens, E., de Ruyter, K., Wetzels, M., Oosterveld, P.: Response rate and response quality of internet-
based surveys: an experimental study. Mark. Lett. 15(1), 21–36 (2004). https:// doi. org/ 10. 1023/B:
MARK. 00000 21968. 86465. 00
Diment, K., Garrett-Jones, S.: How demographic characteristics affect mode preference in a postal/web
mixed-mode survey of Australian researchers. Soc. Sci. Comput. Rev. 25(3), 410–417 (2007)
Galesic, M.: Dropouts on the web: effects of interest and burden experienced during an online survey. J. Off.
Stat. 22(2), 313 (2006)
Ganassali, S.: The influence of the design of web survey questionnaires on the quality of responses. Surv.
Res. Methods 2, 21–32 (2008)
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
5828
M.Schwerdtfeger, R.L.Bach
GESIS. (2019). GESIS Panel—Extended Edition. GESIS Data Archive, Cologne, ZA5664 Data File (Ver-
sion 31.0.0). https:// doi. org/ 10. 4232/1. 13319
Gummer, T., Daikeler, J.: A note on how prior survey experience with self-administered panel surveys
affects attrition in different modes. Soc. Sci. Computer Rev. (2018). https:// doi. org/ 10. 1177/ 08944
39318 816986
Herzing, J.M.E., Blom, A.G.: The influence of a person’s digital affinity on unit nonresponse and attrition in
an online panel. Soc. Sci. Computer Rev. 37(3), 404–424 (2019). https:// doi. org/ 10. 1177/ 08944 39318
774758
Internet Archive. (2018, October 9). Geoportal.de—Geodaten aus Deutschland—Themenkarten—Breit-
bandversorgung mit 50 Mbit/s. https:// web. archi ve. org/ web/ 20181 00923 1108/ https:// www. geopo rtal.
de/ Share dDocs/ Karten/ DE/ Theme nkarte_ Breit bandv ersor gung. html
Lepkowski, J.M., Couper, M.P.: Nonresponse in the second wave of longitudinal household surveys. In: Sur-
vey nonresponse, pp. 259–272. John Wiley, Hoboken (2002)
Lipps, O.: Attrition of households and individuals in panel surveys. Soeppapers 164, 1–15 (2009)
Lugtig, P.: Panel attrition: separating stayers, fast attriters, gradual attriters, and lurkers. Sociol. Methods
Res. 43(4), 699–723 (2014)
Marcus, B., Bosnjak, M., Lindner, S., Pilischenko, S., Schütz, A.: Compensating for low topic interest and
long surveys: a field experiment on nonresponse in web surveys. Soc. Sci. Comput. Rev. 25(3), 372–
383 (2007)
Millar, M.M., O’Neill, A.C., Dillman, D.A.: Are mode preferences real. Pullman: Washington State Univ.
9, 2–41 (2009)
Nielsen, J.: Usability engineering. Morgan Kaufmann, Cambridge (1994)
Olson, K., Smyth, J.D., Wood, H.M.: Does giving people their preferred survey mode actually increase sur-
vey participation rates? An experimental examination. Public Opin. Quarterly 76(4), 611–635 (2012).
https:// doi. org/ 10. 1093/ poq/ nfs024
R Core Team. (2021). R: A language and environment for statistical computing [Computer software]. R
Foundation for Statistical Computing. https:// www.R- proje ct. org/
Reichl, P., Egger, S., Schatz, R., D’Alconzo, A.: The logarithmic nature of QoE and the role of the Weber-
Fechner law in QoE assessment. IEEE Int. Conf. Commun. 2010, 1–5 (2010)
Schaurer, I., Minderop, I., Bretschi, D., & Weyandt, K. (2020). GESIS Panel Technical Report: Recruitment
2018 (Wave f11 and f12) (p. 24). GESIS - Leibniz Institute for the Social Sciences. https:// dbk. gesis.
org/ dbkse arch/ downl oad. asp? id= 68673
Schaurer, I., & Weyandt, K. (2018). GESIS Panel Technical Report: Recruitment 2016 (Wave d11 and d12)
(p. 28). GESIS - Leibniz Institute for the Social Sciences. https:// dbk. gesis. org/ dbkse arch/ downl oad.
asp? id= 63525
Scherpenzeel, A. C., & Bethlehem, J. G. (2011). How representative are online panels? Problems of cover-
age and selection and possible solutions. Social and Behavioral Research and the Internet: Advances
in Applied Methods and Research Strategies, pp. 105–132.
Smyth, J.D., Olson, K., Millar, M.M.: Identifying predictors of survey mode preference. Soc. Sci. Res. 48,
135–144 (2014). https:// doi. org/ 10. 1016/j. ssres earch. 2014. 06. 002
Solymosi, R., Buil-Gil, D., Ceccato, V., Kim, E., Jansson, U.: Privacy challenges in geodata and open data.
Area 55(4), 456–464 (2023)
STHDA. (2019). Cox Proportional-Hazards Model. STHDA - Statistical Tools for Data Analysis and Visu-
alization. http:// www. sthda. com/ engli sh/ wiki/ cox- propo rtion al- hazar ds- model
Struminskaya, B. (2014). Data quality in probability-based online panels: Nonresponse, attrition, and panel
conditioning [Utrecht University]. https:// dspace. libra ry. uu. nl/ handle/ 1874/ 301751
Vicente, P., Reis, E.: Using questionnaire design to fight nonresponse bias in web surveys. Soc. Sci. Com-
put. Rev. 28(2), 251–267 (2010). https:// doi. org/ 10. 1177/ 08944 39309 340751
Watson, N., Wooden, M.: Identifying factors affecting longitudinal survey response. Methodol. Longitud.
Surv. 1, 157–182 (2009)
Yu, M., Zhou, R., Cai, Z., Tan, C.-W., Wang, H.: Unravelling the relationship between response time and
user experience in mobile applications. Internet Res. 30(5), 1353–1382 (2020)
Publisher’s Note Springer Nature remains neutral with regard to jurisdictional claims in published maps and
institutional affiliations.
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
1.
2.
3.
4.
5.
6.
Terms and Conditions
Springer Nature journal content, brought to you courtesy of Springer Nature Customer Service Center
GmbH (“Springer Nature”).
Springer Nature supports a reasonable amount of sharing of research papers by authors, subscribers
and authorised users (“Users”), for small-scale personal, non-commercial use provided that all
copyright, trade and service marks and other proprietary notices are maintained. By accessing,
sharing, receiving or otherwise using the Springer Nature journal content you agree to these terms of
use (“Terms”). For these purposes, Springer Nature considers academic use (by researchers and
students) to be non-commercial.
These Terms are supplementary and will apply in addition to any applicable website terms and
conditions, a relevant site licence or a personal subscription. These Terms will prevail over any
conflict or ambiguity with regards to the relevant terms, a site licence or a personal subscription (to
the extent of the conflict or ambiguity only). For Creative Commons-licensed articles, the terms of
the Creative Commons license used will apply.
We collect and use personal data to provide access to the Springer Nature journal content. We may
also use these personal data internally within ResearchGate and Springer Nature and as agreed share
it, in an anonymised way, for purposes of tracking, analysis and reporting. We will not otherwise
disclose your personal data outside the ResearchGate or the Springer Nature group of companies
unless we have your permission as detailed in the Privacy Policy.
While Users may use the Springer Nature journal content for small scale, personal non-commercial
use, it is important to note that Users may not:
use such content for the purpose of providing other users with access on a regular or large scale
basis or as a means to circumvent access control;
use such content where to do so would be considered a criminal or statutory offence in any
jurisdiction, or gives rise to civil liability, or is otherwise unlawful;
falsely or misleadingly imply or suggest endorsement, approval , sponsorship, or association
unless explicitly agreed to by Springer Nature in writing;
use bots or other automated methods to access the content or redirect messages
override any security feature or exclusionary protocol; or
share the content in order to create substitute for Springer Nature products or services or a
systematic database of Springer Nature journal content.
In line with the restriction against commercial use, Springer Nature does not permit the creation of a
product or service that creates revenue, royalties, rent or income from our content or its inclusion as
part of a paid for service or for other commercial gain. Springer Nature journal content cannot be
used for inter-library loans and librarians may not upload Springer Nature journal content on a large
scale into their, or any other, institutional repository.
These terms of use are reviewed regularly and may be amended at any time. Springer Nature is not
obligated to publish any information or content on this website and may remove it or features or
functionality at our sole discretion, at any time with or without notice. Springer Nature may revoke
this licence to you at any time and remove access to any copies of the Springer Nature journal content
which have been saved.
To the fullest extent permitted by law, Springer Nature makes no warranties, representations or
guarantees to Users, either express or implied with respect to the Springer nature journal content and
all parties disclaim and waive any implied warranties or warranties imposed by law, including
merchantability or fitness for any particular purpose.
Please note that these rights do not automatically extend to content, data or other material published
by Springer Nature that may be licensed from third parties.
If you would like to use or distribute our Springer Nature journal content to a wider audience or on a
regular basis or in any other manner not expressly permitted by these Terms, please contact Springer
Nature at
onlineservice@springernature.com
ResearchGate has not been able to resolve any citations for this publication.
Article
Full-text available
This article presents a discussion of the emerging ethical issue of geodata pri-vacy in geographical research. The paper highlights the importance of consider-ing challenges to privacy when working with geographically explicit data and explores explicit ways in which researchers and practitioners can be conscious of these issues. Through summarising the key problems in this area and presenting outstanding open research areas and questions from a seminar series on geodata privacy, we highlight important considerations for future research in this field. We focus on the specific topics of appropriate anonymization, responsible data dissemination, the balance between data sharing and privacy, and the challenges posed by working across international contexts. We conclude by recommending approaches to manage various legal and ethical frameworks, raise the importance of the international context, and inspire future research to address the challenges of safeguarding sensitive geodata while promoting openness and transparency.
Article
In recent years, several longitudinal studies have transitioned from an interviewer-administered to a mixed-mode design, using the internet as one of the modes of data collection. However, a substantial proportion of panelists are reluctant to participate in web surveys when offered a choice in an ongoing mixed-mode panel. We still know little about the characteristics of panel members that drive them to comply with the request to complete surveys via the internet. This study aims to fill this gap by investigating how internet-related characteristics are linked to the willingness of panelists to switch from the mail mode to the web. We use data from multiple waves of the GESIS Panel, a probability-based mixed-mode panel in Germany ( N = 5734). A web-push intervention motivated 28% of 1364 panelists of the mail mode to complete the survey online in a single wave and 70% of these 380 short-term switchers to switch to the web mode permanently. We measured indicators of internet use, internet skills, and attitudes toward the internet as potential mechanisms of this short-term and long-term mode switching in the two waves before the intervention. Our results suggest that internet use and internet skills affect respondents’ willingness to switch modes in a single wave. For these short-term switchers, however, none of the internet-related characteristics could explain mode switching in the long term. We also present self-reported reasons by panelists for not accepting the offer to switch modes that correspond to these findings. The results of this study can be used to develop effective push-to-web methods for longitudinal mixed-mode surveys.
Article
Purpose This study examines the impact of response time on user experience for mobile applications and considers the moderating influence of gender and network environment on this relationship. Design/methodology/approach An experiment was conducted with 50 young adults to evaluate their user experience of a mobile application that simulates variations in network environment and response time. User experience was evaluated based on the three constituent dimensions of tolerance, acceptance, and satisfaction. Findings Analytical results demonstrate that response time not only adversely affects user experience of mobile applications, but that this effect is not homogeneous across the three dimensions of tolerance, acceptance and satisfaction. The findings also illustrate that gender moderates the effect of response time on user experience, however, the negative influence is more salient for males than females, which is opposite to our hypothesis. The joint moderating influence of gender and network environment turned out to be partly significant. Practical implications By illuminating users' tolerance, acceptance, and satisfaction with varied response times, findings from this study can inform the design of mobile applications such that desired levels of user experience can be assured with minimum resources. Originality/value Although response time has been hailed as a key determinant of user experience for desktop applications, there is a paucity of studies that have investigated the impact of response time on user experience for mobile applications. Furthermore, prior research on response time neglects the multi-dimensional nature of user experience. This study bridges the above mentioned knowledge gaps by delineating user experience into its constituent dimensions and clarifying the effects of response time on each of these dimensions.
Article
Attrition poses an important challenge for panel surveys. With respect to these surveys, respondents’ decisions about whether to participate in reinterviews are affected by their participation in prior waves of the panel. However, in self-administered mixed-mode panels, the way of experiencing a survey differs between the mail mode and the web mode. Consequently, this study investigated how respondents’ prior experience with the characteristics of a survey—such as length, difficulty, interestingness, sensitivity, and the diversity of the questionnaire—affects their informed decision about whether to participate again or not. We found that the length of a questionnaire seems to be of such importance to respondents that they base their participation on this characteristic, regardless of the mode. Our findings also suggest that the difficulty and diversity of questionnaires are readily accessible information that respondents use in the mail mode when making a decision about whether to participate again, whereas these characteristics have no effect in the web mode. In addition, privacy concerns have an impact in the web mode but not in the mail mode.
Book
With an emphasis on social science applications, Event History Analysis with R presents an introduction to survival and event history analysis using real-life examples. Keeping mathematical details to a minimum, the book covers key topics, including both discrete and continuous time data, parametric proportional hazards, and accelerated failure times. Features • Introduces parametric proportional hazards models with baseline distributions like the Weibull, Gompertz, Lognormal, and Piecewise constant hazard distributions, in addition to traditional Cox regression • Presents mathematical details as well as technical material in an appendix • Includes real examples with applications in demography, econometrics, and epidemiology • Provides a dedicated R package, eha, containing special treatments, including making cuts in the Lexis diagram, creating communal covariates, and creating period statistics A much-needed primer, Event History Analysis with R is a didactically excellent resource for students and practitioners of applied event history and survival analysis.
Article
Research has shown that the non-Internet population is hesitant to respond to online survey requests. However, also subgroups in the Internet population with low digital affinity may hesitate to respond to online surveys. This latter issue has not yet received much attention by scholars despite its potentially detrimental effects on the external validity of online survey data. In this article, we explore the extent to which a person’s digital affinity contributes to nonresponse bias in the German Internet Panel, a probability-based online panel of the general population. With a multidimensional classification of digital affinity, we predict response to the first online panel wave and participation across panel waves. We find that persons who belong to different classes of digital affinity have systematically different sociodemographic characteristics and show different voting behavior. In addition, we find that initial response propensities vary by classes of digital affinity, as do attrition patterns over time. Our results demonstrate the importance of digital affinity for the reduction in nonresponse bias during fieldwork and for postsurvey adjustments.
Article
Various open probability-based panel infrastructures have been established in recent years, allowing researchers to collect high-quality survey data. In this report, we describe the processes and deliverables of setting up the GESIS Panel, the first probability-based mixed-mode panel infrastructure in Germany open for data collection to the academic research community. The reference population for the GESIS Panel is the German-speaking population aged between 18 and 70 years permanently residing in Germany. In 2013, approximately 5,000 panelists had been recruited from a random sample drawn from municipal population registers. We describe the outcomes of the sampling strategy and the multistep recruitment process, involving computer-aided personal interviews conducted at respondents’ homes. Next, we describe the outcomes of the two self-administered survey modes (online and paper-and-pencil) of the GESIS Panel used for the initial profile survey and all subsequent bimonthly data collection waves. Across all stages of setting up the GESIS Panel, we report sample composition discrepancies for key demographic variables between the GESIS Panel and established benchmark surveys. Overall, the findings highlight the usefulness of pursuing a mixed-mode strategy when building a probability-based panel infrastructure in Germany.