ChapterPDF Available

The Portable Document Format: An Analysis of PDF Accessibility

Authors:

Abstract and Figures

Today, PDFs are frequently used as part of the preservation of historical documents in libraries, and they are also one of the most used formats on the web when sharing information. Unfortunately, most shared documents are not accessible, especially for users with disabilities. To solve this problem, we propose to relate accessibility techniques for PDF documents in accordance with the Web Content Accessibility Guidelines (WCAG) 2.1. As a case study, we have selected a random sample of 10 documents related to the modern architectural heritage of Quito. The authors applied a combined method to check accessibility in PDFs with the help of the PDF Accessibility Checker version 3.0
Content may be subject to copyright.
The Portable Document Format: An Analysis
of PDF Accessibility
Patricia Acosta-Vargas1,2(&), Mario Gonzalez1,
Maria Rosa Zambrano1,3, Ana Medina1, Noah Zweig1, and
Luis Salvador-Ullauri2
1 Universidad de Las Américas, Vía a Nayón, Quito, Ecuador
{patricia.acosta,mario.gonzalez.rodriguez,
maria.zambrano.torres,anagabriela.medina, noah.zweig}@udla.edu.ec
2 Universidad de Alicante, San Vicente del Raspeig, Alicante, Spain
lasu1@alu.ua.es
3 Universidad Politécnica de Madrid, Madrid, Spain
Abstract. Today, PDFs are frequently used as part of the preservation of historical documents
in libraries, and they are also one of the most used formats on the web when sharing
information. Unfortunately, most shared documents are not accessible, especially for users
with disabilities. To solve this problem, AQ1 we propose to relate accessibility techniques for
PDF documents in accordance with the Web Content Accessibility Guidelines (WCAG) 2.1.
As a case study, we have selected a random sample of 10 documents related to the modern
architectural heritage of Quito. The authors applied a combined method to check accessibility
in PDFs with the help of the PDF Accessibility Checker version
3.0,
The results revealed that the accessibility barriers that are repeated in most
documents are related to the content and the natural language of the analyzed
PDFs. The analysis applied in this investigation can contribute to future works
to generate more inclusive PDF documents.
Keywords: Accessibility Digital documents Portable document PDF techniques
WCAG 2.1
1 Introduction
Presently, portable document formats (PDF) are an essential element of information
excellence. PDF documents are increasingly used as part of the preservation of
historical documents in libraries and are often shared on the web. Nevertheless, not all
PDFs offer universal access. To solve this problem, we apply the PDF techniques of
the Web Content Accessibility Guidelines (WCAG) 2.1 [1]. In this study, we take as a
case study a random sample of 10 documents in PDF format that refers to the modern
architectural heritage of Quito and is stored in a digital format. In the evaluation of the
documents, we use the PDF Accessibility Checker version 3.0, which showed that
libraries had not been concerned about providing accessible documents under
© The Editor(s) (if applicable) and The Author(s), under exclusive license to
Springer Nature Switzerland AG 2020
2 P. Acosta-Vargas et al.
I. L. Nunes (Ed.): AHFE 2020, AISC 1207, pp. 19, 2020.
https://doi.org/10.1007/978-3-030-51369-6_28
minimum accessibility standards. The PDFs became the first digital format to distribute the
documentation on the Internet; PDF files enable the whole integration combination of
various kinds of content, like text, images, videos, and forms.
The rest of the article is structured as follows: in Sect. 2 we show the background, in
Sect. 3 we depict the methodology and, therefore, the case study, in Sect. 4 we to show the
results and, the discussion, and finally, in Sect. 5, we tend to incorporate our conclusions and
propose future analyzes.
2 Background and Related Work
Accessibility refers to how users can communicate, interact, and navigate the web with ease.
To better the level of accessibility, the Web Content Accessibility Guidelines 2.1 (WCAG
2.1) proposes 4 principles of accessibility, 13 guidelines and 78 compliance criteria, and
some sufficient techniques and advisory techniques. The four principles of web accessibility
are 1) perceptible, 2) operable, 3) understandable and 4) robust [1].
Uebelbacher et al. [2] indicate that the research presents the PDF Accessibility Checker
2.0 tool that allows for automatic testing of those 108 test conditions that can be thoroughly
tested automatically. The tool promotes PDF accessibility among a full group of users and
has the potential to increase compliance of PDF documents with the respective accessibility
standard.
Furthermore, Ahmetovic et al. [3] argue that accessing mathematical formulas inside
digital documents is a challenge for blind people; in specific, the formats of documents
designed for printing, such as PDF, structure the mathematical content for visual access only.
While there are accessibility characteristics for presenting nonvisual PDF content, formula
support is limited to supporting alternative text that can be read on a screen reader or shown
in a braille bar. Nevertheless, the procedure of introducing replacement text is left to
document creators who infrequently deliver such content. Besides, at most excellent,
descriptions of formulas are supplied, which consequently makes it almost impossible to
transmit a detailed understanding of the complex formula.
The authors [4] suggest that in order for documents to be accessible, navigation aids,
such as bookmarks, may be included, which are particularly useful for longer documents.
The key to creating accessible PDF documents is to design the source document taking into
account accessibility; they suggest applying the standard ISO 32000-1: 2008.
In their previous studies, Acosta-Vargas et al. [5, 6] depict that PDF documents are
universally accessible, and Web Content Accessibility (WCAG) 2.0 must be applied. The
authors took as a case study the repositories of Latin American universities with the most
excellent university reputation corresponding to the Webometrics. In the assessment of the
PDFs, they showed that academies have not been worried about supporting creating
accessible documents.
Following the techniques proposed in WCAG 2.1, we have 23 techniques to make a PDF
accessible [7], Table 1 presents a summary of the success criteria associated with PDF
techniques. With the techniques recommended by WCAG 2.1, it is probable to examine the
scanning order of the labels, of how the manuscript is read aloud. To
Table 1. Summary of the success criteria associated with PDF techniques [7].
The Portable Document Format: An Analysis of PDF Accessibilty 3
Success criteria
Level
PDF general techniques
1.1.1 Non-textual content
A
PDF1, PDF4
1.2.1 Audio-only and
videoonly
A
General techniques
1.2.2 Subtitles
A
General techniques
1.2.3 Audio description or
alternative media
A
General techniques
1.2.4 Subtitles
AA
General techniques
1.2.5 Audio description
AA
General techniques
1.3.1 Information and
relationships
A
[7] PDF6, PDF9, PDF10, PDF11, PDF12,
PDF17, PDF20, PDF21
1.3.2 Significant sequence
A
PDF3 [7]
1.3.3 Sensory characteristics
A
General techniques
1.4.1 Use of color
A
General techniques
1.4.2 Audio control
A
General techniques
1.4.3 Contrast
AA
General techniques
1.4.4 Change text size
AA
G142 [7]
1.4.5 Text images
AA
PDF7, General techniques
1.4.9 Text images
AAA
PDF7
2.1.1 Keyboard
A
PDF3, PDF11, PDF23
2.1.2 No traps for keyboard
focus
A
G21
2.1.3 Keyboard
AAA
PDF3, PDF11, PDF23
2.2.1 Adjustable time
A
PDF3, G133
2.2.2 Pause, stop, hide
A
General techniques
2.3.1 Threshold of three flashes
or less
A
General techniques
2.4.1 Avoid blocks
A
PDF9, General techniques
2.4.2 Titling pages
A
PDF18
2.4.3 Focus order
A
PDF3
2.4.4 Purpose of the links
A
PDF11, PDF13
2.4.5 Multiple ways
AA
PDF2, General techniques
2.4.6 Headings and labels
AA
General techniques
2.4.7 Visible focus
AA
G149, G165, G195
2.4.8 Location
AAA
PDF14, PDF17
2.4.9 Purpose of the links
AAA
PDF11, PDF13
3.1.1 Page language
A
PDF16, PDF19 [7]
3.1.2 Language of the parties
AA
PDF19 [7]
3.1.4 Abbreviations
AAA
PDF8
4 P. Acosta-Vargas et al.
3.2.1 Upon receiving the focus
A
General techniques
3.2.2 When receiving tickets
A
PDF15 [7]
3.2.3 Consistent navigation
AA
PDF14, PDF17, G61 [7]
(continued)
Table 1. (continued)
Success criteria
Level
PDF general techniques
3.2.4 Consistent identification
AA
General techniques
3.3.1 Error identification
A
PDF5, PDF22 [7]
3.3.2 Labels or instructions
A
PDF5, PDF10 [7]
3.3.3 Error suggestions
AA
PDF5, PDF22 [7]
3.3.4 Error prevention
AA
General techniques
4.1.1 Processing
A
Not Applicable: PDF
4.1.2 Name, function, value
A
PDF10, PDF12 [7]
review accessibility in PDFs, there are some validators, which allows us to assess the
accessibility of PDFs corresponding to the WCAG 2.0 and the PDF/UA standard.
An additional tool is PDF Accessibility Checker 3.0, which is free and validates meta
information, labeling, safety, bookmarks, scanning order, and text contrast. This
investigation applied the PDF Accessibility Checker 3.0
1
because it permits validating the
PDFs under ISO 32000-1 (PDF/UA-1) [8] and the WCAG 2.1 [4], it offers a quick way to
test the accessibility of PDFs, it supports both experts and end-users who perform
accessibility valuations.
3 Method and Case Study
The case study is applied to a random sample of 10 documents in PDF format related to the
modern architectural heritage of Quito; Table 2 contains the detail of the documents
evaluated.
Table 2. PDF documents used in accessibility evaluation.
Id
File
Size (kB)
Title
Language
Tags
Pages
A
prueba_1.pdf
453
no title
no language
no tags
23
B
prueba_2.pdf
3158
no title
no language
no tags
23
C
prueba_3.pdf
2795
no title
no language
no tags
23
D
prueba_4.pdf
4981
no title
no language
no tags
23
E
prueba_5.pdf
2137
no title
no language
no tags
23
1
https://www.access-for-all.ch/en/pdf-lab/pdf-accessibility-checker-pac.html.
The Portable Document Format: An Analysis of PDF Accessibilty 5
F
prueba_6.pdf
355
no title
es-ES
525
12
G
prueba_7.pdf
16670
no title
no language
no tags
92
H
prueba_8.pdf
671
no title
no language
no tags
8
I
prueba_9.pdf
11328
Yes
es-ES
5519
130
J
prueba_10.pdf
2910
no title
es-ES
50
16
The method applied to evaluate accessibility in PDFs comprises of five phases, as
presented in Fig. 1.
Fig. 1. Method to assess accessibility in PDFs.
Phase 1: Select the random sample of PDF documents, in this phase we randomly
selected ten documents in PDF format that contain information related to the modern
architectural heritage of Quito, the evaluated documents are detailed in Table 2.
Phase 2: Review with PDF Accessibility Checker, we review each document with PDF
Accessibility Checker 3.0, version 3.0.7.0. The tests performed are available in a data set
located in the Mendeley repository
2
.
Phase 3: Record the results, in Table 3, we record the evaluation data; the tests are
available for the reproduction of the experiment in the Mendeley repository. Table 3
contains the number of barriers presented by the PDF documents evaluated, the errors
presented by each PDF document is detailed according to the errors presented.
Table 3. PDF documents failed.
PDF (failed)
A
B
C
D
E
F
G
H
I
J
Total
Embedded files
0
0
0
0
0
0
0
0
0
0
0
Metadata
4
4
4
4
4
6
0
4
4
0
34
Document settings
4
4
4
4
4
2
2
2
14
4
44
Fonts
0
0
0
0
0
6
0
24
32
0
62
Structure elements
0
0
0
0
0
0
0
0
332
4
336
PDF syntax
22
0
22
22
0
0
186
18
5791
86
6147
Structure tree
0
0
0
0
0
0
0
0
9998
4
10002
Role mapping
0
0
0
0
0
0
0
0
10752
196
10948
Alternative
Descriptions
0
0
0
0
0
0
0
0
21608
198
21806
2
https://data.mendeley.com/datasets/83n9xvgfcr/2.
6 P. Acosta-Vargas et al.
Natural language
916
10724
920
926
10296
0
0
0
109872
36
133690
Content
918
11368
922
928
10770
0
0
4692
246230
138
275966
Phase 4: Analyze the results; in this phase, we analyze the outcomes of the PDFs; in Fig.
2, we present a summary of the analyzed PDF documents. The parameters that fail and
represent an accessibility barrier for the users are shown, we observe that a substantial
number of failures corresponds to the Content followed by Natural language and
Alternative descriptions.
Fig. 2. Parameters of failed PDF documents.
Table 4 shows the parameters that pass the accessibility verification test; there are zero
(0) errors related to Embedded files, followed by Metadata.
Table 4. PDF documents passed.
PDF (passed)
A
B
C
D
E
F
G
H
I
J
Total
Embedded files
0
0
0
0
0
0
0
0
0
0
0
Metadata
2
2
2
2
2
0
0
4
4
0
18
Document settings
2
2
2
2
2
4
2
2
14
4
36
Structure elements
0
0
0
0
0
4
0
0
332
4
340
Fonts
0
380
0
0
0
14
0
24
32
0
450
PDF syntax
26
48
26
26
26
553
186
18
5791
86
6786
Structure tree
0
0
0
0
0
1048
0
0
9998
4
11050
Role mapping
0
0
0
0
0
1146
0
0
10752
196
12094
Alternative
Descriptions
0
0
0
0
0
0
0
0
21608
198
21806
Natural language
0
0
0
0
0
24524
0
0
109872
36
134432
The Portable Document Format: An Analysis of PDF Accessibilty 7
Content
916
10724
920
926
926
49768
0
4692
246230
138
315240
Phase 5: Suggest improvements, to ensure that PDF documents achieve an acceptable
degree of accessibility, we suggest the following: 1) Apply the same criteria as on the
web, that is, only images that are not decorative should have
alternative text; 2) create the PDF so that bookmarks are automatically
generated, hence, it is necessary to structure the source document well; 3)
label the tables correctly with the labels TABLE, TR, TH, and TD; 4) define
the links before labeling the document; and 5) include relevant information in
headers and footers consistently throughout the entire document.
4 Results and Discussion
In Fig. 2, we observe that PDF documents are not compatible with PDF/UA; 60%
contain errors related to Content, 29% with the Natural language, and 5% with
Alternative descriptions. Natural language is the most frequent error; it is present
when it is impossible to identify the language of the content of a document; this
is the reason why voice synthesizers and braille devices cannot automatically
switch to a new language. Also, the authors suggest considering the requirements
for multimedia and image resources to be accessible to the most significant
number of users and, therefore, suggest reviewing the studies [9]. Finally, they
suggest considering the application of heuristic methods [10] related to web
accessibility and the type of disability of endusers.
Figure 3, we observe that the documents that present a more significant
number of failures correspond to those of identifiers B, E, and I.
Fig. 3. Parameters of failed PDF documents.
Figure 4 presents a summary of the documents analyzed with PDF
Accessibility Checker 3.0; the most common errors are related to Content and
Natural language.
Fig. 4. Detail of the documents analyzed.
5 Conclusions and Future Works
The study carried out recommends creating accessible PDFs by applying the
techniques for PDFs, according to WCAG 2.1. To generate more inclusive
documents we propose to use PDF Accessibility Checker 3.0, version 3.0.7.0. The
study carried out can promote as a beginning point the future work to produce
more accessible PDFs. On the other hand, we suggest conducting accessibility
tests and correcting errors in PDF documents before sharing in digital
repositories. Furthermore, the authors suggest applying accessibility tools for
PDFs in the design of architectural plans which will allow innovating this area
and to get better access to a large number of users with disabilities. Finally, we
recommend libraries to develop access to digital papers so that they can raise
accessibility from an international communication viewpoint by employing the
criteria related to WCAG 2.1.
Acknowledgments. The researchers thank Universidad de Las Américas - Ecuador, for
funding this study through projects UDLA FGE.PAV.19.11, and ARQ.AMG.1802.
References
1. World Wide Web Consortium: Web Content Accessibility Guidelines (WCAG) 2.1.
https:// www.w3.org/TR/WCAG21/
2. Uebelbacher, A., Bianchetti, R., Riesch, M.: PDF accessibility checker (PAC 2): the
first tool to test PDF documents for PDF/UA compliance. In: International
Conference on Computers for Handicapped Persons, pp. 197201. Springer, Cham
(2014)
3. Ahmetovic, D., Armano, T., Bernareggi, C., Berra, M., Capietto, A., Coriasco, S.,
Murru, N.,Ruighi, A., Taranto, E.: Axessibility: a LaTeX package for mathematical
formulae accessibility in PDF documents. In: Proceedings of the 20th International
ACM SIGACCESS Conference on Computers and Accessibility, pp. 352354.
Association for Computing Machinery, New York, NY, USA (2018)
4. Devine, H., Gonzalez, A., Hardy, M.: Making accessible PDF documents. In:
Proceedings ofthe 11th ACM Symposium on Document Engineering, pp. 275276.
Association for Computing Machinery, New York (2011)
5. Acosta-Vargas, P., Luján-Mora, S., Acosta, T.: Accessibility of portable document
format in education repositories. In: ACM International Conference Proceeding
Series, pp. 239242 (2017)
6. Acosta-Vargas, P., Luján-mora, S., Acosta, T., Salvador, L.: Accesibilidad de
documentos PDF en repositorios educativos de Latinoamérica. In: Congreso
Internacional sobre Aplicación de Tecnologías de la Información y Comunicaciones
Avanzadas, pp. 239246 (2017)
7. World Wide Web Consortium (W3C): Techniques for WCAG 2.1.
https://www.w3.org/ WAI/WCAG21/Techniques/
8. ISO: Document management applicationsElectronic document file format
enhancement for accessibilityPart 1: Use of ISO 32000-1 (PDF/UA-1).
https://www.iso.org/standard/ 54564.html
9. Acosta-Vargas, P., Esparza, W., Rybarczyk, Y., González, M., Villarreal, S., Jadán,
J., Guevara, C., Sanchez-Gordon, S., Calle-Jimenez, T., Baldeon, J.: Educational
resources accessible on the tele-rehabilitation platform. In: International Conference
on Applied Human Factors and Ergonomics, pp. 210220. Springer (2018)
10. Acosta-Vargas, P., Salvador-Ullauri, L., Luján-Mora, S.: A heuristic method to
evaluate web accessibility for users with low vision. IEEE Access 7, 125634125648
(2019). https://doi.
org/10.1109/ACCESS.2019.2939068
ResearchGate has not been able to resolve any citations for this publication.
Article
Full-text available
Checking the accessibility of a website is a significant challenge for accessibility experts. Users who suffer from age-related changes, such as low vision, poor hearing, and diminishing motor skills, among others, have problems accessing the services offered by the web. Currently, there are qualitative and quantitative methods to check if a website is accessible. Most methods apply automatic tools because they are low cost, but they do not present an ideal solution. Instead, heuristic methods require manual support that will help the expert to assess accessibility by establishing severity ranges. This research used a modification of the Barrier Walkthrough method proposed by Giorgio Brajnik considering the Web Content Accessibility Guidelines 2.1. The modification consisted of including persistence to determine the severity of an accessibility barrier. This method enabled the measurement of the accessibility of websites to test a new heuristic process and to obtain sample data for analysis. The method was applied to 40 websites, including those of 30 universities in Latin America, according to the Webometrics ranking, and 10 websites among the most visited, according to Alexa ranking. With this heuristic method, the evaluators concluded that although a website is in a high-ranking position, this does not imply that it is accessible and inclusive. However, the manual method takes too long, and it is therefore too costly to solve accessibility problems. This research can serve as a starting point for future studies related to web accessibility heuristics.
Poster
Full-text available
Accessing mathematical formulae within digital documents is challenging for blind people. In particular, document formats designed for printing, such as PDF, structure math content for visual access only. While accessibility features exist to present PDF content non-visually, formulae support is limited to providing replacement text that can be read by a screen reader or displayed on a braille bar. However, the operation of inserting replacement text is left to document authors, who rarely provide such content. Furthermore, at best, description of the formulae are provided. Thus, conveying detailed understanding of complex formulae is nearly impossible. In this contribution we report our ongoing research on Axessibility, a LaTeX package framework that automates the process of making mathematical formulae accessible by providing the formulae LaTeX code as PDF replacement text. Axessibility is coupled with external scripts to automate its integration in existing documents, expand user shorthand macros to standard LaTeX representation, and custom screen reader dictionaries that improve formulae reading on screen readers.
Chapter
Full-text available
This research is part of a telemedicine platform project to guide and accompany the patient online, during rehabilitation after total or partial hip replacement surgery. The study proposes to apply the Accessibility Guidelines for educational content in accordance with the Web Accessibility Initiative (WAI) accessibility guidelines. The main functionalities of the tele-rehabilitation platform involve the execution of rehabilitation movements, remote communication with health professionals, and therapeutic education of the patient during the recovery process. This article discusses the guidelines that the teaching-learning resources for elderly patients must meet to generate inclusive and easily accessible resources. The present study takes into consideration specific parameters relevant to the design of educational resources, with the aim of providing more accessible and inclusive educational guidelines for elderly patients.
Conference Paper
Full-text available
This article describes a study on accessibility in the Portable Document Format. This type of file is increasingly used on the Web. In order to make the format universally accessible, it is suggested to apply the standards defined by the World Wide Web Consortium in the Web Content Accessibility Guidelines 2.0. However, not all PDF documents are accessible. In this research, the accessibility of PDF files was evaluated when applying the PDF Techniques based on WCAG 2.0. As a case study, it was applied to the universities in Latin America with the highest academic prestige according to the classification of Webometrics. The evaluation of the documents found that, in general, universities have not been concerned with providing accessible documents. In this research, we study the problems encountered in documents and the solutions to generate more accessible and inclusive documents.
Conference Paper
Full-text available
Este artículo describe un estudio realizado sobre la accesibilidad en el formato Portable Document Format (PDF). Este tipo de archivos son cada vez más utilizados en la Web. Para que el formato sea universalmente accesible, se sugiere aplicar los estándares definidos por la recomendación de la World Wide Web Consortium en las Web Content Accessibility Guidelines (WCAG) 2.0. Sin embargo, no todos los documentos PDF son accesibles. En esta investigación se evaluó la accesibilidad de los archivos en formato PDF al aplicar las Técnicas PDF para la WCAG 2.0. Como caso de estudio se aplicó a las universidades de Latinoamérica con mayor prestigio académico según la clasificación de Webometrics. En la evaluación de los documentos se constató que, en general, las universidades no se han preocupado de proporcionar documentos accesibles. En esta investigación estudiamos los problemas encontrados en los documentos y las soluciones para generar PDF más accesibles e inclusivos
Conference Paper
In 2012, the new standard PDF/UA (ISO 14289-1) was published, specifying the requirements for accessible PDF documents. The Matterhorn Protocol by the PDF Association details the list of 136 test conditions that need to be fulfilled, but so far, there was no test tool to check a given PDF document against these requirements. This paper presents the PDF Accessibility Checker 2.0 (PAC 2), which is the first tool that allows for an automatic test of those 108 test conditions which can be tested fully automatically. The tool provides a de-tailed report of a document analysis, and various features such as visual inspec-tion of standard violations, supporting further improvement of the PDF document. As the PAC 2 is free of charge and can be used without technical knowledge, the tool promotes PDF accessibility among a wider user group and has the potential to increase compliance of PDF documents with the respective accessibility standard.
Conference Paper
Accessibility features in the Adobe Portable Document Format (PDF) help facilitate access to electronic information for people with disabilities. This workshop explores how to create accessible PDF documents, from within Adobe Acrobat and other applications; how to use the Adobe Acrobat PDF accessibility checker and repair workflow; best practices for accessibility; and how accessibility has been built into forthcoming ISO standards (PDF/UA, PDF 32000-2).