Universal Accessibility as a Multimodal Design Issue
Zeljko Obrenovic, Julio Abascal, Dusan Starcevic
In recent years, many research activities have focused on design that aims to produce
universally accessible systems, taking into account special needs of various user groups.
These special needs are associated with various factors, including speech, motor, hearing,
and vision impairments, cognitive limitations, emotional and learning disabilities, as well as
aging, but also with various environmental factors .
Fields that address this problem, such as Usability, Universal Accessibility, Universal
Design, or Inclusive Design  have been developed as relatively independent domains, but
they share many things with other HCI disciplines. However, researchers and practitioners
are often not aware of interconnections among concepts of universal accessibility and
"ordinary" HCI. In view of that, in this paper we wanted to show that there is a fundamental
connection between multimodal interface design and universal accessibility, and that
awareness of these links can help both disciplines. Researchers from these areas may use
different terminology, but they can mean pretty much the same. Based on these ideas, we
propose a unified generic framework where these areas can be joined.
Accessibility and Multimodal Interaction
Universal accessibility and related approaches such as "Inclusive Design" or "Design for All",
aim to produce systems that can be used by everyone, regardless of their physical or
cognitive skills. As this design philosophy tend to enhance the usability of the product, it can
also be extremely valuable for non-disabled users trying to use the system under suboptimal
conditions . The growing interest in accessibility and universal usability for information
and communications technologies have resulted in various solutions that developers can use.
For example, many guidelines about accessibility, especially for Web design, are already
available . In addition, conferences such as ACM Conference on Universal Usability
(CUU), and ACM Conference on Assistive Technologies (ASSETS), as well as journals such
as International Journal on Universal Access in the Information Society, offer good sources of
various practical and theoretical work in this area. Developers can also use various practical
solutions and tools, such as Web site compliance checkers, semi-automatic Web site repair
tools, or Web adaptation facilities that transform existing Web content "on the fly". There are
also activities in developing tools that use of guidelines to automatically verify Web
Multimodal interaction is a characteristic of everyday human discourse, in which we speak,
shift eye gaze, gesture, and move in an effective flow of communication. Enriching human-
computer interaction with these elements of natural human behavior is the primarily task of
multimodal user interfaces. Many studies have explored multimodal interaction from
different viewpoints . Sharon Oviatt gave a practical definition of multimodal systems,
saying that they combine human natural input modalities—such as speech, pen, touch, hand
gestures, eye gaze, and head and body movements—in a coordinated manner with
multimedia system output . Matthew Turk and George Robertson further refined the
difference between multimedia and multimodal systems, saying that multimedia research
focuses on the media, while multimodal research focuses on the human perceptual channels
. Multimodal interfaces can improve accessibility for diverse users and usage contexts,
advance performance stability, robustness, expressive power and efficiency of
While multimodal interaction research focuses on adding more natural human
communication channels into human-computer interaction (HCI), accessibility research is
looking for substitute ways of communication when some of these channels, due to various
restrictions, are of limited bandwidth. What makes a difference between these two areas is a
focus of their research. Therefore, many things from both areas can be generalized so that we
can get unified and more abstract view on them. In this way, some existing solutions from
one of the domains could find their usage in another domain.
The Unified Framework
Treating user interfaces as multimodal systems, can clearly help design for universal
accessibility, as multimodal interfaces describe human-computer interaction in terms of
modalities, e.g. in terms of communication channels established between the computer and
the user. Limiting environment characteristics or limited abilities of a user can be viewed as a
break or decrease of throughput in these channels (Figure 1).
Figure 1. Modalities, constraints, and effects. Computers and humans establish various
communication channels over which they exchange messages or effects. Modalities process or
produce these effects, while various interaction constraints reduce or completely eliminate some of
If we describe user interfaces as a set of communication channels, and connect these
descriptions with user, environment, and device profiles, we can easily see if the multimodal
interface will be appropriate for the user in a specific situation. However, to create a unified
view on multimodal system design and accessibility, we need a semantic framework where
we can explicitly and formally establish relations among concepts from both domains.
Therefore, our first step was to formally define a unified modeling framework for
description of multimodal human-computer interaction and various user and environment
characteristics using the same terms. Proposed framework does not define any specific
interaction modality - such as speech, gesture, graphics, and so on - nor a constraint, such as
low vision, immobility, or various environment conditions, but defines a generic unified
approach for describing such concepts. Proposed framework, therefore, focuses on the
notions of an abstract modality and abstract constraint, defining their common
characteristics regardless of their specific manifestations. This work is the extension of our
previous work in modeling multimodal human-computer interaction .
The Model of Multimodal HCI and Constraints
Our approach is based on the idea that user interfaces can be viewed as one-shot, higher-
order messages sent from designers to users . While designing a user interface the
designer defines an interactive language that determines which effects and levels will be
included in the interaction. Therefore, we model user interfaces with modalities they use,
where we define a modality as a form of interaction designed to engage some of human
capabilities, e.g. to produce effects on users, or to process effects produced by the user
(Figure 2a). In our model, modalities can be simple or complex: a complex modality integrates
other modalities to create simultaneous use of them, e.g. to provide modality fusion of
fission mechanisms, while a simple modality represents a primitive form of interaction. In the
paper we do not focus on the detailed description of multimodal integration, but on high-
level effects that a modality system or some of his parts use. We defined input and output
types of a simple modality, using the computer as a reference point. An input modality
requires some user devices to transfer human output into a form suitable for computer
processing, and we classified them into event-based and streaming-based classes. Event-based
input modalities produce discrete events in reaction to user actions, such in the case of user
input via a keyboard or mouse. Streaming-based modalities sample input signals with some
resolution and frequency, producing the time-stamped array of sampled values. We
introduced a special class of streaming modality, a recognition-based modality, which adds
additional processing over streaming data, searching for patterns. An output modality
presents data to the user, and this presentation can be static or dynamic. More elaborate
description of this model can be found in .
While we describe human-computer interaction in terms of modalities, we describe various
accessibility issues in terms of interaction constraints (Figure 2b). Interaction constrictions can
be viewed as filters on usage of some effects. Constraints are organized as basic and complex.
We have identified two types of basic constraints: user constraints and external constraints.
User constraints are classified into user features, user states and user preferences. User features
describe the longterm ability of user to exploit some of the effects, and this description can
include some of the user disabilities, such as low vision or immobility. A user state constraint,
further classified in emotional and cognitive context, describes user's temporary ability to use
some effects. User preferences describe how much is the user eager to make use of some
effects, e.g. it is a user's subjective mark of the effects they prefer or dislike.
Figure 2. Simplified model of computing modalities (a) and constraints (b).
External constraints are categorized as device constraints, environment constraints, and social
context. Device constraints describe restrictions on usage of some effects which are a
consequence of device characteristics. For example, a mouse is limited to capture movement
in two-dimensional space with some resolution, while output devices, such us screens on
PDAs and other mobile devices, have limited resolution and a limited number of colors.
Environmental constraints describe how the interaction environment influences the effects. For
example, when driving a car, in most of the situations, users are not able to watch the screen
and, therefore, this situation greatly reduces usage of visual effects. In addition, various
other environmental factors, such as lightning or noise, greatly affect the usage of other
effects. Social context describes social situation in which the interaction occurs. Proposed
model allows flexible definition of various simple and complex constraints of different types.
The resulting constraint in a particular situation will be a combination of user's state, abilities
and preferences, as well as various external factors relevant to that situation.
Common ground: The effects
Entities that connect modalities and constraints in our model are effects. We have classified
effects used by modalities and affected by constraints in five main categories :
• Sensual effects,
• Perceptual effects,
• Motor effects,
• Linguistic effect, and
• Cognitive effects.
These effects are based on various sources, such as World Health Organization International
Classification of Functioning, Disability and Health. In our model, these concepts are
subclasses of the Effect class presented in Figure 2. Sensory effects describe processing of
stimuli performed by human sensory apparatus. Perceptual effects are more complex effects
that human perceptual system gets by analyzing data received from sensors, such us, shape
recognition, grouping, highlighting, or 3D cues. Motor effects describe human mechanical
action, such as hand movement or pressure. Linguistics effects are associated with human
speech, listening, reading and writing. Cognitive effects take place at higher level of human
information processing, such as memory processes, or attention.
Effects are often interconnected. For example, all perceptual effects are a consequence of
sensory effects. These relations among effects are important because in this way a designer
can see what side-effects will be caused by his intention to use some effects.
Using the Framework
Proposed unified framework can be used to describe various interaction modalities and
interaction constraints. By combining these descriptions, and by using effects as a common
ground, it is possible to see if the designed interface will be appropriate for a concrete
situation, and it can enable adaptation of user interfaces according to user profiles and
Describing user interfaces, users and environment
We model user interfaces with their modalities, where we describe each modality with effect
it uses in order to be operative. For example, Table 1 shows effects produced by some
common simple or complex modalities, such as simple text presentation, aimed hand
movement, visual menu interaction, and speech user interfaces.
Table 1. A list of some of the modalities and associated effects.
Modality Composed of: Effect which the modality uses. Effect type
Pixel Visual sensory processing sensory
Correct central field vision
Normal vision sharpness
Shape recognition of grouped pixels perceptual
Grouping of letters by proximity Word
Shape recognition of words
Text line Grouping of words by good continuation perceptual
Paragraph Grouping of lines by proximity
Highlighting by the shape of the first line
Hand movement Hand movement
(input modality) Pressure
Highlighting by shape of the cursor
Highlighting by motion
Highlighting by depth (cursor shadow)
Selection See aimed hand movement. -
Grouping by surrounding of menu borders perceptual
Grouping of items by proximity perceptual
Highlighting by shape and color (selected
Visual reading of item text linguistic
Understanding of the menu language linguistic
Speech input Speaking linguistic
interaction Speech output
Concrete user interface can then be described using these high-level descriptions of
modalities, while we can get detailed description of used effects automatically through
mappings, such us those shown in Table 1. It is also possible to have several alternative
mappings among modalities and effects according to different theories. For example, simple
textual presentation in Table 1 is described according to Gestalt psychology, but it is also
possible to provide description of these modalities according to the other theories.
Accessibility issues, such as user abilities and environmental constrains, are described with
the constraints. User abilities can be described in several ways. For example, one approach is
to create individual profiles of each user, associating all the effects with values describing the
user's capability to exploit them. For simplicity, the profile could include only the effects
which are different from a typical user. Alternatively, it is possible to define a repository or
user ability categories, where each category is described with a set of effects that it reduces.
These categories can describe some disability, or other factors, such as average abilities of
different age groups. For example, Table 2 shows effects that are reduced by some
disabilities. In modeling and analyzing user interfaces, very important role play the relations
among effects. If, for example, we describe that the user is not capable of processing a sound, it
means not only sensory, but also all the audio perceptual effects will not be appropriate for
Table 2. A list of some of the disabilities and associated effects.
Disability as a user ability
Effects reduced by a disability
Blindness Absence of all visual stimulus processing, and therefore
associated visual perceptual effects
Poor acuity (poor sharpness) Reduced visual sharpness
Clouded vision Reduced visual sharpness
Tunnel vision Reduced peripheral vision
Central field loss Reduced central vision
Color blindness Reduced color sensation and contrast processing
Deafness Absence of all audio stimulus processing, and associated
audio perceptual effects
Hard of Hearing Reduced audio sensory processing, and associated audio
Weakness Reduced movement and pressure
Limitations of muscular control Reduced movement and pressure
Limitations of sensation Reduced pressure
Joint problems Reduced movement
Pain associated with movement Reduced movement
Dyslexia Reduced linguistic effects
Attention Deficit Disorder Reduced attention
Memory Impairments Reduced memory processes
In a similar way we can describe constraints introduced by the environment conditions. For
example, driving a car is a complex constraint that integrates different user end
environmental parameters. Table 3 show simplified description of this constraint where we
identified that it depends on traffic situation, weather conditions, noise level, lighting, user
current state, as well as a number of people in the car. Constraints can also be
interconnected, for example, lightning and weather conditions can affect the user current
state, while a number of people in the car can influence the noise level. This description can
be useful in determining which modalities to use in particular situation. When the car is
stopped, it is possible to use central field vision of the user, as well as the other effects in
higher amount. On the other hand, traffic jam further limits possible usage of these effects,
allowing their use in a very low level.
Table 3. Driving a car as a complex interaction constraint, composed of various simpler
constraints, with associated effects.
Influence on the usage of effect.
Car stopped No specific reductions.
Normal traffic situation It is not convenient to require of the user to
. Also, user's
is directed toward the road.
Traffic jam In addition to the normal traffic situation,
additional limitation is usage of user
as the user is more focused and stressed.
Insignificant noise level No specific reductions.
Normal noise level A user's
audio 3D cues
can be used provided that they are of
High noise level
All audio effects
are significantly reduced.
Day No specific reductions.
Driving conditions are tougher; user is more
focused and stressed.
Dry No specific reductions.
Driving conditions are tougher; user is more
focused and stressed.
The driver is relaxed No specific reductions. User current state
(emotional context) The driver is stressed Limited usage of attention requests and
complex interaction modalities.
The driver is alone No specific reductions. Number of people in
The driver is not alone
can use the application. Can affect
the noise level.
Analysis and Transformations
Presented descriptions of multimodal interfaces and interaction constraints can be used for
various purposes. In a simpler form, they can serve as metadata about some user interface, or
as a part of a user profile. However, with formal descriptions of user interfaces and
constraints, it is possible to develop tools that analyse and transform the content in order to
see if it is suitable for some situation or for some user.
We have developed a Web service, which have formalized the proposed framework, creating
a database of effects, and standard description of modalities, and constraints. This service
can receive the description of user interface, expressed in terms of modalities, and then
evaluate it, for example, giving the list of effects, or giving a list of potential problems in
some environments, as well as the list of user groups that could have a problem to use this
interface. To increase the awareness about importance of accessibility aspects, we also
enabled that these reports contain data about percentage of people who suffer from some
interaction limitations (for example, about 8% of men and 0.4% of women have some form of
Various other applications, such as dynamic adaptation and content repurposing are also
possible. By connecting descriptions of user interfaces, user profiles, and other constraints,
we can analyze and transform content in various ways. Proposed framework, therefore, can
be a good basis for adaptation and content repurposing, that attack the problem of
developing content for various users and devices. More detailed description of our previous
work in this area can be found in .
Discussion and Conclusions
Proposed approach can bring developers and researchers several advantages. From the
developer's point of view, one advantage is that it is possible to design more flexible and
more reusable solutions, aimed for a broader set of situations. Most of the previous work in
designing solutions for people with disabilities concentrated on a specific set of disabilities,
or on specific situations. Having in mind great diversity of disabilities and situations, it is
clear that development and maintenance of such systems is rather complex. With our
approach, developers can concentrate on more generic effects, providing solutions for
different levels of availability of specific effects. In this way it is possible to create adaptable
solutions which adjust to user features, states, preferences, and environmental
Another important advantage is that our framework enables treating different situations in a
same way. As user features and preferences are described in the same way as environmental
characteristics, it is possible to use solutions aimed for user with some disability, for non-
disabled user in situations that limit the interaction in the same way as some disability limits
the other user. Besides providing more universal solutions, this could also solve some of the
ethical problems, as design is not concerned with disabilities, and usage of the term
'disability' often introduces negative reactions, but with various effects and their constraints.
Some of the constraints are not a consequence of user physical limitations. For example,
when using a secondary language, a foreign user may experiences similar problems as the
user who has cognitive disabilities that affect linguistic effects. Therefore, these situations do
not have to be treated differently, and some of the solutions from one of the domains can be
reused in another.
1. J. Abascal et al, "The use of guidelines to automatically verify Web accessibility", Universal Access in
the Information Society, Vol. 3, No. 1, 71-79 (2004).
2. M.M. Blattner and E.P. Glinter, "Multimodal Integration", IEEE Multimedia, Winter, 14-24 (1996).
3. C. Nicolle, J. Abascal (Eds.) Inclusive Design Guidelines for HCI. Taylor & Francis. London (2001).
4. Z. Obrenovic and D. Starcevic, "Modeling Multimodal Human-Computer Interaction", IEEE
Computer, Vol. 37, No. 9, 62-69 (2004).
5. Z. Obrenovic, D. Starcevic, and B. Selic, "A Model Driven Approach to Content Repurposing", IEEE
Multimedia, Vol. 11, No. 1, 62-71 (2004).
6. S. Oviatt, T. Darrell, and M. Flickner, "Multimodal interfaces that flex, adapt, and persist", Comm. of
the ACM, Vol. 47 , No. 1, 30-33 (2004).
7. R. Prates, C. De souza, and C. Barbosa, “A Method for Evaluating the Communicability of User
Interfaces”, Interactions, Jan– Feb 2000, pp. 31-38.
8. A. Savidis and C. Stephanidis, "Unified User Interface Design: Designing Universally Accessible
Interactions", International Journal of Interacting with Computers, 16 (2), pp. 243-270, (2004).
9. M. Turk, G. Robertson, "Perceptual user interfaces (introduction)", Comm. of the ACM, Vol. 43, No.
3, March 2000, pp. 33-35.
10. Web Accessibility Initiative (WAI), http://www.w3c.org/wai/, Last visited September 2004.