The signspeak project-bridging the gap between signers and speakers
ABSTRACT The SignSpeak project will be the first step to approach sign language recognition and translation at a scientific level already reached in similar research fields such as automatic speech recognition or statistical machine translation of spoken languages. Deaf communities revolve around sign languages as they are their natural means of communication. Although deaf, hard of hearing and hearing signers can communicate without problems amongst themselves, there is a serious challenge for the deaf community in trying to integrate into educational, social and work environments. The overall goal of SignSpeak is to develop a new vision-based technology for recognizing and translating continuous sign language to text. New knowledge about the nature of sign language structure from the perspective of machine recognition of continuous sign language will allow a subsequent breakthrough in the development of a new vision-based technology for continuous sign language recognition and translation. Existing and new publicly available corpora will be used to evaluate the research progress throughout the whole project.
-
Citations (0)
-
Cited In (0)
Page 1
The SignSpeak Project - Bridging the Gap Between Signers and Speakers
Philippe Dreuw1, Hermann Ney1, Gregorio Martinez2, Onno Crasborn3,
Justus Piater4, Jose Miguel Moya5, and Mark Wheatley6
1RWTH, Aachen, Germany
dreuw@cs.rwth-aachen.de
4ULg, Liege, Belgium
justus.piater@ulg.ac.be
2CRIC, Barcelona, Spain
gregorio.martinez@cric.cat
5TID, Granada, Spain
jmml@tid.es
Abstract
3RUN, Nijmegen, The Netherlands
o.crasborn@let.ru.nl
6EUD, Brussels, Belgium
mark.wheatley@eud.eu
The SignSpeak project will be the first step to approach sign language recognition and translation at a scientific level already reached in
similar research fields such as automatic speech recognition or statistical machine translation of spoken languages. Deaf communities
revolve around sign languages as they are their natural means of communication. Although deaf, hard of hearing and hearing signers
can communicate without problems amongst themselves, there is a serious challenge for the deaf community in trying to integrate into
educational, social and work environments. The overall goal of SignSpeak is to develop a new vision-based technology for recognizing
and translating continuous sign language to text. New knowledge about the nature of sign language structure from the perspective
of machine recognition of continuous sign language will allow a subsequent breakthrough in the development of a new vision-based
technology for continuous sign language recognition and translation. Existing and new publicly available corpora will be used to evaluate
the research progress throughout the whole project.
1.Introduction
The SignSpeak project1is one of the first EU funded
projects that tackles the problem of automatic recognition
and translation of continuous sign language.
The overall goal of the SignSpeak project is to develop
a new vision-based technology for recognizing and trans-
lating continuous sign language (i.e. provide Video-to-Text
technologies), in order to provide new e-Services to the
deaf community and to improve their communication with
the hearing people.
The current rapid development of sign language research
is partly due to advances in technology, including of course
the spread of Internet, but especially the advance of com-
puter technology enabling the use of digital video (Cras-
born et al., 2007). The main research goals are related to a
better scientific understanding and vision-based technolog-
ical development for continuous sign language recognition
and translation:
• understanding sign language requires better linguistic
knowledge
• large vocabulary recognition requires more robust fea-
ture extraction methods and a modeling of the signs at
a sub-word unit level
• statistical machine translation requires large bilingual
annotated corpora and a better linguistic knowledge
for phrase-based modeling and alignment
Therefore, the SignSpeak project combines innova-
tive scientific theory and vision-based technology devel-
opment by gathering novel linguistic research and the
most advanced techniques in image analysis, automatic
speech recognition (ASR) and statistical machine transla-
tion (SMT) within a common framework.
1www.signspeak.eu
1.1.
Signed languages vary like spoken languages do: they are
not mutually understandable, and there is typically one or
more signed language in each country.
Although sign languages are used by a significant num-
ber of people, only a few member states of the European
Union (EU) have recognized their national sign language
on a constitutional level: Finland (1995), Slovak Republic
(1995), Portugal (1997), Czech Republic (1998 & 2008),
Austria (2005), and Spain (2007). The European Union
of the Deaf (EUD)2, a non-research partner in the Sign-
Speak project, is a European non-profit making organiza-
tion which aims at establishing and maintaining EU level
dialogue with the “hearing world” in consultation and co-
operation with its member National Deaf Associations. The
EUD is the only organization representing the interests of
Deaf Europeans at European Union level. The EUD has
30 full members (27 EU countries plus Norway, Iceland
& Switzerland), and 6 affiliated members (Croatia, Ser-
bia, Bosnia and Herzegovina, Macedonia, Turkey & Is-
rael). Their main goals are the recognition of the right to
useanindigenoussignlanguage, theempowermentthrough
communication and information, and the equality in educa-
tion and employment. In 2008, the EUD estimated about
650,000 Sign Language users in Europe, with about 7,000
official sign language interpreters, resulting in approxi-
mately 93 sign language users to 1sign language interpreter
(EUD, 2008; Wheatley and Pabsch, 2010). However, the
number of sign language users might be much higher, as it
is difficult to estimate an exact number – e.g. late-deafened
or hard of hearing people who need interpreter services are
not always counted as deaf people in these statistics.
Sign Languages in Europe
1.2.
Linguistic research on sign languages started in the 1950s,
with initial studies of Tervoort (Tervoort, 1953) and
Linguistic Research in Sign Languages
2www.eud.eu
Page 2
Stokoe (Stokoe et al., 1960). In the USA, the wider recog-
nition of sign languages as an important linguistic research
object only started in the 1970s, with Europe following in
the 1980s. Only since 1990, sign language research has be-
come a truly world-wide enterprise, resulting in the foun-
dation of the Sign Language Linguistics Society in 20043.
Linguistic research has targeted all areas of linguistics,
from phonetics to discourse, from first language acquisition
to language disorders.
Vision-based sign language recognition has only been
attempted on the basis of small sets of elicited data (Cor-
pora) recorded under lab conditions (only from one to three
signers and under controlled colour and brightness ambient
conditions), without the use of spontaneous signing. The
same restriction holds for much linguistic research on sign
languages. Due to the extremely time-consuming work of
linguistic annotation, studying sign languages has necessar-
ily been confined to small selections of data. Depending on
their research strategy, researchers either choose to record
small sets of spontaneous signing which will then be tran-
scribed to be able to address the linguistic question at hand,
or native signer intuitions about what forms a correct utter-
ance.
1.3. Research and Challenges in Automatic Sign
Language Recognition
In (Ong and Ranganath, 2005; Y. Wu, 1999) reviews on
research in sign language and gesture recognition are pre-
sented. In the following we briefly discuss the most im-
portant topics to build up a large vocabulary sign language
recognition system.
1.3.1.
Almost all publicly available resources, which have been
recorded under lab conditions for linguistic research pur-
poses, have in common that the vocabulary size, the
types/tokenratio(TTR),andsigner/speakerdependencyare
closely related to the recording and annotation costs. Data-
drivenapproacheswithsystemsbeingautomaticallytrained
on these corpora do not generalize very well, as the struc-
ture of the signed sentences has often been designed in ad-
vance (von Agris and Kraiss, 2007), or offer small varia-
tions only (Dreuw et al., 2008b; Bungeroth et al., 2008),
resulting in probably over-fitted language models. Addi-
tionally, most self-recorded corpora consists only of a lim-
ited number of signers (Vogler and Metaxas, 2001; Bowden
et al., 2004).
In the recently very active research area of sign language
recognition, a new trend towards broadcast news or weather
forecast news can be observed. The problem of aligning
an American Sign Language (ASL) sign with an English
text subtitle is considered in (Farhadi and Forsyth, 2006).
In (Buehler et al., 2009; Cooper and Bowden, 2009), the
goal is to automatically learn a large number of British Sign
Language (BSL) signs from TV broadcasts. Due to limited
preparation time of the interpreters, the grammatical differ-
ences between “real-life” sign language and the sign lan-
guage used in TV broadcast (being more close to Signed
Exact English (SEE)) are often significant.
Languages and Available Resources
3www.slls.eu
1.3.2.Environment Conditions and Feature
Extraction
Further difficulties for such sign language recognition
frameworks arise due to different environment assump-
tions. Most of the methods developed assume closed-world
scenarios, e.g. simple backgrounds, special hardware like
data gloves, limited sets of actions, and a limited number
of signers, resulting in different problems in sign language
feature extraction or modeling.
1.3.3.
In continuous sign language recognition, as well as in
speech recognition, coarticulation effects have to be con-
sidered. One of the challenges in the recognition of con-
tinuous sign language on large corpora is the definition and
modelling of the basic building blocks of sign language.
The use of whole-word models for the recognition of sign
language with a large vocabulary is unsuitable, as there is
usually not enough training material available to robustly
train the parameters of the individual word models. A suit-
able definition of sub-word units for sign language recogni-
tion would probably alleviate the burden of insufficient data
for model creation.
In ASR, words are modelled as a concatenated sub-word
units. These sub-word units are shared among the differ-
ent word-models and thus the available training material is
distributed over all word-models. On the one hand, this
leads to better statistical models for the sub-word units,
and on the other hand it allows to recognize words which
have never been seen in the training procedure using lex-
ica. According to the linguistic work on sign language
by Stokoe (Stokoe et al., 1960), a phonological model for
sign language can be defined, dividing signs into their four
constituent visemes, such as the hand shapes, hand ori-
entations, types of hand movements, and body locations
at which signs are executed.
components like facial expression and body posture are
used. However, no suitable decomposition of words into
sub-word units is currently known for the purposes of a
large vocabulary sign language recognition system (e.g. a
grapheme-to-phoneme like conversion and use of a pronun-
ciation lexicon).
The most important of these problems are related to the
lack of generalization and overfitting systems (von Agris
and Kraiss, 2007), poor scaling (Buehler et al., 2009;
Cooper and Bowden, 2009), and unsuitable databases for
mostly data driven approaches (Dreuw et al., 2008b).
Modeling of the Signs
Additionally, non-manual
1.4.Research and Challenges in Statistical Machine
Translation of Sign Languages
While the first papers on sign language translations only
date back to roughly a decade (Veale et al., 1998) and typi-
cally employed rule-based systems, several research groups
have recently focussed on data-driven approaches. In (Stein
et al., 2006), a SMT system has been developed for Ger-
man and German sign language in the domain weather re-
ports. Their work describes the addition of pre- and post-
processingstepstoimprovethetranslationforthislanguage
pairing. The authors of (Morrissey and Way, 2005) have
explored example-based MT approaches for the language
Page 3
pair English and sign language of the Netherlands with fur-
ther developments being made in the area of Irish sign lan-
guage. In (Chiu et al., 2007), a system is presented for the
language pair Chinese and Taiwanese sign language. The
optimizing methodologies are shown to outperform a sim-
ple SMT model. In the work of (San-Segundo et al., 2006),
some basic research is done on Spanish and Spanish sign
language with a focus on a speech-to-gesture architecture.
2.Speech and Sign Language Recognition
Automatic speech recognition (ASR) is the conversion of an
acoustic signal (sound) into a sequence of written words
(text).
Due to the high variability of the speech signal, speech
recognition – outside lab conditions – is known to be a hard
problem. Most decisions in speech recognition are interde-
pendent, as word and phoneme boundaries are not visible
in the acoustic signal, and the speaking rate varies. There-
fore, decisions cannot be drawn independently but have to
be made within a certain context, leading to systems that
recognize whole sentences rather than single words.
One of the key ideas in speech recognition is to put
all ambiguities into probability distributions (so called
stochastic knowledge sources, see Figure 1).
a stochastic modelling of the phoneme and word models,
a pronunciation lexicon and a language model, the free
parameters of the speech recognition framework are opti-
mized using a large training data set. Finally, all the in-
terdependencies and ambiguities are considered jointly in a
search process which tries to find the best textual represen-
tation of the captured audio signal. In contrast, rule-based
approaches try to solve the problems more or less indepen-
dently.
In order to design a speech recognition system, four cru-
cial problems have to be solved:
Then, by
1. preprocessing and feature extraction of the input sig-
nal,
2. specification of models and structures for the words to
be recognized,
3. learningofthefreemodelparametersfromthetraining
data, and
4. search of the maximum probability over all models
during recognition (see Figure 1).
2.1.Differences Between Spoken Language and Sign
Language
Main differences between spoken language and sign lan-
guage are due to linguistic characteristics such as simulta-
neous facial and hand expressions, references in the virtual
signing space, and grammatical differences as explained
more detailed in (Dreuw et al., 2008c):
Simultaneousness: Major issue in sign language recogni-
tion compared to speech recognition – a signer can use
different communication channels (facial expression,
hand movement, and body posture) in parallel.
Global Search:
Language Model
Visual Model
Word Model Inventory
Recognized
Word Sequence
Figure 1: Sign language recognition system overview
Video Input
XT
Tracking
Feature Extraction
1
xT
1
Pr(xT
1|wN
1)
argmax
wN
1
?Pr(wN
1) · Pr(xT
1|wN
1)?
Pr(wN
1)
ˆ wN
1
XT
1
uT
1
xT
1:= f (XT
1, uT
1)
Signing Space: Entities like persons or objects can be
stored in a 3D body-centered space around the signer,
by executing them at a certain location and later just
referencing them by pointing to the space – the chal-
lenge is to define a model for spatial information han-
dling.
Coarticulation and Epenthesis: In continuous sign lan-
guage recognition, as well as in speech recognition,
coarticulation effects have to be considered. Due to
location changes in the 3D signing space, we also have
to deal with the movement epenthesis problem (Vogler
and Metaxas, 2001; Yang et al., 2007). Movement
epenthesis refers to movements which occur regularly
in natural sign language in order to move from the
end state of one sign to the beginning of the next one.
Movement epenthesis conveys no meaning in itself but
contributes phonetic information to the perceiver.
Silence: opposed to automatic speech recognition, where
the energy of the audio signal is usually used for the
silence detection in the sentences, new spatial features
and models will have to be defined for silence detec-
tion in sign language recognition. Silence cannot be
detected by simply analyzing motion in the video, be-
cause words can be signed by just holding a particular
posture in the signing space over time. Further, the
rest position of the hand(s) may be somewhere in the
signing space.
3.Towards a
Sign-Language-to-Spoken-Language
Translation System
The interpersonal communication problem between signer
and hearing community could be resolved by building
up a new communication bridge integrating components
for sign-, speech-, and text-processing. To build a Sign-
Language-to-Spoken-Language translator for a new lan-
guage, a six component-engine must be integrated (see
Figure 2), where each component is in principle lan-
guage independent, but requires language dependent pa-
rameters/models. The models are usually automatically
Page 4
Figure 2: Complete six components-engine necessary to
build a Sign-Language-to-Spoken-Language system (com-
ponents: automatic sign language recognition (ASLR),
automatic speech recognition (ASR), machine translation
(MT), and text-to-speech/sign (TTS))
trained but require large annotated corpora.
In SignSpeak, a theoretical study will be carried out
about how the new communication bridge between deaf
and hearing people could be built up by analyzing and
adapting the ASLR and MT components technologies for
sign language processing. The problems described in Sec-
tion 2. will mainly be tackled by
• analysis of linguistic markers for sub-units and sen-
tence boundaries,
• head and hand tracking of the dominant and non-
dominant hand,
• facial expression and body posture analysis,
• analysis of linguistically- and data-driven sub-word
units for sign modeling,
• analysis of spatio-temporal across-word modeling,
• signer independent recognition by pronunciation mod-
eling, language model adaptation, and speaker adapta-
tion techniques known from ASR
• contextual and multi-modal translation of sign lan-
guage by an integration of tracking and recognition
features into the translation process
Once the different modules are integrated within a com-
mon communication platform, the communication could be
handled over 3G phones, media center TVs, or video tele-
phone devices. The following sign language related appli-
cation scenarios would be possible:
• e-learning of sign language
• automatic transcription of video e-mails, video docu-
ments, or video-SMS
• video subtitling
3.1.Impact on Other Industrial Applications
The novel features of such systems provide new ways
for solving industrial problems. The technological break-
through of SignSpeak will have an impact on other applica-
tions fields:
Improving human-machine communication by gesture:
vision-based systems are opening new paths and ap-
plications for human-machine communication by
gesture, e.g. Play Station’s EyeToy or Microsoft
Xbox’s Natal Project4, which could be interesting for
physically disabled individuals or even blind people
as well.
Medical sector: new communication methods by gesture
are being investigated to improve the communication
between the medical staff, the computer, and other
electronic equipments.
sector is related to web- or video-based e-Care / e-
Health treatments, or an auto-rehabilitation system
which makes the guidance process to a patient during
the rehabilitation exercises easier.
Another application in this
Surveillance sector: person detection and recognition of
body parts or dangerous objects, and their tracking
within video sequences or in the context of quality
control and inspection in manufacturing sectors.
4.
In order to build a Sign-Language-to-Spoken-Language
translator, reasonably sized corpora have to be created for
the data-driven approaches. For a limited domain speech
recognition task (Verbmobil II) as e.g. presented in (Kan-
thak et al., 2000), systems with a vocabulary size of up to
10k words have to trained with at least 700k words to ob-
tain a reasonable performance, i.e. about 70 observations
per vocabulary entry. Similar values must be obtained for a
limited domain translation task (IWSLT) as e.g. presented
in (Mauser et al., 2006).
Similar corpora statistics can be observed for other ASR
or MT tasks. The requirements for a sign language cor-
pus suitable for recognition and translation can therefore
be summarized as follows:
Experimental Results and Requirements
• annotations should be domain specific (i.e. broadcast
news, or weather forecasts, etc.)
• foravocabularysizesmallerthan4kwords, eachword
should be observed at least 20 times
• the singleton ratio should ideally stay below 40%
Existing corpora must be extended to achieve a good per-
formance w.r.t. recognition and translation (Forster et al.,
2010) . The core of the SignSpeak data will come from
the Corpus-NGT5. This 72 hour corpus of Sign Language
of the Netherlands is the first large open access corpus for
sign linguistics in the world. It presently contains record-
ings from 92 different signers, mirroring both the age vari-
ation and the dialect variation present in the Dutch Deaf
community (Crasborn et al., 2008).
During the SignSpeak project, the existing RWTH-
Phoenix corpus (Stein et al., 2006) and Corpus-NGT (Cras-
born et al., 2008) will be extended to meet these demands
(see Table 1). For the SignSpeak project, the limited gloss
annotations that were present in the first release of 2008
have been considerably expanded, and sentence-level trans-
lations have been added. The smaller RWTH-BOSTON-
104 database (Dreuw et al., 2007a) has been extended by a
4www.xbox.com/en-US/live/projectnatal/
5www.corpusngt.nl
Page 5
Table 1: Expected corpus annotation progress of the RWTH-Phoenix and Corpus-NGT corpora in comparison to the limited
domain speech (Verbmobil II) and translation (IWSLT) corpora.
Boston104PhoenixCorpus-NGT
year2007 2009 20112009
recordings 20178400
running words0.8k 10k 50k
vocabulary size0.1k0.6k
2.5k
T/T ratio8 15
20
Performance11% WER--
(Dreuw et al., 2008a)--
Vermobil IIIWSLT
20062011
300
80k
> 5k
< 20
2000
116
30k
3k
10
--
700k
10k
200k
10k
20 70
-
-
-
-
15% WER 40% TER
(Kanthak et al., 2000)(Mauser et al., 2006)
groundtruth annotation for both hand and face positions –
more than 15.000 frames have been annotated – in order to
evaluate the performance of different tracking algorithms.
Novel facial features (Piater et al., 2010) developed within
the SignSpeak project are shown in Figure 3 and will be
analyzed for continuous sign language recognition.
For automatic sign language recognition, promising re-
sults have been achieved for continuous sign language
recognition under lab conditions (von Agris and Kraiss,
2007; Dreuw et al., 2007a). Even if the performances of the
automatic learning approaches presented in (Farhadi and
Forsyth, 2006) and (Buehler et al., 2009; Cooper and Bow-
den, 2009) are still quite low, they represent an interesting
approach for further research.
For the task of sign language recognition and transla-
tion, promising results on the publicly available benchmark
database RWTH-BOSTON-1046have been achieved for
automatic sign language recognition (Dreuw et al., 2007a)
and translation (Dreuw et al., 2008c; Dreuw et al., 2007b)
that can be used as baseline reference for other researchers.
However, the preliminary results on the larger RWTH-
BOSTON-400 database show the limitations of the pro-
posed framework and the need for better visual features,
models, and corpora (Dreuw et al., 2008b).
5.Acknowledgements
This work received funding from the European Commu-
nity’s Seventh Framework Programme under grant agree-
ment number 231424 (FP7-ICT-2007-3).
6. References
R. Bowden, D. Windridge, T. Kadir, A. Zisserman, and
M. Brady. 2004. A Linguistic Feature Vector for the
Visual Interpretation of Sign Language. In ECCV, vol-
ume 1, pages 390–401, May.
Patrick Buehler, Mark Everingham, and Andrew Zisser-
man. 2009. Learning sign language by watching TV
(using weakly aligned subtitles). In IEEE CVPR, Miami,
FL, USA, June.
Jan Bungeroth, Daniel Stein, Philippe Dreuw, Hermann
Ney, Sara Morrissey, Andy Way, and Lynette van Zijl.
2008. The ATIS Sign Language Corpus. In LREC, Mar-
rakech, Morocco, May.
Y.-H. Chiu, C.-H. Wu, H.-Y. Su, and C.-J. Cheng. 2007.
Joint Optimization of Word Alignment and Epenthesis
6www-i6.informatik.rwth-aachen.de/˜dreuw/
database.php
Generation for Chinese to Taiwanese Sign Synthesis.
IEEE Trans. PAMI, 29(1):28–39.
Helen Cooper and Richard Bowden. 2009. Learning Signs
from Subtitles: A Weakly Supervised Approach to Sign
Language Recognition. In IEEE CVPR, Miami, FL,
USA, June.
Onno Crasborn, Johanna Mesch, Dafydd Waters, Annika
Nonhebel, Els van der Kooij, Bencie Woll, and Brita
Bergman. 2007. Sharing sign langauge data online. Ex-
periences from the ECHO project. International Journal
of Corpus Linguistics, 12(4):537–564.
Onno Crasborn, Inge Zwitserlood, and Johan Ros. 2008.
Corpus-NGT. An open access digital corpus of movies
with annotations of Sign Language of the Netherlands.
Technical report, Centre for Language Studies, Radboud
University Nijmegen. http://www.corpusngt.nl.
P. Dreuw, D. Rybach, T. Deselaers, M. Zahedi, and H. Ney.
2007a. Speech Recognition Techniques for a Sign Lan-
guage Recognition System. In ICSLP, Antwerp, Bel-
gium, August. Best paper award.
P. Dreuw, D. Stein, and H. Ney. 2007b. Enhancing a Sign
Language Translation System with Vision-Based Fea-
tures. In Intl. Workshop on Gesture in HCI and Simu-
lation 2007, pages 18–19, Lisbon, Portugal, May.
Philippe Dreuw, Jens Forster, Thomas Deselaers, and Her-
mann Ney. 2008a. Efficient Approximations to Model-
based Joint Tracking and Recognition of Continuous
Sign Language. In IEEE International Conference Au-
tomatic Face and Gesture Recognition, Amsterdam, The
Netherlands, September.
Philippe Dreuw, Carol Neidle, Vassilis Athitsos, Stan
Sclaroff, and Hermann Ney.
Databases for Video-Based Automatic Sign Language
Recognition. In LREC, Marrakech, Morocco, May.
Philippe Dreuw, Daniel Stein, Thomas Deselaers, David
Rybach, Morteza Zahedi, Jan Bungeroth, and Hermann
Ney. 2008c. Spoken Language Processing Techniques
for Sign Language Recognition and Translation. Tech-
nology and Dissability, 20(2):121–133, June.
EUD. 2008. Survey about Sign Languages in Europe.
A. Farhadi and D. Forsyth. 2006. Aligning ASL for sta-
tistical translation using a discriminative word model. In
IEEE CVPR, New York, USA, June.
Jens Forster, Daniel Stein, Ellen Ormel, Onno Crasborn,
and Hermann Ney. 2010. Best Practice for Sign Lan-
guage Data Collections Regarding the Needs of Data-
Driven Recognition and Translation. In 4th LREC Work-
2008b.Benchmark
Page 6
Figure 3: Facial feature extraction on the Corpus-NGT database (f.l.t.r.): three vertical lines quantify features like left eye
aperture, mouth aperture, and right eye aperture; the extraction of these features is based on a fitted face model, where the
orientation of this model is shown by three axis on the face: red is X, green is Y, blue is Z, origin is the nose tip.
shop on the Representation and Processing of Sign
Languages: Corpora and Sign Language Technologies
(CSLT), Malta, May.
Stephan Kanthak, Achim Sixtus, Sirko Molau, Ralf
Schl¨ uter, and Hermann Ney, 2000. Fast Search for
Large Vocabulary Speech Recognition, chapter ”From
Speech Input to Augmented Word Lattices”, pages 63–
78. Springer Verlag, Berlin, Heidelberg, New York, July.
Arne Mauser, Richard Zens, Evgeny Matusov, Saˇ sa Hasan,
and Hermann Ney. 2006. The RWTH Statistical Ma-
chine Translation System for the IWSLT 2006 Evalua-
tion. In IWSLT, pages 103–110, Kyoto, Japan, Novem-
ber. Best Paper Award.
S. Morrissey and A. Way. 2005. An Example-based Ap-
proach to Translating Sign Language. In Workshop in
Example-Based Machine Translation (MT Summit X),
pages 109–116, Phuket, Thailand, September.
S. Ong and S. Ranganath. 2005. Automatic Sign Lan-
guage Analysis: A Survey and the Future beyond Lexical
Meaning. IEEE Trans. PAMI, 27(6):873–891, June.
Justus Piater, Thomas Hoyoux, and Wei Du. 2010. Video
Analysis for Continuous Sign Language Recognition. In
4th LREC Workshop on the Representation and Process-
ing of Sign Languages: Corpora and Sign Language
Technologies (CSLT), Malta, May.
R. San-Segundo, R. Barra, L. F. D’Haro, J. M. Montero,
R. C´ ordoba, and J. Ferreiros. 2006. A Spanish Speech
to Sign Language Translation System for assisting deaf-
mute people. In ICSLP, Pittsburgh, PA, September.
D. Stein, J. Bungeroth, and H. Ney. 2006. Morpho-Syntax
Based Statistical Methods for Sign Language Transla-
tion. In 11th EAMT, pages 169–177, Oslo, Norway,
June.
W. Stokoe, D. Casterline, and C. Croneberg. 1960. Sign
language structure. An outline of the visual communica-
tion systems of the American Deaf (1993 Reprint ed.).
Silver Spring MD: Linstok Press.
B. Tervoort. 1953. Structurele analyse van visueel taalge-
bruik binnen een groep dove kinderen.
T. Veale, A. Conway, and B. Collins. 1998. The Chal-
lenges of Cross-Modal Translation: English to Sign Lan-
guage Translation in the ZARDOZ System. Journal of
Machine Translation, 13, No. 1:81–106.
C. Vogler and D. Metaxas. 2001. A Framework for Recog-
nizing the Simultaneous Aspects of American Sign Lan-
guage. CVIU, 81(3):358–384, March.
U. von Agris and K.-F. Kraiss. 2007. Towards a Video Cor-
pus for Signer-Independent Continuous Sign Language
Recognition. In Gesture in Human-Computer Interac-
tion and Simulation, Lisbon, Portugal, May.
Mark Wheatley and Annika Pabsch. 2010. Sign Language
in Europe. In 4th LREC Workshop on the Representation
and Processing of Sign Languages: Corpora and Sign
Language Technologies (CSLT), Malta, May.
T.S. Huang Y. Wu. 1999. Vision-based gesture recogni-
tion: a review. In Gesture Workshop, volume 1739 of
LNCS, pages 103–115, Gif-sur-Yvette, France, March.
Ruiduo Yang, Sudeep Sarkar, and Barbara Loeding. 2007.
Enhanced Level Building Algorithm to the Movement
Epenthesis Problem in Sign Language. In CVPR, MN,
USA, June.