ArticlePDF Available

A Highly Accurate and Reliable Data Fusion Framework for Guiding the Visually Impaired

Article

A Highly Accurate and Reliable Data Fusion Framework for Guiding the Visually Impaired

Abstract and Figures

The world has approximately 253 million visually impaired (VI) people according to a report by the World Health Organization (WHO) in 2014. Thirty-six million people are estimated to be blind. According to WHO, 217 million people are estimated to have moderate to severe visual impairment. An important factor that motivated this research is the fact that 90% of VI people live in developing countries. Several systems were designed to improve the quality of the life of VI people and support their mobility. Unfortunately, none of these systems are considered to be a complete solution for VI people and these systems are very expensive. We present in this work an intelligent framework for supporting VI people. The proposed work integrates sensor-based and computer vision-based techniques to provide an accurate and economical solution. These techniques allow us to detect multiple objects and enhance the accuracy of the collision avoidance system. In addition, we introduce a novel obstacle avoidance algorithm based on the image depth and fuzzy logic. By using the fuzzy logic, we were able to provide precise information to help the VI user in avoiding front obstacles. The system has been deployed and tested in real-time scenarios. An accuracy of 98% was obtained for detecting objects and 100% accuracy in avoiding the detected objects.
shows the three main subcategories of visual assistive technology: vision enhancement, vision substitution, and vision replacement [14, 15]. Using the functions of sensors, this technology became available to users in terms of electronic devices and applications. These systems provide different services, such as object localization, detection, and avoidance. Navigation and orientation services are offered to provide users with a sense of their external environment. Sensors help VI people with their mobility tasks based on identifying an object's properties [6, 16]. The most complex category in this taxonomy is the vision replacement category, which is related to medical and technology issues. In terms of the vision replacement category, the results or information to be displayed will be sent to the brain's visual cortex or sent via a specific nerve [14]. Vision replacement and vision enhancement are comparable with a slight difference. The processed data that was sensed by a sensor in the vision enhancement category will be displayed. The results in the vision substitution category will not be displayed. Alternatively, the output is either auditory or tactile by may consist of both auditory or tactile outputs based on touch and hearing senses and the option that is more convenient to the user. The visual substitution category, which is our main focus, is subdivided into three other categories: Electronic Travel Aid (ETAs), Electronic Orientation Aid (EOAs), and Position Locator Devices (PLDs). Each of these categories provides a particular service to enhance the user's mobility with a slight difference. Table I describes each subcategory of the visual substitution category and their services.
… 
Content may be subject to copyright.
Received December 4, 2017, accepted January 12, 2018, date of publication March 19, 2018, date of current version July 6, 2018.
Digital Object Identifier 10.1109/ACCESS.2018.2817164
A Highly Accurate and Reliable Data Fusion
Framework for Guiding the Visually Impaired
WAFA M. ELMANNAI , (Member, IEEE), AND KHALED M. ELLEITHY, (Senior Member, IEEE)
Department of Computer Science and Engineering, University of Bridgeport, Bridgeport, CT 06604, USA
Corresponding author: Wafa M. Elmannai (welmanna@my.bridgeport.edu)
This work was supported in part by the American Association of University Women and in part by the University of Bridgeport, CT, USA.
ABSTRACT The world has approximately 253 million visually impaired (VI) people according to a report by
the world health organization (WHO) in 2014. Thirty-six million people are estimated to be blind. According
to WHO, 217 million people are estimated to have moderate to severe visual impairment. An important factor
that motivated this research is the fact that 90% of VI people live in developing countries. Several systems
were designed to improve the quality of the life of VI people and support their mobility. Unfortunately, none
of these systems are considered to be a complete solution for VI people and these systems are very expensive.
We present in this paper an intelligent framework for supporting VI people. The proposed work integrates
sensor-based and computer vision-based techniques to provide an accurate and economical solution. These
techniques allow us to detect multiple objects and enhance the accuracy of the collision avoidance system.
In addition, we introduce a novel obstacle avoidance algorithm based on the image depth information and
fuzzy logic. By using the fuzzy logic, we were able to provide precise information to help the VI user in
avoiding front obstacles. The system has been deployed and tested in real-time scenarios. An accuracy of 98%
was obtained for detecting objects and 100% accuracy in avoiding the detected objects.
INDEX TERMS Assistive wearable devices, computer vision systems, data fusion algorithm, obstacle
detection and obstacle collision avoidance, sensor-based networks, visual impairment, blindness, and
mobility limitation.
I. INTRODUCTION
In 2014, statistics of 253 million VI people worldwide were
reported by The World Health Organization (WHO) [1];
36 million people are completely blind. In the USA, approx-
imately 8.7 million people are VI, whereas approximately
1.3 million people are blind [2]. Both the National Federa-
tion for the Blind [2] and the American Foundation for the
Blind [3] reported that 100,000 of VI people are students.
During the last decade, the accomplishment of public health
performance was a decrease in the number of diseases that
cause blindness. Ninety percent of VI people are low-income
and live in developing countries. In addition, 82% of VI peo-
ple are older than 50 years old [1]. This number is estimated
to increase approximately 2 million per decade. By 2020, this
number is estimated to double [4].
VI people encounter many challenges when performing
most natural activities that are performed by human beings,
such as detecting static or dynamic objects and safely navi-
gating through their paths. These activities are highly difficult
and may be dangerous for VI people, especially if the envi-
ronment is unknown. Therefore, VI people use the same route
every time by remembering unique elements.
The most popular assistance method used by VI people
to detect and avoid obstacles through their paths is a white
cane; a trained dog is used for navigation service [5]. These
methods are limited with regard to the information that they
provide in real-time scenarios; this information cannot ensure
safe mobility and a clear path to the user as it would for a
sighted person [6], [7]. A white cane is designed to detect
close objects with physical contact requirements. A white
cane can also alert people to the presence of VI people and
enable sighted people to yield the path to VI people. However,
a white cane cannot detect head level barriers and their danger
levels. A dog is a good navigation solution compared to the
white cane but it is an expensive solution. Intensive training
is required for dogs that serve as guide dogs.
Therefore, developing an independent, effective, and assis-
tive device for VI people that provides real-time information
VOLUME 6, 2018
2169-3536 2018 IEEE. Translations and content mining are permitted for academic research only.
Personal use is also permitted, but republication/redistribution requires IEEE permission.
See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
33029
W. M. Elmannai, K. M. Elleithy: Highly Accurate and Reliable Data Fusion Framework for Guiding the Visually Impaired
with fine recognition of the surrounding environment within
a reasonable range of detection and indoor and outdoor cov-
erage during day or night becomes a critical challenge.
Many electronic devices (wearable and portable) were
introduced to assist VI people in providing navigational
information, such as ultrasonic obstacle detection glasses,
laser canes, and mobile applications using smart phones.
However, the majority of available systems have two issues:
these offered devices are very expensive, whereas VI peo-
ple predominantly belong to the low-income group and the
capacities and services of these proposed systems are limited.
Therefore, a complete design of a framework that integrates
all possible and useful sensors with computer vision methods
can overcome these limitations.
We have investigated several solutions that assist VI peo-
ple. A fair taxonomy was the result of our intensive study
to provide a technical classification to compare any system
with other systems. This taxonomy is presented in a literature
survey paper that was recently published in [8]. None of these
studies provides a complete solution that can assist VI people
in all aspects of their lives. Thus, the objective of this work is
to design an efficient framework that significantly improves
the life of VI people. The framework can overcome the
limitations of previous systems by providing a wider range of
detection that works indoors and outdoors and a navigational
service.
The focus of this paper is to design a novel navigation
assistant and wearable device to support VI people in iden-
tifying and avoiding static/dynamic objects by integrating
computer vision technology and sensor-based technology.
An innovative approach, which is referred to as a proximity
measurement method, is proposed for measuring distance.
This approach is based on an image’s depth. The system has
been deployed and tested in real-time scenarios. This system
enables the user to detect and avoid obstacles by providing
navigational information to recover his/her path in the case of
obstacles. The novelty of this work arises from multisensory
integration and a proposed data fusion algorithm with the help
of computer vision methods. The combination of different
data resources improves the accuracy of the output. Our
platform was evaluated for different scenarios. The validated
results indicated accurate navigational instructions and effec-
tive performance in terms of obstacle detection and avoid-
ance. The system consistently sends warning audio messages
to the user. Thus, this system is designed to assist normal
walkers.
The organization of the paper is as follows: section 2
presents a background about assistive technologies for
VI people. A study of the state-of-the art assistive technolo-
gies for VI people is presented in section 3. The proposed
framework is described in section 4. Real-time scenarios and
experimental results are presented in section 5. Section 6 con-
cludes the paper with a discussion, comparison, and perspec-
tives of future directions.
II. BACKGROUND
Assistive technology was introduced in the 1960s to
solve problems associated with transmitting informa-
tion [9] and mobility assistance, such as orientation and
navigation [10], [11].
FIGURE 1. Demonstration of the interaction between the assistive
technology and the user [13].
Assistive technology includes all services, systems, appli-
ances, and devices that are used to assist disabled people in
their daily lives to facilitate their activities and ensure their
safe mobility [12]. Figure 1 demonstrates the services and
capabilities that are afforded to a disabled person by interac-
tion with assistive technology. The user can communicate and
take actions toward other people, devices, and the surround-
ing environment using either sensors or computer vision tech-
nologies that have been employed by assistive technology.
The user with a disability can individually accomplish his/her
daily tasks and experience an enhanced quality of life that
enables him/her to feel connected to the outside world [13].
FIGURE 2. The taxonomy of assistive technology.
Figure 2 shows the three main subcategories of visual
assistive technology: vision enhancement, vision substitu-
tion, and vision replacement [14], [15]. Using the functions
of sensors, this technology became available to users in terms
of electronic devices and applications. These systems provide
different services, such as object localization, detection, and
avoidance. Navigation and orientation services are offered
to provide users with a sense of their external environment.
Sensors help VI people with their mobility tasks based on
identifying an object’s properties [6], [16].
The most complex category in this taxonomy is the vision
replacement category, which is related to medical and tech-
nology issues. In terms of the vision replacement category,
the results or information to be displayed will be sent to the
33030 VOLUME 6, 2018
W. M. Elmannai, K. M. Elleithy: Highly Accurate and Reliable Data Fusion Framework for Guiding the Visually Impaired
brain’s visual cortex or sent via a specific nerve [14]. Vision
replacement and vision enhancement are comparable with a
slight difference. The processed data that was sensed by a
sensor in the vision enhancement category will be displayed.
The results in the vision substitution category will not be dis-
played. Alternatively, the output is either auditory or tactile by
may consist of both auditory or tactile outputs based on touch
and hearing senses and the option that is more convenient to
the user.
The visual substitution category, which is our main focus,
is subdivided into three other categories: Electronic Travel
Aid (ETAs), Electronic Orientation Aid (EOAs), and Position
Locator Devices (PLDs). Each of these categories provides a
particular service to enhance the user’s mobility with a slight
difference. Table 1 describes each subcategory of the visual
substitution category and their services.
TABLE 1. Visual substitution subcategories.
III. RELATED WORK
Although several solutions were proposed in the last decade,
none of these solutions is a complete solution that can assist
VI people in all aspects of their lives. The following subsec-
tions present some of the work that has been performed.
A. SENSOR-BASED ETA
Sensor-Based ETAs are techniques or systems that provide
the VI person with information about their surrounding envi-
ronment via vibration, audio messages or both vibration and
messages using sensors. These systems primarily rely on the
collected data to detect an object and avoid it by measuring
the velocity of the obstacle and the distance between the
user and the obstacle. Different devices use different types
of sensors and provide different services. Ultrasonic sensors
are the most popular sensors.
Wahab et al. developed an obstacle detection and avoid-
ance system that is based on the ultrasonic technology in [21].
A number of ultrasonic sensors are attached to a cane.
However, a timer for the water detector’s buzzer is needed.
In addition, [22] proposed an embedded device using an
android application to navigate the user through his/her
path. Modified GSM was introduced in this study. A multi-
sensor system was designed and installed on a stick to detect
and avoid front obstacles in three different directions [23].
An electronic cane was designed as a mobility aid to detect
front obstacles with the help of haptics and ultrasonic tech-
nology [24]. An ultrasonic cane was presented in [25] as a
development to the C-5 laser cane [26] to detect both ground
objects and aerial objects. The authors of [27] introduced an
ultrasonic headset as a mobility aid for VI people that detects
and avoids obstacles.
Other systems use different types of sensors and devices to
provide VI people with navigational services. A navigational
system that is based on a laser light and sensors was proposed
in [28] to support the mobility of VI people. A low-cost
navigator for pedestrians was designed using the Raspberry
Pi device and Geo-Coder-US and Mo Nav modules in [29].
An assistive navigator was suggested in [30] to guide the user
through his/her unknown path by adapting GPS and GSM
technologies.
An obstacle avoidance system was proposed in [31] using
a Kinect depth camera and an auto-adaptive thresholding
strategy. The largest peak threshold is defined using the Otsu
method [32]. The idea of the navigator belt, including the
number of cells around the belt, was introduced in [33] based
on the Kinect depth sensor. This belt is designed to detect and
avoid obstacles that are represented in a 3D model. Each cell
represents a different warning message.
Using ultra-wide technology, the SUGAR system was
introduced as an indoor navigator to VI people. This system
navigates the user through an enclosed place that was mapped
in advance [34].
Using a retina-inspired dynamic vision camera, [35]
improved the mobility of VI. The system represents the
environmental information as an audio landscape using
3D sound [36]. The premise is a dynamic vision camera that
resembles the human retina [37], [38].
B. SENSOR SUBSTITUTION-BASED ETA
Sensor substitution-based systems are designed to be an alter-
native to multi-sensory systems. A small wireless device that
is placed on the user’s tongue was proposed to navigate
VI people [39]. The wireless communication between the
glasses (camera placed on the glasses) and the device is estab-
lished using the designed dipole antenna in [40]. Using radio
VOLUME 6, 2018 33031
W. M. Elmannai, K. M. Elleithy: Highly Accurate and Reliable Data Fusion Framework for Guiding the Visually Impaired
frequency technology, the Radio Frequency Identification
Walking Stick was introduced in [41] to ensure safe mobility;
the user does not walk beyond the sidewalk boundaries.
A virtual cane was designed in [42] for obstacle detec-
tion and avoidance for VI and handicapped people using
laser rangefinder and haptics. An H3DAPI plate form was
employed [43].
A mobile crowd assistant was implemented in [44] to
navigate the user to his/her desired destination. The user’s
information and volunteer feedback are transferred through
the crowd server.
C. ETA SENSORY-BASED COMPUTER VISION METHODS
Recently, we noticed a rapid propagation of assistive systems
due to the improvement and progress of the computer vision
techniques that add more value and services with flexibility.
Fusion of artificial vision and map matching [46] and GPS
introduced an enhanced navigational system that supports
VI people in their mobility [45]. The SpikNet recognition
algorithm was employed for image processing [47]. GPS,
modified a geographic information system (GIS) [48] and
vision-based positioning are used to provide the properties
of obstacles.
Cognitive guidance device is designed by integrating the
Kinect sensor’s output, vanishing points and fuzzy decisions
to navigate the VI person through a known environment [49].
Spatial landmarks are not detected by the system.
An independent mobility aid was proposed in [50] for
indoor and outdoor obstacle avoidance [51] for static/
dynamic object detection. Lucas-Kanade, RANSAC, adapted
HOG descriptor, Bag of Visual Words (BoVW), and SVM are
employed for object detection and recognition.
Aladren et al. [52] introduced an object detection system
using an RGB-D Sensor and computer vision techniques.
The classification target is to classify the object as either a
floor or an obstacle. The classification process was based on
the use of a canny edge detector [53], the Hough Line Trans-
form and RANSA, which are applied on the RGB-D’s output.
Integrating both ultrasonic-based technology and com-
puter vision algorithms presented an obstacle detection and
recognition system [54]. SIFT, Lucas-Kanade, RANSAC and
K-means clustering were applied to a mobile camera’s output.
The recognition part was limited to classifying the object as
either normal or urgent. According to their paper, the users’
feedback indicates that the system is not sufficiently reliable
to be a replacement for the white cane.
Details about the systems presented in this section are
provided in [8]. Based on the study and literature review
presented in [8], no system could fully satisfy all of the user’s
needs to provide him/her with safe mobility indoors and out-
doors. The above-mentioned systems are not fully satisfying
the user’s needs due to the limitation of used techniques
in these proposed systems [8]. Systems that are based on
both sensors and computer vision provide better solutions.
However, there is no single technique that can be considered
a robust or complete solution to replace a white cane and
provide safe mobility both indoors and outdoors with a wide
range of object detection.
Due to this observation, we propose a novel system
that integrates both sensor-based techniques and computer
vision techniques to provide a complete solution for VI peo-
ple indoors and outdoors with other complementary fea-
tures. The proposed approach is described in the following
section.
IV. PROPOSED FUSION OF SENSOR-BASED
DATA USING COMPUTER VISION
In this paper, we integrate both computer vision and sensor-
based technologies to facilitate the user’s mobility indoors
and outdoors and provide an efficient system for the VI per-
son. The system is affordable for both the blind and low vision
people. Figure 3 illustrates the developed device.
FIGURE 3. Designed prototype for proposed system.
This section discusses the object detection and the pro-
posed novel and unique obstacle avoidance technique that
is based on the depth of an image to provide the user with
navigational information in an audio format using a headset.
The results of the proposed obstacle avoidance approach
and the proposed data fusion algorithm shows a significant
improvement and qualitative advancement in the collision
avoidance field, which increases its accuracy compared with
other existing systems. The anticipated obstacle avoidance
technique is based on the depth of an image, which is con-
sidered a challenging area for many researchers. Therefore,
the majority of researchers prefer to use ultrasonic sensors-
based technology instead of the depth of an image despite the
implicit limitations of the ultrasonic sensor-based technology.
Details about the proposed approach are provided in this
section.
33032 VOLUME 6, 2018
W. M. Elmannai, K. M. Elleithy: Highly Accurate and Reliable Data Fusion Framework for Guiding the Visually Impaired
FIGURE 4. Proposed methodology of the interaction process among the hardware components.
We propose a framework that significantly improves the
life of the VI person. This framework supports the following
features: obstacle detection [55], [56], navigational guid-
ance, and the proposed distance measurement approach to
provide an accurate collision avoidance system. The sys-
tem performs well both indoors and outdoors. Additional
features can be implemented in the future, including loco-
motion [45], [49], [57]–[63], character recognition and
text reading [64], [65], identifying currency bills [66], note
taking [67], [68], traffic signal detection [69], barcode
scanning, product information retrieval [70], finding lost
items [71], localizing specific objects [72], [73], and mobile
vision [74].
A. PROPOSED METHODOLOGY AND FLOWCHART
OF THE DATA FUSION ALGORITHM
The proposed framework includes hardware and software
components. The hardware design is composed of two cam-
era modules, a compass module, a GPS module, a gyroscope
module, a music (audio output) module, a microphone mod-
ule, and a wi-fi module.
However, the aim of the software is to develop an efficient
data fusion algorithm that is based on sensory data to enhance
and provide a highly accurate object detection/avoidance
and navigational system with the help of computer vision
methods to provide safe mobility. Figure 4 demonstrates the
interaction between the hardware components for the navi-
gational system and the fused data that are received by the
microcontroller board (FEZ Spider) from multiple sensors
and transferred to the remote server.
The system is designed to navigate the user to the desired
location and to avoid any obstacle in front of the user after
it is detected. Based on fused data from multiple sensors and
computer vision methods, the user will receive feedback in
an audio format. Two camera sensors are used for object
detection, which is processed using computer vision methods.
The remote server handles the image processing. Based on
the depth of the image, we can approximately measure the
distance between the obstacle and the VI person. A com-
pass is employed for orientation adjustment. A gyro sensor
is employed for rotation measurement in degrees. A GPS
provides the user’s location. All components are connected
with the microcontroller board, which communicates with a
remote server. Route guidance is provided by a GIS. Thus,
we use a gyro, compass, and GPS to track the user’s direc-
tions, locations and orientations to provide accurate route
guidance information.
Figure 5 demonstrates the flow of the anticipated data
fusion algorithm’s work and the method of using the fused
data that was received from multiple sensors and processing
it to provide accurate real-time information.
The proposed system has three modes, as shown in the
flowchart of Figure 5. Mode zero indicates that the sys-
tem is booting. Mode 1 represents the static and dynamic
obstacle detection and avoidance system. Mode 2 is the
data fusion model of obstacle detection/ avoidance and the
navigation system. Once the system is on and the device
is placed in the right position, the right and left cam-
eras start to transfer the token frames to the remote server
through the FEZ Spider board. The static and dynamic
VOLUME 6, 2018 33033
W. M. Elmannai, K. M. Elleithy: Highly Accurate and Reliable Data Fusion Framework for Guiding the Visually Impaired
FIGURE 5. Flowchart of the proposed data fusion algorithm using multiple sensors.
object detection and proposed avoidance system will be
applied.
As the object is detected, the remote server will trigger
the appropriate audio message and the FEZ Spider board
will send a signal to play the message through the audio
module in the case of mode 2. If the user selected mode 3,
the user will ask for the destination address through the
microphone module. The desired address will be sent to the
speech recognition server through the FEZ Spider board and
validated.
33034 VOLUME 6, 2018
W. M. Elmannai, K. M. Elleithy: Highly Accurate and Reliable Data Fusion Framework for Guiding the Visually Impaired
FIGURE 6. The block diagram of static/dynamic object detection process.
FIGURE 7. (a) Camera Module for object detection, (b) The camera view range.
Information about the user’s location and orientation will
be retrieved from the GPS, gyroscope, and compass sensors.
The three-axis gyroscope is used to provide information about
the changes of the user’s movement and orientation. Hence,
we use the gyroscope as a black box to determine the orien-
tation and tilt of the user from 90-degree angle to measure
if the camera is tilted enough so that it is slightly facing the
floor. We need the camera slightly facing down so it captures
only a small area within few meters. As an example, when the
user is walking outside and the camera might captures 100 m
ahead, which is unnecessary information during that time.
We have determined capturing 9 meters in front is enough
to control the used resources. Otherwise, the system will be
slower in processing the captured data and it will not be
energy efficient. The output of this multi-sensory data will
be fed into the GIS to generate a map and provide guidance
information using audio messages. In the scenario in which an
obstacle appears in this scene and the user should receive nav-
igational information, both messages will be combined into
one message and then sent to the user. However, the obstacle
avoidance warning message will precede the navigational
information in order and be combined with the word ‘‘then’’;
for example, ‘‘slight left, then, turn right’’. Thus, the proposed
system performs indoors as a static/dynamic obstacle detec-
tion and avoidance system and outdoors as a combination of a
static/dynamic obstacle detection and avoidance system and
navigator.
B. STATIC/ DYNAMIC OBSTACLE DETECTION USING
CAMERA MODULES AND COMPUTER
VISION TECHNIQUES
1) EXTRACTION OF INTEREST POINTS USING THE
COMBINATION OF ORIENTED FAST AND
ROTATED BRIEF (ORB) ALGORITHM
Figure 6 illustrates the process of the object detection sys-
tematically using computer vision methods. The camera dis-
played in Figure 7 (a) is from GHI Electronics [75]; it is a
serial camera with a resolution of 320 ×240 and a maximum
resolution of 20 fps. We use two camera modules in our
framework to cover a wider view of the scene and then
stitch the various camera views into one view. Figure 7 (b)
demonstrates the use of two cameras to detect objects on
edges and objects that cannot be noticed when using one
camera.
The Oriented FAST and Rotated BRIEF (ORB) is the
approach that we applied for static/dynamic object detection.
ORB is characterized by a fast computation for panorama
stitching and low power consumption. The ORB algorithm is
an open source that was presented by Ethan Rublee, Vincent
Rabaud, Kurt Konolige and Gary R. Bradski in 2011 as
a suitable alternative of SIFT due to its effective match-
ing performance and low computing cost [76]. Unlike other
extraction algorithms, ORB has a descriptor. Therefore, ORB
is an integration of the modified Features from Accelerated
Segment Test (FAST) detector [77] and the modified Binary
VOLUME 6, 2018 33035
W. M. Elmannai, K. M. Elleithy: Highly Accurate and Reliable Data Fusion Framework for Guiding the Visually Impaired
Robust Independent Elementary Features (BRIEF) descrip-
tor [78]. FAST was chosen because it is sufficiently fast for
real-time applications compared with other detectors. The
modified version of FAST is termed oriented FAST (oFAST).
Key points are selected by FAST [77].
FIGURE 8. Segments test detector [80].
As demonstrated in Figure 8, Pis the candidate point. Ipis
the intensity of the candidate point. An appropriate threshold
is selected as t. A circular of 16 pixels around the centroid
point is referred to as a neighborhood. Consecutive Npixels
need to satisfy the following equation (1) of the 16 pixels:
IxIp>t(1)
Ixis the value of surrounding consecutive pixels. Ntop
points are filtered by the Harris corner measure [79].
The strength-weighted centroid Cwill be calculated with
a located corner at the center to make the FAST rotational
invariant. The moments are calculated as (x,y) in a circular
with radius rand ras follows:
mpq =XxpyqI(x,y) (2)
In addition, centroid Ccan be calculated by applying (2):
C=m10
m00
,m01
m00 (3)
The orientation is calculated based on the vector’s direction
from the corner point to the centroid point as shown in (4):
θ=atan2(m01,m10 )(4)
BRIEF is a binary string representation of an image patch
P[78]. τis a binary test of npairs of pixel points that can be
defined as shown in (5):
τ(p;x,y):= (1:p(x)<p(y)
0:p(x)p(y)(5)
The strength of Pat the point xis P(x). τrepresents one
binary test, whereas fnrepresents nbinary tests. In (6), fnrep-
resents nvector length binary strings, which is the descriptor
of the feature point:
fn(p):= X
1in
2i1τ(p;xi,yi) (6)
BRIEF can change the directions based on the orientation.
For each set of nbinary tests of features at location (xi,yi),
we determine a matrix of size 2xn:
S=xi,......,xn
yi,......,yn(7)
Where Sstores the set of pixels’ coordinates, and S2is the
rotation of Susing the orientation of patch 2. The steered
version can be determined as follows:
S2=SRθ(8)
The modified version of BRIEF can be denoted as (9):
gn(p, 2):= fn(P)|(xi,yi)S2(9)
Each angle is a multiple of 12 degrees. The lookup table of
pre-processed BRIEF is created. If the key point orientation
2is constant in all directions, the precise set of points S2will
be used to compute its descriptor [80].
The descriptors of extracted features will be the output of
this step, which will be fed to the descriptor matcher KNN.
2) DESCRIPTOR MATCHING USING K-NEAREST
NEIGHBOR (KNN) ALGORITHM
We employed the K-Nearest Neighbor (KNN) algorithm to
match the descriptors of extracted interest points between
two frames to an object’s presence. In this paper, we use
the Brute Force matcher, which is the simple version of the
KNN. In our case, the Brute Force will match the closest
Kcorresponding descriptors of extracted points with the
descriptor of selected interest points in a frame by trying each
corresponding descriptor of interest points in the correspond-
ing frame. The Hamming Distance method is applied between
each two pairs since the descriptor of ORB is a binary string.
Each descriptor of an interest point will be represented as the
vector f, which was generated by BRIEF. If the descriptors
of two interest points are equal, the result is 0; otherwise,
the result is 1. The Hamming distance will ensure correct
matching by counting the difference between the attributes,
in which the pair of two instances differ.
Let K=2, that is, for each extracted point pi, KNN
needs to find the corresponding two neighbor matched points
ti1,ti2in the next frame. We chose k=2 because we
are running the algorithm on stream video, where objects
are possibly shifted slightly from the reference frame to the
next frame. The distance between piand ti1,ti2is di1,di2.
We retain pi,tiif a significant difference between di1,di2
is observed; if the difference is close, then we eliminate the
points as mismatches [80]. The corresponding interest points
will be counted as a correct match, if the ratioiof di1
di2is less
than 0.8 [81].
The K-Nearest Neighbor Algorithm finds the best match of
the descriptor of an extracted interest point to the correspond-
ing descriptor. However, RANSAC reduces the false positive
match when the presence of an object is determined but an
actual object does not exist.
33036 VOLUME 6, 2018
W. M. Elmannai, K. M. Elleithy: Highly Accurate and Reliable Data Fusion Framework for Guiding the Visually Impaired
FIGURE 9. The performance of RANSAC for line fitting [82].
3) ELIMINATING FALSE MATCH USING RANDOM
SAMPLE CONSENSUS (RANSAC)
We employed RANSAC to eliminate false matches, which
are termed outliers. RANSAC is a highly estimated robust
algorithm for estimating and eliminating outliers, even a
significant number (more than 50%) of outliers. The outliers
in Figure 9 are denoted by red dots; they did not have any
influence on the results, which are represented in blue color.
RANSAC’s assumption that at least one set will satisfy a
certain model and the data distribution of a set of outliers does
not satisfy a certain model. Therefore, RANSAC improves
the dataset by guessing the factors of the model that involves
the highest number of best matches [82].
The threshold distance tis calculated by assuming that the
probability of inlier points in a set of points is αand that the
distribution of the inlier point is known. In addition, the value
of the threshold tcan be computed if the distribution of inlier
points justifies the variance σof Gaussian circulation and the
mean is equal to zero.
Based on the squared sum Gaussian variant, d2is the
square distance of two points to obey the chi-square distri-
bution (x2
m) with mdegrees of freedom. The random variable,
which follows the rules of a chi-square distribution, has the
probability of being lower than the integral upper limit. This
variable can be represented as follows:
Fmk2=Zk2
0
x2
m(ξ)dξ < k2(10)
The threshold distance is computed as follows:
t2=F1
m(α)σ2(11)
In addition, outliers and inliers can be classified as effective
or non-effective points, which are represented as follows:
Inlier: d2<t2
Outlier: d2t2(12)
Nis the number of iterations; it needs to be sufficiently
high to obtain the probability Psuch that at least one set
does not have an outlier. Assuming that the probability of the
outlier is v=1u, for the minimum of points, Niterations
are giving as follows:
1p=(1 um)N(13)
Thus, Nis expressed as
N=log(1 p)
log[1 (1v)m](14)
RANSAC is randomly iterated N times to determine the
inliers and outliers.
K-Means clustering will be applied to valid points to create
a cluster for each object based on the detected corners.
4) FOREGROUND OBJECTS EXTRACTION USING
MODIFIED K-MEANS CRUSTING
We employ the K-Means clustering technique to cluster n
extracted points of a particular frame. The K-Means clus-
tering technique is a well-known clustering analysis. Many
approaches prefer the K-means technique for clustering due
to its simplicity and suitability for large datasets. The purpose
of the K-Means technique in this paper is to assign nextracted
points p1,p2, . . . .., pnto kclusters {s1,s2, . . . .., sk}, whereas
Kis the maximum number of clusters and k<n:
n
X
i=0
D(pi,Center (Sk)) ,where pi∈ |Sk(15)
The Center Sjis the centroid point of Si; it is calculated
as the means of linked data points and depends on the number
of desired clusters. The centroid points will be randomly
selected. Each feature point will be assigned to the closest
centroid based on the calculated distance D. Groups will
be formed and distinguished from each other. In this study,
we establish k=6 for each frame based on our observations.
FIGURE 10. Demonstration of merging two clusters into one cluster.
However, more than one cluster may represent the same
object. Therefore, a merging method needs to be applied in
the case of any intersections among the clusters. Figure 10
represents the modification that we made to the K-Means
clustering technique.
VOLUME 6, 2018 33037
W. M. Elmannai, K. M. Elleithy: Highly Accurate and Reliable Data Fusion Framework for Guiding the Visually Impaired
FIGURE 11. Representation of the proposed object detection technique; (a) original frame, (b) the frame after applying proposed
sequence of algorithms for object detection and (c) the frame after applying K-means clustering and merging to identify each
object.
Algorithm 1 Modified K-Means Clustering Algorithm
Input: A set of points p1,p2, . . . . ., pn
Output: A set of K clusters s1,s2, . . . . ., sk
Nnumber of interest points
Knumber of clusters
Cscentroid point
While (K <N)
Assign all points to closest centroid to form K clusters
Recomputed the centroid of each cluster
End while
While (true)
For (i =0; i <=k; i++)
For (j =i+1; j <=k; j++)
only if ((SjSi)^(CSjwithin Si)
Merge Sjinto Si
End if
End for
End for
End while
Algorithm 1 shows the steps for clustering the closest
neighbors of each centroid and merging the clustered. Two
clusters can be merged into one cluster if (S1S2) AND the
centroid CS2within S1and then merges S2to S1. Otherwise,
merging does not occur even if (S1S2).
The result of the combination of adopted algorithms is
represented in Figure 11. The two green dots, which are
represented on the floor in Figure 11 (b), denote the detection
range from where the user is standing.
C. PROPOSED PROXIMITY MEASUREMENT METHOD FOR
A RELIABLE COLLISION AVOIDANCE SYSTEM
Existing systems use sensors to measure the distance between
the user and the obstacle; a technique that supports distance
measurement for this type of system is not available. In this
section, we propose a proximity measurement methodology
to approximately measure the distance between the user
and the obstacle using mathematical models. The proposed
approach is based on the camera that faces a slight angle
down to have a fixed distance between a VI person and the
ground. This view enables us to have a reference to determine
whether an object is an obstruction. We have determined that
the average distance between a VI person and the ground
is 9 meters with the device facing down on an angle. This
result enables us to identify an obstacle within a 9-meter
range; however, a VI person would only need to react to
an object within the 3-meter range. Our proposed method
divides the frame into three areas—left, right, and center—
as shown in Figure 12.
FIGURE 12. Approximate distance measurement for object avoidance.
We have assumed that an object in the upper part of
the frame is further away than on object in the lower half
and that an object detected in the lower half is an obstruc-
tion to the VI person. We can represent the frame in an
xy coordinate system. Let Wbe the width and Hbe
33038 VOLUME 6, 2018
W. M. Elmannai, K. M. Elleithy: Highly Accurate and Reliable Data Fusion Framework for Guiding the Visually Impaired
FIGURE 13. Flowchart of the collision avoidance algorithm.
the height. The calculation of the right and left is expressed
as follows:
(1
3H,2
5W)&(1
3H,3
5W) (16)
Equation (16) represents the corners of the middle area,
where we detect objects and inform the VI person that an
obstacle is in front of them. Two green dots, which are
equal to 1
3Hof the frame, represents the threshold of the
free collision area. Objects between the two green dots and
the start point must be avoided. An object is deemed an
obstruction if and when the lower corners of the objects
represented by (xmin1,ymin)&(xmin2,ymin) enter below the
area of equation (16). If an object exists in front of the
VI person, an alternative path is required. We determine this
path by searching for an object on the left or right of the area
enclosed by (16). If no objects are detected on the left side
of equation (16), the system issues a turn left and go straight
command to the VI person. If an object on the left is detected,
then the system searches for an object on the right. If an object
on the right is not detected, then a turn right and go straight
command is issued to the VI person. If objects are detected
on the left and right sides and middle, the system issues a stop
and wait command until a suitable path is identified for the
VI to continue. We calculated 20% of the middle quadrat to
provide accurate information to the user. If the obstacle exists
within 20% of the middle quadrat, he/she does not need to
move to other sides as long as the object appears in one of the
20% of the middle quadrat but not both.
Figure 13 displays a flowchart of the collision avoidance
system. Each preprocessed frame is divided into left, middle,
and right parts. Figure 13 shows the parallel process of simul-
taneously applying a free collision approach to the three areas
at same time. Although the proposed approach is applied to
the middle area (where free collision path is needed), a quick
scan is being run on both the right area and left area to ensure
a free path for the user in case any obstacle appears in front of
the user in the middle area. An audio feedback is the output of
this algorithm. Previous studies indicated that audio feedback
is a better choice than a tactile feedback because the user
becomes familiar with tactile feedback and loses their sense
of a particular body’s area. This algorithm will be recursively
applied for each frame compared with the previous frame.
The algorithm considers the previous frame as the reference
frame. Tactile feedback is a suitable option for people who
are hearing impairment.
Table 2 represents the conditions and the audio feedback
that the user will receive. We decided to use a left area as
the default direction if the obstacle appears in the middle;
however, both left areas and right areas are free to avoid any
confusion.
Algorithm 2 demonstrates the proposed distance measure-
ment approach for collision avoidance.
1) FUZZY LOGIC CONTROLLER
In order to implement the abovementioned strategy, we use
fuzzy logic to determine the precise decision that the user
will take in order to avoid front obstacles based on multi-
ple inputs. Figure 14 shows the fuzzy controller system for
obstacle avoidance algorithm which includes: fuzzier that
converts the inputs to number of fuzzy sets based on the
VOLUME 6, 2018 33039
W. M. Elmannai, K. M. Elleithy: Highly Accurate and Reliable Data Fusion Framework for Guiding the Visually Impaired
TABLE 2. Audio feedback of the obstacle avoidance system based on
certain conditions.
FIGURE 14. The Fuzzy structure for obstacle avoidance system.
defined variables and member functions; interface engine
which generates fuzzy results based on the fuzzy rules; Each
fuzzy output will be mapped by member functions to get the
precise output the user should seek [83]. We used Mathlab
R2017b software just in order to implement the rules of the
fuzzy logic.
Step 1 (Input and Output Determination): The input
variables for the proposed system are seven inputs. Those
inputs are based on the position of the detected obsta-
cles, the obstacle range {far, near} and the user position
{the user’s location within the frame}. They are donated
as: {ObsRange, UserPosition, ObsLeft, Obs20%LeftMid,
ObsMiddle, Obs20%RightMid, and ObsRight}. The output
is the feedback that the user needs for a path to the endpoint
(audio feedback that is sent through headphones).
Step 2 (Fuzzification): We have divided each input into
membership functions. Since the user is wearing the devices
on his/her chest, there are only three options in term of
the user’s position which are: {Left, middle, and right}.
Algorithm 2 Object Avoidance Algorithm
Input: An array of detected objects
Output: Warning message in audio format
rows firstFrame. Rows / 3
Xcorner12firstFrame.cols / 5
Xcorner2 3firstFrame.cols / 5
For.parallel (Objectsi)
xobjectsi.x
yobjectsi.y
ymin y+objectsi.height
xmax objectsi.x +objectsi.width;
If (ymin >=rows && x >=Xcorner1 && xmax <=
Xcorner2)
Middle true
Else if (ymin >=rows && xmax >=Xcorner2
&& x<=Xcorner2)
twentypercenttoright=Xcorner2-twenty
If(x>=twentypercenttoright)
MiddleRighttrue
Else if (ymin >=rows && xmax >=Xcorner1 && x<=
Xcorner1)
twentypercenttoleftt=Columnfourth+twenty
If(xmax<=twentypercenttoleftt)
MiddleLefttrue
Else if (ymin >=rows && (xmax <=Xcorner1 && x<=
Xcorner1))
Left true
Else if (ymin >=rows && (x >=Xcorner1 &&
xmax>=Xcorner1))
Righttrue
End If
End For
For.parallel (Objectsi)
If (Middle || MiddleLeft || MiddleRight)
If (Middle || (MiddleRight && MiddleLeft))
IF(!LEFT)
Output: ‘move left"
Else if (! Right)
Output: "move right"
Else
Output: ’’Stop’
.........\\More if statements
.........\\More if statements
End If
Else If(MiddleLeft && !MiddleRight)
Output: ’’Slight Right then straight’
End For
However, since we are using two cameras; and the processed
frames of those two cameras are stitched every time as one,
the user’s position is going to be always in middle. Therefore,
the membership function of the user’s position is donated as
shown in Figure 15. The range of this membership function
is 300cm as it is considered to be the width of the scene.
33040 VOLUME 6, 2018
W. M. Elmannai, K. M. Elleithy: Highly Accurate and Reliable Data Fusion Framework for Guiding the Visually Impaired
FIGURE 15. Membership function for the user’s position.
TABLE 3. Audio feedback of the obstacle avoidance system based on
certain conditionsd.
The used membership function is Gaussian Function. Gaus-
sian function is represented in (17) using the middle value m
and σ > 0. As σgets smaller, the bell gets narrower.
G(x)=exp "(xm)2
2σ2#(17)
Table 3 describes the terms of user’s position. Obstacle’s
position is described in Table 4. However, the obstacle’s range
is divided into two membership functions {Near, Far} within
the scene’s height which is [0 -900cm]. The threshold is set
to be 300cm. Consequently, the obstacle is near if it exists
within the range of [0 – 300cm], however, the obstacle is
far if it is far than 300cm. Fig. 16 represents the member-
ship function of the obstacle’s range within the height of
the scene (frame or view). In addition, the obstacle’s posi-
tion is divided into {ObsLeft, Obs20%LeftMid, ObsMiddle,
Obs20%RightMid, ObsRight}. However, in order to have
more control on the fuzzy rules, we had to divide each part
of the obstacle’s position into two membership functions that
exist or does exist {ObsEx, Obs_NEx}.
FIGURE 16. Membership function of object presentence in two ranges of
the scene.
FIGURE 17. Membership function for obstacle’s position in the left side.
Figure 17 represents the membership function of the obsta-
cle’s position in the left side of the scene. Same function
will be presented for the remaining inputs for the obstacle’s
position. The negative values indicates that the obstacle does
not exist in that side, whereas, the positive values exemplifies
the existence of the obstacle in that side. Assume the value
of the obstacle’s position is x and in R range, where xR.
Consequently, four parameters [i,j,k,l] are used to express
the Trapezoidal-shaped membership function in the follow-
ing equation (18):
µtrap (x:i,j,k,l)=max(min xi
ji,1,lx
lk,0) (18)
The output is divided into six membership functions
that are based on the fused input variables. The out-
put can be: {MoveLeft, SlightLeftStraight, GoStraight,
SlightRightStraight, MoveRight, and Stop}. We used the
Trapezoidal-shaped membership function for MoveRight and
VOLUME 6, 2018 33041
W. M. Elmannai, K. M. Elleithy: Highly Accurate and Reliable Data Fusion Framework for Guiding the Visually Impaired
FIGURE 18. Membership function of the output (feedback/directions).
MoveLeft membership values. However, we used Trian-
gular membership function as shown in Fig. 18 to rep-
resent {SlightLeftStraight, GoStraight, SlightRightStraight,
MoveLeft, MoveRight, and Stop}. The model value, lower
limit a and upper limit b, can define the Triangular mem-
bership function; where a <m<b. This function can be
expressed in (19):
A(x)=
0if x a
xa
maif x(a,m)
bx
bmif x(m,b)
0if x b
(19)
Step 3 (Creating Fuzzy Rules): The fuzzy rules can be pro-
duced based on observing and employing the Knowledge that
was introduced in Table 3 and Table 4, member function and
variables. The rules were implemented using five conditions
of the obstacle’s position, the obstacles’ range, and user’s
position. There are 18 rules for the fuzzy controller system.
The implemented 18 rules are presented in Appendix A.
TABLE 4. Definition of the obstacle position’s variables.
We have used the union operation to connect the mem-
bership values. AND is a representation of minimum result
between two values, whereas, OR is the representation of
maximum result between two values. Let µγand µδare
two membership values, thus, the fuzzy AND is described as
following (20):
µγANDµδ=min (µγ, µδ) (20)
Step 4 (Defuzzification): Defuzzification is the last step of
the fuzzy controller system. The output is produced based
on the set of inputs, membership functions and values, and
the fuzzy rules. The defized effect of the user’s position
and the obstacles’ position on the feedback is calculated using
the defuzzification method the Large Of Maximum (LOM)
method. Figure 19 illustrates the surface viewer that displays
the boundary of the differences and combination of obstacle’s
position and user’s position. The user will be allowed to
receive the accurate and precise feedback in order to avoid
front obstacles based on the combination of the described
membership values.
FIGURE 19. The Surface Viewer that examines the output surface of an
FIS for obstacle’s position and user’s position using fuzzy logic toolbox.
Furthermore, fuzzy logic is used to assist the VI person
from colliding with front obstacles in front of them. The
proposed system was built based on the user’s position, obsta-
cle’s position and one output. After the device’s initialization
step occurs, the information of the obstacle and user position
will be fed to the fuzzy controller. Then the decision will
be made based on the 18 fuzzy rules. This feedback will
be sent to the user through their headphones. The whole
process will be recursively employed. In case an obstacle does
not exist, user will continue his/her path (straight) with no
change.
V. IMPLEMENTATION AND EXPERIMENT SETUP
The aim of the developed device is to detect static and
dynamic objects, obstacle avoidance and provide a naviga-
tional information. This section describes experience setup
and test plan.
33042 VOLUME 6, 2018
W. M. Elmannai, K. M. Elleithy: Highly Accurate and Reliable Data Fusion Framework for Guiding the Visually Impaired
TABLE 5. calculation of the power consumption for the components
A. DESIGN STRUCTURE OF THE PROPOSED SYSTEM
The device was designed to facilitate the user’s mobility by
providing appropriate navigational information. We used C#
programing language. Table 5 shows the power consump-
tion for all modules. The system is built using the .NET
Gadgeteer compatible mainboard and modules from GHI
Electronics [76].
The software implementation is built on top of the
following SDKs using Visual Studio 2013:
NETMF SDK 4.3
NETMF and Gadgeteer Package 2013 R3
Microsoft introduced .Net Gadgeteer as an open source
to design electronic devices by taking advantage of object-
oriented programming and integrating Visual Studio and
.NET Micro Framework [84]. Net Gadgeteer is considered
to be a tool for connecting a mainboard with electronic
components. A well-known company that offers a variety of
mainboards and modules is GHI Electronic.
The FEZ Spider Mainboard is a .NET Gadgeteer-
compatible mainboard from GHI Electronics. The board sup-
ports the features of the .NET Micro Framework core, USB
host, RLP and wi-fi. The mainboard is shown in Figure 20.
The FEZ Spider Mainboard is a .NET Gadgeteer-
compatible mainboard from GHI Electronics. The board sup-
ports the features of the .NET Micro Framework core, USB
host, RLP and wi-fi. The mainboard is shown in Figure 20.
B. IMPLEMENTATION AND TEST PLAN
The complete design of our wearable navigational device is
shown in Figure 21. All sensors modules are connected to the
FEZ-Spider mainboard.
We have employed two camera modules for static dynamic
obstacle detection and avoidance. The previous studies
emphasized that wearable devices are more convenient than
portable devices for VI people. The wearable device is worn
FIGURE 20. GHI Electronics FEZ Spider Mainboard [76].
FIGURE 21. Hardware architecture.
on the user’s chest. The location of the device on this area of
the user’s body will ensure two things: 1) the device will have
a stable position and will be connected by two belts: the first
belt is on the neck side, and the second belt is on the waist
side. Therefore, the device will not move from its position,
2) this location of the device will enable our system to address
the obstacles under waist level and at head level.
The device was tested in indoor and outdoor scenarios. The
number and shape of the obstacles differed by scenario. Our
system was also tested on a video dataset that was directly fed
to the system. A video of the testing experiments was added
to as external material to this paper.
VI. REAL TIME EXPERIMENTS AND RESULTS
A. REAL TIME SCENARIOS
A set of experiments was performed on the designed device
in indoor and outdoor environments. Simultaneously, frames
VOLUME 6, 2018 33043
W. M. Elmannai, K. M. Elleithy: Highly Accurate and Reliable Data Fusion Framework for Guiding the Visually Impaired
FIGURE 22. This figure illustrates different real scenarios that were tested. (a): A snapshot of the first real-time experiment when two objects exist,
(b): A snapshot of the second real-time experiment to avoid dynamic and static objects in a complex environment, (c): A snapshot of the third
real-time experiment for outdoor navigation, (d): A snapshot of the fourth real-time experiment for outdoor navigation using a complex path setup.
will be transmitted to the server using an HTTP request to the
device through the IP address. The GHI system has a built-
in web server that can respond to HTTP requests. When the
HTTP request is made, the device responds with the videos
taken by the two camera modules, which are mounted on our
device.
We grouped the objects in each scene into two subgroups
because we realized that some objects should not be con-
sidered as obstacles unless they are located in front of the
users or are blocking his/her path; otherwise, the object is
considered to be an object. The two groups are as follows:
the first group contains objects that are located in a frame
in a particular video but do not create an obstruction to the
user, which are termed objects. The second group contains
any object with which the user can collide, which we termed
obstacles. Once the obstacles are detected, our proposed mea-
surement method will be applied for obstacle avoidance.
Each frame of the streamed video will be framed to three
parts: left, right, and middle areas where the user is standing.
Audio messages will be produced based on the direction that
the user needs to follow.
Scenario 1: the first scenario was conducted indoors to
examine the detection algorithm and the proposed method
for avoiding obstacles in a simple environment. The scenario
was conducted in a hall of the Tech building at the University
of Bridgeport. Three obstacles are detected in the scene,
whereas other objects are not considered to be obstacles.
While the user was walking through the hall, audio messages
were produced to avoid two sequential chairs, as shown
in Fig. 22 (a).
Scenario 2: this scenario was also conducted indoors to
detect and avoid objects along the user’s path. Multiple
objects are in this scene. The objective of this scenario is
to test the system for detecting obstacles in close prox-
imity while giving navigational instructions to avoid and
move between the chairs. The chairs were setup in the
middle and close to each other to test the accuracy of the
avoidance technique in a complex environment, where more
than one obstacle is located in close proximity. The arrows
in Figure 22(b) illustrate the directions that the user was
proceeding while he was using the device.
Scenario 3: this scenario was conducted outdoors to evalu-
ate outdoor navigation performance. The scenario, which was
conducted outdoors at the University of Bridgeport, is shown
in Figure 22 (c). The objective of this scenario is to test the
sensitivity of the modules to sunlight.
Scenario 4: this scenario was conducted outdoors to evalu-
ate the proposed avoidance algorithm, where multiple objects
exist. This scenario was performed with path planning but
without any setup in a complex outdoor environment with
dynamic objects. The user started from a predefined point
and walked along a path, as shown in Figure 22 (d), to avoid
detected static and dynamic obstacles and safely proceed
along his path.
B. OVERVIEW OF THE FUNCTIONAL CAPABILITY OF
PROPOSED METHOD IN THE SCENARIOS
Table 6 demonstrates the capability and reliability of the
static/dynamic object detection algorithm model to safely
navigate the user through his/her path. The modules,
the type of experiment and the output are included in this
table.
C. RESULTS AND EVALUATION
The focus of this evaluation is to provide an efficient and
economically accessible device that assists VI people in navi-
gating indoors and outdoors and detecting dynamic and static
objects. In this section, we represent the accuracy based on
the results of real-time scenarios, which are demonstrated
in Table 7. Table 8 illustrates the results of examining the
designed system on a video dataset of 30 videos. Each video
has numerous frames to examine the accuracy of the perfor-
mance of the detection system and avoidance of dynamic and
static objects.
33044 VOLUME 6, 2018
W. M. Elmannai, K. M. Elleithy: Highly Accurate and Reliable Data Fusion Framework for Guiding the Visually Impaired
TABLE 6. Overview of the results of the four scenarios.
The experiments were run on Windows 7, core i7, and
the resolution of the camera is 320 ×240 with a maximum
resolution of 20 fps. More intensive testing can be performed
with a larger dataset. Our sequence of algorithms to detect
dynamic and static objects, especially obstacles, yields very
promising results and high accuracy compared with other
algorithms. We have tested our algorithm on a sequence of
videos that is considered to be a challenge for other systems.
The processing time is dependent on the resolution of the
camera; a higher resolution consumes a larger amount of
time. Therefore, we chose a GHI camera module with a rea-
sonable resolution to save time. An accuracy of 96.40% was
obtained for our four pre-prepared scenarios and a small num-
ber of objects. An accuracy of 98.36% was achieved based
on examining the proposed algorithm on a higher number of
videos and a higher number of objects per frame. This finding
indicates that our algorithm adequately performs for crowded
environments with a larger dataset. As shown in Table 7,
scenario 3 has a number of objects that the user does not
encounter and that are not considered to be obstacles. This
scenario was presented to test the sensitivity of the sensors to
sunlight and to test the outdoor performance of our device.
The user was safely guided by the device through his/her
path. Clear and short audio messages were produced within a
reasonable time.
Table 9 describes the matching level of the microcon-
troller’s decision based on the proposed avoidance algo-
rithm’s process. The results of our tests indicate that the
results are promising and accurate for avoiding any obsta-
cles that may cause a collision with the user and navigating
him/her through his/her path to ensure safe mobility.
Figure 23 shows two real-time indoor scenarios that we
recorded while the user employed the system. Figure 24 rep-
resents a real-time outdoor scenario. Snapshots of some
frames at different times were taken to show different outputs.
The figure represents the performance of the system indoors
and outdoors. A blindfolded person was wearing the device.
The system started to give instructions based on detected
obstacles. The user followed the instructions, which were
given through a headset within a reasonable time based on
the user’s report. The user mentioned that the device was
light and easy to use and the instructions were clear; he did
not need any previous knowledge about the surrounding
environment.
D. COMPUTATIONAL ANALYSIS
The collision avoidance approach is used in this framework
to avoid the detected obstacles that the user may collide with.
The user is provided with the avoidance instructions in prede-
fined distance (1/3 height of a frame). Therefore, we have one
scanning level (from the frame start to 1/3 height) where we
scan for a free path to the user. This scanning level is moving
as the user walks.
The scanning level is divided to three areas: left, right and
middle as shown in Figure 25. The searching approach in each
area is based on the fuzzy logic under one ‘‘for loop’’. We first
search for the detected obstacles. If there is no obstacle within
the scanning level, the user will continue straight. If there
is an obstacle, the fuzzy rules will be applied. Thus, the
VOLUME 6, 2018 33045
W. M. Elmannai, K. M. Elleithy: Highly Accurate and Reliable Data Fusion Framework for Guiding the Visually Impaired
TABLE 7. Evaluation results of the proposed framework.
TABLE 8. Evaluation results for the tested dataset.
proposed obstacle avoidance approach has a complexity of
linear time O(n); n is the number of detected front obstacles
and need to be avoided). Every time the user pass the thresh-
old (1/3 height of the frame), a new window (1/3 height) will
be calculated.
In order to get the time complexity of the whole system,
we need to add the time complexity of the detection algorithm
and the time complexity of the obstacle avoidance algo-
rithm. Most obstacle avoidance algorithms are applied after
SIFT or SURF algorithms for object detection. Our obstacle
avoidance approach is applied after ORB algorithm for object
detection, which requires less memory and computation time
than other systems [76]. According to [85] and [86], the time
complexity of ORB algorithm is almost half of the time
complexity of SIFT and SURF. This conclude that our overall
all system provide a faster and reliable obstacle avoidance
system.
Figure 26 represents the taken time to process five frames;
each frame includes number of obstacles. Furthermore,
Figure 26 describes the actual taken time to detect obstacles,
avoid obstacles as well as sending the audio feedback to
the user through the headphone. This Figure demonstrates
the taken time for detection / avoidance algorithm, estab-
lishing HTTP request and playing the audio feedback. Thus,
the complete processing time is increasing proportionally
to the number of detected objects with a time complexity
of O(n2)
The required processing time for 50 obstacles in one
frame is 0.35sec. The used serial camera has a resolution of
320 ×240 and a maximum resolution of 20 fps. Thus, our
system is capable to process more than three frames within
a second. That indicates the proposed system is a real time
system as we designed it for the pedestrians.
E. DISCUSSION
The objective of this study is to overcome the limitation of
the reviewed systems by designing a new system that sup-
ports missing features in an effective and autonomous design.
Table 10 represents a comparison between the previous sys-
tems that were reviewed in section III and the proposed
system; this comparison is based on the user’s needs and the
engineers’ perspectives.
Table 10 focuses on the performance of the systems. The
parameters in Table 10 were chosen based on our in-depth
study [8]. The unavailability of these features can negatively
influence the performance of the systems. The main con-
cerns of the user are the analysis type (real-time or non-
real-time), weight, cost, and performance (outdoors, indoors).
The main concerns of engineers are the type of detected
objects, the range of the detection, and the total accuracy of
the system. Other parameters can be added to Table 10, and
some of the listed parameters can be conjoined for both users
and engineers.
However, the type of the sensors and the techniques
that we discussed in section 3 may explain the limitations.
For example, infrared technology is sensitive to sunlight,
which indicates that systems based on infrared technology
are not suitable for outdoor use [87]. The scope limita-
tion of radio frequency technology makes it less preferable
in this field because the installation of tags is required in
surrounding areas [88]. In addition, systems based on the
Kinect sensor demonstrate a small detection range because
33046 VOLUME 6, 2018
W. M. Elmannai, K. M. Elleithy: Highly Accurate and Reliable Data Fusion Framework for Guiding the Visually Impaired
FIGURE 23. The proposed system is applied to indoor real-time scenarios.
the accuracy of the Kinect sensor decreases when the dis-
tance between the object and the camera increases [89], [90].
The change in the environmental parameters can have
a significant impact on the performance of ultrasonic
sensors [91]. Thus, ultrasonic sensors have a small detection
range.
VOLUME 6, 2018 33047
W. M. Elmannai, K. M. Elleithy: Highly Accurate and Reliable Data Fusion Framework for Guiding the Visually Impaired
FIGURE 24. The proposed system is applied to outdoor real-time scenarios.
As shown in Table 10, systems [31], [32], and [41] do
not operate in real time, which indicates that they are in
the research phase. These systems include Silicon Eyes,
RFIWS, and Path Force belts. Approximately 70% of the
reviewed techniques do not fully satisfy the benchmark
table (Table 10). For instance, [22] does not provide indoor
33048 VOLUME 6, 2018
W. M. Elmannai, K. M. Elleithy: Highly Accurate and Reliable Data Fusion Framework for Guiding the Visually Impaired
TABLE 9. The evolution of the proposed obstacle avoidance approach.
Scann ing l evel
Lef t M idd le Ri ght
12
2
FIGURE 25. The searching area for obstacle avoidance approach.
performance, and the detection range is small due to the
use of ultrasonic sensors. The integration of both sensor-
based and computer vision technologies is a solution to these
issues since sensor-based systems have limitations due to the
sensors’ limitations and unpredictable behavior due to the
environment’s influence on these systems, which are usually
unpredictable too. Hence, systems that are based on computer
vision technology can also have limitations.
Furthermore, we have surveyed a large number of pub-
lished articles including that present the rules of O&M [92]
for visually impaired people. All the published work agreed
on one point that the guidance of the visually impaired
requires precise instructions and accurate positioning as
well as needs to be economically accessible [92]. We have
designed a system that integrate sensor-based and computer
FIGURE 26. The cost of the proposed approach as a time function for
avoiding number of obstacles in each frame.
vision system. The used sequence of algorithms (computer
vision-based) provide us with an efficient multi object detec-
tion system. As we are able to locate the user’s position and
the obstacle’s coordinates, the accurate positioning condition
is applied.
VOLUME 6, 2018 33049
W. M. Elmannai, K. M. Elleithy: Highly Accurate and Reliable Data Fusion Framework for Guiding the Visually Impaired
TABLE 10. Comparison between the previous systems that were analyzed in section 3 and the proposed system based on the user’s needs.
With this study, O&M instructions and the benchmark
Table 8, we suggest that the proposed system stands out in
this comparison based on its features and the ability to satisfy
expectations of both the user and the engineer with high
accuracy using both sensor-based technology and computer
vision technology.
VII. CONCLUSION
In this study, we developed a hardware and software imple-
mentation that provides a framework for a wearable device
that can assist VI people. The system was implemented using
a .NET Gadgeteer-compatible mainboard and modules from
GHI Electronics. This novel electronic travel aid facilitates
the mobility of VI people indoors and outdoors using com-
puter vision and sensor-based approaches.
At the hardware level, the proposed system includes
modules such as GPS, camera, compass, gyroscope, music,
microphone, wi-fi, and a FEZ spider microcontroller. At the
software level, the system was designed based on multi-
sensory data and computer vision approaches to support a
navigational system and produce accurate information.
The proposed measurement method enables us to approxi-
mately measure the distance between the user and the object.
This method enables the user to safely traverse his/her path
without any collisions depending on the change in the size
and bottom (x, y) coordination of this object in a particular
frame.
An accuracy of 96.40±2% for the static and dynamic
detection system is achieved based on the proposed sequence
of well-known algorithms. Our proposed obstacle avoidance
system enabled the user to traverse his/her path and avoid
100% of the obstacles when they were detected. We con-
ducted numerous experiments to test the accuracy of the
33050 VOLUME 6, 2018
W. M. Elmannai, K. M. Elleithy: Highly Accurate and Reliable Data Fusion Framework for Guiding the Visually Impaired
TABLE 11. Fuzzy rules for proposed obstacles avoidance system.
system. The proposed system exhibits outstanding perfor-
mance when comparing the expected decision with the actual
decision.
Based on the extensive evaluation of other systems, our
system exhibits accurate performance and an improved inter-
action structure with VI people. The following summary
describes the properties of the proposed system:
Performance: the device satisfies the parameters repre-
sented in Table 10, which need to be supported in any device
that assists VI people.
Wireless connectivity: using a wi-fi sensor, the device is
wirelessly connected.
Reliability: designed device satisfies the software’s and
hardware requirements.
Simplicity: the proposed device has a simple interface
that is user-friendly and does not require previous knowl-
edge (speech recognition and audio feedback for navigational
instructions).
Wearable: from a previous study and review [8], the pro-
posed system can be worn rather than carried, which is more
convenient.
Economically accessible: since most blind people are from
low-income backgrounds, the designed system is an eco-
nomic solution, because the current implementation costs less
than $250.
The proposed collision avoidance system can be imple-
mented in different applications such as automotive applica-
tions, self-driving vehicles, and military applications.
VIII. FUTURE DIRECTIONS: OBSTACLE DETECTION
USING SENSOR NETWORKS
Walls and large doors may not be detected due to their
size of representation into the frame, which may consume
half of the frame. In this case, distinguishing between the
foreground and the background can be difficult. Therefore,
ultrasonic sensors may be the solution. In addition, the ultra-
sonic module is a reliable source of obstacle detection that can
measure distance if it is integrated with computer vision tech-
niques. Therefore, additional ultrasonic sensors can increase
the accuracy.
APPENDIX A
See Table 11.
Acknowledgment
$20,000 fellowship grant by the American Association of
University Women (AAUW) was used to partially support
Wafa Elmannai to conduct this research. The cost of publish-
ing this paper was supported by the University of Bridgeport,
CT, USA.
REFERENCES
[1] World Health Organization. Visual Impairment and Blindness.
Accessed: Jun. 2017. [Online]. Available: http://www.who.int/
mediacentre/factsheets/fs282/en/
[2] National Federationof the Blind. Accessed: Jan. 2016. [Online]. Available:
http://www.nfb.org/
[3] American Foundation for the Blind. Accessed: Jan. 2017. [Online]. Avail-
able: http://www.afb.org/
[4] R. Velázquez, ‘‘Wearable assistive devices for the blind,’’ in Wearable
and Autonomous Biomedical Devices and Systems for Smart Environment,
vol. 75. Berlin, Germany: Springer, Oct. 2010, pp. 331–349.
[5] B. Douglas, ‘‘Wayfinding technology: A road map to the future,’’ J. Vis.
Impairment Blindness, vol. 97, no. 10, pp. 612–620, Oct. 2003.
[6] B. B. Blasch, W. R. Wiener, and R. L. Welsh, Foundations of Orientation
and Mobility, 2nd ed. New York, NY, USA: AFB Press, 1997.
[7] C. Shah, M. Bouzit, M. Youssef, and L. Vasquez, ‘‘Evaluation
of RU-Netra—Tactile feedback navigation system for the visually
impaired,’’ in Proc. IEEE Int. Workshop Virtual Rehabil., Aug. 2006,
pp. 72–77.
VOLUME 6, 2018 33051
W. M. Elmannai, K. M. Elleithy: Highly Accurate and Reliable Data Fusion Framework for Guiding the Visually Impaired
[8] W. Elmannai and K. Elleithy, ‘‘Sensor-based assistive devices for visually-
impaired people: Current status, challenges, and future directions,’Sen-
sors, vol. 17, no. 3, p. 565, 2017.
[9] A. H. Marion and A. J. Michael, Assistive Technology for Visually Impaired
and Blind People. London, U.K.: Springer, 2008.
[10] V. Tiponut, D. Ianchis, M. E. Basch, and Z. Haraszy, ‘‘Work directions
and new results in electronic travel aids for blind and visually impaired
people,’WSEAS Trans. Syst., vol. 9, no. 10, pp. 1086–1097, 2011.
[11] V. Tiponut, S. Popescu, I. Bogdanov, and C. Caleanu, ‘‘Obstacles detection
system for visually-impaired guidance,’’ in Proc. New Aspects Syst. 12th
WSEAS Int. Conf. Syst., Heraklion, Greece, Jul. 2008, pp. 350–356.
[12] M. A. Hersh, ‘‘The design and evaluation of assistive technology products
and devices part 1: Design,’’ in International Encyclopedia of Rehabilita-
tion. Buffal, NY, USA: CIRRIE, 2010.
[13] K. K. Kim, S. H. Han, J. Park, and J. Park, ‘‘The interaction experiences
of visually impaired people with assistive technology: A case study of
smartphones,’Int. J. Ind. Ergonom., vol. 55, pp. 22–33, Sep. 2016.
[14] D. Dakopoulos and N. G. Bourbakis, ‘‘Wearable obstacle avoidance elec-
tronic travel aids for blind: A survey,’IEEE Trans. Syst., Man, Cybern.
C, Appl. Rev., vol. 40, no. 1, pp. 25–35, Jan. 2010.
[15] L. Renier and A. G. De Volder, ‘‘Vision substitution and depth perception:
Early blind subjects experience visual perspective through their ears,’’
Disab. Rehabil. Assistive Technol., vol. 5, no. 3, pp. 175–183, May 2010.
[16] R. Tapu, B. Mocanu, and E. Tapu, ‘‘A survey on wearable devices used
to assist the visual impaired user navigation in outdoor environments,’’
in Proc. IEEE 11th Int. Symp. (ISETC), Timişoara, Romania, Nov. 2014,
pp. 1–4.
[17] J. Liu, J. Liu, L. Xu, and W. Jin, ‘‘Electronic travel aids for the blind
based on sensory substitution,’’ in Proc. 5th Int. Conf. Comput. Sci.
Edu. (ICCSE), Hefei, China, Aug. 2010, pp. 1328–1331.
[18] J. Sánchez and M. Elías, ‘‘Guidelines for designing mobility and ori-
entation software for blind children,’’ in Human-Computer Interaction—
INTERACT (Lecture Notes in Computer Science), vol. 14, C. Baranauskas,
P. Palanque, J. Abascal, and S. D. J. Barbosa, Eds. Rio de Janeiro, Brazil:
Springer, Sep. 2007, pp. 375–388.
[19] R. Farcy, R. Leroux, A. Jucha, R. Damaschini, C. Grégoire, and
A. Zogaghi, ‘‘Electronic travel aids and electronic orientation aids for
blind people: Technical, rehabilitation and everyday life points of view,’’ in
Proc. Conf. Workshop Assistive Technol. People Vis. Hearing Impairments
Technol. Inclusion, vol. 12. Nov. 2006, pp. 1–12.
[20] S. Kammoun, M.-J. Macé, B. Oriola, and C. Jouffrais, ‘‘Toward a better
guidance in wearable electronic orientation aids,’’ in Proc. IFIP Conf.
Hum.-Comput. Interact., Lisbon, Portugal, Sep. 2011, pp. 624–627.
[21] M. H. A. Wahab et al., ‘‘Smart cane: Assistive cane for visually-impaired
people,’Int. J. Comput. Sci. Issues, vol. 8, no. 4, pp. 21–27, Jul. 2011.
[22] S. Bharambe, R. Thakker,H. Patil, and K. M. Bhurchandi, ‘‘Substitute eyes
for blind with navigator using Android,’’ in Proc. India Edu. Conf. (TIIEC),
Bengaluru, India, Apr. 2013, pp. 38–43.
[23] Y. Yi and L. Dong, ‘‘A design of blind-guide crutch based on
multi-sensors,’’ in Proc. IEEE 12th Int. Conf. Fuzzy Syst. Knowl.
Discovery (FSKD), Zhangjiajie, China, Aug. 2015, pp. 2288–2292.
[24] A. R. García, R. Fonseca, and A. Durán, ‘‘Electronic long cane for loco-
motion improving on visual impaired people. A case study,’’ in Proc.
Pan Amer. Health Care Exchanges (PAHCE), Rio de Janeiro, Brazil,
Mar./Apr. 2011, pp. 58–61.
[25] K. Kumar, B. Champaty, K. Uvanesh, R. Chachan, K. Pal, and A. Anis,
‘‘Development of an ultrasonic cane as a navigation aid for the blind
people,’’ in Proc. Int. Conf. Control, Instrum., Commun. Comput.
Technol. (ICCICCT), Kanyakumari, India, Jul. 2014, pp. 475–479.
[26] J. J. M. Benjamin, Jr., ‘‘The laser cane,’’ Bull. Prosthetics Res.,
pp. 443–450, 1973. [Online]. Available: https://www.rehab.research.
va.gov/jour/74/11/2/443.pdf
[27] S. Aymaz and T. Çavdar, ‘‘Ultrasonic assistive headset for visually
impaired people,’’ in Proc. IEEE 39th Int. Conf. Telecommun. Signal
Process. (TSP), Vienna, Austria, Jun. 2016, pp. 388–391.
[28] L. Dunai, B. D. Garcia, I. Lengua, and G. Peris-Fajarnés, ‘‘3D CMOS
sensor based acoustic object detection and navigation system for blind
people,’’ in Proc. 38th Annu. Conf. IEEE Ind. Electron. Soc. (IECON),
Montreal, QC, Canada, Oct. 2012, pp. 4208–4215.
[29] J. Xiao, K. Ramdath, M. Losilevish, D. Sigh, and A. Tsakas, ‘‘A low cost
outdoor assistive navigation system for blind people,’’ in Proc. 8th IEEE
Conf. Ind. Electron. Appl. (ICIEA), Melbourne, VIC, Australia, Jun. 2013,
pp. 828–833.
[30] B. R. Prudhvi and R. Bagani, ‘‘Silicon eyes: GPS-GSM based nav-
igation assistant for visually impaired using capacitive touch braille
keypad and smart SMS facility,’’ in Proc. World Congr. Comput. Inf.
Technol. (WCCIT), Sousse, Tunisia, Jun. 2013, pp. 1–3.
[31] M. R. U. Saputra and P. I. Santosa, ‘‘Obstacle avoidance for visually
impaired using auto-adaptive thresholding on Kinect’s depth image,’’ in
Proc. IEEE 14th Int. Conf. Scalable Comput. Commun. Assoc. Workshops
(UTC-ATC-ScalCom), Bali, Indonesia, Dec. 2014, pp. 337–342.
[32] F. A. Jassim and F. H. Altaani, ‘‘Hybridization of Otsu method and median
filter for color image segmentation,’Int. J. Soft Comput. Eng., vol. 3, no. 2,
pp. 69–74, May 2013.
[33] J. F. Oliveira, ‘‘The path force feedback belt,’’ in Proc. 8th Int. Conf. Inf.
Technol. Asia (CITA), Kuching, Malaysia, Jul. 2013, pp. 1–6.
[34] A. S. Martinez-Sala, F. Losilla, J. C. Sánchez-Aarnoutse, and
J. García-Haro, ‘‘Design, implementation and evaluation of an indoor
navigation system for visually impaired people,’Sensors, vol. 15, no. 2,
pp. 32168–32187, 2015.
[35] L. Everding, L. Walger, V. S. Ghaderi, and J. Conradt, ‘‘A mobility
device for the blind with improved vertical resolution using dynamic
vision sensors,’’ in Proc. IEEE 18th Int. Conf. e-Health Netw., Appl.
Services (Healthcom), Munich, Germany, Sep. 2016, pp. 1–5.
[36] V. S. Ghaderi, M. Mulas, V. F. S. Pereira, L. Everding, D. Weikersdorfer,
and J. A. Conradt, ‘‘A wearable mobility device for the blind using retina-
inspired dynamic vision sensors,’’ in Proc. 37th Annu. Int. Conf. IEEE Eng.
Med. Biol. Soc. (EMBC), Aug. 2015, pp. 3371–3374.
[37] E. Mueggler, C. Forster, N. Baumli, G. Gallego, and D. Scaramuzza,
‘‘Lifetime estimation of events from dynamic vision sensors,’’ in Proc.
IEEE Int. Conf. Robot. Autom. (ICRA), Seattle, WA, USA, May 2015,
pp. 4874–4881.
[38] N. Owano. (Aug. 13, 2016). Dynamic Vision Sensor Tech Works
Like Human Retina. [Online]. Available: http://phys.org/news/2013–08-
dynamic-vision-sensor-tech-human.html
[39] T. H. Nguyen, T. H. Nguyen, T. L. Le, T. T. H. Tran, N. Vuillerme, and
T. P. Vuong, ‘‘A wearable assistive device for the blind using tongue-
placed electrotactile display: Design and verification,’’ in Proc. Int. Conf.
Control, Autom. Inf. Sci. (ICCAIS), Nha Trang, Vietnam, Nov. 2013,
pp. 42–47.
[40] T. H. Nguyen, T. L. Le, T. T. H. Tran, N. Vuillerme, and T. P. Vuong,
‘‘Antenna design for tongue electrotactile assistive device for the blind and
visually-impaired,’’ in Proc. 7th Eur. Conf. Antennas Propag. (EuCAP),
Gothenburg, Sweden, Apr. 2013, pp. 1183–1186.
[41] M. F. Saaid, I. Ismail, and M. Z. H. Noor, ‘‘Radio frequency identi-
fication walking stick (RFIWS): A device for the blind,’’ in Proc. 5th
Int. Colloq. Signal Process. Appl., Kuala Lumpur, Malaysia, Mar. 2009,
pp. 250–253.
[42] I. Ahlmark, D. H. Fredriksson, and K. Hyyppa, ‘‘Obstacle avoidance using
haptics and a laser rangefinder,’’ in Proc. IEEE Workshop Adv. Robot. Soc.
Impacts (ARSO), Tokyo, Japan, Nov. 2013, pp. 76–81.
[43] SenseGrapics AB. Open Source Haptics—H3D. Accessed: Jun. 18, 2013.
[Online]. Available: http://www.h3dapi.org/
[44] G. Olmschenk, C. Yang, Z. Zhu, H. Tong, and W. H. Seiple, ‘‘Mobile
crowd assisted navigation for the visually impaired,’’ in Proc. IEEE
12th Int. Conf. Ubiquitous Intell. Comput., IEEE 12th Int. Conf. Auto.
Trusted Comput., IEEE 15th Int. Conf. Scalable Comput. Commun. Assoc.
Workshops (UIC-ATC-ScalCom), Aug. 2015, pp. 324–327.
[45] A. Brilhault, S. Kammoun, O. Gutierrez, P. Truillet, and C. Jouffrais,
‘‘Fusion of artificial vision and GPS to improve blind pedestrian position-
ing,’’ in Proc. 4th IFIP Int. Conf. New Technol., Mobility Secur. (NTMS),
Paris, France, Feb. 2011, pp. 1–57.
[46] C. E. White, D. Bernstein, and A. L. Kornhauser, ‘‘Some map matching
algorithms for personal navigation assistants,’Transp. Res. C, Emerg.
Technol., vol. 8, no. 1, pp. 91–108, Dec. 2000.
[47] J. M. Loomis, R. G. Golledge, R. L. Klatzky, J. M. Speigle, and J. Tietz,
‘‘Personal guidance system for the visually-impaired,’’ in Proc. 1st Annu.
ACM Conf. Assistive Technol., Marina Del Rey, CA, USA, Oct./Nov. 1994,
pp. 85–91.
[48] A. Delorme and S. J. Thorpe, ‘‘SpikeNET: An event-driven simulation
package for modelling large networks of spiking neurons,’’ Netw., Comput.
Neural Syst., vol. 14, no. 4, pp. 613–627, 2003.
[49] A. Landa-Hernández and E. Bayro-Corrochano, ‘‘Cognitive guidance sys-
tem for the blind,’’ in Proc. IEEE World Autom. Congr. (WAC), Puerto
Vallarta, Mexico, Jun. 2012, pp. 1–6.
33052 VOLUME 6, 2018
W. M. Elmannai, K. M. Elleithy: Highly Accurate and Reliable Data Fusion Framework for Guiding the Visually Impaired
[50] R. Tapu, B. Mocanu, and T. Zaharia, ‘‘A computer vision system that
ensure the autonomous navigation of blind people,’’ in Proc. IEEE
E-Health Bioeng. Conf. (EHB), Iasi, Romania, Nov. 2013, pp. 1–4.
[51] R. Tapu, B. Mocanu, and T. Zaharia, ‘‘Real time static/dynamic obstacle
detection for visually impaired persons,’’ in Proc. IEEE Int. Conf. Con-
sumer Electron. (ICCE), Las Vegas, NV, USA, Jan. 2014, pp. 394–395.
[52] A. Aladrén, G. López-Nicolás, L. Puig, and J. J. Guerrero, ‘‘Navigation
assistance for the visually impaired using RGB-D sensor with range expan-
sion,’IEEE Syst. J., vol. 10, no. 3, pp. 922–932, Sep. 2016.
[53] N. Kiryati, Y. Eldar, and A. M. Bruckstein, ‘‘A probabilistic Hough trans-
form,’Pattern Recognit., vol. 24, no. 4, pp. 303–316, 1991.
[54] B. Mocanu, R. Tapu, and T. Zaharia, ‘‘When ultrasonic sensors and com-
puter vision join forces for efficient obstacle detection and recognition,’
Sensors, vol. 16, no. 11, p. 1807, 2016.
[55] F. Dramas, S. J. Thorpe, and C. Jouffrais, ‘‘Artificial vision for the blind:
A bio-inspired algorithm for objects and obstacles detection,’Int. J. Image
Graph., vol. 10, no. 4, pp. 531–544, 2010.
[56] R. Vlasov, K.-I. Friese, and F.-E. Wolter, ‘‘Haptic rendering of volume
data with collision determination guarantee using ray casting and implicit
surface representation,’’ in Proc. IEEE Int. Conf. Cyberworlds (CW),
Sep. 2012, pp. 91–98.
[57] N. Bourbakis, R. Keefer, D. Dakopoulos, and A. Esposito, ‘‘A multimodal
interaction scheme between a blind user and the tyflos assistive prototype,’’
in Proc. IEEE Int. Conf. Tools Artif. Intell. (ICTAI), vol. 2. Nov. 2008,
pp. 487–494.
[58] D. J. Calder, ‘‘Travel aids for the blind—The digital ecosystem solution,’
in Proc. 7th IEEE Int. Conf. Ind. Inf. (INDIN), Jun. 2009, pp. 149–154.
[59] P. Strumillo, ‘‘Electronic interfaces aiding the visually impaired in envi-
ronmental access, mobility and navigation,’’ in Proc. IEEE 3rd Conf. Hum.
Syst. Int. (HSI), May 2010, pp. 17–24.
[60] S. Kammoun, F. Dramas, B. Oriolaand, and C. Jouffrais, ‘‘Route selec-
tion algorithm for Blind pedestrian,’’ in Proc. Int. Conf. Control Autom.
Syst. (ICCAS), 2010 pp. 2223–2228.
[61] K. Koiner, H. Elmiligi, and F. Gebali, ‘‘GPS waypoint applica-
tion,’’ in Proc. 7th Int. Conf. Broadband, Wireless Comput., Commun.
Appl. (BWCCA), Nov. 2012, pp. 397–401.
[62] M. C. Le, S. L. Phung, and A. Bouzerdoum, ‘‘Pedestrian lane detection
for assistive navigation of blind people,’’ in Proc. 21st Int. Conf. Pattern
Recognit. (ICPR), Nov. 2012, pp. 2594–2597.
[63] C. Jacquet, Y. Bellik, and Y. Bourda, ‘‘Electronic locomotion aids for
the blind: Towards more assistive systems,’’ in Intelligent Paradigms
for Assistive and Preventive Healthcare, vol. 19, N. Ichalkaranje,
A. Ichalkaranje, and L. C. Jain, Eds. Berlin, Germany: Springer, 2006,
pp. 133–163.
[64] C. Yi and Y. Tian, ‘‘Assistive text reading from complex background for
blind persons,’’ in Camera-Based Document Analysis and Recognition.
Berlin, Germany: Springer, 2012, pp. 15–28.
[65] A. Tripathy, A. Pathak, A. Rodrigues, and C. Chaudhari, ‘‘VIMPY
A Yapper for the visually impaired,’’ in Proc. IEEE World Congr. Inf.
Commun. Technol. (WICT), Oct./Nov. 2012, pp. 167–172.
[66] R. Parlouar, F. Dramas, M. Macé, and C. Jouffrais, ‘‘Assistive device for
the blind based on object recognition: an application to identify currency
bills,’’ presented at the 11th Int. ACM SIGACCESS Conf. Comput. Acces-
sibility, Pittsburgh, PA, USA, 2009.
[67] J. A. Black and D. S. Hayden, ‘‘The note-taker: An assistive technology
that allows students who are legally blind to take notes in the classroom,’’ in
Proc. IEEE Comput. Soc. Conf. Comput. Vis. Pattern Recognit. Workshops
(CVPRW), Jun. 2010, pp. 1–8.
[68] S. Hayden, L. Zhou, M. J. Astrauskas, and J. Black, ‘‘Note-taker 2.0:
The next step toward enabling students who are legally blind to take notes
in class,’’ presented at the 12th Int. ACM SIGACCESS Conf. Comput.
Accessibility, Orlando, FL, USA, 2010.
[69] P. Angin, B. Bhargava, and S. Helal, ‘‘A mobile-cloud collaborative traffic
lights detector for blind navigation,’’ in Proc. 7th Int. Conf. Mobile Data
Manage. (MDM), 2010, pp. 396–401.
[70] V. Kulyukin and A. Kutiyanawala, ‘‘From ShopTalk to ShopMobile:
Vision-based barcode scanning with mobile phones for independent blind
grocery shopping,’’ in Proc. Rehabil. Eng. Assistive Technol. Soc. North
Amer. Conf. (RESNA), vol. 703. Las Vegas, NV, USA, 2010, pp. 1–5.
[71] B. Schauerte, M. Martinez, A. Constantinescu, and R. Stiefelhagen,
‘‘An assistive vision system for the blind that helps find lost things,’’ in
Computers Helping PeopleWith Special Needs. Berlin, Germany: Springer,
2012, pp. 566–572.
[72] F. Dramas, B. Oriola, B. G. Katz, S. J. Thorpe, and C. Jouffrais,
‘‘Designing an assistive device for the blind based on object local-
ization and augmented auditory reality,’’ presented at the 10th Int.
ACM SIGACCESS Conf. Comput. Accessibility, Halifax, NS, Canada,
2008.
[73] C. Yi, R. W. Flores, R. Chincha, and Y. Tian, ‘‘Finding objects for assist-
ing blind people,’Netw. Model. Anal. Health Inf. Bioinf., vol. 2, no. 2,
pp. 71–79, 2013.
[74] R. Manduchi, ‘‘Mobile vision as assistive technology for the blind: An
experimental study,’’ in Proc. Int. Conf. Comput. Handicapped Persons,
2012, pp. 9–16.
[75] G. H. I. Electronics. (May 5, 2013). Catalog | NET Gadgeteer—
GHI Electronics. [Online]. Available: http://www.ghielectronics.
com/catalog/category/265/
[76] E. Rublee, V. Rabaud, K. Konolige, and G. Bradski, ‘‘ORB: An efficient
alternative to SIFT or SURF,’’ in Proc.IEEE Int. Conf. Comput., Nov. 2011,
pp. 2564–2571.
[77] E. Rosten and T. Drummond, ‘‘Machine learning for high-speed corner
detection,’’ in Proc. Eur. Conf. Comput. Vis., vol. 1. 2006, pp. 430–443.
[78] M. Calonder, V. Lepetit, C. Strecha, and P. Fua, ‘‘BRIEF: Binary robust
independent elementary features,’’ in Proc. Eur. Conf. Comput. Vis.
(ECCV), 2010, pp. 778–792.
[79] P. L. Rosin, ‘‘Measuring corner properties,’’ Comput. Vis. Image
Understand., vol. 73, no. 2, pp. 291–307, 1999.
[80] L. Zhang, P. Shen, G. Zhu, W. Wei, and H. Song, ‘‘A fast robot identifi-
cation and mapping algorithm based on kinect sensor,’’ Sensors, vol. 15,
no. 8, pp. 19937–19967, 2015.
[81] L. Yu, Z. Yu, and Y. Gong, ‘‘An improved ORB algorithm of extracting
and matching features,’Int. J. Signal Process., Image Process. Pattern
Recognit., vol. 8, no. 5, pp. 117–126, 2015.
[82] A. Vinay, A. S. Rao, V. S. Shekhar, A. Kumar, K. N. B. Murthy,
and S. Natarajan, ‘‘Feature extractionusing ORB-RANSAC for
face recognition,’Procedia Comput. Sci., vol. 70, pp. 174–184,
Jan. 2015.
[83] L. Zadeh, ‘‘Fuzzy sets, fuzzy logic, and fuzzy systems,’’ in Advances in
Fuzzy Systems—Applications and Theory, L. A. Zadeh, Ed. Hoboken, NJ,
USA: World Scientific, 1996, pp. 94–102.
[84] Microsoft. (May 5, 2013). Home—Gadgeteer. [Online]. Available:
http://www.netmf.com/gadgeteer/
[85] A. Canclini, M. Cesana, A. Redondi, M. Tagliasacchi, J. Ascenso, and
R. Cilla, ‘‘Evaluation of low-complexity visual feature detectors and
descriptors,’’ in Proc. IEEE 18th Int. Conf. Digit. Signal Process. (DSP),
2013, pp. 1–7.
[86] L. Yang and Z. Lu, ‘‘Anew scheme for keypoint detection and description,’’
Math. Problems Eng., vol. 2015, May 2015, Art. no. 310704.
[87] H. Photonics. Characteristics and use of Infrared Detectors.
Solid State Division. Accessed: Jan. 2017. [Online]. Available:
http://www.hamamatsu.com/resources/pdf/ssd/infrared_kird9001e.pdf?_
ga=2.110454413.948372403.1521579305-1740956368.1521579305
[88] L. McCathie, ‘‘The advantages and disadvantages of barcodes and radio
frequency identification in supply chain management,’’ Ph.D. dissertation,
Faculty Eng. Inf. Sci., Univ. Wollongong: Wollongong, NSW, Australia,
2004.
[89] M. R. Andersen et al., ‘‘Kinect depth sensor evaluation for computer vision
applications,’Tech. Rep. Electron. Comput. Eng., vol. 1, no. 6, p. 1–34,
Feb. 2012.
[90] L. B. Neto et al., ‘‘A kinect-based wearable face recognition system to aid
visually impaired users,’IEEE Trans. Human-Mach. Syst., vol. 47, no. 1,
pp. 52–64, Feb. 2016.
[91] Overview for Applying Ultrasonic Technology (AirducerTM Catalog),
AIRMAR Technol. Corp., Milford, NH, USA, 2001.
[92] M. A. Williams, A. Hurst, and S. K. Kane, ‘‘‘Pray before you step out’:
Describing personal and situational blind navigation behaviors,’’ in Proc.
15th Int. ACM SIGACCESS Conf. Comput. Accessib. (ASSETS), 2013,
p. 28.
VOLUME 6, 2018 33053
W. M. Elmannai, K. M. Elleithy: Highly Accurate and Reliable Data Fusion Framework for Guiding the Visually Impaired
WAFA M. ELMANNAI received the M.S. degree
from the University of Bridgeport (UB) in 2012,
where she is currently pursuing the Ph.D. degree
with the Department of Computer Science and
Engineering. She is currently a Graduate and
Research Assistant with the Department of Com-
puter Science and Engineering, UB.
She received the bachelor’s degree (Hons.) in
computer science from the College of Electronic
Technology, Tripoli, Libya, in 2005. She started
her career as a Computer Science Teacher from 2005 to 2009. She also
supervised a number of associate degree projects with the Jeel Altakadom
Institution, Tripoli, Libya, from 2007 to 2009. She published over 20 research
papers in prestigious national / international conferences and journals. The
research interests include mobile communications, wireless sensor networks,
design of mobile applications, and health assistive devices. She was a recip-
ient of over 12 awards from national organizations.
Mrs. Elmannai serves as the President of UPE Honor Society, UB chapter,
a Vice President of SWE Society, and AAUW ambassador. She is a member
of other technical and honorary societies. She has been a member of the IEEE
computer society since 2012, also has been a member of the honor society of
Phi Kappa Phi University of UB Chapter since 2012, also has been a member
of the IEEE Communications Society since 2014, and also a member of Arab
American Association of Engineers and Architects in 2017.
KHALED M. ELLEITHY is currently the Asso-
ciate Vice President for graduate studies and
research with the University of Bridgeport. He is
also a Professor of computer science and engi-
neering. His research interests includes wireless
sensor networks, mobile communications, net-
work security, quantum computing, and formal
approaches for design and verification. He has
published over three hundred fifty research papers
in national/international journals and conferences
in his areas of expertise. He is a fellow of the African Academy of Sciences.
He is the Editor or Co-Editor for 12 books published by Springer.
He received the B.Sc. degree in computer science and automatic control
and the M.S. degree in computer networks from Alexandria University
in 1983 and 1986, respectively, and the M.S. and Ph.D. degrees in computer
science from the Center for Advanced Computer Studies, University of
Louisiana at Lafayette, in 1988 and 1990, respectively.
He has over 30 years of teaching experience. His teaching evaluations
were distinguished in all the universities he joined. He supervised hundreds
of senior projects, M.S. theses, and Ph.D. dissertations. He developed and
introduced many new undergraduate/graduate courses. He also developed
new teaching/research laboratories in his area of expertise. He was a recip-
ient of the Distinguished Professor of the Year, University of Bridgeport,
from 2006 to 2007. His students have received over twenty prestigious
national/international awards from the IEEE, ACM, and ASEE.
Dr. Elleithy is a member of the technical program committees of many
international conferences as recognition of his research qualifications.
He served as a guest editor for several international journals. He was the
Chairperson of the International Conference on Industrial Electronics, Tech-
nology and Automation. Furthermore, he is the Co-Chair and Co-Founder of
the Annual International Joint Conferences on Computer, Information, and
Systems Sciences, and Engineering Virtual Conferences from 2005 to 2014.
He is a member of several technical and honorary societies. He is a
Senior Member of the IEEE computer society. He has been a member
of the Association of Computing Machinery (ACM) since 1990, also has
been a member of ACM Special Interest Group on Computer Architecture
since 1990, also has been a member of the honor society of Phi Kappa Phi
University of South Western Louisiana Chapter since 1989, also has been a
member of the IEEE Circuits and Systems society since 1988, also has been
a member of the IEEE Computer Society since 1988, and also has been a
lifetime member of the Egyptian Engineering Syndicate since 1983.
33054 VOLUME 6, 2018
... Other cameras used in the reviewed studies included stereo cameras (n = 8), USB cameras (n = 4), infrared cameras (n = 3), high-sensitivity cameras (n = 2), micro cameras (n = 2), and smartphone cameras (n = 2). Studies that used computer vision-based technologies reported 99% precision in detecting main structural elements [25] an accuracy of 98% in detecting obstacles and 100% in avoiding them [35]. A decrease in navigation time was also reported in studies using high-sensitivity cameras [41], RGB-D cameras [49], and RealSense [73]. ...
... A decrease in navigation time was also reported in studies using high-sensitivity cameras [41], RGB-D cameras [49], and RealSense [73]. In addition, a reduction in the number of collisions was also reported [35], [49], [57], [73], [75]. ...
... In the 36 studies (59.02%) that used a combination of technologies, 29 (47.54%) reported using an integration of computer vision and sensor-based technologies for obstacle detection. Combining computer vision and sensor-based technologies can improve obstacle detection, increase accuracy, and provide efficient and safer mobility in both indoors and outdoors environments [35]. Examples include Bai et al. [27] and Mocanu et al. [54] that added an ultrasonic sensor to compensate for the limitations of the camera (transparent objects, larger obstructions like walls or doors). ...
Article
Full-text available
Wearable devices have been developed to improve the navigation of blind and visually impaired people. With technological advancements, the use and research of wearable devices have been increasing. This systematic review aimed to explore existing literature on technologies used in wearable devices intended to provide independent and safe mobility for visually impaired people. Searches were conducted in six electronic databases (PubMed, Web of Science, Scopus, Cochrane, ACM Digital Library and SciELO). Our systematic review included 61 studies. The results show that the majority of studies used audio information as a feedback interface and a combination of technologies for obstacle detection - especially the integration of sensor-based and computer vision-based technologies. The findings also showed the importance of including visually impaired individuals during prototype usage testing and the need for including safety evaluation which is currently lacking. These results have important implications for developing wearable devices for the safe mobility of visually impaired people.
... Some of our previous work includes novel methods for detection of crosswalks and stairs in assistance systems for the visually impaired persons [5]. Similarly, some authors address the detection of other obstacles in such assistance systems [6], [7]. Furthermore, those methods are broaden with sound guiding technique to help the users in their movement [8]. ...
Preprint
Visually impaired persons have significant problems in their everyday movement. Therefore, some of our previous work involves computer vision in developing assistance systems for guiding the visually impaired in critical situations. Some of those situations includes crosswalks on road crossings and stairs in indoor and outdoor environment. This paper presents an evaluation framework for computer vision-based guiding of the visually impaired persons in such critical situations. Presented framework includes the interface for labeling and storing referent human decisions for guiding directions and compares them to computer vision-based decisions. Since strict evaluation methodology in this research field is not clearly defined and due to the specifics of the transfer of information to visually impaired persons, evaluation criterion for specific simplified guiding instructions is proposed.
... In Brazil, around 0.75% of the population is blind [2]. Visual impairment can seriously impact people's quality of life, as they encounter many challenges in most daily activities [3]. One of the biggest challenges faced by such people is associated with secure and efficient navigation [4], such as obstacles, stairs, traffic corners, signposts on the pavement, and slippery paths [5,6]. ...
Conference Paper
Full-text available
Efficient navigation is a challenge for visually impaired people. Several technologies combine sensors, cameras, or feedback channels to increase the autonomy and mobility of visually impaired people. Still, many existing systems are expensive and complex to a blind person's needs. This work presents a dataset for indoor navigation purposes with annotated ground-truth representing real-world situations. We also performed a study on the efficiency of deep-learning-based approaches on such dataset. These results represent initial efforts to develop a real-time navigation system for visually impaired people in uncontrolled indoor environments. We analyzed the use of video-based object recognition algorithms for the automatic detection of five groups of objects: i) fire extinguisher ; ii) emergency sign; iii) attention sign; iv) internal sign, and v) other. We produced an experimental database with 20 minutes and 6 seconds of videos recorded by a person walking through the corridors of the largest building on campus. In addition to the testing database, other contributions of this work are the study on the efficiency of five state-of-the-art deep-learning-based models (YOLO-v3, YOLO-v3 tiny, YOLO-v4, YOLO-v4 tiny, and YOLO-v4 scaled), achieving results above 82% performance in uncontrolled environments, reaching up to 93% with YOLO-v4. It was possible to process between 62 and 371 Frames Per Second (FPS) concerning the speed, being the YOLO-v4 tiny architecture, the fastest one. Code and dataset available at: https://github.com/ICDI/navigation4blind.
... Elmannai W. M. et al. [45] present a system to avoid front obstacles utilizing sensorbased and computer vision-based techniques, as well as image depth information and fuzzy logic. It consists of (1) a FEZ spider microcontroller, (2) two camera modules, (3) a compass module, (4) a GPS module, (5) a gyroscope module, (6) a music (audio output) module, (7) a microphone module and (8) a wi-fi module. ...
Article
Full-text available
Navigation assistive technologies have been designed to support the mobility of people who are blind and visually impaired during independent navigation by providing sensory augmentation, spatial information and general awareness of their environment. This paper focuses on the extended Usability and User Experience (UX) evaluation of BlindRouteVision, an outdoor navigation smartphone application that tries to efficiently solve problems related to the pedestrian navigation of visually impaired people without the aid of guides. The proposed system consists of an Android application that interacts with an external high-accuracy GPS sensor tracking pedestrian mobility in real-time, a second external device specifically designed to be mounted on traffic lights for identifying traffic light status and an ultrasonic sensor for detecting near-field obstacles along the route of the blind. Moreover, during outdoor navigation, it can optionally incorporate the use of Public Means of Transport, as well as provide multiple other uses such as dialing a call and notifying the current location in case of an emergency. We present findings from a Usability and UX standpoint of our proposed system conducted in the context of a pilot study, with 30 people having varying degrees of blindness. We also received feedback for improving both the available functionality of our application and the process by which the blind users learn the features of the application. The method of the study involved using standardized questionnaires and semi-structured interviews. The evaluation took place after the participants were exposed to the system’s functionality via specialized user-centered training sessions organized around a training version of the application that involves route simulation. The results indicate an overall positive attitude from the users.
... WafaM.Elmannai, et al. [1] proposes a method intended to assist the visually impaired. The system combines sensor-based techniques with computer vision concepts to achieve an economically viable solution. ...
Article
Full-text available
This research introduces a blind aid that uses a live object recognition system. People who are blind or partially sighted rely significantly on their other senses, such as touch and auditory cues, to comprehend their surroundings. There is a need to deploy a technology that assists visually impaired persons in their daily routines, as there is now very little aid. Existing solutions, such as Screen Reading software and Braille devices, assist visually impaired individuals in reading and gaining access to numerous gadgets. However, these technologies are rendered worthless When the blind need to perform basic activities like recognizing the situation before them, such as recognizing people or objects, technologies become ineffective. This method will benefit blind or visually challenged all across the world. The goal is to help a person with total or partial blindness obtain a second set of eyesight without the assistance of a guardian, allowing them to live a better and more independent life. This project outlines working to create a more welcoming and inclusive environment, focusing on assistive technology that provides services, resources, and information to those with visual disabilities.
... Despite the wide range of possibilities currently available, some papers opted for the development of specific algorithms tailored to their research objectives. They are used to perform specific tasks of the overall development such as list construction and object detection in [33], or object extraction and obstacle avoidance in [47]. ...
Article
Full-text available
We present in this paper the state of the art and an analysis of recent research work and achievements performed in the domain of AI-based and vision-based systems for helping blind and visually impaired people (BVIP). We start by highlighting the recent and tremendous importance that AI has acquired following the use of convolutional neural networks (CNN) and their ability to solve image classification tasks efficiently. After that, we also note that VIP have high expectations about AI-based systems as a possible way to ease the perception of their environment and to improve their everyday life. Then, we set the scope of our survey: we concentrate our investigations on the use of CNN or related methods in a vision-based system for helping BVIP. We analyze the existing surveys, and we study the current work (a selection of 30 case studies) using several dimensions such as acquired data, learned models, and human–computer interfaces. We compare the different approaches, and conclude by analyzing future trends in this domain.
... IR, ultrasonic, etc.), to take data from the surrounding environment, perform object detection. and give feedback to the user by means of vibration, sound, or both [11]. Electronic Long Cane (ELC) [12] is an electronic device that guides VIPs for the detection of an object. ...
Article
Full-text available
Visually impaired persons (VIPs) comprise a significant portion of the population and they are present in all corners of the world. In recent times, the technology proved its presence in every domain of life and innovative devices are assisting humans in all fields especially, artificial intelligence has dominated and outperformed the rest of the trades. VIPs need assistance in performing daily life tasks like object/obstacle detection and recognition, navigation, and mobility, particularly in indoor and outdoor environments. Moreover, the protection and safety of these people are of prime concern. Several devices and applications have been developed for the assistance of VIPs. Firstly, these devices take input from the surrounding environment through different sensors e.g. infrared radiation, ultrasonic, imagery sensor, etc. In the second stage, state of the art machine learning techniques process these signals and extract useful information. Finally, feedback is provided to the user through auditory and/or vibratory means. It is observed that most of the existing devices are constrained in their abilities. The paper presents a comprehensive comparative analysis of the state-of-the-art assistive devices for VIPs. These techniques are categorized based on their functionality and working principles. The main attributes, challenges, and limitations of these techniques have also been highlighted. Moreover, a score-based quantitative analysis of these devices is performed to highlight their feature enrichment capability for each category. It may help to select an appropriate device for a particular scenario.
... The assistive system proposed by [3] learns from RGBD data and predicts semantic maps to support the obstacle avoidance task. [4] integrated sensor based, computer vision-based, and fuzzy logic techniques to detect objects for collision avoidance. ...
Preprint
Full-text available
Blind and visually challenged face multiple issues with navigating the world independently. Some of these challenges include finding the shortest path to a destination and detecting obstacles from a distance. To tackle this issue, this paper proposes ViT Cane, which leverages a vision transformer model in order to detect obstacles in real-time. Our entire system consists of a Pi Camera Module v2, Raspberry Pi 4B with 8GB Ram and 4 motors. Based on tactile input using the 4 motors, the obstacle detection model is highly efficient in helping visually impaired navigate unknown terrain and is designed to be easily reproduced. The paper discusses the utility of a Visual Transformer model in comparison to other CNN based models for this specific application. Through rigorous testing, the proposed obstacle detection model has achieved higher performance on the Common Object in Context (COCO) data set than its CNN counterpart. Comprehensive field tests were conducted to verify the effectiveness of our system for holistic indoor understanding and obstacle avoidance.
Article
Full-text available
In this study, we propose an assistive system for helping visually impaired people walk outdoors. This assistive system contains an embedded system—Jetson AGX Xavier (manufacture by Nvidia in Santa Clara, CA, USA) and a binocular depth camera—ZED 2 (manufacture by Stereolabs in San Francisco, CA, USA). Based on the CNN neural network FAST-SCNN and the depth map obtained by the ZED 2, the image of the environment in front of the visually impaired user is split into seven equal divisions. A walkability confidence value for each division is computed, and a voice prompt is played to guide the user toward the most appropriate direction such that the visually impaired user can navigate a safe path on the sidewalk, avoid any obstacles, or walk on the crosswalk safely. Furthermore, the obstacle in front of the user is identified by the network YOLOv5s proposed by Jocher, G. et al. Finally, we provided the proposed assistive system to a visually impaired person and experimented around an MRT station in Taiwan. The visually impaired person indicated that the proposed system indeed helped him feel safer when walking outdoors. The experiment also verified that the system could effectively guide the visually impaired person walking safely on the sidewalk and crosswalks.
Article
Full-text available
The World Health Organization (WHO) reported that there are 285 million visually-impaired people worldwide. Among these individuals, there are 39 million who are totally blind. There have been several systems designed to support visually-impaired people and to improve the quality of their lives. Unfortunately, most of these systems are limited in their capabilities. In this paper, we present a comparative survey of the wearable and portable assistive devices for visually-impaired people in order to show the progress in assistive technology for this group of people. Thus, the contribution of this literature survey is to discuss in detail the most significant devices that are presented in the literature to assist this population and highlight the improvements, advantages, disadvantages, and accuracy. Our aim is to address and present most of the issues of these systems to pave the way for other researchers to design devices that ensure safety and independent mobility to visually-impaired people.
Article
Full-text available
In the most recent report published by theWorld Health Organization concerning people with visual disabilities it is highlighted that by the year 2020, worldwide, the number of completely blind people will reach 75 million, while the number of visually impaired (VI) people will rise to 250 million. Within this context, the development of dedicated electronic travel aid (ETA) systems, able to increase the safe displacement of VI people in indoor/outdoor spaces, while providing additional cognition of the environment becomes of outmost importance. This paper introduces a novel wearable assistive device designed to facilitate the autonomous navigation of blind and VI people in highly dynamic urban scenes. The system exploits two independent sources of information: ultrasonic sensors and the video camera embedded in a regular smartphone. The underlying methodology exploits computer vision and machine learning techniques and makes it possible to identify accurately both static and highly dynamic objects existent in a scene, regardless on their location, size or shape. In addition, the proposed system is able to acquire information about the environment, semantically interpret it and alert users about possible dangerous situations through acoustic feedback. To determine the performance of the proposed methodology we have performed an extensive objective and subjective experimental evaluation with the help of 21 VI subjects from two blind associations. The users pointed out that our prototype is highly helpful in increasing the mobility, while being friendly and easy to learn.
Article
Full-text available
In this paper, we introduce a real-time face recognition (and announcement) system targeted at aiding the blind and low-vision people. The system uses a Microsoft Kinect sensor as a wearable device, performs face detection, and uses temporal coherence along with a simple biometric procedure to generate a sound associated with the identified person, virtualized at his/her estimated 3-D location. Our approach uses a variation of the K-nearest neighbors algorithm over histogram of oriented gradient descriptors dimensionally reduced by principal component analysis. The results show that our approach, on average, outperforms traditional face recognition methods while requiring much less computational resources (memory, processing power, and battery life) when compared with existing techniques in the literature, deeming it suitable for the wearable hardware constraints. We also show the performance of the system in the dark, using depth-only information acquired with Kinect’s infrared camera. The validation uses a new dataset available for download, with 600 videos of 30 people, containing variation of illumination, background, and movement patterns. Experiments with existing datasets in the literature are also considered. Finally, we conducted user experience evaluations on both blindfolded and visually impaired users, showing encouraging results.
Book
Equal access to services and public places is now required by law in many countries. In the case of the visually impaired, it is often the use of assistive technology that facilitates their full participation in many societal activities ranging from meetings and entertainment to the more personal activities of reading books or making music. In this volume, the engineering techniques and design principles used in many solutions for vision-impaired and blind people are described and explained. Features: • a new comprehensive assistive technology model structures the volume into groups of chapters on vision fundamentals, mobility, communications and access to information, daily living, education and employment, and finally recreational activities; • contributions by international authors from the diverse engineering and scientific disciplines needed to describe and develop the necessary assistive technology solutions; • systematic coverage of the many different types of assistive technology devices, applications and solutions used by visually impaired and blind people; • chapters open with learning objectives and close with sets of test questions and details of practical projects that can be used for student investigative work and self-study. Assistive Technology for Vision-impaired and Blind People is an excellent self-study and reference textbook for assistive technology and rehabilitation engineering students and professionals. The comprehensive presentation also allows engineers and health professionals to update their knowledge of recent assistive technology developments for people with sight impairment and loss.
Conference Paper
We propose an improved version of a wearable lightweight device to support visually impaired people during their everyday lives by facilitating autonomous navigation and obstacle avoidance. The system deploys two retina-inspired Dynamic Vision Sensors for visual information gathering. These sensors are characterized by very low power consumption, low latency and drastically reduced data rate in comparison with regular CMOS/ CCD cameras which makes them well suited for real-time mobile applications. Event-based algorithms operating on the visual data stream extract depth information in real-time which is translated into the acoustic domain. Spatial auditory signals are simulated at the computed origin of visual events in the real world. These sounds are modulated according to the position in the field of view which the user can change by moving their head. Here, different tests with eleven subjects are conducted to evaluate the performance of the system. These tests show that the modulation helps to improve object localization performance significantly in comparison to prior experiments. Further trials estimate the visual acuity a user of the device would have using the Landolt C test. The low power consumption of all integrated components in a final system will allow for a long lasting battery life of a small portable device, which might ultimately combine perceived visual information and environmental knowledge to provide a higher quality of life for the visually impaired.
Article
Globally, the number of visually impaired people is large and increasing. Many assistive technologies are being developed to help visually impaired people, because they still have difficulty accessing assistive technologies that have been developed from a technology-driven perspective. This study applied a user-centered perspective to get different and hopefully deeper understanding of the interaction experiences. More specifically, this study focused on identifying the unique interaction experiences of visually impaired people when they use a camera application on a smartphone. Twenty participants conducted usability testing using the retrospective think aloud technique. The unique interaction experiences of visually impaired people with the camera application, and relevant implications for designing assistive technologies were analyzed. Relevance to industry The considerations for conducting usability testing and the results of this study are expected to contribute to the design and evaluation of new assistive technologies based on smartphones.