Developing audio processing tools for extracting social-audio features are just as important as conscious content for determining human behavior. Psychologists speculate these features may have evolved as a way to establish hierarchy and group cohesion because they function as a subconscious discussion about relationships, resources, risks, and rewards. In this paper, we present the design, implementation, and deployment of a wearable computing platform capable of automatically extracting and analyzing social-audio signals. Unlike conventional research that concentrates on data which have been recorded under constrained conditions, our data were recorded in completely natural and unpredictable situations. In particular, we benchmarked a set of integrated algorithms (sound speech detection and classification, sound level meter calculation, voice and nonvoice segmentation, speaker segmentation, and prediction) to obtain speech and environmental sound social-audio signals using an in-house built wearable device. In addition, we derive a novel method that incorporates the recently published audio feature extraction technique based on power normalized cepstral coefficient and gap statistics for speaker segmentation and prediction. The performance of the proposed integrated platform is robust to natural and unpredictable situations. Experiments show that the method has successfully segmented natural speech with 89.6% accuracy.
A methodology to determine the interactive effects of weapon recoil, for both the weapon and the shooter in synchrony during shoulder-fired small arms target engagement scenarios, is yet to be established with widespread applications. Recoil energy can be measured using devices such as instrumented weapon mounts, or calculated from factors such as weapon weight, center of mass, projectile mass, and muzzle velocity. However, the effect of recoil on the shooter will affect shooting performance. Perceived recoil may be defined as a mental representation of the impact intensity experienced by the shooter, a subjective estimation that encompasses pain, discomfort, propensity to flinch, and other factors. Methods to quantify and mitigate the recoil energy experienced by the shooter, as well as proposed concepts for improved recording of the interaction between the physical and psychological correlates of recoil as it relates to small arms shoulder-fired weapon use are discussed.
Gaze tracking has been suggested as an alternative to traditional computer pointing mechanisms. However, the accuracy limitations of gaze estimation algorithms and the fatigue imposed on users when overloading the visual perceptual channel with a motor control task have prevented the widespread adoption of gaze as a pointing modality. Rather than using gaze as a complete pointing mechanism, this study investigates the usage of gaze to complement traditional keyboard/mouse cursor positioning methods during standard human-computer interaction (HCI). With this approach, bringing the mouse/keyboard cursor to a target still requires a manual action, but the time and effort involved are substantially reduced in terms of mouse movement amplitude incurred or number of keystrokes pressed. This is accomplished by the cursor warping from its original position on the screen to the estimated point of regard of the user on the screen as estimated by video-oculography gaze tracking when a keystroke or mouse movement event is detected. The user adjusts the final fine-grained positioning of the cursor manually. The results of the user study carried out here on the effects of cursor warping in common computer input operations that involve cursor repositioning when using one or several monitors as well as on its learning dynamics over time show that cursor warping can speed up and/or reduce the physical effort required to complete tasks such as mouse/trackpad target acquisition, keyboard text cursor positioning, mouse/keyboard based text selection, and drag and drop operations.
Given a video sequence, the task of action recognition is to identify the most similar action among the action sequences learned by the system. Such human action recognition is based on evidence gathered from videos. It has wide application including surveillance, video indexing, biometrics, telehealth, and human–computer interaction. Vision-based human action recognition is affected by several challenges due to view changes, occlusion, variation in execution rate, anthropometry, camera motion, and background clutter. In this survey, we provide an overview of the existing methods based on their ability to handle these challenges as well as how these methods can be generalized and their ability to detect abnormal actions. Such systematic classification will help researchers to identify the suitable methods available to address each of the challenges faced and their limitations. In addition, we also identify the publicly available datasets and the challenges posed by them. From this survey, we draw conclusions regarding how well a challenge has been solved, and we identify potential research areas that require further work.
A survey of the adaptive controllers deployed to address major inherent control issues in robotic teleoperation systems is carried out. The study in particular explores the application of adaptive controllers in dealing with master and slave model uncertainties, operator and environment force model uncertainties, unknown external disturbances, and communication delay. The reviewed literature is structured according to the objectives envisaged for the adaptive controllers. Meanwhile, some adaptive methods deployed in human–robot interaction, where robots collaborate with people and actively support them, and local robot control, where robot manipulators are controlled at the same location as the operator, are also considered in the review as they can be used in teleoperation with some minor adjustment. A comparison of the strengths, deficiencies, and requirement of methods in each category is carried out. The study indicates that the majority of the proposed methods either require additional hardware such as sensors, or assume an accurate model of the system under study. The possible future research directions are outlined based on the gaps identified in the survey.
For an engaging human–machine interaction, machines need to be equipped with affective communication abilities. Such abilities enable interactive machines to recognize the affective expressions of their users, and respond appropriately through different modalities including movement. This paper focuses on bodily expressions of affect, and presents a new computational model for affective movement recognition, robust to kinematic, interpersonal, and stochastic variations in affective movements. The proposed approach derives a stochastic model of the affective movement dynamics using hidden Markov models (HMMs). The resulting HMMs are then used to derive a Fisher score representation of the movements, which is subsequently used to optimize affective movement recognition using support vector machine classification. In addition, this paper presents an approach to obtain a minimal discriminative representation of the movements using supervised principal component analysis (SPCA) that is based on Hilbert–Schmidt independence criterion in the Fisher score space. The dimensions of the resulting SPCA subspace consist of intrinsic movement features salient to affective movement recognition. These salient features enable a low-dimensional encoding of observed movements during a human–machine interaction, which can be used to recognize and analyze human affect that is displayed through movement. The efficacy of the proposed approach in recognizing affective movements and identifying a minimal discriminative movement representation is demonstrated using two challenging affective movement datasets.
In order to incorporate both affective and cognitive factors in the decision-making process, a user experience (UX) evaluation function based on cumulative prospect theory is proposed for three different affective states and two different types of products (affect-rich versus affect-poor). In order to tackle multiple parameters involved in the UX evaluation function, a hierarchical Bayesian model is proposed with a technique called “Markov chain Monte Carlo.” It estimates parameters that represent different cognitive tendencies and affective influences for customers at the individual and group levels by generating posterior probability density functions of the parameters to incorporate inherent uncertainty. An experiment with four hypotheses was designed to test the proposed model. We found that: 1) anxious participants tend to be more risk-averse than those in joy and excitement; 2) joyful and excited participants tend to be more risk-seeking than those in anxiety in UX-related choice decision making; 3) all participants tend to be averse to unpleasant UX; and 4) participants tend to value by feeling for affect-rich products and value by calculation for affect-poor products. Furthermore, the models of five different types can predict choice decision making between product profiles with around 80% accuracy. In summary, the results explain affective-cognitive decision-making behavior in the complex domain of UX design and, thus, illustrate the potential and feasibility of the proposed method.
The human factors literature on intelligent systems was reviewed in relation to the following: efficient human supervision of multiple robots, appropriate human trust in the automated systems, maintenance of human operator's situation awareness, individual differences in human–agent (H-A) interaction, and retention of human decision authority. A number of approaches—from flexible automation to autonomous agents—were reviewed, and their advantages and disadvantages were discussed. In addition, two key human performance issues (trust and situation awareness) related to H–A teaming for multirobot control and some promising user interface design solutions to address these issues were discussed. Some major individual differences factors (operator spatial ability, attentional control ability, and gaming experience) were identified that may impact H-A teaming in the context of robotics control.
Patients undergoing interfacility transfers are at potentially greater risk of adverse or critical events than those in hospital, and efficient transfers play a significant role in reducing mortality and morbidity. Medical dispatchers rely on accurate estimations of transfer time in determining the most appropriate method of transportation, often either a helicopter and/or land ambulance, in situations that are characterized by high time pressure and uncertainty. In this paper, we propose the design of a data-driven decision support tool to improve dispatcher transport mode decision making. We studied the dispatch process of the air and land medical transport system in Ontario, Canada through onsite observations and developed a tool which generates transfer time estimates based on historical data. We found that dispatchers have large estimation errors, and are biased toward higher degrees of underestimation for air transfers compared with land transfers. In contrast, the proposed tool produced estimates that had significantly less error than dispatcher estimates. The estimation error for the tool was on the average 21 min less: a practically significant difference in urgent patient care. Through onsite observations and the relevant literature, we also identified factors that may influence the collaboration between the dispatcher and the tool. This research is a first attempt to study how decisions are made for interfacility medical transfers and for evaluating the accuracy of human operator estimates of these transfer times. It is also the first to demonstrate a tool's utility in comparison to existing procedures for estimating transfer times.
Neuroimaging technologies, such as functional near-infrared spectroscopy (fNIR), could provide performance metrics directly from brain-based measures to assess safety and performance of operators in high-risk fields. In this paper, we objectively and subjectively examine the cognitive workload of air traffic control specialists utilizing a next-generation conflict resolution advisory. Credible differences were observed between continuously increasing workload levels that were induced by increasing the number of aircraft under control. In higher aircraft counts, a possible saturation in brain activity was realized in the fNIR data. A learning effect was also analyzed across a three-day/nine-session training period. The difference between Day 1 and Day 2 was credible, while there was a noncredible difference between Day 2 and Day 3. The results presented in this paper indicate some advantages in objective measures of cognitive workload assessment with fNIR cortical imaging over the subjective workload assessment keypad.
Airport surface congestion control has the potential to mitigate the increase in taxi times and fuel burn at major airports. One possible class of congestion control strategies predicts the departure throughput, and recommends a rate at which to release aircraft pushbacks from the gate. This paper describes the field-testing of these types of strategies at Boston Logan International Airport, focusing on the communication of the suggested rate to the air traffic controller, and additional support for its implementation. Two Android tablet computers were used for the field-tests; one to input the data and the other to display the recommended rate to the air traffic controllers. Two potential decision-support displays were tested: a rate control display that only presented a color-coded suggested pushback rate and a volume control display that provided additional support to the controllers on the number of aircraft that had called-ready and had been released. A survey of controllers showed that they had found the decision-support tool easy to use, especially the additional functionality that is provided by the aircraft volume control display. The field tests were also found to yield significant operational benefits showing that such a congestion control strategy could be effective in practice.
This paper advocates for a novel approach to recommend texts at various levels of difficulties based on a proposed method, the algebraic complexity of texts (ACT). Different from traditional complexity measures that mainly focus on surface features like the numbers of syllables per word, characters per word, or words per sentence, ACT draws from the perspective of human concept learning, which can reflect the complex semantic relations inside texts. To cope with the high cost of measuring ACT, the Degree-2 Hypothesis of ACT is proposed to reduce the measurement from unrestricted dimensions to three dimensions. Based on the principle of “mental anchor,” an extension of ACT and its general edition [denoted as extension of text algebraic complexity (EACT) and general extension of text algebraic complexity (GEACT)] are developed, which take keywords’ and association rules’ complexities into account. Finally, using the scores given by humans as a benchmark, we compare our proposed methods with linguistic models. The experimental results show the order GEACT>EACT>ACT> Linguistic models, which means GEACT performs the best, while linguistic models perform the worst. Additionally, GEACT with lower convex functions has the best ability in measuring the algebraic complexities of text understanding. It may also indicate that the human complexity curve tends to be a curve like lower convex function rather than linear functions.
Anticipation of future events is recognized to be a significant element of driver competence. Surely, guiding one's behavior through the anticipation of future traffic states provides potential gains in recognition and reaction times. However, the role of anticipation in driving has not been systematically studied. In this paper, we identify the characteristics of anticipation in driving and provide a working definition. In particular, we distinguish it from driving goals such as eco or defensive driving and define it as a high-level competence for efficient positioning of the vehicle to facilitate these goals. We also present a driving simulator study assessing the relation between driver experience and anticipation. Thirty drivers from three different experience categories (low, medium, and high) completed five scenarios, each involving several pre-event cues designed to allow the anticipation of an event. The results showed that more experienced drivers demonstrated more pre-event actions compared with less experienced drivers. While pre-event actions resulted in improved safety on certain occasions, the effects were often not significant. Future research should further investigate the mechanisms underlying anticipation, particularly how drivers make use of temporal and spatial gains obtained through the recognition of pre-event cues.
This paper provides a critical review of laboratory-based studies of spatial attention. We highlight a number of ways in which such studies fail to capture the key factors/constraints that have been shown to give rise to an increased risk of vehicular accident in real-world situations. In particular, limitations that are related to the design of the attentional capture task itself and limitations that are concern the demographic and current state of the participants tested in these laboratory studies are discussed. A list of recommendations are made concerning those areas in which laboratory-based spatial attention research could focus on in the future in order to make sure that their results are more relevant to those working in an applied setting, and thus, enhance translational research.
Early detection of fall risk can reduce health costs associated with surgery, rehabilitation, imaging studies, hospitalizations, and medical evaluations. This paper proposes a measurement-focused study oriented to evaluate a new methodology for assessing fall risk using low-cost and off-the-shelf devices. The proposed methodology consists of a data acquisition system, a data analysis system, and a fall risk assessment system. The data acquisition system is composed by a standard notebook computer and video game input devices: a Kinect, a Wii balance board, and two Wii motion controllers. The data analysis system and the fall risk assessment system, in turn, use signal processing, data mining, and computational intelligence methods, in order to analyze the acquired data for determining the fall risk of the subject under analysis. This methodology includes six static and two dynamic tests. Experiments were conducted on a population of 37 subjects: 16 with falling background, and 21 with nonfalling background. These two groups have the same age distribution. As nonlinear binary classification techniques were used, methodologies based on confidence intervals are not applicable and then tenfold cross validation was used to estimate accuracy. Hence, such a methodology can classify the fall risk as high or low, with an accuracy of 89.2%. The proposed methodology allows the construction of low-cost, portable, replicable, objective, and reliable fall risk assessment systems.
In this paper, an assessment of a driving assistance by a deictic command for a smart wheelchair is proposed. This equipment enables the user to move with a series of indications on an interface displaying a view of the environment and bringing about automatic movement of the wheelchair. Two sets of tests were implemented to assess the advantages of this type of assistance compared with conventional wheelchair control. The first set evaluated the performance of the human–machine system that is based on a course time analysis, an observation of users’ actions, and an estimation of driving comfort. The second test was implemented to assess the cognitive requirements of the driving task, specifically the attentional and executive processes required when driving in assisted mode. A dual-task method was used to achieve this. The results show that driving assistance brings about a decrease in physical load for the same level of comfort as manual driving, but requires an additional cognitive effort for the user, especially in terms of executive abilities.
Choosing clothes with complex patterns and colors is a challenging task for visually impaired people. Automatic clothing pattern recognition is also a challenging research problem due to rotation, scaling, illumination, and especially large intraclass pattern variations. We have developed a camera-based prototype system that recognizes clothing patterns in four categories (plaid, striped, patternless, and irregular) and identifies 11 clothing colors. The system integrates a camera, a microphone, a computer, and a Bluetooth earpiece for audio description of clothing patterns and colors. A camera mounted upon a pair of sunglasses is used to capture clothing images. The clothing patterns and colors are described to blind users verbally. This system can be controlled by speech input through microphone. To recognize clothing patterns, we propose a novel Radon Signature descriptor and a schema to extract statistical properties from wavelet subbands to capture global features of clothing patterns. They are combined with local features to recognize complex clothing patterns. To evaluate the effectiveness of the proposed approach, we used the CCNY Clothing Pattern dataset. Our approach achieves 92.55% recognition accuracy which significantly outperforms the state-of-the-art texture analysis methods on clothing pattern recognition. The prototype was also used by ten visually impaired participants. Most thought such a system would support more independence in their daily life but they also made suggestions for improvements.
The allocation of visual attention is a key factor for the humans when operating complex systems under time pressure with multiple information sources. In some situations, attentional tunneling is likely to appear and leads to excessive focus and poor decision making. In this study, we propose a formal approach to detect the occurrence of such an attentional impairment that is based on machine learning techniques. An experiment was conducted to provoke attentional tunneling during which psycho-physiological and oculomotor data from 23 participants were collected. Data from 18 participants were used to train an adaptive neuro-fuzzy inference system (ANFIS). From a machine learning point of view, the classification performance of the trained ANFIS proved the validity of this approach. Furthermore, the resulting classification rules were consistent with the attentional tunneling literature. Finally, the classifier was robust to detect attentional tunneling when performing over test data from four participants.
Expert judgment is widely used for activity duration estimation in software project management. While there are both advantages and disadvantages of expert judgment-based estimation, we propose the use of fuzzy inference rules for semi-automatic estimation to reduce the potential negative aspects of the expert judgment-based estimation. Fourteen fuzzy inference rules are introduced to elicit and adjust expert tacit knowledge, and expert judgment-based estimation results are complemented by fuzzy inference rules. The results from expert judgment and fuzzy inference rules are compared with the expert judgment-based approach using surveys and one-on-one interviews with project managers from different disciplines through analyses with data from past software projects. The use of fuzzy inference rules improves the estimation accuracy of the expert judgment-based approach by 39.35%. The proposed approach facilitates the experts to derive a more realistic and reliable activity duration estimation in software project management.
Human–automation interaction (HAI) is often a contributor to failures in complex systems. This is frequently due to system interactions that were not anticipated by designers and analysts. Model checking is a method of formal verification analysis that automatically proves whether or not a formal system model adheres to desirable specification properties. Task analytic models can be included in formal system models to allow HAI to be evaluated with model checking. However, previous work in this area has required analysts to manually formulate the properties to check. Such a practice can be prone to analyst error and oversight which can result in unexpected dangerous HAI conditions not being discovered. To address this, this paper presents a method for automatically generating specification properties from task models that enables analysts to use formal verification to check for system HAI problems they may not have anticipated. This paper describes the design and implementation of the method. An example (a pilot performing a before landing checklist) is presented to illustrate its utility. Limitations of this approach and future research directions are discussed.
This paper describes and experimentally demonstrates a new approach to shared-adaptive control of human–machine systems. Motivated by observed human proclivity toward fields of safe travel rather than specific trajectories, our approach is rooted in the planning and enforcement of constraints rather than the more traditional reference paths. This approach identifies path homotopies, bounds a desired homotopy with constraints, and allocates control as necessary to ensure that these constraints remain satisfied without unduly restricting the human operator. We present a summary of this framework's technical background and analyze its effect both with and without driver feedback on the performance and confidence of 20 different drivers teleoperating an unmanned (teleoperated) vehicle through an outdoor obstacle course. In 1200 trials, constraint-based semiautonomy was shown to increase the operator speed by 26% while reducing the occurrence of collisions by 78%, and improving overall user confidence and sense of control by 44% and 12%, respectively—all the while assuming less than 43% control of the vehicle.
Monitoring human movements using wireless wearable sensors finds applications in a variety of domains including healthcare and wellness. In these systems, sensory devices are tightly integrated with the human body and infer status of the user through signal and information processing. Typically, highly accurate observations can be made at the cost of deploying a sufficiently large number of sensors, which in turn results in increased energy consumption of the system and reduced adherence to using the system. Therefore, optimizing power consumption of the system while maintaining acceptable accuracy plays a crucial role in realizing these stringent resource constraint systems. In this paper, we present an activity monitoring approach that minimizes power consumption of the system subject to a lower bound on the classification accuracy. The system utilizes computationally simple template-matching blocks that perform classifications on individual sensor nodes. The system further employs a boosting approach to enhance accuracy of the distributed classifier by selecting a subset of sensors optimized in terms of power consumption and capable of achieving a given lower bound accuracy criterion. A proof-of-concept evaluation with three participants performing 14 transitional actions was conducted, where collected signals were segmented and labeled manually for each action. The results indicated that the proposed approach provides more than a 65% reduction in the power consumption of the signal processing, while maintaining 80% sensitivity in classifying human movements.
This paper addresses recommending presentation sessions at smart conferences to participants. We propose a venue recommendation algorithm: socially aware recommendation of venues and environments (SARVE). SARVE computes correlation and social characteristic information of conference participants. In order to model a recommendation process using distributed community detection, SARVE further integrates the current context of both the smart conference community and participants. SARVE recommends presentation sessions that may be of high interest to each participant. We evaluate SARVE using a real-world dataset. In our experiments, we compare SARVE with two related state-of-the-art methods, namely context-aware mobile recommendation services and conference navigator (recommender) model. Our experimental results show that in terms of the utilized evaluation metrics, i.e., precision, recall, and f-measure, SARVE achieves more reliable and favorable social (relations and context) recommendation results.
Mouse dynamics is the process of identifying individual users on the basis of their mouse operating behaviors. Mouse dynamics analysis techniques do not provide an acceptable level of accuracy, perhaps due to behavioral variability. This study presents a dimensionality-reduction-based approach to mitigate the behavioral variability of mouse dynamics and improve the performance of mouse-dynamics-based continuous authentication. Variability was measured over the schematic features and motor-skill features extracted from each mouse behavior data session. A unified framework of employing dimensionality reduction methods (Multidimensional Scaling, Laplacian Eigenmap, Isometric Feature Mapping, and Local Linear Embedding) was developed to reduce behavioral variability by obtaining predominant characteristics from the original feature space. Classification techniques (Random Forest, Support Vector Machine, Neural Network, and Nearest Neighbor) were applied to the transformed feature space to perform the authentication task. Analyses were conducted using data from 840 half-hour sessions of 28 participants. Results indicated that for sufficiently long sequences, the transformed feature spaces had much less variability and the corresponding authentication performance was better than the original feature space with improvements of the false-acceptance rate by 89.6% and of the false-rejection rate by 77.4% in some cases. Additionally, an investigation of the relationships between variability and authentication error rates and detection time indicated that the variability and authentication error rates reduce greatly with the increase of detection time. For the data collected, the approach fared better than the state-of-the-art approaches. These findings suggest that variability reduction could improve mouse dynamics, so it may enhance current authentication mechanisms.
Radiographic-imaging modalities like computerized tomography, positron emission tomography, and magnetic resonance imaging are playing a major role in the diagnosis and prognosis of cancer. Gene and protein expression patterns, from the tumor genome, are seen to facilitate individualized selection of therapies. Along with breakthroughs in biotechnology, applicable within cancer radiation biology, a new research field called Radiogenomics has been born in radiation oncology. Associating genotypes with imaging phenotypes holds promise for personalized optimal treatment. Segmentation and feature selection from the region of interest in an image are followed by correlation with the gene expression profile of the tumor in order to determine its noninvasive surrogates. This paper highlights the roles of quantitative imaging, genomics, and radiogenomics for a patient-specific tumor management.