ArticlePDF Available

Abstract and Figures

An ongoing problem in the management of facial neuromotor disorders is the lack of a universal outcome measurement system.1 Objective systems have been developed to quantify facial symmetry.1-3 However, these systems are time consuming, require expensive and complicated hardware, and can be difficult to implement in clinical practice. Recently, machine learning techniques have been developed that enable automatic localization of facial landmarks using large data sets of facial photographs.4-6 We have leveraged these technological advances to develop Emotrics, a simple, high-throughput software platform that enables automatic facial landmark localization and computation of facial measurements.
Content may be subject to copyright.
Letters
OBSERVATION
A Machine Learning Approach for Automated
Facial Measurements in Facial Palsy
An ongoing problem in the management of facial neuromo-
tor disorders is the lack of a universal outcome measure-
ment system.
1
Objective systems have been developed to
quantify facial symmetry.
1-3
However, these systems are
time consuming, require expensive and complicated hard-
ware, and can be difficult to implement in clinical practice.
Recently, machine learning techniques have been devel-
oped that enable automatic localization of facial landmarks
using large data sets of facial photographs.
4-6
We have lever-
aged these technological advances to develop Emotrics, a
simple, high-throughput software platform that enables
automatic facial landmark localization and computation of
facial measurements.
Emotrics is designed for use with frontal-view clinical pho-
tographs, automatically placing facial landmark dots on an up-
loaded image. Emotrics automatically generates multiple fa-
cial measurements by scaling iris diameter to pixel width in
each image. This measurement technique uses a mean hu-
man population iris diameter of 11.77 mm; this value is com-
parable to that used in the Massachusetts Eye and Ear Infir-
mary’s FACE-gram software (Sir Charles Bell Society). However,
Emotrics has 2 important advantages over FACE-gram.
Emotrics rapidly computes multiple relevant facial measure-
ments simultaneously, with full analysis of one image taking
less than 5 seconds on average. Emotrics can also analyze the
differences between 2 photographs, which allows automated
calculation of smile excursion.
Functionalities |The user can manually reposition any facial land-
mark dot requiring refinement (red dots). Facial landmark dots
should outline the superior border of the brow, the free mar-
gin of the upper and lower eyelids, the nasal midline, the na-
sal base, the mucosal edge and vermillion-cutaneous junc-
tion of the upper and lower lips, and the lower two-thirds of
the face. The locations of the eye centers (green dots) and iris
borders (green circles) may also be finely adjusted, which pro-
vides additional flexibility and accuracy. The facial midline may
be displayed to ensure that the user is satisfied with its posi-
tion (Figure 1); this line is computed as a line perpendicular
to the interpupillary plane at the midpupillary point.
Emotrics automatically computes a literature-estab-
lished set of facial measurements relevant to facial palsy
(Figure 1). Brow symmetry, palpebral fissure width, smile ex-
cursion, and smile symmetry outputs are automatically gen-
erated. During future development of Emotrics, outputs will
appear in an even more user-friendly fashion. Emotrics can also
measure differences between any 2 images manually se-
lected by the user (Figure 2), although ultimately this process
will be automated. By comparing images obtained before and
after intervention, users can rapidly obtain measures of treat-
Figure 1. Graphical User Interface of Emotrics and Facial Measurements Computed From the Facial Landmarks
Patient image with landmarks
A
Facial measurements
B
A, Icon bar displays the different functionalities: load image, load patient,
compare 2 images, change image, fit image to window, resizeiris diame ter, find
facial midline, toggle landmarks, show facial measurements, take screenshot,
save results, and exit. Each function is fully described in the video tutorial. The
window of Emotrics displays the facial photograph with the 68 facial landmarks
(red dots) and bilateral iris positions (green circle). The face midline (vertical
green line) can be easily estimated and displayed as a reference. B, Pixel width is
automatically normalized using a mean iris diameter of 11.77 mm to produce
millimetric values. A complete description of each facial measurement is
available within Emotrics by clicking on the help icon.
jamafacialplasticsurgery.com (Reprinted) JAMA Facial Plastic Surgery Published online March 15, 2018 E1
© 2018 American Medical Association. All rights reserved.
Downloaded From: by a Harvard University User on 03/21/2018
ment effects. Similarly, comparing the facial position during
maximum smile effort with the resting facial position en-
ables automated measurement of smile excursion.
Discussion |Emotrics was created using a database of normal
faces, which renders the program prone to localization errors
in cases of gross facial asymmetry. User verification of land-
marks is recommended for photographs of patients with fa-
cial palsy. An additional limitation is that the accuracy of the
measurements is dependent on image quality. Image resolu-
tion of at least 1 megapixel is ideal; at least 1 iris should be vis-
ible, and there should be no head yaw or tilt (head roll is ac-
ceptable). If the eyes are closed, the user can import pupil
positions from a previous photograph in the same clinical se-
ries, which would enable measurements in these images.
Where there is strabismus or orbital dystopia, the user may
manually select an ideal pupil position that enables facial mea-
surements.
An objective means of characterizing facial displace-
ments is essential to the management of facial palsy. Emotrics
harnesses recent advances in machine learning to automati-
cally compute facial displacements from standard photo-
graphs. This tool may facilitate communication of disease se-
verity and outcomes within the field of facial reanimation by
means of shared, objective, data-driven language. Emotrics
is freely available and can be downloaded, along with de-
tailed tutorial videos, from the Sir Charles Bell Society web-
site (http://www.sircharlesbell.org/).
Diego L. Guarin, PhD
Joseph Dusseldorp, MD
Tessa A. Hadlock, MD
Nate Jowett, MD, FRCSC
Author Affiliations: Department of Otolaryngology, Massachusetts Eyeand Ear
Infirmary, Harvard Medical School, Cambridge (Guarin, Dusseldorp, Hadlock,
Jowett); University of Sydney, Sydney,Australia (Dusseldorp).
Corresponding Author: Diego L. Guarin, PhD, Massachusetts Eye and Ear
Infirmary, Harvard Medical School, 243 Charles St, Boston, MA 02114(diego
_guarin@meei.harvard.edu).
Published Online: March 15, 2018. doi:10.1001/jamafacial.2018.0030
Conflict of Interest Disclosures: None reported.
Additional Contributions: We thank the patient for granting permission to
publish this information.
1. Hadlock TA, Urban LS. Towarda universal, automated facial measurement
tool in facial reanimation. Arch Facial Plast Surg. 2012;14(4):277-282.
2. Bray D, Henstrom DK, Cheney ML, Hadlock TA. Assessing outcomes in facial
reanimation: evaluation and validation of the SMILE system for measuring lip
excursion during smiling. Arch Facial Plast Surg.2010;12(5):352-354.
3. Coulson SE, Croxson GR, Gilleard WL. Three-dimensional quantification of
“still” points during normal facial movement. Ann Otol Rhinol Laryngol. 1999;
108(3):265-268.
4. Vahid K, Sullivan J. One millisecond face alignment with an ensemble of
regression trees. In: 27th IEEE Conference on Computer Vision and Pattern
Recognition. Piscataway, NJ: IEEE; 2014:1867-1874.
5. King DE. Dlib-ml: a machine learning toolkit. J Mach Learn Res. 2009;10:1755-
1758.
6. Sagonas C, Georgios T, Stefanos Z, Maja P. A semi-automatic methodology
for facial landmark annotation. In: 26th IEEE Conference on Computer Vision and
Pattern Recognition. Piscataway, NJ: IEEE; 2013:896-903.
Figure 2. Case Example of a Patient With Aberrant RecoveryFrom Left-Sided Bell Palsy
Before recovery
A
After recovery
B
Difference in facial measurements after recovery
C
This example shows photographs taken before (A) and after (B) recovery. The
facial midline is shown in both patient images, computed as a line perpendicular
to the interpupillary plane at the midpupillary point. The difference in facial
measurements between photographs obtained before and after recovery (for
both sides of the face) is shown in panel C.
Letters
E2 JAMA Facial Plastic Surgery Published online March 15, 2018 (Reprinted) jamafacialplasticsurgery.com
© 2018 American Medical Association. All rights reserved.
Downloaded From: by a Harvard University User on 03/21/2018
... Matthews and Baker introduced a 68-point facial model in [11], which is widely known and employed in the facial analysis field. However, few implementations fail in predicting those key points from persons who have facial palsy because the model was trained using imagery from healthy persons [12][13][14]. Recently, some authors have been working on improving the available shape predictors to extract facial landmarks in palsy patients accurately. ...
... In general, the data set is composed of 19 palsy patients and 19 healthy subjects. The palsy patients are subjects 1,5,6,7,11,12,13,14,15,19,20,21,23,24,25,28,29,30, and 31 from the YFP database. Patients with less than 20 images and patients with facial occlusions were excluded. ...
Article
Full-text available
The incapability to move the facial muscles is known as facial palsy, and it affects various abilities of the patient, for example, performing facial expressions. Recently, automatic approaches aiming to diagnose facial palsy using images and machine learning algorithms have emerged, focusing on providing an objective evaluation of the paralysis severity. This research proposes an approach to analyze and assess the lesion severity as a classification problem with three levels: healthy, slight, and strong palsy. The method explores the use of regional information, meaning that only certain areas of the face are of interest. Experiments carrying on multi-class classification tasks are performed using four different classifiers to validate a set of proposed hand-crafted features. After a set of experiments using this methodology on available image databases, great results are revealed (up to 95.61% of correct detection of palsy patients and 95.58% of correct assessment of the severity level). This perspective leads us to believe that the analysis of facial paralysis is possible with partial occlusions if face detection is accomplished and facial features are obtained adequately. The results also show that our methodology is suited to operate with other databases while attaining high performance, even though the image conditions are different and the participants do not perform equivalent facial expressions.
... MA) pre-and postoperative measurements were taken for brow height, margin reflex distance 1 (MRD1) and 2, palpebral fissure height, commissure excursion, smile angle, and dental show (Table 1). 4 Given the young age of these patients, rather than measuring ''crow's feet,'' we felt that dynamic lower eyelid elevation is a more accurate measure of success. Key findings for both patients include a markedly reduced MRD-2, in a dynamic manner (Fig. 1). ...
Article
Importance: Traditional techniques of facial reanimation using gracilis free tissue transfer do not address the lower eyelid or provide contraction at the site of orbicularis oculi, which is necessary to create a natural appearing Duchenne smile. In this report, we describe a novel technique to achieve this element of a true mimetic smile using a tri-vector gracilis muscle flap. Objective: To describe a novel gracilis free flap technique for facial reanimation to provide contraction of the inferior and lateral orbicularis oculi and achieve a Duchenne smile. Design, Setting, and Participants: This was a surgical pearls-description of a novel surgical technique at Academic Tertiary Medical Center. Three patients who underwent the operation.
... Finally, to overcome subjectivity caused by human-generated facial ratings using scales such as House-Brackmann 2.0 or Sunnybrook, it would be interesting to evaluate facial symmetry with objective instrumental facial metrics tools such as Emotrics. 46 Clinical messages • Patients who receive appropriate medication but still suffer of severe Bell's Palsy after 14 days seem to benefit from the MEPP. • The MEPP significantly improved patient's quality-of-life during recovery and most probably contributed to decrease synkinesis at one-year post-onset. ...
Article
Full-text available
Objective To study the effects of the “Mirror Effect Plus Protocol” (MEPP) on global facial function in acute and severe Bell's Palsy. Design Single blind and randomized controlled trial to compare the effects of basic counseling (control group) versus MEPP (experimental group) over one year. Setting Outpatient clinic following referrals from Emergency or Otorhinolaryngology Departments. Subjects 40 patients ( n = 20 per group) with moderately severe to total palsy who received standard medication were recruited within 14 days of onset. Baseline characteristics were comparable between the groups. Interventions The experimental group received the MEPP program (motor imagery + manipulations + facial mirror therapy) while the control group received basic counseling. Both groups met the clinician monthly until 6 months and at one-year post-onset for assessments. Outcome measures Facial symmetry, synkinesis, and quality of life were measured using standardized scales. Perceived speech intelligibility was rated before and after therapy by naïve judges. Results Descriptive statistics demonstrated improvements in favor of the MEPP for each measured variable. Significant differences were found for one facial symmetry score (House-Brackmann 2.0 mean (SD) = 7.40 (3.15) for controls versus 5.1 (1.44) for MEPP), for synkinesis measures ( p = 0.008) and for quality-of-life ratings (mean (SD) score = 83.17% (17.383) for controls versus 98.36% (3.608) for MEPP ( p = 0.002)). No group difference was found for perceived speech intelligibility. Conclusion The MEPP demonstrates promising long-term results when started during the acute phase of moderately severe to total Bell's Palsy.
Article
Purpose Facial movement analysis during facial gestures and speech provides clinically useful information for assessing bulbar amyotrophic lateral sclerosis (ALS). However, current kinematic methods have limited clinical application due to the equipment costs. Recent advancements in consumer-grade hardware and machine/deep learning made it possible to estimate facial movements from videos. This study aimed to establish the clinical validity of a video-based facial analysis for disease staging classification and estimation of clinical scores. Method Fifteen individuals with ALS and 11 controls participated in this study. Participants with ALS were stratified into early and late bulbar ALS groups based on their speaking rate. Participants were recorded with a three-dimensional (3D) camera (color + depth) while repeating a simple sentence 10 times. The lips and jaw movements were estimated, and features related to sentence duration and facial movements were used to train a machine learning model for multiclass classification and to predict the Amyotrophic Lateral Sclerosis Functional Rating Scale–Revised (ALSFRS-R) bulbar subscore and speaking rate. Results The classification model successfully separated healthy controls, the early ALS group, and the late ALS group with an overall accuracy of 96.1%. Video-based features demonstrated a high ability to estimate the speaking rate (adjusted R ² = .82) and a moderate ability to predict the ALSFRS-R bulbar subscore (adjusted R ² = .55). Conclusions The proposed approach based on a 3D camera and machine learning algorithms represents an easy-to-use and inexpensive system that can be included as part of a clinical assessment of bulbar ALS to integrate facial movement analysis with other clinical data seamlessly
Article
Introduction: Restoration of spontaneous smiling is a key goal in facial reanimation. A major obstacle to quantifying recovery of spontaneous smiling is the current lack of a uniform and objective means of smile quantification. Objective: To compare the facial movements during smiling in patients with facial paralysis as measured by an automated application and human observers. Methods: Video recordings of 25 patients with unilateral facial palsy (FP) watching humorous videos were utilized. Application-derived smile timestamping was compared with manual observer interpretation. Internal reliability of measurements was evaluated through a test-retest approach. Results: Application-derived smile identification demonstrated almost perfect agreement with manual interpretation (kappa 0.861, p < 0.001). There was no statistically significant difference in mean number of smiles between detection method (p = 0.354). Automated smile identification demonstrated a high degree of specificity (95.4%), accuracy (93.1%), positive-predictive value (94.7%), and negative-predictive value (91.8%). This method demonstrated a high degree of reliability (kappa 0.864, p < 0.01). Conclusion: The novel "SmileCheck" mobile phone application performed accurate and reliable smile quantification in FP patients in comparison with manual observation.
Article
The aim of the present study was to analyze the consequences of partial free latissimus dorsi muscle flap with nerve splitting technique (Partial LD transfer) for facial reanimation and compare outcomes according to innervation method (singer versus dual innervation). Patients with complete unilateral facial paralysis underwent either the single (ipsilateral masseteric nerve only) or dual (ipsilateral masseteric nerve plus contralateral buccal branch of the facial nerve) nerve innervation method for facial reanimation. An assessment was carried out to compare the outcomes between the single and dual innervation. Total of 21 patients were involved in this study. In the single innervation group, 7 out of 8 patients developed a voluntary smile. However, none were able to achieve a spontaneous smile. On the other hand, 9 out of 13 patients developed a voluntary smile and 3 out of 13 patients achieved a spontaneous smile. The mean increases of smile excursion assessed by Emotrics software and Terzis grades showed no significant differences between two groups. Our novel partial LD transfer approach utilizing the dual innervation method had a positive effect on achieving a spontaneous smile and could be a valuable option for facial reanimation.
Conference Paper
Full-text available
Article
Background : Following paralysis, facial reanimation surgery can restore movement by nerve and/or muscle transfer within the face. The subtleties of lip and cheek movements during smiling are important aspects in assessing reanimation. This study quantifies average 3D movement vectors of the face during smiling based on the diverse Binghamton University 3D facial expression database to yield normative measures of lip and cheek movement. Methods : The analysis was conducted on 100 subjects with 3D facial scans in a neutral and 4 increasing smile intensities, and associated labelled 3D landmark points. Each subject set of 3D scans were rigidly registered to measure average displacement vectors (distance, azimuth, elevation) between the neutral and happy expressions. Results : The average lip commissure displacement was found to be 9.2, 11.4, 13.5, and 16.0 mm for increasing smile levels 1-4, respectively. Similarly, the average commissure azimuth angle across all 4 smile levels is ∼44 ± 21 degrees and the average elevation angle across all 4 smile levels is ∼37 ± 15 degrees. The maximum cheek displacement from the neutral expression was 4.5, 5.7, 6.8, and 7.9 mm for the smile levels 1-4, respectively. The average cheek movement azimuth angle is outward (increasing 1-13 degrees) and the elevation angle is upward (increasing 51-59 degrees) from the face. Conclusions : This data quantifying 3D lip and cheek smile displacements improves understanding of facial movement and may be applicable to future assessment/planning of facial reanimation surgeries.
Article
Facial landmark detection is a crucial step in the task of computer-aided facial palsy diagnosis, which enables to focus on the affected facial regions for learning asymmetry, shape and texture features of facial palsy. However, it is still very challenging to accurately detect salient landmarks on facial palsy images due to the unavailability of sufficient training databases providing annotated facial landmark images of facial palsy. To meet this need, we present a database in this paper named Annotated Facial Landmarks for Facial Palsy (AFLFP). AFLFP is a diverse, and reliable database that contains facial images with 16-class facial expressions of asymmetric facial expressions from 88 subjects. Each facial image is independently and manually annotated with 68 facial landmarks. This database is the first public manually annotated facial landmark database for facial palsy so far. Furthermore, to establish the benchmark results for the proposed database, we propose a deep neural network baseline with a two-stage cascaded fully convolutional network (FCN), which can detect facial landmarks in facial palsy from coarse to fine. The comprehensive experiments show that the proposed method performs better than the mainstream methods of machine learning and deep learning. And we have also compared the performance using normal faces and palsy faces respectively as the training data. The comparison results show that there are significant differences between them in terms of facial landmark detection, which further confirms the necessity to develop a facial landmark database specifically for facial palsy. Index Terms-Facial palsy, facial landmark detection, facial landmark database, deep learning.
Article
Objective: Management of the temporomandibular joint (TMJ) following condylar resection remains challenging in the field of mandibular reconstruction. A simple reconstruction of the TMJ with a contoured end of a fibular graft placed into the joint space is a potential option, but its efficacy is unknown partly because there are only few objective assessment systems for aesthetic outcome. This study aimed to develop an artificial intelligence (AI)-based aesthetic outcome evaluation system for the simple TMJ reconstruction method and evaluate its functional outcomes. Methods: Patients who underwent segmental mandibular resection and reconstruction with fibular flaps at our institution between 2011 and 2020 were retrospectively reviewed. The mandibular asymmetry value was introduced as a primary aesthetic outcome measure, calculated for each patient's photograph using facial recognition AI. The outcomes of the simple TMJ reconstruction method were compared with those of cases in which the native condyle was preserved. Results: Ten patients underwent condylar resection followed by simple TMJ reconstruction, while the native condyle was preserved in 18 patients. No significant difference was observed in the postoperative mandibular asymmetry value between the two treatment groups. No significant differences were found in the functional outcomes of deglutition and speech. Conclusion: The AI-based asymmetry evaluation system was useful as an aesthetic outcome measure in mandibular reconstruction. Simple TMJ reconstruction with a fibular end seemed to be a promising option, as there were no significant differences in both aesthetic and functional outcomes between this method and those cases in which the native condyle was preserved. Laryngoscope, 2022.
Conference Paper
Full-text available
Developing powerful deformable face models requires massive, annotated face databases on which techniques can be trained, validated and tested. Manual annotation of each facial image in terms of landmarks requires a trained expert and the workload is usually enormous. Fatigue is one of the reasons that in some cases annotations are inaccurate. This is why, the majority of existing facial databases provide annotations for a relatively small subset of the training images. Furthermore, there is hardly any correspondence between the annotated land-marks across different databases. These problems make cross-database experiments almost infeasible. To overcome these difficulties, we propose a semi-automatic annotation methodology for annotating massive face datasets. This is the first attempt to create a tool suitable for annotating massive facial databases. We employed our tool for creating annotations for MultiPIE, XM2VTS, AR, and FRGC Ver. 2 databases. The annotations will be made publicly available from http://ibug.doc.ic.ac.uk/ resources/facial-point-annotations/. Finally, we present experiments which verify the accuracy of produced annotations.
Article
There are many excellent toolkits which provide support for developing machine learning soft- ware in Python, R, Matlab, and similar environments. Dlib-ml is an open source library, targeted at both engineers and research scientists, which aims to provide a similarly rich environment for developing machine learning software in the C++ language. Towards this end, dlib-ml contains an extensible linear algebra toolkit with built in BLAS support. It also houses implementations of algorithms for performing inference in Bayesian networks and kernel-based methods for classifi- cation, regression, clustering, anomaly detection, and fe ature ranking. To enable easy use of these tools, the entire library has been developed with contract p rogramming, which provides complete and precise documentation as well as powerful debugging tools.