Conference Paper

A real-time computer vision system for workers’ PPE and posture detection in actual construction site environment

Authors:
To read the full-text of this research, you can request a copy directly from the authors.

Abstract

The real-time video detection model is yet a challenging, especially in detecting construction site workers and their PPE (helmet and safety gear) and postures, since the construction site environment consists multiple complications such as different illumination levels, shadows, complex activities, a wide range of personal protective equipment (PPE) designs and colours. This paper proposes a novel computer vision (CV) system to detect the construction workers’ PPE and postures in a real-time manner. Four different recording sessions have been carried out to build a dataset of 95 videos by using a novel design of site cameras. The PPE detection included eight different types of helmets and gears and the postures detection consisted of nine classes. The Python data-labelling tool was used to annotate the selected datasets and the labelled datasets were used to build a detection model based on the TensorFlow environment. The proposed method consists of two layers of decision trees, which was tested and validated on two videos of 2000 frames. The proposed model achieves high-performance results in both identification and recall ratios over 83% and 95%, respectively. It also achieved higher accuracy in classifying the postures over 72% and 64% in model testing and validation. The proposed model can promote potential improvements in the application of real-time video analysis in actual site conditions.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

... Before deep learning [13] was widely-adopted outside the computer vision community, researchers developed systems for PPE detection based on conventional machine learning or statistical methods [5,7,14,16,17,18,19,20,27,29]. It seems that the majority of systems, even from the recent literature [14,16,17,18,29], still rely on handcrafted methods. ...
... Before deep learning [13] was widely-adopted outside the computer vision community, researchers developed systems for PPE detection based on conventional machine learning or statistical methods [5,7,14,16,17,18,19,20,27,29]. It seems that the majority of systems, even from the recent literature [14,16,17,18,29], still rely on handcrafted methods. In order to detect helmets, Shrestha et al. [27] employed a standard face detector. ...
... Nevertheless, these methods are tested on very few images, making it hard to draw generic conclusions. Moohialdin et al. [18] employed decision trees for PPE and posture detection. Decision trees are part of the standard machine learning methods, which cannot model the complexity of data affected by large variations. ...
... Before deep learning [13] was widely-adopted outside the computer vision community, researchers developed systems for PPE detection based on conventional machine learning or statistical methods [5,7,14,16,17,18,19,20,27,29]. It seems that the majority of systems, even from the recent literature [14,16,17,18,29], still rely on handcrafted methods. ...
... Before deep learning [13] was widely-adopted outside the computer vision community, researchers developed systems for PPE detection based on conventional machine learning or statistical methods [5,7,14,16,17,18,19,20,27,29]. It seems that the majority of systems, even from the recent literature [14,16,17,18,29], still rely on handcrafted methods. In order to detect helmets, Shrestha et al. [27] employed a standard face detector. ...
... Nevertheless, these methods are tested on very few images, making it hard to draw generic conclusions. Moohialdin et al. [18] employed decision trees for PPE and posture detection. Decision trees are part of the standard machine learning methods, which cannot generalize well when data is affected by large variations. ...
Preprint
Full-text available
We propose a deep learning method to automatically detect personal protective equipment (PPE), such as helmets, surgical masks, reflective vests, boots and so on, in images of people. Typical approaches for PPE detection based on deep learning are (i) to train an object detector for items such as those listed above or (ii) to train a person detector and a classifier that takes the bounding boxes predicted by the detector and discriminates between people wearing and people not wearing the corresponding PPE items. We propose a novel and accurate approach that uses three components: a person detector, a body pose estimator and a classifier. Our novelty consists in using the pose estimator only at training time, to improve the prediction performance of the classifier. We modify the neural architecture of the classifier by adding a spatial attention mechanism, which is trained using supervision signal from the pose estimator. In this way, the classifier learns to focus on PPE items, using knowledge from the pose estimator with almost no computational overhead during inference.
Article
Full-text available
Timely and accurate monitoring of onsite construction operations can bring an immediate awareness on project specific issues. It provides practitioners with the information they need to easily and quickly make project control decisions. Despite their importance, the current practices are still time-consuming, costly, and prone to errors. To facilitate the process of collecting and analyzing performance data, researchers have focused on devising methods that can semi-automatically or automatically assess ongoing operations both at project level and operation level. A major line of work has particularly focused on developing computer vision techniques that can leverage still images, time-lapse photos and video streams for documenting the work in progress. To this end, this paper extensively reviews these state-of-the-art vision-based construction performance monitoring methods. Based on the level of information perceived and the types of output, these methods are mainly divided into two categories (namely project level: visual monitoring of civil infrastructure or building elements vs. operation level: visual monitoring of construction equipment and workers). The underlying formulations and assumptions used in these methods are discussed in detail. Finally the gaps in knowledge that need to be addressed in future research are identified.
Article
Full-text available
Purpose – Heat stress, having caused preventable and lamentable deaths, is hazardous to construction workers in the hot and humid summers of Hong Kong. The purpose of this paper is to develop a heat stress model, based on the Wet Bulb Globe Temperature (WBGT) index. Design/methodology/approach – Field studies were conducted during the summer time in Hong Kong (July to September 2010). Based upon 281 sets of synchronized meteorological and physiological data collected from construction workers in four different construction sites between July and September 2010, physiological, work-related, environmental and personal parameters were measured to construct and verify the heat stress model. Findings – It is found that drinking habit, age and work duration are the top three significant predictors to determine construction workers' physiological responses. Other predictors include percentage of body fat, resting heart rate, air pollution index, WBGT, smoking habit, energy consumption, and respiratory exchange rate. The accuracy of the model is verified against data which have not been used in developing the model. The accuracy of the heat stress model is found to be statistically acceptable (Mean Absolute Percentage Error=5.6 percent, Theil's U inequality coefficients=0.003). Practical implications – Based on these findings, appropriate work-rest pattern can be designed to safeguard the well being of workers when working in a hot and humid environment. Originality/value – The model reported in this paper provides a more scientific and reliable prediction of the reality which may benefit the industry to produce solid guidelines for working in hot weather.
Conference Paper
Full-text available
This paper presents an automated and real-time algorithm for recognition and 2D tracking of construction workers and equipment from site video streams. In recent years, several research studies have proposed semi-automated vision-based methods for tracking of construction workers and equipment. Nonetheless, there is still a need for automated initial recognition and real-time tracking of these resources in video streams. To address these limitations, a new algorithm based on histograms of Oriented Gradients (HOG) is proposed. The method uses HOG features with a new multiple binary Support Vector Machine (SVM) classifier to automatically recognize and differentiate workers and equipment. These resources are tracked in real-time using a new GPU-based implementation of the detector and classifier. Experimental results are presented on a comprehensive set of video streams on excavators, trucks, and workers collected from different projects. Our preliminary results indicate the applicability of the proposed approach for automated recognition and real-time 2D tracking of workers and equipment from a single video camera. Unlike other methods, our algorithm can enable automated and real-time construction performance assessment (including detection of idle resources) and does not need manual or semiautomated initialization of the resources in 2D video frames. The preliminary experimental results and perceived benefits of the proposed method are discussed in detail.
Article
In construction, about 80%-90% of accidents are associated with workers' unsafe acts. Nevertheless, the measurement of workers' behavior has not been actively applied in practice, due to the difficulties in observing workers on jobsites. In an effort to provide a robust and automated means for worker observation, this paper proposes a framework of vision-based unsafe action detection for behavior monitoring. The framework consists of (1) the identification of critical unsafe behavior, (2) the collection of relevant motion templates and site videos, (3) the 3D skeleton extraction from the videos, and (4) the detection of unsafe actions using the motion templates and skeleton models. For a proof of concept, experimental studies are undertaken to detect unsafe actions during ladder climbing (i.e., reaching far to a side) in motion datasets extracted from videos. The result indicates that the proposed framework can potentially perform well at detecting predefined unsafe actions in videos.
Article
For construction safety and health, continuous monitoring of unsafe conditions and action is essential in order to eliminate potential hazards in a timely manner. As a robust and automated means of field observation, computer vision techniques have been applied for the extraction of safety related information from site images and videos, and regarded as effective solutions complementary to current time-consuming and unreliable manual observational practices. Although some research efforts have been directed toward computer vision-based safety and health monitoring, its application in real practice remains premature due to a number of technical issues and research challenges in terms of reliability, accuracy, and applicability. This paper thus reviews previous attempts in construction applications from both technical and practical perspectives in order to understand the current status of computer vision techniques, which in turn suggests the direction of future research in the field of computer vision-based safety and health monitoring. Specifically, this paper categorizes previous studies into three groups—object detection, object tracking, and action recognition—based on types of information required to evaluate unsafe conditions and acts. The results demonstrate that major research challenges include comprehensive scene understanding, varying tracking accuracy by camera position, and action recognition of multiple equipment and workers. In addition, we identified several practical issues including a lack of task-specific and quantifiable metrics to evaluate the extracted information in safety context, technical obstacles due to dynamic conditions at construction sites and privacy issues. These challenges indicate a need for further research in these areas. Accordingly, this paper provides researchers insights into advancing knowledge and techniques for computer vision-based safety and health monitoring, and offers fresh opportunities and considerations to practitioners in understanding and adopting the techniques.
Article
This study aimed to (1) quantify the respective physical workloads of bar bending and fixing; and (2) compare the physiological and perceptual responses between bar benders and bar fixers. Field studies were conducted during the summer in Hong Kong from July 2011 to August 2011 over six construction sites. Synchronized physiological, perceptual, and environmental parameters were measured from construction rebar workers. The average duration of the 39 field measurements was 151.1 ± 22.4 min under hot environment (WBGT = 31.4 ± 2.2 °C), during which physiological, perceptual and environmental parameters were synchronized. Energy expenditure of overall rebar work, bar bending, and bar fixing were 2.57, 2.26 and 2.67 Kcal/min (179, 158 and 186 W), respectively. Bar fixing induced significantly higher physiological responses in heart rate (113.6 vs. 102.3 beat/min, p < 0.05), oxygen consumption (9.53 vs. 7.14 ml/min/kg, p < 0.05), and energy expenditure (2.67 vs. 2.26 Kcal/min, p < 0.05) (186 vs. 158 W, p < 0.05) as compared to bar bending. Perceptual response was higher in bar fixing but such difference was not statistically significant. Findings of this study enable the calculation of daily energy expenditure of rebar work.
Article
Technology application is deemed an effective way to further construction safety management. Various technologies have been adopted for construction safety, including information communication technology (ICT), sensor-based technology, 3S (GIS/GPS/RS) technology, radio frequency identification (RFID) and virtual reality. A review of previous studies in the area of technology applications for construction safety would be indispensable for the main stakeholders in this field to share innovative research findings and gain access to future research trends. A three-step method was used to obtain relevant publications (119 papers met the ultimate selection criteria) and compile a database of the findings. The results present a general review of technology application for construction safety from the aspects of number of papers published annually, publication type, publication name, country/region of distribution, research level, project phase and project type. Corresponding analysis was performed with the collected data and the radar chart was used for analysing the trend of technology application for construction safety and the trend of research topics. Five research gaps were identified in the review process. The trends and gaps can serve as motivation for researchers and practitioners to work on the next generation of studies and the development of future effective measures, which can ensure a safe construction environment.
Article
Automatically monitoring construction progress or generating Building Information Models using site images collections – beyond point cloud data – requires semantic information such as construction materials and inter-connectivity to be recognized for building elements. In the case of materials such information can only be derived from appearance-based data contained in 2D imagery. Currently, the state-of-the-art texture recognition algorithms which are often used for recognizing materials are very promising (reaching over 95% average accuracy), yet they have mainly been tested in strictly controlled conditions and often do not perform well with images collected from construction sites (dropping to 70% accuracy and lower). In addition, there is no benchmark that validates their performance under real-world construction site conditions. To overcome these limitations, we propose a new vision-based method for material classification from single images taken under unknown viewpoint and site illumination conditions. In the proposed algorithm, material appearance is modeled by a joint probability distribution of responses from a filter bank and principal Hue-Saturation-Value color values and classified using a multiple one-vs.-all χ2χ2 kernel Support Vector Machine classifier. Classification performance is compared with the state-of-the-art algorithms both in computer vision and AEC communities. For experimental studies, a new database containing 20 typical construction materials with more than 150 images per category is assembled and used for validation. Overall, for material classification an average accuracy of 97.1% for 200×200200×200 pixel image patches are reported. In cases where image patches are smaller, our method can synthetically generate additional pixels and maintain a competitive accuracy to those reported above (90.8% for 30×3030×30 pixel patches). The results show the promise of the applicability of the proposed method and expose the limitations of the state-of-the-art classification algorithms under real world conditions. It further defines a new benchmark that could be used to measure the performance of future algorithms.
Article
Videotaping is an effective and inexpensive technique that has long been used in construction to conduct productivity analyzes. However, as schedules of modern construction projects become more and more compressed, the limitation of video-based analysis-intensive manual reviewing process-contrasts sharply with the need for effortless data analysis methods. This paper presents a study on developing a video interpretation model to interpret videos of construction operations automatically into productivity information. More specifically, this research formalizes key concepts and procedures of video interpretation within the construction domain. It focuses on designing a mechanism for furthering the crosstalk between the prior knowledge of construction operations and computer vision techniques. It uses this mechanism to guide the detection and tracking of project resources as well as work state classifications and abnormal production scenario identifications. The resulting approach has the potential to provide a common base for developing automated video interpretation procedures that can greatly improve current data collection and analyzes practices in construction. Experimental results from preliminary studies have shown the potential of the proposed video interpretation method as an improved productivity data analysis method.
Article
Physical work in hot and humid environments imposes health risks, productivity falling and safety problems on workers. Protection of workers from heat related problems requires quantitative heat stress assessment of the workplace. In this paper, a new index-equivalent temperature (ET) is proposed to measure the environmental heat stress in indoor hot and humid environments. A climate chamber was built to simulate the indoor hot and humid environment. And the safe working time of 144 male volunteers were studied under different climatic conditions in the chamber. Cox regression method is adopted to obtain the impacts of variables on the safe working time. Then the new index-ET is proposed based on the Cox regression results. The correlations between the ET and the common used indexes are determined to test the validity of this new index. Finally the safe working time concerned with the ET is summarized. The results show that the new index gives physiological correlates and physical means. The ET developed in this paper has the potential to be a practical index to measure the environmental heat stress in indoor hot and humid environments.Highlights► We build a climate chamber to simulate the indoor hot and humid environment. ► We obtain safe working time. ► We adopt Cox regression method to study the safe working time. ► A new index-equivalent temperature (ET) is defined. ► The ET shows specific physiological correlates and physical means.
Article
This paper is an extension to a paper previously published in the journal Building and Environment. Having determined an optimal recovery time in a controlled climatic environment, this paper aims to investigate the real impact on construction rebar workers by replicating the clinical experimentation to a series of field studies. Field studies were conducted during the summer time in Hong Kong. Nineteen rebar workers performed tasks of fixing and bending steel reinforcement bars on two building construction sites until voluntary exhaustion and were allowed to recover on site until their physiological conditions returned to the pre-work level or lower. Physiological Strain Index (PSI) was used as a yard-stick to determine the rate of recovery. A total of 411 sets of meteorological and physiological data collected over fourteen working days between July and August of 2011 were collated to derive the optimal recovery time. It was found that on average a rebar worker could achieve 94% recovery in 40 min; 93% in 35 min; 92% in 30 min; 88% in 25 min; 84% in 20 min; 78% in 15 min; 68% in 10 min; and 58% in 5 min. Curve estimation results showed that recovery time is a significant variable to predict the rate of recovery (R 2 ¼ 0.99, P < 0.05). Additional rest times should be introduced between works in extreme hot weather to enable workers to recover from heat stress. Frequency and duration of each rest time should be agreed among different stakeholders based on the cumulative recovery curve.
Article
Visual recording devices such as video cameras, CCTVs, or webcams have been broadly used to facilitate work progress or safety monitoring on construction sites. Without human intervention, however, both real-time reasoning about captured scenes and interpretation of recorded images are challenging tasks. This article presents an exploratory method for automated object identification using standard video cameras on construction sites. The proposed method supports real-time detection and classification of mobile heavy equipment and workers. The background subtraction algorithm extracts motion pixels from an image sequence, the pixels are then grouped into regions to represent moving objects, and finally the regions are identified as a certain object using classifiers. For evaluating the method, the formulated computer-aided process was implemented on actual construction sites, and promising results were obtained. This article is expected to contribute to future applications of automated monitoring systems of work zone safety or productivity.
Using Workforce's Physiological Strain Monitoring to Enhance Social Sustainability of Construction
  • U Gatti
  • G Migliaccio
  • S M Bogus
  • S Priyadarshini
  • A Scharrer
Gatti, U., Migliaccio, G., Bogus, S. M., Priyadarshini, S., & Scharrer, A. (2013). Using Workforce's Physiological Strain Monitoring to Enhance Social Sustainability of Construction. Journal of Architectural Engineering, 19(3), 179-185. https://doi.org/10.1061/(ASCE)AE.1943-5568.0000110