Conference Paper

Human Posture Detection on Lightweight DCNN and SVM in a Digitalized Healthcare System

Authors:
To read the full-text of this research, you can request a copy directly from the authors.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

... This can assist in determining various Salat postures with more accuracy and efficiency. In research [18] Mobile Net and Xception, two lightweight deep convolutional neural networks, have been modified for better recognition performance while decreasing hyperparameters and addressing data shortages. This is accomplished by employing approaches like regularization, transfer learning, and neural architecture search. ...
... Furthermore, the lightweight and efficient technique created in your project is consistent with recent efforts [18] aimed at optimizing deep learning architectures such as Mobile Net and Xception for posture identification. While their attention was on enhancing recognition performance and reducing model complexity, the emphasis on obtaining remarkable accuracy while preserving efficiency reflects the requirement for resource-effective solutions in posture recognition systems. ...
Article
Full-text available
Salat, a fundamental act of worship in Islam, is performed five times daily. It entails a specific set of postures and has both spiritual and bodily advantages. Many people, notably novices and the elderly, may trouble with maintaining proper posture and remembering the sequence. Resources, instruction, and practice assist in addressing these issues, emphasizing the need of prayer sincerity. Our contribution in the research is twofold as we have developed a new dataset for Salat posture detection and further a hybrid model Media Pipe+3DCNN. Dataset is developed of 46 individuals performing each of the three compulsory Salat postures of Qayyam, Rukku and Sajdah and model was trained and tested with 14019 images. Our current research is a solution for correct posture detection which can be used for all ages. We examined the Media Pipe library design as a methodology, which leverages a multistep detector machine learning pipeline that has been proven to work in our research. Using a detector, the pipeline first locates the person's region-of-interest (ROI) within the frame. The tracker then forecasts the pose landmarks and division mask in between the ROIs using the ROI cropped frame as input. A 3D convolutional neural network (3DCNN) was also utilized to extract features and classification from key-points retrieved from the Media Pipe architecture. With real-time evaluation, the newly built model provided 100% accuracy and a promising result. We analyzed different evaluation matrices such as Loss, Precision, Recall, F1-Score, and area under the curve (AUC) to give validation process authenticity; the results are 0.03, 1.00, 0.01, 0.99, 1.00 and 0.95. accordingly.
Article
Full-text available
Radio frequency (RF) spectrum sensing is critical for applications requiring precise object and posture detection and classification. This survey aims to provide a focused review of context-aware RF-based sensing, emphasizing its principles, advancements, and challenges. It specifically examines state-of-the-art techniques such as phased array radar, synthetic aperture radar, and passive RF sensing, highlighting their methodologies, data input domains, and spatial diversity strategies. The paper evaluates feature extraction methods and machine learning approaches used for detection and classification, presenting their accuracy metrics across various applications. Additionally, it investigates the integration of RF sensing with other modalities, such as inertial sensors, to enhance context awareness and improve performance. Challenges like environmental interference, scalability, and regulatory constraints are addressed, with insights into real-world mitigation strategies. The survey concludes by identifying emerging trends, practical applications, and future directions for advancing RF sensing technologies.
Article
Full-text available
Many individuals worldwide pass away as a result of inadequate procedures for prompt illness identification and subsequent treatment. A valuable life can be saved or at least extended with the early identification of serious illnesses, such as various cancers and other life-threatening conditions. The development of the Internet of Medical Things (IoMT) has made it possible for healthcare technology to offer the general public efficient medical services and make a significant contribution to patients’ recoveries. By using IoMT to diagnose and examine BreakHis v1 400× breast cancer histology (BCH) scans, disorders may be quickly identified and appropriate treatment can be given to a patient. Imaging equipment having the capability of auto-analyzing acquired pictures can be used to achieve this. However, the majority of deep learning (DL)-based image classification approaches are of a large number of parameters and unsuitable for application in IoMT-centered imaging sensors. The goal of this study is to create a lightweight deep transfer learning (DTL) model suited for BCH scan examination and has a good level of accuracy. In this study, a lightweight DTL-based model “MobileNet-SVM”, which is the hybridization of MobileNet and Support Vector Machine (SVM), for auto-classifying BreakHis v1 400× BCH images is presented. When tested against a real dataset of BreakHis v1 400× BCH images, the suggested technique achieved a training accuracy of 100% on the training dataset. It also obtained an accuracy of 91% and an F1-score of 91.35 on the test dataset. Considering how complicated BCH scans are, the findings are encouraging. The MobileNet-SVM model is ideal for IoMT imaging equipment in addition to having a high degree of precision. According to the simulation findings, the suggested model requires a small computation speed and time.
Article
Full-text available
Posture detection targets toward providing assessments for the monitoring of the health and welfare of humans have been of great interest to researchers from different disciplines. The use of computer vision systems for posture recognition might result in useful improvements in healthy aging and support for elderly people in their daily activities in the field of health care. Computer vision and pattern recognition communities are particularly interested in fall automated recognition. Human sensing and artificial intelligence have both paid great attention to human posture detection (HPD). The health status of elderly people can be remotely monitored using human posture detection, which can distinguish between positions such as standing, sitting, and walking. The most recent research identified posture using both deep learning (DL) and conventional machine learning (ML) classifiers. However, these techniques do not effectively identify the postures and overfits of the model overfits. Therefore, this study suggested a deep convolutional neural network (DCNN) framework to examine and classify human posture in health monitoring systems. This study proposes a feature selection technique, DCNN, and a machine learning technique to assess the previously mentioned problems. The InceptionV3 DCNN model is hybridized with SVM ML and its performance is compared. Furthermore, the performance of the proposed system is validated with other transfer learning (TL) techniques such as InceptionV3, DenseNet121, and ResNet50. This study uses the least absolute shrinkage and selection operator (LASSO)-based feature selection to enhance the feature vector. The study also used various techniques, such as data augmentation, dropout, and early stop, to overcome the problem of model overfitting. The performance of this DCNN framework is tested using benchmark Silhouettes of human posture and classification accuracy, loss, and AUC value of 95.42%, 0.01, and 99.35% are attained, respectively. Furthermore, the results of the proposed technology offer the most promising solution for indoor monitoring systems.
Article
Full-text available
Human posture classification (HPC) is the process of identifying a human pose from a still image or moving image that was recorded by a digicam. This makes it easier to keep a record of people’s postures, which is helpful for many things. The intricate surroundings that are depicted in the image, such as occlusion and the camera view angle, make HPC a difficult process. Consequently, the development of a reliable HPC system is essential. This study proposes the “DeneSVM”, an innovative deep transfer learning-based classification model that pulls characteristics from image datasets to detect and classify human postures. The paradigm is intended to classify the four primary postures of lying, bending, sitting, and standing. These positions are classes of sitting, bending, lying, and standing. The Silhouettes for Human Posture Recognition dataset has been used to train, validate, test, and analyze the suggested model. The DeneSVM model attained the highest test precision (94.72%), validation accuracy (93.79%) and training accuracy (97.06%). When the efficiency of the suggested model was validated using the testing dataset, it too had a good accuracy of 95%.
Article
Full-text available
With the advancement in pose estimation techniques, human posture detection recently received considerable attention in many applications, including ergonomics and healthcare. When using neural network models, overfitting and poor performance are prevalent issues. Recently, convolutional neural networks (CNNs) were successfully used for human posture recognition from human images due to their superior multiscale high-level visual representations over hand-engineering low-level characteristics. However, calculating millions of parameters in a deep CNN requires a significant number of annotated examples, which prohibits many deep CNNs such as AlexNet and VGG16 from being used on issues with minimal training data. We propose a new three-phase model for decision support that integrates CNN transfer learning, image data augmentation, and hyperparameter optimization (HPO) to address this problem. The model is used as part of a new decision support framework for the optimization of hyperparameters for AlexNet, VGG16, CNN, and multilayer perceptron (MLP) models for accomplishing optimal classification results. The AlexNet and VGG16 transfer learning algorithms with HPO are used for human posture detection, while CNN and Multilayer Perceptron (MLP) were used as standard classifiers for contrast. The HPO methods are essential for machine learning and deep learning algorithms because they directly influence the behaviors of training algorithms and have a major impact on the performance of machine learning and deep learning models. We used an image data augmentation technique to increase the number of images to be used for model training to reduce model overfitting and improve classification performance using the AlexNet, VGG16, CNN, and MLP models. The optimal combination of hyperparameters was found for the four models using a random-based search strategy. The MPII human posture datasets were used to test the proposed approach. The proposed models achieved an accuracy of 91.2% using AlexNet, 90.2% using VGG16, 87.5% using CNN, and 89.9% using MLP. The study is the first HPO study executed on the MPII human pose dataset.
Article
Full-text available
Yoga is a centuries-old style of exercise followed by sports personnel, patients, and physiotherapist as their regime. A correct posture and technique are the key points in yoga to reap the maximum benefits. Hence, developing a model to classify yoga postures correctly is a recently emerging research topic. The paper presents a novel architecture that aims to classify various yoga poses. The proposed model estimates and classifies yoga poses into five broad categories with low latency. In the proposed architecture, the images are skeletonized before inputting into the model. The skeletonization process is done using the MediaPipe library for body keypoint detection. The paper compares the performance of various deep learning models with and without skeletonization. Different learning models showed the optimum result with the training of skeletonized images to the network. The comparison is drawn to establish the positive impact of skeletonization on the results obtained by various models. VGG16 achieves the highest validation accuracy on non-skeletonized images (95.6%), followed by InceptionV3, NASNetMobile, YogaConvo2d (proposed model) (89.9%), and lastly, InceptionResNetV2. In contrast, the proposed model YogaConvo2d using skeletonized images reports a validation accuracy of 99.62%, followed by VGG16, InceptionResNetV2, NASNetMobile, and InceptionV3.
Article
Full-text available
Ambient assisted living is good way to look after ageing population that enables us to detect human's activities of daily living (ADLs) and postures, as number of older adults are increasing at rapid pace. Posture detection is used to provide the assessment for monitoring the activity of elderly people. Most of the existing approaches exploit dedicated sensing devices as as cameras, thermal sensors, accelerometer, gyroscope, magnetometer and so on. Traditional methods such as recording data using these sensors, training and testing machine learning classifiers to identify various human postures. This paper exploits data recorded using ubiquitous devices such as smart phones we use on daily basis and classify different human activities such as standing, sitting, laying, walking, walking downstairs and walking upstairs. Moreover, we have used machine learning and deep learning classifiers including random forest, KNN, logistic regression, multilayer perceptron, decision tree, QDA and SVM, convolutional neural network and long short-term memory as ground truth and proposed a novel ensemble classification algorithm to classify each human activity. The proposed algorithm demonstrate classification accuracy of 98% that outperforms other algorithms.
Article
Full-text available
In this paper, four different strategies are explored for the classification task. Both ResNet50 and VGG-16 are utilized and re-trained to classify our diffusion-weighted magnetic resonance imaging (DWI) database to clarify the existence of prostate cancer (PCa) or not. Transfer learning and data augmentation are applied for ResNet50 and VGG-16 to solve the problem of lack of tagged data and increase system efficiency. The last fully connected layer is replaced by the Support Vector Machine (SVM) classifier to achieve a better accuracy. Both transfer learning and data augmentation are performed for SVM to increase the performance of our framework. In addition to, a k-fold cross validation is applied to test our models performance. Our proposed techniques are trained and evaluated on a given DWI dataset that involves 1765 patients of which 845 are with PCa and 920 are without. This paper employs end-to-end fully convolutional neural networks without any prepossessing or post-processing. The proposed technique based on ResNet50 hybridized with SVM achieves the best performance with 98.79 % accuracy, 98.91 % area under the curve (AUC), 98.43 % sensitivity, 97.99 % precision, 95.92 % F1 score and a computational time of 2.345 s.
Article
Full-text available
Yoga is an ancient science and discipline originated in India 5000 years ago. It is used to bring harmony to both body and mind with the help of asana, meditation and various other breathing techniques It bring peace to the mind. Due to increase of stress in the modern lifestyle, yoga has become popular throughout the world. There are various ways through which one can learn yoga. Yoga can be learnt by attending classes at a yoga centre or through home tutoring. It can also be self-learnt with the help of books and videos. Most people prefer self-learning but it is hard for them to find incorrect parts of their yoga poses by themselves. Using the system, the user can select the pose that he/she wishes to practice. He/she can then upload a photo of themselves doing the pose. The pose of the user is compared with the pose of the expert and difference in angles of various body joints is calculated. Based on thisdifference of angles feedback is provided to the user so that he/she can improve the pose.
Conference Paper
Full-text available
Human activity recognition (HAR) is considered as one of the most difficult and challenging issues now a days. Many experiments are now in progress regarding this problem. Among many human activities, mostly six are considered for research in this area. This activity recognition issue can be measured with the help of smartphones and smartphone sensors, along with the connection of Internet of Things (IoT) devices. In this research, an improved deep learning scheme is proposed for the recognition of human activities. A customized Neural Network (NN) model was designed and tested for the research. The proposed model obtained 96.47% accuracy on the HAR with smartphones dataset that is better than most other analyzed models. Sensors such as accelerometer, gyroscope are focused on the data analysis portion of this research work. This article will give a clear idea of the dataset, Machine Learning algorithms, and the effect of the proposed algorithm.
Article
Full-text available
This paper is concerned with posture recognition using ensemble convolutional neural networks (CNNs) in home environments. With the increasing number of elderly people living alone at home, posture recognition is very important for helping elderly people cope with sudden danger. Traditionally, to recognize posture, it was necessary to obtain the coordinates of the body points, depth, frame information of video, and so on. In conventional machine learning, there is a limitation in recognizing posture directly using only an image. However, with advancements in the latest deep learning, it is possible to achieve good performance in posture recognition using only an image. Thus, we performed experiments based on VGGNet, ResNet, DenseNet, InceptionResNet, and Xception as pre-trained CNNs using five types of preprocessing. On the basis of these deep learning methods, we finally present the ensemble deep model combined by majority and average methods. The experiments were performed by a posture database constructed at the Electronics and Telecommunications Research Institute (ETRI), Korea. This database consists of 51,000 images with 10 postures from 51 home environments. The experimental results reveal that the ensemble system by InceptionResNetV2s with five types of preprocessing shows good performance in comparison to other combination methods and the pre-trained CNN itself.
Article
Full-text available
The world’s population is aging: the expansion of the older adult population with multiple physical and health issues is now a huge socio-economic concern worldwide. Among these issues, the loss of mobility among older adults due to musculoskeletal disorders is especially serious as it has severe social, mental and physical consequences. Human body joint monitoring and early diagnosis of these disorders will be a strong and effective solution to this problem. A smart joint monitoring system can identify and record important musculoskeletal-related parameters. Such devices can be utilized for continuous monitoring of joint movements during the normal daily activities of older adults and the healing process of joints (hips, knees or ankles) during the post-surgery period. A viable monitoring system can be developed by combining miniaturized, durable, low-cost and compact sensors with the advanced communication technologies and data processing techniques. In this study, we have presented and compared different joint monitoring methods and sensing technologies recently reported. A discussion on sensors’ data processing, interpretation, and analysis techniques is also presented. Finally, current research focus, as well as future prospects and development challenges in joint monitoring systems are discussed.
Article
Full-text available
Deep learning has demonstrated tremendous success in variety of application domains in the past few years. This new field of machine learning has been growing rapidly and applied in most of the application domains with some new modalities of applications, which helps to open new opportunity. There are different methods have been proposed on different category of learning approaches, which includes supervised, semi-supervised and un-supervised learning. The experimental results show state-of-the-art performance of deep learning over traditional machine learning approaches in the field of Image Processing, Computer Vision, Speech Recognition, Machine Translation, Art, Medical imaging, Medical information processing, Robotics and control, Bio-informatics, Natural Language Processing (NLP), Cyber security, and many more. This report presents a brief survey on development of DL approaches, including Deep Neural Network (DNN), Convolutional Neural Network (CNN), Recurrent Neural Network (RNN) including Long Short Term Memory (LSTM) and Gated Recurrent Units (GRU), Auto-Encoder (AE), Deep Belief Network (DBN), Generative Adversarial Network (GAN), and Deep Reinforcement Learning (DRL). In addition, we have included recent development of proposed advanced variant DL techniques based on the mentioned DL approaches. Furthermore, DL approaches have explored and evaluated in different application domains are also included in this survey. We have also comprised recently developed frameworks, SDKs, and benchmark datasets that are used for implementing and evaluating deep learning approaches. There are some surveys have published on Deep Learning in Neural Networks [1, 38] and a survey on RL [234]. However, those papers have not discussed the individual advanced techniques for training large scale deep learning models and the recently developed method of generative models [1].
Article
Full-text available
Remarkable progress has been made in image recognition, primarily due to the availability of large-scale annotated datasets and the revival of deep CNN. CNNs enable learning data-driven, highly representative, layered hierarchical image features from sufficient training data. However, obtaining datasets as comprehensively annotated as ImageNet in the medical imaging domain remains a challenge. There are currently three major techniques that successfully employ CNNs to medical image classification: training the CNN from scratch, using off-the-shelf pre-trained CNN features, and conducting unsupervised CNN pre-training with supervised fine-tuning. Another effective method is transfer learning, i.e., fine-tuning CNN models pre-trained from natural image dataset to medical image tasks. In this paper, we exploit three important, but previously understudied factors of employing deep convolutional neural networks to computer-aided detection problems. We first explore and evaluate different CNN architectures. The studied models contain 5 thousand to 160 million parameters, and vary in numbers of layers. We then evaluate the influence of dataset scale and spatial image context on performance. Finally, we examine when and why transfer learning from pre-trained ImageNet (via fine-tuning) can be useful. We study two specific computer-aided detection (CADe) problems, namely thoraco-abdominal lymph node (LN) detection and interstitial lung disease (ILD) classification. We achieve the state-of-the-art performance on the mediastinal LN detection, with 85% sensitivity at 3 false positive per patient, and report the first five-fold cross-validation classification results on predicting axial CT slices with ILD categories. Our extensive empirical evaluation, CNN model analysis and valuable insights can be extended to the design of high performance CAD systems for other medical imaging tasks.
Conference Paper
Full-text available
Human pose estimation has made significant progress during the last years. However current datasets are limited in their coverage of the overall pose estimation challenges. Still these serve as the common sources to evaluate, train and compare different models on. In this paper we intro-duce a novel benchmark "MPII Human Pose" 1 that makes a significant advance in terms of diversity and difficulty, a contribution that we feel is required for future develop-ments in human body models. This comprehensive dataset was collected using an established taxonomy of over 800 human activities [1]. The collected images cover a wider variety of human activities than previous datasets including various recreational, occupational and householding activ-ities, and capture people from a wider range of viewpoints. We provide a rich set of labels including positions of body joints, full 3D torso and head orientation, occlusion labels for joints and body parts, and activity labels. For each im-age we provide adjacent video frames to facilitate the use of motion information. Given these rich annotations we per-form a detailed analysis of leading human pose estimation approaches and gaining insights for the success and fail-ures of these methods.
Article
Posture identification system has been extensively useful in areas such as physical activities, ecological awareness events, man machine interfacing applications, control room panel monitoring systems and in autonomous old people caring systems. This paper describes a sensor based posture detection and guidance system while doing yoga exercise. In our day-to-day life, doing yoga exercise has become a well-known discipline, which helps people to be physically fit and have good mental health. Practicing yoga exercise is a recognized system for the human beings to improve their physical and mental activities. This practice helps in regulatory activities in an individual’s mind, body and soul. As a result of this, flexibility, muscle power and body image will increase. The existing method has shown the recognition of posture analysis through virtual reality and exergame. We proposed a suit that is designed and implemented through a virtual master. The suit is integrated with various types of accelerometer sensors which are fixed on the flexible bands, that are light in weight and are used indirectly to measure the orientation of the limbs. Also the sensor integrated suit is very helpful to measure the direction of the tilt moment. Predefined calibrated values are loaded on to the microcontroller, which are compared with the measured values and when there is a match the information is sent to a mobile application. The mobile app identifies whether the user is doing the posture correctly or not based on the information given to practice the yoga exercise.
Article
Human posture detection allows the capture of the kinematic parameters of the human body, which is important for many applications, such as assisted living, healthcare, physical exercising and rehabilitation. This task can greatly benefit from recent development in deep learning and computer vision. In this paper, we propose a novel deep recurrent hierarchical network (DRHN) model based on MobileNetV2 that allows for greater flexibility by reducing or eliminating posture detection problems related to a limited visibility human torso in the frame, i.e., the occlusion problem. The DRHN network accepts the RGB-Depth frame sequences and produces a representation of semantically related posture states. We achieved 91.47% accuracy at 10 fps rate for sitting posture recognition.
Chapter
In this chapter, we propose a technique for the classification of yoga poses/asanas by learning the 3D landmark points in human poses obtained from a single image. We apply an encoder architecture followed by a regression layer to estimate pose parameters like shape, gesture, and camera position, which are later mapped to 3D landmark points by the SMPL (Skinned Multi-Person Linear) model. The 3D landmark points of each image are the features used for the classification of poses. We experiment with different classification models, including k-nearest neighbors (kNN), support vector machine (SVM), and some popular deep neural networks such as AlexNet, VGGNet, and ResNet. Since this is the first attempt to classify yoga asanas, no dataset is available in the literature. We propose an annotated dataset containing images of yoga poses and validate the proposed method on the newly introduced dataset.
Article
In this paper, the ventricular fibrillation (VF) rhythm is detected by using a new approach involving the support vector machine (SVM), adaptive boosting (AdaBoost) and differential evolution (DE) algorithms with the help of an optimal variable combination. The proposed methodology has been validated on training sets and testing sets that were obtained from three databases, namely MIT-BIH malignant ventricular arrhythmia database, arrhythmia database, and CUDB database. In the evaluation phase, the proposed methodology shows superior performance in detection of the VF rhythm than competing methods: an accuracy of 98.20%, a sensitivity of 98.25%, and specificity of 98.18% using 5 s of the ECG segments. Another advantage of our method is that it needs less memory and can be implemented in real-time.
Article
The posture detection received lots of attention in the fields of human sensing and artificial intelligence. Posture detection can be used for the monitoring health status of elderly remotely by identifying their postures such as standing, sitting and walking. Most of the current studies used traditional machine learning classifiers to identify the posture. However, these methods do not perform well to detect the postures accurately. Therefore, in this study, we proposed a novel hybrid approach based on machine learning classifiers (i. e., support vector machine (SVM), logistic regression (KNN), decision tree, Naive Bayes, random forest, Linear discrete analysis and Quadratic discrete analysis) and deep learning classifiers (i. e., 1D-convolutional neural network (1D-CNN), 2D-convolutional neural network (2D-CNN), LSTM and bidirectional LSTM) to identify posture detection. The proposed hybrid approach uses prediction of machine learning (ML) and deep learning (DL) to improve the performance of ML and DL algorithms. The experimental results on widely benchmark dataset are shown and results achieved an accuracy of more than 98%.
Article
Vision-based monocular human pose estimation, as one of the most fundamental and challenging problems in computer vision, aims to obtain posture of the human body from input images or video sequences. The recent developments of deep learning techniques have been brought significant progress and remarkable breakthroughs in the field of human pose estimation. This survey extensively reviews the recent deep learning-based 2D and 3D human pose estimation methods published since 2014. This paper summarizes the challenges, main frameworks, benchmark datasets, evaluation metrics, performance comparison, and discusses some promising future research directions.
Article
This paper gives an overview of some ways in which our understanding of performance evaluation measures for machine-learned classifiers has improved over the last twenty years. I also highlight a range of areas where this understanding is still lacking, leading to ill-advised practices in classifier evaluation. This suggests that in order to make further progress we need to develop a proper measurement theory of machine learning. I then demonstrate by example what such a measurement theory might look like and what kinds of new results it would entail. Finally, I argue that key properties such as classification ability and data set difficulty are unlikely to be directly observable, suggesting the need for latent-variable models and causal inference.
Article
Object detection, including objectness detection (OD), salient object detection (SOD), and category-specific object detection (COD), is one of the most fundamental yet challenging problems in the computer vision community. Over the last several decades, great efforts have been made by researchers to tackle this problem, due to its broad range of applications for other computer vision tasks such as activity or event recognition, content-based image retrieval and scene understanding, etc. While numerous methods have been presented in recent years, a comprehensive review for the proposed high-quality object detection techniques, especially for those based on advanced deep-learning techniques, is still lacking. To this end, this article delves into the recent progress in this research field, including 1) definitions, motivations, and tasks of each subdirection; 2) modern techniques and essential research trends; 3) benchmark data sets and evaluation metrics; and 4) comparisons and analysis of the experimental results. More importantly, we will reveal the underlying relationship among OD, SOD, and COD and discuss in detail some open questions as well as point out several unsolved challenges and promising future works.
Article
Deeper neural networks are more difficult to train. We present a residual learning framework to ease the training of networks that are substantially deeper than those used previously. We explicitly reformulate the layers as learning residual functions with reference to the layer inputs, instead of learning unreferenced functions. We provide comprehensive empirical evidence showing that these residual networks are easier to optimize, and can gain accuracy from considerably increased depth. On the ImageNet dataset we evaluate residual nets with a depth of up to 152 layers---8x deeper than VGG nets but still having lower complexity. An ensemble of these residual nets achieves 3.57% error on the ImageNet test set. This result won the 1st place on the ILSVRC 2015 classification task. We also present analysis on CIFAR-10 with 100 and 1000 layers. The depth of representations is of central importance for many visual recognition tasks. Solely due to our extremely deep representations, we obtain a 28% relative improvement on the COCO object detection dataset. Deep residual nets are foundations of our submissions to ILSVRC & COCO 2015 competitions, where we also won the 1st places on the tasks of ImageNet detection, ImageNet localization, COCO detection, and COCO segmentation.
A Sonification Method using human body movements
  • albu
Real-time Recognition of Yoga Poses using Computer Vision for Smart Health Care
  • sharma
Real-time Recognition of Yoga Poses using Computer Vision for Smart Health Care
  • A Sharma
  • Y Shah
  • Y Agrawal
  • P Jain
A Sonification Method using human body movements
  • F Albu
  • M Nicolau
  • F Pirvan
  • D Hagiescu