IET Computer Vision
Volume 12, Issue 1, February 2018
Volumes & issues:
Volume 12, Issue 1
February 2018
-
- Source: IET Computer Vision, Volume 12, Issue 1, p. 1 –2
- DOI: 10.1049/iet-cvi.2017.0628
- Type: Article
- + Show details - Hide details
-
p.
1
–2
(2)
- Author(s): Biplab Ketan Chakraborty ; Debajit Sarma ; M.K. Bhuyan ; Karl F MacDorman
- Source: IET Computer Vision, Volume 12, Issue 1, p. 3 –15
- DOI: 10.1049/iet-cvi.2017.0052
- Type: Article
- + Show details - Hide details
-
p.
3
–15
(13)
The ability of computers to recognise hand gestures visually is essential for progress in human–computer interaction. Gesture recognition has applications ranging from sign language to medical assistance to virtual reality. However, gesture recognition is extremely challenging not only because of its diverse contexts, multiple interpretations, and spatio-temporal variations but also because of the complex non-rigid properties of the hand. This study surveys major constraints on vision-based gesture recognition occurring in detection and pre-processing, representation and feature extraction, and recognition. Current challenges are explored in detail.
- Author(s): Youssef Hbali ; Sara Hbali ; Lahoucine Ballihi ; Mohammed Sadgal
- Source: IET Computer Vision, Volume 12, Issue 1, p. 16 –26
- DOI: 10.1049/iet-cvi.2017.0062
- Type: Article
- + Show details - Hide details
-
p.
16
–26
(11)
There is a significantly increasing demand for monitoring systems for elderly people in the health-care sector. As the aging population increases, patient privacy violations and the cost of elderly assistance have driven the research community toward computer vision and image processing to design and deploy new systems for monitoring the elderly in the authors’ society and turning their living houses into smart environments. By exploiting recent advances and the low cost of three-dimensional (3D) depth sensors such as Microsoft Kinect, the authors propose a new skeleton-based approach to describe the spatio-temporal aspects of a human activity sequence, using the Minkowski and cosine distances between the 3D joints. We trained and validated their approach on the Microsoft MSR 3D Action and MSR Daily Activity 3D datasets using the Extremely Randomised Trees algorithm. The results are very promising, demonstrating that the trained model can be used to build a monitoring system for the elderly using open-source libraries and a low-cost depth sensor.
- Author(s): Alessandro Manzi ; Laura Fiorini ; Raffaele Limosani ; Paolo Dario ; Filippo Cavallo
- Source: IET Computer Vision, Volume 12, Issue 1, p. 27 –35
- DOI: 10.1049/iet-cvi.2017.0118
- Type: Article
- + Show details - Hide details
-
p.
27
–35
(9)
Human activity recognition is an important and active field of research having a wide range of applications in numerous fields including ambient-assisted living (AL). Although most of the researches are focused on the single user, the ability to recognise two-person interactions is perhaps more important for its social implications. This study presents a two-person activity recognition system that uses skeleton data extracted from a depth camera. The human actions are encoded using a set of a few basic postures obtained with an unsupervised clustering approach. Multiclass support vector machines are used to build models on the training set, whereas the X-means algorithm is employed to dynamically find the optimal number of clusters for each sample during the classification phase. The system is evaluated on the Institute of Systems and Robotics (ISR) - University of Lincoln (UoL) and Stony Brook University (SBU) datasets, reaching overall accuracies of 0.87 and 0.88, respectively. Although the results show that the performances of the system are comparable with the state of the art, recognition improvements are obtained with the activities related to health-care environments, showing promise for applications in the AL realm.
- Author(s): Lili Tao ; Tilo Burghardt ; Majid Mirmehdi ; Dima Damen ; Ashley Cooper ; Massimo Camplani ; Sion Hannuna ; Adeline Paiement ; Ian Craddock
- Source: IET Computer Vision, Volume 12, Issue 1, p. 36 –47
- DOI: 10.1049/iet-cvi.2017.0112
- Type: Article
- + Show details - Hide details
-
p.
36
–47
(12)
Deriving a person's energy expenditure accurately forms the foundation for tracking physical activity levels across many health and lifestyle monitoring tasks. In this study, the authors present a method for estimating calorific expenditure from combined visual and accelerometer sensors by way of an RGB-Depth camera and a wearable inertial sensor. The proposed individual-independent framework fuses information from both modalities which leads to improved estimates beyond the accuracy of single modality and manual metabolic equivalents of task (MET) lookup table based methods. For evaluation, the authors introduce a new dataset called SPHERE_RGBD + Inertial_calorie, for which visual and inertial data are simultaneously obtained with indirect calorimetry ground truth measurements based on gas exchange. Experiments show that the fusion of visual and inertial data reduces the estimation error by 8 and 18% compared with the use of visual only and inertial sensor only, respectively, and by 33% compared with a MET-based approach. The authors conclude from their results that the proposed approach is suitable for home monitoring in a controlled environment.
- Author(s): Michal Kepski and Bogdan Kwolek
- Source: IET Computer Vision, Volume 12, Issue 1, p. 48 –58
- DOI: 10.1049/iet-cvi.2017.0119
- Type: Article
- + Show details - Hide details
-
p.
48
–58
(11)
The authors present efficient and effective algorithms for fall detection on the basis of sequences of depth maps and data from a wireless inertial sensor worn by a monitored person. A set of descriptors is discussed to permit distinguishing between accidental falls and activities of daily living. Experimental validation is carried out on the freely available dataset consisting of synchronised depth and accelerometric data. Extensive experiments are conducted in the scenario with a static camera facing the scene and an active camera observing the same scene from above. Several experiments consisting of person detection, tracking and fall detection in real-time are carried out to show efficiency and reliability of the proposed solutions. The experimental results show that the developed algorithms for fall detection have high sensitivity and specificity.
- Author(s): Huseyin Erdogan ; Yunus Palaska ; Engin Masazade ; Duygun Erol Barkana ; Hazım Kemal Ekenel
- Source: IET Computer Vision, Volume 12, Issue 1, p. 59 –68
- DOI: 10.1049/iet-cvi.2017.0122
- Type: Article
- + Show details - Hide details
-
p.
59
–68
(10)
Engagement is a key factor in gaming. Especially, in gamification applications, users’ engagement levels have to be assessed in order to determine the usability of the developed games. The authors first present computer vision-based game design for physical exercise. All games are played with gesture controls. The authors conduct user studies in order to evaluate the perception of the games using a game engagement questionnaire. Participants state that the games are interesting and they want to play them again. Next, as a use case, the authors integrate one of these games into a robot-assisted rehabilitation system. The authors perform additional user studies by employing self-assessment manikin to assess the difficulty levels that can range from boredom to excitement. The authors observe that with the increasing difficulty level, users’ arousal increases. Additionally, the authors perform psychophysiological signal analysis of the participants during the execution of the game under two distinctive difficulty levels. The authors derive features from the signals obtained from blood volume pulse (BVP), skin conductance, and skin temperature sensors. As a result of analysis of variance and sequential forward selection, the authors find that changes in the temperature and frequency content of BVP provide useful information to estimate the players’ engagement.
- Author(s): Nabila Mansouri ; Mohammed Aouled Issa ; Yousra Ben Jemaa
- Source: IET Computer Vision, Volume 12, Issue 1, p. 69 –75
- DOI: 10.1049/iet-cvi.2017.0055
- Type: Article
- + Show details - Hide details
-
p.
69
–75
(7)
Far from the camera, image resolution is significantly degraded and person cannot cooperate with the acquisition equipment. So, the classical intrusive biometrics approach could not be applied. As a non-intrusive biometric, gait analysis gained the attention of the computer vision community for number of potential applications such as age estimation. Since, that gait is very sensitive to ageing, gait analysis is the suitable solution for age estimation at a great distance from the camera. Given the complexity of this task, the authors propose in this study a new approach based on descriptors cascade. The proposed approach is to use a fusion of some efficient contour and silhouette descriptors. Indeed, they introduce the proposed descriptor based on silhouette projection model (SM) in the first time. In the second time, the proposed descriptor is merged with the best existing ones in order to enhance the classification performances. Despite that age classification using gait is a very challenging task, experiments conducted on OU-ISIR database show that their proposed descriptors fusion approach enhances considerably the recognition rate.
Guest Editorial: Computer Vision in Healthcare and Assisted Living
Review of constraints on vision-based gesture recognition for human–computer interaction
Skeleton-based human activity recognition for elderly monitoring systems
Two-person activity recognition using skeleton data
Energy expenditure estimation using visual and inertial sensors
Event-driven system for fall detection using body-worn accelerometer and depth sensor
Vision-based game design and assessment for physical exercise in a robot-assisted rehabilitation system
Gait features fusion for efficient automatic age classification
-
- Author(s): Fuqun Zhao ; Mingquan Zhou ; Guohua Geng ; Lipin Zhu
- Source: IET Computer Vision, Volume 12, Issue 1, p. 76 –85
- DOI: 10.1049/iet-cvi.2016.0392
- Type: Article
- + Show details - Hide details
-
p.
76
–85
(10)
This study proposes a blocks matching method based on contour curves and feature regions that improve the matching precision and speed with which rigid blocks with a specified thickness in point clouds are matched. The method comprises two steps: coarse matching and fine matching. In the coarse matching step, the rigid blocks are first segmented into a series of surfaces and the fracture surfaces are distinguished. Then, the contour curves of the fracture surfaces are extracted using an improved boundary growth method and the rigid blocks are coarsely matched with them. In the fine matching step, feature regions are first extracted from the fracture surfaces. Then, the centroid of each feature region is calculated and the fine matching of rigid blocks with the centroid sets is completed using an improved iterative closest point (ICP) algorithm. The improved ICP algorithm integrates the rotation angle constraint and dynamic iteration coefficient into a probability ICP algorithm, which significantly improves matching precision and speed. Experiments conducted using public blocks and Terracotta Warriors blocks indicate that the proposed method carries out rigid blocks matching more accurately and rapidly than various conventional methods.
- Author(s): Omar Elharrouss ; Abdelghafour Abbad ; Driss Moujahid ; Hamid Tairi
- Source: IET Computer Vision, Volume 12, Issue 1, p. 86 –94
- DOI: 10.1049/iet-cvi.2017.0136
- Type: Article
- + Show details - Hide details
-
p.
86
–94
(9)
Background modelling is a critical case for background-subtraction-based approaches and also for a wide range of applications. The background generation becomes difficult when the scene is complex or an object stays for a long time in the scene. Here, the authors propose a block-based background initialisation, using the sum of absolute difference (SAD), and modelling, using a block-based entropy evaluation, with a low computational cost which making them feasible for embedded platform. In general, many background-subtraction approaches are sensitive to sudden illumination change in the scene and cannot update the background image in scenes. The proposed background modelling approach analyses the illumination change problem. The moving object detection mask is developed using a threshold selected by computing the mean of the SAD between the blocks background and the blocks of the current frame. From the qualitative and quantitative results obtained by the authors approach compared with some existing methods, the authors approach is effective for background generation and moving objects detection.
- Author(s): Liangqiong Qu ; Jiandong Tian ; Huijie Fan ; Wentao Li ; Yandong Tang
- Source: IET Computer Vision, Volume 12, Issue 1, p. 95 –103
- DOI: 10.1049/iet-cvi.2017.0159
- Type: Article
- + Show details - Hide details
-
p.
95
–103
(9)
Shadow features such as colour ratio, texture, and chromaticity have proved to be quite effective in shadow detection. Many shadow detection methods have been proposed on the basis of different features. However, previous works for shadow detection mainly focus on designing an effective classifier for existing shadow features, but pay less attention on the analysis of shadow features themselves. The majority of studies simply report the final shadow detection results rather than make an evaluation on each feature. Readers often do not know which features are more effective or whether these shadow features are complementary. The following problems are still unsolved: the robustness of each feature, which feature plays the most important role in a detection method, and what is the best performance that current features can reach. The purpose of this study is to answer these questions, and the authors hope that this study can offer guidance for future shadow detection algorithms via the evaluation of frequently used shadow features. Several useful and interesting conclusions are obtained after conducting extensive comparison experiments on a large dataset.
- Author(s): Liantao Wang ; Qingwu Li ; Jianfeng Lu ; Qiong Wang
- Source: IET Computer Vision, Volume 12, Issue 1, p. 104 –109
- DOI: 10.1049/iet-cvi.2017.0036
- Type: Article
- + Show details - Hide details
-
p.
104
–109
(6)
Automatic object annotation for weakly labelled images/videos has attracted great research interests. In the literature, the idea of negative mining has been proposed for the task. Following existing works, the authors start with image/video over-segmentation. With the assumption that the noisy segments in the concept images and the strongly labelled non-concept segments are drawn from the same distribution, the authors plan to estimate the non-concept distribution and apply it to the ambiguous segments to generate a concept ranking. Although this idea was proposed in existing work and was shown ineffective when combined with a naive kernel density estimation strategy, in this study, the authors explore improved density estimation techniques for the ranking and propose a kernel regression model whose parameters are estimated by a maximum likelihood estimation. Experimental results validate the effectiveness of their method.
- Author(s): Wenwen Ding ; Kai Liu ; Hao Chen ; Fengqin Tang
- Source: IET Computer Vision, Volume 12, Issue 1, p. 110 –117
- DOI: 10.1049/iet-cvi.2017.0031
- Type: Article
- + Show details - Hide details
-
p.
110
–117
(8)
In recent years, there has been renewed interest in developing methods for skeleton-based human action recognition. In this study, the challenging problem of the similarity degree of skeleton-based human postures is addressed. Human posture is described by screw motions between 3D rigid bodies, which can be seen as a relation matrix of 3D rigid bodies (RMRB3D). A linear subspace, a point of a Grassmannian manifold, is spanned by the orthonormal basis of matrix RMRB3D. A powerful way to compute the similarity degree between postures is researched to solve the geodesic distance between points on the Grassmannian manifold. Then representative postures are extracted through spectral clustering over representative postures. An action will be represented by a symbol sequence generated with a global linear eigenfunction constructed by spectral embedding. Finally, dynamic time warping and hidden Markov model (HMM) are used to classify these action sequences. The experimental evaluations of the proposed method on several challenging 3D action datasets show that the proposed approaches achieve promising results compared with other skeleton-based human action recognition algorithms.
Rigid blocks matching method based on contour curves and feature regions
Moving object detection zone using a block-based background model
Evaluation of shadow features
Non-concept density estimation via kernel regression for concept ranking in weakly labelled data
Human action recognition using similarity degree between postures and spectral learning
Most viewed content
Most cited content for this Journal
-
Brain tumour classification using two-tier classifier with adaptive segmentation technique
- Author(s): V. Anitha and S. Murugavalli
- Type: Article
-
Driving posture recognition by convolutional neural networks
- Author(s): Chao Yan ; Frans Coenen ; Bailing Zhang
- Type: Article
-
Local directional mask maximum edge patterns for image retrieval and face recognition
- Author(s): Santosh Kumar Vipparthi ; Subrahmanyam Murala ; Anil Balaji Gonde ; Q.M. Jonathan Wu
- Type: Article
-
Fast and accurate algorithm for eye localisation for gaze tracking in low-resolution images
- Author(s): Anjith George and Aurobinda Routray
- Type: Article
-
‘Owl’ and ‘Lizard’: patterns of head pose and eye pose in driver gaze classification
- Author(s): Lex Fridman ; Joonbum Lee ; Bryan Reimer ; Trent Victor
- Type: Article