IET Computer Vision
Volume 8, Issue 4, August 2014
Volumes & issues:
Volume 8, Issue 4
August 2014
Robust local stereo matching under varying radiometric conditions
- Author(s): Yufu Qu ; Jixiang Jiang ; Xiangjin Deng ; Yanhong Zheng
- Source: IET Computer Vision, Volume 8, Issue 4, p. 263 –276
- DOI: 10.1049/iet-cvi.2013.0117
- Type: Article
- + Show details - Hide details
-
p.
263
–276
(14)
The authors present a local stereo matching algorithm whose performance is insensitive to changes in radiometric conditions between the input images. First, a prior on the disparities is built by combining the DAISY descriptor and Census filtering. Then, a Census-based cost aggregation with a self-adaptive window is performed. Finally, the maximum a-posteriori estimation is carried out to compute the disparity. The authors’ algorithm is compared with both local and global stereo matching algorithms (NLCA, ELAS, ANCC, AdaptWeight and CSBP) by using Middlebury datasets. The results show that the proposed algorithm achieves high-accuracy dense disparity estimations and is more robust to radiometric differences between input images than other algorithms.
Recognising human interaction from videos by a discriminative model
- Author(s): Yu Kong ; Wei Liang ; Zhen Dong ; Yunde Jia
- Source: IET Computer Vision, Volume 8, Issue 4, p. 277 –286
- DOI: 10.1049/iet-cvi.2013.0042
- Type: Article
- + Show details - Hide details
-
p.
277
–286
(10)
This study addresses the problem of recognising human interactions between two people. The main difficulties lie in the partial occlusion of body parts and the motion ambiguity in interactions. The authors observed that the interdependencies existing at both the action level and the body part level can greatly help disambiguate similar individual movements and facilitate human interaction recognition. Accordingly, they proposed a novel discriminative method, which model the action of each person by a large-scale global feature and local body part features, to capture such interdependencies for recognising interaction of two people. A variant of multi-class Adaboost method is proposed to automatically discover class-specific discriminative three-dimensional body parts. The proposed approach is tested on the authors newly introduced BIT-interaction dataset and the UT-interaction dataset. The results show that their proposed model is quite effective in recognising human interactions.
Gradient descent with adaptive momentum for active contour models
- Author(s): Guoqi Liu ; Zhiheng Zhou ; Huiqiang Zhong ; Shengli Xie
- Source: IET Computer Vision, Volume 8, Issue 4, p. 287 –298
- DOI: 10.1049/iet-cvi.2013.0089
- Type: Article
- + Show details - Hide details
-
p.
287
–298
(12)
In active contour models (snakes), various vector force fields replacing the gradient of the original external energy in the equations of motion are a popular way to extract the object boundary. Gradient descent method is usually used to obtain the equations of motion by minimising the energy functional. However, it always suffers from local minimum in extracting complex geometries because of non-convex functional. Gradient descent method with adaptive momentum term is proposed in this study. First, an acceleration function of evolution is defined. Then, the adaptive momentum term is obtained by calculating the product between the edge stopping function and the defined acceleration function. Finally, adaptive momentum is compatible with the snakes. The edge stopping function is used to decide the influence region of the momentum, whereas the defined acceleration function determines the magnitude of the momentum. It is used to extract the complex geometries (such as deep concavity) when adding the adaptive momentum into some snakes, such as gradient vector field or vector field convolution snakes. On the other hand, the proposed method also accelerates the rate of convergence. It can be applied to extract a single object in real images. The experimental results show that the proposed method is effective and efficient.
Global and local exploitation for saliency using bag-of-words
- Author(s): Zhenzhu Zheng ; Yun Zhang ; Luxin Yan
- Source: IET Computer Vision, Volume 8, Issue 4, p. 299 –304
- DOI: 10.1049/iet-cvi.2013.0132
- Type: Article
- + Show details - Hide details
-
p.
299
–304
(6)
The guidance of attention helps human vision system to detect objects rapidly. In this study, the authors present a new saliency detection algorithm by using bag-of-words (BOW) representation. The authors regard salient regions as coming from globally rare features and regions locally differ from their surroundings. Our approach consists of three stages: first, calculate global rarity of visual words. A vocabulary, a group of visual words, is generated from the given image and a rarity factor for each visual word is introduced according to its occurrence. Second, calculate local contrast. Representations of local patch are achieved from the histograms of words. Then, local contrast is computed by the difference between the two BOW histograms of a patch and its surroundings. Finally, saliency is measured by the combination of global rarity and local patch contrast. We compare our model with the previous methods on natural images, and experimental results demonstrate good performance of our model and fair consistency with human eye fixations.
Kernel sparse tracking with compressive sensing
- Author(s): Qingsen Yan and Linsheng Li
- Source: IET Computer Vision, Volume 8, Issue 4, p. 305 –315
- DOI: 10.1049/iet-cvi.2013.0095
- Type: Article
- + Show details - Hide details
-
p.
305
–315
(11)
Online tracking is a challenging task to develop effective and efficient models to account for appearance change. However, most tracking algorithms only consider the holistic or local information and do not make full use of the appearance information. In this study, a novel tracking algorithm with sparse representation is proposed and the online classifier is learned to discriminate the target from the background. To reduce visual drift problem which is encountered in object tracking, a two-stage sparse representation method is proposed. The holistic information is used to estimate the initial tracking position, and the local information is used to determine the final tracking position. To improve the performance of the classifier and robustness of the algorithm, the kernel function is applied on the sparse representation. Moreover, the dimension of the target is reduced via compressive sensing. Besides, a simple and effective method for dictionary update is proposed. Both qualitative and quantitative evaluations on challenging image sequences demonstrate that the proposed algorithm performs favourably against several state-of-the-art algorithms.
Locally adaptive combining colour and depth for human body contour tracking using level set method
- Author(s): Yuhua Xu ; Mao Ye ; Zunhua Tian ; Xiaohu Zhang
- Source: IET Computer Vision, Volume 8, Issue 4, p. 316 –328
- DOI: 10.1049/iet-cvi.2013.0164
- Type: Article
- + Show details - Hide details
-
p.
316
–328
(13)
In this study, the authors present a novel human body contour tracking method which adaptively combines colour and depth cues of RGB-D images in level set framework. The authors model the body object by active contour. When the body object is far away from the background objects, it is relatively easy to separate the object from the background using the depth cue. In this case, the depth cue should dominate the evolution of the active contour. When some part of the body is very close to the background objects, the discriminability of the depth decreases rapidly. In this local region the colour cue should dominate the evolution, whereas the depth cue should still play an important role in other regions. To achieve these objectives, the authors propose to use a superpixel-based locally adaptive weight map to determine the importance of the depth cue. Moreover, to obtain more accurate contours and to avoid error drifting, based on the two novel properties of the human body surface in the depth images, the authors propose two simple but effective algorithms to refine the tracking results of the level set method. The promising results demonstrate the performance of the proposed method.
Personalised face neutralisation based on subspace bilinear regression
- Author(s): Ying Chen ; Ruilin Bai ; Chunjian Hua
- Source: IET Computer Vision, Volume 8, Issue 4, p. 329 –337
- DOI: 10.1049/iet-cvi.2013.0212
- Type: Article
- + Show details - Hide details
-
p.
329
–337
(9)
Expression face neutralisation helps to improve the performance of expressive face recognition with one single neutral sample in gallery per subject. For learning-based expression neutralisation, the virtual neutral face totally relies on training samples, which removes person-specific characters from the neutralised face. Bilinear kernel rank reduced regression (BKRRR) algorithm is designed in a virtual subspace to simultaneously and efficiently generate both virtual expressive and neutral images from training samples. An expression mask is then established using grey and gradient differences of the two images. The test expression image is transformed to neutral template by piece-wise affine warp (PAW). Using the virtual BKRRR neutral image as source, the PAW image as destination and the area covered by expression mask as clone area, an image fusion strategy based on Poisson equation is then designed, which achieves virtual neutralised face image with person-specific characters preserved. From experiments on the CMU Multi-PIE databases, it could be observed that the neutral faces synthesised by the proposed method could effectively approximate the real ground truth expressive faces, and greatly improve the performance of classic face recognition algorithms on expression variant problems.
Exact image representation via a number-theoretic Radon transform
- Author(s): Shekhar Chandra and Imants Svalbe
- Source: IET Computer Vision, Volume 8, Issue 4, p. 338 –346
- DOI: 10.1049/iet-cvi.2013.0101
- Type: Article
- + Show details - Hide details
-
p.
338
–346
(9)
This study presents an integer-only algorithm to exactly recover an image from its discrete projected views that can be computed with the same computational complexity as the fast Fourier transform (FFT). Most discrete transforms for image reconstruction rely on the FFT, via the Fourier slice theorem (FST), in order to compute reconstructions with low-computational complexity. Consequently, complex arithmetic and floating point representations are needed, the latter of which is susceptible to round-off errors. This study shows that the slice theorem is valid within integer fields, via modulo arithmetic, using a circulant theory of the Radon transform (RT). The resulting number-theoretic RT (NRT) provides a representation of images as discrete projections that is always exact and real-valued. The NRT is ideally suited as part of a discrete tomographic algorithm, an encryption scheme or for when numerical overflow is likely, such as when computing a large number of convolutions on the projections. The low-computational complexity of the NRT algorithm also provides an efficient method to generate discrete projected views of image data.
Video face recognition via combination of real-time local features and temporal–spatial cues
- Author(s): Gaopeng Gou ; Di Huang ; Yunhong Wang
- Source: IET Computer Vision, Volume 8, Issue 4, p. 347 –357
- DOI: 10.1049/iet-cvi.2013.0025
- Type: Article
- + Show details - Hide details
-
p.
347
–357
(11)
Video-based face recognition has attracted much attention and made great progress in the past decade. However, it still encounters two main problems, which are efficiently representing faces in frames and sufficiently exploiting temporal–spatial constraints between frames. The authors investigate the existing real-time features for face description, and compare their performance. Moreover, a novel approach is proposed to model temporal–spatial information which is then combined with real-time features to further enforce the consistent constraints between frames to improve the recognition performance. The experiments are validated on three video face databases and the results demonstrate that temporal–spatial cues combined with the most powerful real-time features largely improve the recognition rate.
Hierarchical tone mapping based on image colour appearance model
- Author(s): Jinsheng Xiao ; Wenhao Li ; Guoxiong Liu ; Shih-Lung Shaw ; Yongqin Zhang
- Source: IET Computer Vision, Volume 8, Issue 4, p. 358 –364
- DOI: 10.1049/iet-cvi.2013.0230
- Type: Article
- + Show details - Hide details
-
p.
358
–364
(7)
To solve the problem of low efficiency and poor effect of the current tone mapping methods for the high dynamic range images, the authors propose a hierarchical tone mapping algorithm based on colour appearance model. The discrete Gaussian kernel is used to speed up the bilateral filter. The operation of tone compression in RGB colour space is adopted to correct the colour casts. The extreme values of the pixels are also adjusted in the detail layer. Moreover, after the tone mapping, the colour saturation is enhanced in the image regions of rich details and sharp edges. Experimental results show that the proposed algorithm with less computational cost reduces the halo effect significantly, and achieves the natural colour and the rich details. It outperforms the state-of-the-art methods in terms of visual quality and objective indicators.
Most viewed content
Most cited content for this Journal
-
Brain tumour classification using two-tier classifier with adaptive segmentation technique
- Author(s): V. Anitha and S. Murugavalli
- Type: Article
-
Driving posture recognition by convolutional neural networks
- Author(s): Chao Yan ; Frans Coenen ; Bailing Zhang
- Type: Article
-
Local directional mask maximum edge patterns for image retrieval and face recognition
- Author(s): Santosh Kumar Vipparthi ; Subrahmanyam Murala ; Anil Balaji Gonde ; Q.M. Jonathan Wu
- Type: Article
-
Fast and accurate algorithm for eye localisation for gaze tracking in low-resolution images
- Author(s): Anjith George and Aurobinda Routray
- Type: Article
-
‘Owl’ and ‘Lizard’: patterns of head pose and eye pose in driver gaze classification
- Author(s): Lex Fridman ; Joonbum Lee ; Bryan Reimer ; Trent Victor
- Type: Article