IET Computer Vision
Volume 12, Issue 5, August 2018
Volumes & issues:
Volume 12, Issue 5
August 2018
-
- Author(s): Suliang Yu ; Dongmei Niu ; Liangliang Zhang ; Mingjun Liu ; Xiuyang Zhao
- Source: IET Computer Vision, Volume 12, Issue 5, p. 563 –569
- DOI: 10.1049/iet-cvi.2017.0566
- Type: Article
- + Show details - Hide details
-
p.
563
–569
(7)
Content-based image retrieval (CBIR) is a research hotspot. To improve the performance of a CBIR system, especially the retrieval accuracy, this work proposes a method that uses a soft hypergraph combined with a weighted adjacent structure (WAS) to retrieve images. In this method, the similarities between images are computed and a similarity matrix is constructed by a conjoined colour difference histogram and micro-structure descriptor method. Furthermore, a novel WAS and a soft hypergraph model are utilised to further improve the retrieval precision. The proposed method is compared with other methods in several datasets. Experimental results manifest the performance and robustness of this proposed method.
- Author(s): Garima Joshi ; Renu Vig ; Sukhwinder Singh
- Source: IET Computer Vision, Volume 12, Issue 5, p. 570 –577
- DOI: 10.1049/iet-cvi.2017.0394
- Type: Article
- + Show details - Hide details
-
p.
570
–577
(8)
Sign language recognition system classifies signs made by hand gestures. An adequate number of features are required to represent the shape variations of sign language. As compared to individual feature set, a combination of features can be effective due to the fact that a particular feature set represents different shape information. A simple concatenation results in large feature vector size and increases the classification computational complexity. Discriminant correlation analysis (DCA)-based unimodal feature-level fusion has been applied on uniform as well as complex background Indian sign language datasets. DCA is a feature-level fusion technique that takes into account the class associations while combining the feature sets. It maximises the inter-class separability of two feature sets and also minimises the intra-class separability while performing the feature fusion. The objective of DCA-based unimodal feature fusion technique is to combine different feature sets into a single feature vector with more discriminative power. The performance of proposed framework is compared with individual orthogonal moment-based feature sets and canonical correlation analysis (CCA)-based feature fusion technique. Results show that in comparison to individual features and CCA-based fused features, DCA is an effective technique in terms of improved accuracy, reduced feature vector size and smaller classification time.
- Author(s): Ilia Petrov ; Vlad Shakhuro ; Anton Konushin
- Source: IET Computer Vision, Volume 12, Issue 5, p. 578 –585
- DOI: 10.1049/iet-cvi.2017.0382
- Type: Article
- + Show details - Hide details
-
p.
578
–585
(8)
The authors consider the problem of human pose estimation using probabilistic convolutional neural networks. They explore ways to improve human pose estimation accuracy on standard pose estimation benchmarks MPII human pose and Leeds Sports Pose (LSP) datasets using frameworks for probabilistic deep learning. Such frameworks transform deterministic neural network into a probabilistic one and allow sampling of independent and equiprobable hypotheses (different outputs) for a given input. Overlapping body parts and body joints hidden under clothes or other obstacles make the problem of human pose estimation ambiguous. In this context to get accurate estimation of joints’ position they use uncertainty in network's predictions, which is represented by variance of hypotheses, provided by a probabilistic convolutional neural network, and confidence is characterised by mean of them. Their work is based on current CNN cascades for pose estimation. They propose and evaluate three probabilistic convolutional neural networks built on top of deterministic ones with two probabilistic deep learning frameworks – DISCO networks and Bayesian SegNet. The authors evaluate their models on standard pose estimation benchmarks and show that proposed probabilistic models outperform base deterministic ones.
- Author(s): Wafa Damak ; Randa Boukhris Trabelsi ; Masmoudi Alima Damak ; Dorra Sellami
- Source: IET Computer Vision, Volume 12, Issue 5, p. 586 –595
- DOI: 10.1049/iet-cvi.2017.0613
- Type: Article
- + Show details - Hide details
-
p.
586
–595
(10)
The region of interest (ROI) extraction is important in hand vein recognition system. The main challenges for accurate extraction of the vein region are to overcome variability in hand size, lighting conditions, orientation, appearance, noisy background, and non-uniform grey levels in foreground region. Here, we propose a new dynamic hand vein ROI extraction, preserving the whole vein area. A hand segmentation process robust to the mentioned challenges, contributing to an accurate definition of hand edge delimitations is proposed. Our approach is validated on both dorsal vein Bosphorus database and palm vein Vera database. Our proposed method accuracy is ∼98% for Bosphorus database and 90% for Vera database. To illustrate the efficiency of the proposed ROI extraction, we insert it as a first block in a hand vein recognition system. Then, a comparison study at system level with recent approaches is carried on, showing an improvement of the whole system area under the curve by a rate of 12% and 2% for Bosphorus and Vera databases, respectively. The speed performances demonstrate a mean run time of 0.73 s for Bosphorus database and 1.2 s for Vera database, proving that the proposed method can be conveniently used on a real-time application.
- Author(s): Wenkai Chang ; Guodong Yang ; Junzhi Yu ; Zize Liang
- Source: IET Computer Vision, Volume 12, Issue 5, p. 596 –602
- DOI: 10.1049/iet-cvi.2017.0591
- Type: Article
- + Show details - Hide details
-
p.
596
–602
(7)
The conventional inspection of fragile insulators is critical to grid operation and insulator segmentation is the basis of inspection. However, the segmentation of various insulators is still difficult because of the great differences in colour and shape, as well as the cluttered background. Traditional insulator segmentation algorithms need many artificial thresholds, thereby limiting the adaptability of algorithms. A compact end-to-end neural network, which is trained in the framework of conditional generative adversarial networks, is proposed for the real-time pixel-level segmentation of insulators. The input image is mapped to a visual saliency map, and various insulators with different poses are filtered out at the same time. The proposed two-stage training and empty samples are also used to improve the segmentation quality. Extensive experiments and comparisons are performed on many real-world images. The experimental results demonstrate superior segmentation and real-time performance. Meanwhile, the effectiveness of the proposed training strategies and the trade-off between performance and speed are analysed in detail.
- Author(s): Abuobayda M.M. Shabat and Jules-Raymond Tapamo
- Source: IET Computer Vision, Volume 12, Issue 5, p. 603 –608
- DOI: 10.1049/iet-cvi.2017.0340
- Type: Article
- + Show details - Hide details
-
p.
603
–608
(6)
Local binary pattern (LBP) is currently one of the most common feature extraction methods used for texture analysis. However, LBP suffers from random noise, because it depends on image intensity. Recently, a more stable feature method was introduced, local directional pattern (LDP) uses the gradient space instead of the pixel intensity. Typically, LDP generates a code based on the edge response values using Kirsch masks. Yet, despite the great achievement of LDP, it has two drawbacks. The first is the static choice of the number of most significant bits used for LDP code generation. Second, the original LDP method uses the 8-neighborhood to compute the LDP code, and the value of the centre pixel is ignored. This study presents angled local directional pattern (ALDP), which is an improved version of LDP, for texture analysis. Experimental results on two different texture data sets, using six different classifiers, show that ALDP substantially outperforms both LDP and LBP methods. The ALDP has been evaluated to recognise the facial expressions emotion. Results indicate a very high recognition rate for the proposed method. An added advantage is that ALDP has an adaptive approach for the selection of the number significant bits as opposed to LDP.
- Author(s): Oluwakorede M. Oluyide ; Jules-Raymond Tapamo ; Serestina Viriri
- Source: IET Computer Vision, Volume 12, Issue 5, p. 609 –615
- DOI: 10.1049/iet-cvi.2017.0226
- Type: Article
- + Show details - Hide details
-
p.
609
–615
(7)
Lung segmentation serves to ensure that all the parts of the lungs are considered during pulmonary image analysis by isolating the lung from the surrounding anatomy in the image. Research has shown that computed tomography (CT) images greatly improves the accuracy of the diagnosis obtained by a physician for lung cancer detection. Therefore, inspired by the success of Graph Cut in image segmentation and given that manual methods of analysing CT images are tedious and time-consuming, an automatic segmentation method based on Graph Cut is proposed which makes use of a distance-constrained energy (DCE). Graph Cut produces globally optimal solutions by modelling the image data and spatial relationship among the pixels. However, several anatomical regions in the thoracic CT image have pixel intensity values similar to the lungs, leading to results where the lung tissue and all these regions are included in the segmentation result. The global energy function is, therefore, further constrained by using the distance of pixels from a coarsely segmented region of the CT image containing the lungs. The proposed method, utilising the DCE function, shows significant improvement over using the unconstrained energy function in segmenting the lungs from the CT images using Graph Cut.
- Author(s): Liang Liang Su ; Jun Tang ; Dong Liang ; Ming Zhu
- Source: IET Computer Vision, Volume 12, Issue 5, p. 616 –622
- DOI: 10.1049/iet-cvi.2017.0465
- Type: Article
- + Show details - Hide details
-
p.
616
–622
(7)
As a promising alternative to traditional search techniques, hashing-based approximate nearest neighbour search provides an applicable solution for big data. Most existing efforts are devoted to finding better projections to preserve the neighbouring structure of original data points in Hamming space, but ignore the quantisation procedure which may lead to the breakdown of the neighbouring structure maintained in the projection stage. To address this issue, the authors propose a novel multi-bit quantisation (MBQ) method using a Matthews correlation coefficient (MCC) term and a regularisation term. The authors' method utilises the neighbouring relationship and the distribution information of original data points instead of the projection dimension usually used in the previous MBQ methods to adaptively learn optimal quantisation thresholds, and allocates multiple bits per projection dimension in terms of the learned thresholds. Experiments on two typical image data sets demonstrate that the proposed method effectively preserves the similarity between data points in the original feature space and outperforms state-of-the-art quantisation methods.
- Author(s): Meriama Mahamdioua and Mohamed Benmohammed
- Source: IET Computer Vision, Volume 12, Issue 5, p. 623 –633
- DOI: 10.1049/iet-cvi.2017.0190
- Type: Article
- + Show details - Hide details
-
p.
623
–633
(11)
The scale invariant feature transform (SIFT), which was proposed by David Lowe, is a powerful method that extracts and describes local features called keypoints from images. These keypoints are invariant to scale, translation, and rotation, and partially invariant to image illumination variation. Despite their robustness against these variations, strong lighting variation is a difficult challenge for SIFT-based facial recognition systems, where significant degradation of performance has been reported. To develop a robust system under these conditions, variation in lighting must be first eliminated. Additionally, SIFT parameter default values that remove unstable keypoints and inadequately matched keypoints are not well-suited to images with illumination variation. SIFT keypoints can also be incorrectly matched when using the original SIFT matching method. To overcome this issue, the authors propose propose a method for removing the illumination variation in images and correctly setting SIFT's main parameter values (contrast threshold, curvature threshold, and match threshold) to enhance SIFT feature extraction and matching. The proposed method is based on an estimation of comparative image lighting quality, which is evaluated through an automatic estimation of gamma correction value. Through facial recognition experiments, the authors find significant results that clearly illustrate the importance of the proposed robust recognition system.
- Author(s): Issam Elafi ; Mohamed Jedra ; Noureddine Zahid
- Source: IET Computer Vision, Volume 12, Issue 5, p. 634 –639
- DOI: 10.1049/iet-cvi.2017.0359
- Type: Article
- + Show details - Hide details
-
p.
634
–639
(6)
Tracking objects in infrared video sequences became a very important challenge for many current tracking algorithms due to several complex situations such as illumination variation, night vision, and occlusion. This study proposes a new tracker that uses a set of invariant parameters calculated via the co-occurrence moments to better describe the target object. The usage of the co-occurrence moments gives the ability to exploit the information about the texture of the target to enhance the robustness of the tracking task. This latter is performed without any learning or clustering phase. The qualitative and quantitative studies on challenging sequences demonstrate that the results obtained by the proposed algorithm are very competitive in comparison to several state-of-the-art methods.
- Author(s): Howard Wang ; Sing Kiong Nguang ; Jiwei Wen
- Source: IET Computer Vision, Volume 12, Issue 5, p. 640 –650
- DOI: 10.1049/iet-cvi.2017.0404
- Type: Article
- + Show details - Hide details
-
p.
640
–650
(11)
This study proposes a novel robust video tracking algorithm consists of target detection, multi-feature fusion, and extended Camshift. Firstly, a novel target detection method that integrates Canny edge operator, three-frame difference, and improved Gaussian mixture model (IGMM)-based background modelling is provided to detect targets. The IGMM-based background modelling divides video frames into meshes to avoid pixel-wise processing. In addition, the output of the target detection is utilised to initialise the IGMM and to accelerate the convergence of iterations. Secondly, low-dimensional regional covariance matrices are introduced to describe video targets by fusing multiple features like pixel location, colour index, rotation and scale invariant features as well as uniform local binary patterns, and directional derivatives. Thirdly, an extended Camshift based on adaptive kernel bandwidth and robust H ∞ state estimation is proposed to predict the states of fast moving targets and to reduce the mean shift iterations. Finally, the effectiveness of the proposed tracking algorithm is demonstrated via experiments.
- Author(s): Ting-An Chang and Jar-Ferr Yang
- Source: IET Computer Vision, Volume 12, Issue 5, p. 651 –658
- DOI: 10.1049/iet-cvi.2017.0336
- Type: Article
- + Show details - Hide details
-
p.
651
–658
(8)
A texture image plus its associated depth map is the simplest representation of a three-dimensional image and video signals and can be further encoded for effective transmission. Since it contains fewer variations, a depth map can be coded with much lower resolution than a texture image. Furthermore, the resolution of depth capture devices is usually also lower. Thus, a low-resolution depth map with possible noise requires appropriate interpolation to restore it to full resolution and remove noise. In this study, the authors propose potency guided upsampling and adaptive gradient fusion filters to enhance the erroneous depth maps. The proposed depth map enhancement system can successfully suppress noise, fill missing values, sharpen foreground objects, and smooth background regions simultaneously. Their experimental results show that the proposed methods perform better in terms of both visual and subjective metrics than the classic methods and achieve results that are visually comparable with those of some time-consuming methods.
- Author(s): Shujian Wang ; Deyan Xie ; Fang Chen ; Quanxue Gao
- Source: IET Computer Vision, Volume 12, Issue 5, p. 659 –665
- DOI: 10.1049/iet-cvi.2017.0302
- Type: Article
- + Show details - Hide details
-
p.
659
–665
(7)
Locality preserving projection (LPP) is one of the most representative linear manifold learning methods and well exploits intrinsic structure of data. However, the performance of LPP remarkably degenerate in the presence of outliers. To alleviate this problem, the authors propose a robust LPP, namely LPP-L21. LPP-L21 employs L2-norm as the distance metric in spatial dimension of data and L1-norm as the distance metric over different data points. Moreover, the authors employ L1-norm to construct similarity graph, this helps to improve robustness of algorithm. Accordingly, the authors present an efficient iterative algorithm to solve LPP-L21. The authors’ proposed method not only well suppresses outliers but also retains LPP's some nice properties. Experimental results on several image data sets show its advantages.
- Author(s): Rilwan Remilekun Basaru ; Chris Child ; Eduardo Alonso ; Gregory Slabaugh
- Source: IET Computer Vision, Volume 12, Issue 5, p. 666 –678
- DOI: 10.1049/iet-cvi.2017.0227
- Type: Article
- + Show details - Hide details
-
p.
666
–678
(13)
Hand pose is emerging as an important interface for human–computer interaction. This study presents a data-driven method to estimate a high-quality depth map of a hand from a stereoscopic camera input by introducing a novel superpixel-based regression framework that takes advantage of the smoothness of the depth surface of the hand. To this end, the authors introduce conditional regressive random forest (CRRF), a method that combines a conditional random field (CRF) and an RRF to model the mapping from a stereo red, green and blue image pair to a depth image. The RRF provides a unary term that adaptively selects different stereo-matching measures as it implicitly determines matching pixels in a coarse-to-fine manner. While the RRF makes depth prediction for each superpixel independently, the CRF unifies the prediction of depth by modelling pairwise interactions between adjacent superpixels. Experimental results show that CRRF can generate a depth image more accurately than the leading contemporary techniques using an inexpensive stereo camera.
- Author(s): Shahram Taheri and Önsen Toygar
- Source: IET Computer Vision, Volume 12, Issue 5, p. 679 –685
- DOI: 10.1049/iet-cvi.2017.0079
- Type: Article
- + Show details - Hide details
-
p.
679
–685
(7)
A real-world animal biometric system that detects and describes animal life in image and video data is an emerging subject in machine vision. These systems develop computer vision approaches for the classification of animals. A novel method for animal face classification based on score-level fusion of recently popular convolutional neural network (CNN) features and appearance-based descriptor features is presented. This method utilises a score-level fusion of two different approaches; one uses CNN which can automatically extract features, learn and classify them; and the other one uses kernel Fisher analysis (KFA) for its feature extraction phase. The proposed method may also be used in other areas of image classification and object recognition. The experimental results show that automatic feature extraction in CNN is better than other simple feature extraction techniques (both local- and appearance-based features), and additionally, appropriate score-level combination of CNN and simple features can achieve even higher accuracy than applying CNN alone. The authors showed that the score-level fusion of CNN extracted features and appearance-based KFA method have a positive effect on classification accuracy. The proposed method achieves 95.31% classification rate on animal faces which is significantly better than the other state-of-the-art methods.
- Author(s): Yan Sun ; Jonathon S. Hare ; Mark S. Nixon
- Source: IET Computer Vision, Volume 12, Issue 5, p. 686 –692
- DOI: 10.1049/iet-cvi.2017.0429
- Type: Article
- + Show details - Hide details
-
p.
686
–692
(7)
In some forms of gait analysis, it is important to be able to capture when the heel strikes occur. In addition, in terms of video analysis of gait, it is important to be able to localise the heel where it strikes on the floor. In this study, a new motion descriptor, acceleration flow, is introduced for detecting heel strikes. The key frame of heel strike can be determined by the quantity of acceleration flow within the region of interest, and positions of the strike can be found from the centre of rotation caused by radial acceleration. Our approach has been tested on a number of databases which were recorded indoors and outdoors with multiple views and walking directions for evaluating the detection rate under various environments. Experiments show the ability of our approach for both temporal detection and spatial positioning. The immunity of this new approach to three anticipated types of noises in real CCTV footage is also evaluated in our experiments. The authors acceleration flow detector is shown to be less sensitive to Gaussian white noise, whilst being effective with images of low-resolution and with incomplete body position information when compared with other techniques.
- Author(s): Ozan Arslan
- Source: IET Computer Vision, Volume 12, Issue 5, p. 693 –701
- DOI: 10.1049/iet-cvi.2017.0549
- Type: Article
- + Show details - Hide details
-
p.
693
–701
(9)
Single view methodology has been a long-standing issue of extracting metric information from a single image in the relevant disciplines. To perform metric measurements on a single image, two methods, which are based on identifying vanishing points (VPs), were employed in the study. One of the two methods is based on the cross ratio and VPs computation and the other integrates robust statistical estimators and close range techniques in a multidisciplinary concept. The estimation of object distance from a single image taken in an outdoor environment was the main concern for a practical application in the study. As the main concern of the study, accuracy analysis of distance measurements was performed to compare the performance of two techniques and experimental results are presented.
- Author(s): Divya Sharma and Chiranjoy Chattopadhyay
- Source: IET Computer Vision, Volume 12, Issue 5, p. 702 –709
- DOI: 10.1049/iet-cvi.2017.0581
- Type: Article
- + Show details - Hide details
-
p.
702
–709
(8)
Due to the massive growth of real estate industry, there is an increase in the number of online platforms designed for finding homes/furnished properties. Instead of descriptive words, query by example is always a preferred method for retrieval. Floor plans are the basic 2D representation giving an idea about the building structure at a particular level. The authors propose a framework for the retrieval of similar architectural floor plans under the query by example paradigm. They propose a novel algorithm to extract high-level semantic features from an architectural floor plan. Fine-grained retrieval using weighted sum of the features is proposed, where a feature can be given more preference over others, during retrieval. Experiments were performed on publicly available dataset containing 510 floor plans and compared with existing state-of-the-art techniques. Their proposed method outperforms others both in qualitative and quantitative terms.
- Author(s): Oussama Zayene ; Sameh Masmoudi Touj ; Jean Hennebert ; Rolf Ingold ; Najoua Essoukri Ben Amara
- Source: IET Computer Vision, Volume 12, Issue 5, p. 710 –719
- DOI: 10.1049/iet-cvi.2017.0468
- Type: Article
- + Show details - Hide details
-
p.
710
–719
(10)
This study presents a novel approach for Arabic video text recognition based on recurrent neural networks. In fact, embedded texts in videos represent a rich source of information for indexing and automatically annotating multimedia documents. However, video text recognition is a non-trivial task due to many challenges like the variability of text patterns and the complexity of backgrounds. In the case of Arabic, the presence of diacritic marks, the cursive nature of the script and the non-uniform intra/inter word distances, may introduce many additional challenges. The proposed system presents a segmentation-free method that relies specifically on a multi-dimensional long short-term memory coupled with a connectionist temporal classification layer. It is shown that using an efficient pre-processing step and a compact representation of Arabic character models brings robust performance and yields a low-error rate than other recently published methods. The authors’ system is trained and evaluated using the public AcTiV-R dataset under different evaluation protocols. The obtained results are very interesting. They also outperform current state-of-the-art approaches on the public dataset ALIF in terms of recognition rates at both character and line levels.
- Author(s): Khomsun Singhirunnusorn ; Farbod Fahimi ; Ramazan Aygun
- Source: IET Computer Vision, Volume 12, Issue 5, p. 720 –727
- DOI: 10.1049/iet-cvi.2017.0407
- Type: Article
- + Show details - Hide details
-
p.
720
–727
(8)
Recently, mirage pose estimation method was proposed for multi-camera systems. Multi-camera mirage analytically solves a system of linear equations for six pose parameters in O(n) time. Mirage promises to execute in real time with high accuracy and shows lower rotational and translational errors compared to eight other well-known perspective-n-points (PnP) methods. However, the simulated tests and real experiments showed that, in case of a single camera, the analytical system of linear equations is not solvable due to the reduced rank of the linear system that is obtained by the formulation. In this study, an important revision to mirage is proposed to support single camera systems properly. The results of simulations and real experiments demonstrate smaller pose estimation errors compared to a group of eight well-known state-of-the-art PnP methods.
- Author(s): Du Yong Kim
- Source: IET Computer Vision, Volume 12, Issue 5, p. 728 –734
- DOI: 10.1049/iet-cvi.2017.0600
- Type: Article
- + Show details - Hide details
-
p.
728
–734
(7)
In multi-object tracking applications, model parameter tuning is a prerequisite for reliable performance. In particular, it is difficult to know statistics of false measurements due to various sensing conditions and changes in the field of views. In this study, the authors are interested in designing a multi-object tracking algorithm that handles unknown false measurement rate. The recently proposed robust multi-Bernoulli filter is employed for clutter estimation while generalised labelled multi-Bernoulli filter is considered for target tracking. Performance evaluation with real videos demonstrates the effectiveness of the tracking algorithm for real-world scenarios.
- Author(s): Hai-Hong Phan ; Ngoc-Son Vu ; Vu-Lam Nguyen ; Mathias Quoy
- Source: IET Computer Vision, Volume 12, Issue 5, p. 735 –743
- DOI: 10.1049/iet-cvi.2017.0282
- Type: Article
- + Show details - Hide details
-
p.
735
–743
(9)
Here, the authors introduce a novel system which incorporates the discriminative motion of oriented magnitude patterns (MOMP) descriptor into simple yet efficient techniques. The authors’ descriptor both investigates the relations of the local gradient distributions in neighbours among consecutive image sequences and characterises information changing across different orientations. The proposed system has two main contributions: (i) the authors adopt feature post-processing principal component analysis followed by vector of locally aggregated descriptors encoding to de-correlate MOMP descriptor and reduce the dimension in order to speed up the algorithm; (ii) then the authors include the feature selection (i.e. statistical dependency, mutual information, and minimal redundancy maximal relevance) to find out the best feature subset to improve the performance and decrease the computational expense in classification through support vector machine techniques. Experiment results on four data sets, Weizmann (98.4%), KTH (96.3%), UCF Sport (82.0%), and HMDB51 (31.5%), prove the efficiency of the authors’ algorithm.
- Author(s): Shanmukhappa Angadi and Sanjeevakumar Hatture
- Source: IET Computer Vision, Volume 12, Issue 5, p. 744 –752
- DOI: 10.1049/iet-cvi.2017.0053
- Type: Article
- + Show details - Hide details
-
p.
744
–752
(9)
In a previously reported work, the user's hand is represented as a weighted undirected complete connected graph and spectral properties of the graph are extracted and used as feature vectors. To reduce the complexity in representing the hand image as a complete connected graph and to achieve the higher identification rate, the hand image is sought to be represented as minimal edge connected graph. The experiments are conducted separately for 16 topologies of minimal edge connected graph selected empirically to investigate the performance of the hand-geometry system. The prominent edges of hand image graph are identified experimentally by computing the identification rate. In this study, an innovative peg-free hand-geometry-based user identification system using spectral properties of a minimal edge connected graph representation of hand image is proposed. The multiclass support vector machine is employed for identification of the claimed user. The geometrical information embedded in the prominent edges will contribute to achieve better identification rate. The experimentation is carried on two databases, namely GPDS150 hand database and hand images of VTU-BEC-DB multimodal database. The minimal edge connected graph with 30 prominent edges of hand image graph achieves better identification with a faster rate.
- Author(s): Jianwei Zhao ; Tiantian Sun ; Feilong Cao
- Source: IET Computer Vision, Volume 12, Issue 5, p. 753 –761
- DOI: 10.1049/iet-cvi.2017.0153
- Type: Article
- + Show details - Hide details
-
p.
753
–761
(9)
This study proposes a novel super-resolution regularisation model based on adaptive sparse representation and self-learning frameworks. The fidelity term in the model ensures that the reconstructed image is consistent with the observation image. The adaptive sparsity regularisation term constrains the reconstructed image with an adaptive sparse representation, which successfully harmonises the sparse representation and the collaborative representation adaptively via producing suitable coefficients. To construct a more effective dictionary, the high-frequency features from the underlying image patches are extracted, and the dictionary learning and sparse representation are integrated. To this end, the alternating minimisation algorithm is used to divide this model into three subproblems, and the alternating direction method of multipliers and iterative back-projection method are used to solve the subproblems. To illustrate the effectiveness of the proposed method, additional experiments are conducted on some generic images. Compared with some state-of-the-art algorithms, the experimental results demonstrate that the proposed method achieves better results in terms of both visual quality and noise immunity.
Colour image retrieval based on the hypergraph combined with a weighted adjacent structure
DCA-based unimodal feature-level fusion of orthogonal moments for Indian sign language dataset
Deep probabilistic human pose estimation
Dynamic ROI extraction method for hand vein images
Real-time segmentation of various insulators using generative adversarial networks
Angled local directional pattern for texture analysis with an application to facial expression recognition
Automatic lung segmentation based on Graph Cut using a distance-constrained energy
Multi-bit quantisation for similarity-preserving hashing
Automatic adaptation of SIFT for robust facial recognition in uncontrolled lighting conditions
Tracking objects with co-occurrence matrix and particle filter in infrared video sequences
Robust video tracking algorithm: a multi-feature fusion approach
Precise depth map upsampling and enhancement based on edge-preserving fusion filters
Dimensionality reduction by LPP-L21
Data-driven recovery of hand depth using CRRF on stereo images
Animal classification using facial images with score-level fusion
Detecting heel strikes for gait analysis through acceleration flow
Accuracy assessment of single viewing techniques for metric measurements on single images
High-level feature aggregation for fine-grained architectural floor plan retrieval
Multi-dimensional long short-term memory networks for artificial Arabic text recognition in news video
Single-camera pose estimation using mirage
Visual multiple-object tracking for unknown clutter rate
Action recognition based on motion of oriented magnitude patterns and feature selection
Hand geometry based user identification using minimal edge connected hand image graph
Image super-resolution via adaptive sparse representation and self-learning
Most viewed content
Most cited content for this Journal
-
Brain tumour classification using two-tier classifier with adaptive segmentation technique
- Author(s): V. Anitha and S. Murugavalli
- Type: Article
-
Driving posture recognition by convolutional neural networks
- Author(s): Chao Yan ; Frans Coenen ; Bailing Zhang
- Type: Article
-
Local directional mask maximum edge patterns for image retrieval and face recognition
- Author(s): Santosh Kumar Vipparthi ; Subrahmanyam Murala ; Anil Balaji Gonde ; Q.M. Jonathan Wu
- Type: Article
-
Fast and accurate algorithm for eye localisation for gaze tracking in low-resolution images
- Author(s): Anjith George and Aurobinda Routray
- Type: Article
-
‘Owl’ and ‘Lizard’: patterns of head pose and eye pose in driver gaze classification
- Author(s): Lex Fridman ; Joonbum Lee ; Bryan Reimer ; Trent Victor
- Type: Article