IET Image Processing
Volume 13, Issue 4, 28 March 2019
Volumes & issues:
Volume 13, Issue 4
28 March 2019
-
- Author(s): Monoj K. Pradhan ; Sonajharia Minz ; Vimal K. Shrivastava
- Source: IET Image Processing, Volume 13, Issue 4, p. 549 –555
- DOI: 10.1049/iet-ipr.2018.5104
- Type: Article
- + Show details - Hide details
-
p.
549
–555
(7)
Owing to undulating and complexity of the earth's surface, obtaining the training samples for remote sensing data is time-consuming and expensive. Therefore, it is highly desirable to design a model that uses as few labelled samples as possible and reducing the computational time. Several active learning (AL) algorithms have been proposed in the literature for the classification of hyperspectral images (HSIs). However, its performance in terms of computational time has not been focused yet. Here, the authors have proposed AL approach based on extreme learning machine (ELM) that effectively decreases the computational time while maintaining the classification accuracy. Further, the effectiveness of the proposed approach has been depicted by comparing its performance with state-of-the-art AL algorithms in terms of classification accuracy and computational time as well. The ELM-based AL with different query strategies were conducted on two HSI data sets. The proposed approach achieves the classification accuracy up to 90% which is comparable to support vector machine-based AL approach but effectively reduces the computational time significantly by 1000 times. Thus, the proposed system shows the encouraging results with adequate classification accuracy while reducing the computation time drastically.
- Author(s): Zhuo Su ; Jiaming Guo ; Gengwei Zhang ; Xianghui Luo ; Ruomei Wang ; Fan Zhou
- Source: IET Image Processing, Volume 13, Issue 4, p. 556 –565
- DOI: 10.1049/iet-ipr.2018.5494
- Type: Article
- + Show details - Hide details
-
p.
556
–565
(10)
Clothing parsing is significant to many clothing applications. Recently, a lot of clothing parsing methods have been presented, which explore the innovation of the parsing pipeline or try to find more specific prior information. Although these methods perform well in some benchmarks, a few challenging problems have not been solved yet, such as the complicated mutual interference among labels. In this study, the authors propose a Conditional Progressive Network to parse clothing in different scales and prevent the mutual interference among labels. The authors’ solution consists of three sub-networks, including Conditional Parsing Network (CPN), Pose Estimation Network (PEN) and Label Transform Network (LTN). Specifically, the CPN module generates the intermediate parsing result in the form of the multiple progressive stages, which combines with the previous outputs in each stage and the specific prior conditions. The PEN module provides a series of heat maps about the human pose information. The LTN module suppresses the redundant labels to avoid the mutual interference among labels. They demonstrate their solution in parsing the fashion clothing cases on the ATR and the Fashion dataset. In their experiments, their method obtains a better performance than the state-of-the-art methods.
- Author(s): Zhongyun Bao and Shan Gai
- Source: IET Image Processing, Volume 13, Issue 4, p. 566 –575
- DOI: 10.1049/iet-ipr.2018.5409
- Type: Article
- + Show details - Hide details
-
p.
566
–575
(10)
The traditional colour image sparse models ignore the relationship among the three separate colour channels. The authors propose a novel colour image sparse model by employing reduced quaternion matrix, which can treat independent colour channels as a whole. In addition, reduced quaternion matrix singular value decomposition is employed to design the corresponding dictionary learning algorithm. To make the proposed model robust and tractable, a reduced quaternion split Bregman iteration is developed to solve the minimisation problem. The proposed model cannot only preserve inherent colour structures but also avoid hue bias issue efficiently. Extensive experiments on colour image de-noising, in-painting, and super-resolution manifest that the proposed sparse representation model outperforms the state-of-the-art schemes.
- Author(s): Chuan He ; Changhua Hu ; Naixin Qi ; Xiaofei Zhu ; Lianxiong Liu
- Source: IET Image Processing, Volume 13, Issue 4, p. 576 –582
- DOI: 10.1049/iet-ipr.2018.5078
- Type: Article
- + Show details - Hide details
-
p.
576
–582
(7)
A fast algorithm is proposed to tackle the constrained total generalised variation (TGV)-based image-restoration and reconstruction problems. The proposed algorithm proceeds by splitting: the non-smooth constrained TGV model is first decomposed into several sub-problems easier to solve, and then the linear gradient or proximity operators, including projections and shrinkages, of the sub-problems are individually called without inner iteration. The algorithm is highly parallel since most of its steps can be executed simultaneously. Image-restoration and reconstruction experiments demonstrate that the proposed algorithm outperforms several state-of-the-art TV-based methods both in accuracy and high-speed efficiency. Besides, the proposed method efficiently suppresses staircase effects and presents better visual impression.
- Author(s): Jianwei Zhang ; Junting He ; Tianfu Chen ; Zhenmei Liu ; Danni Chen
- Source: IET Image Processing, Volume 13, Issue 4, p. 583 –590
- DOI: 10.1049/iet-ipr.2018.6032
- Type: Article
- + Show details - Hide details
-
p.
583
–590
(8)
Automation-assisted cervical screening via liquid-based cytology has achieved great success using segmentation and classification methods. This work tries to do abnormal region detection on field of view cervical cell images based on deep learning, which is a novel way to solve cervical cytological screening problem. Since some abnormal nuclei gather in groups, the proposed method chooses abnormal regions instead of abnormal nuclei as the detection targets in order to locate the abnormal regions for the further diagnosis of the pathologists. In this study, a novel abnormal region detection approach for cervical screening is proposed based on a size-sensitive fully convolutional network (R-FCN). Due to the regular feature distribution, a fewer-layer convolutional neural backbone network is designed for more efficient feature extraction and less running time. In addition, a new measure named hit degree is defined to describe the degree how closely each detected region and the corresponding ground truth matches up. Experimental results show that an average precision of 93.2% is achieved for abnormal region detection in cervical smear images. The proposed method is promising for the development of computer-aided systems in clinical cervical cytological screening.
- Author(s): Kangfu Mei ; Aiwen Jiang ; Juncheng Li ; Bo Liu ; Jihua Ye ; Mingwen Wang
- Source: IET Image Processing, Volume 13, Issue 4, p. 591 –599
- DOI: 10.1049/iet-ipr.2018.6057
- Type: Article
- + Show details - Hide details
-
p.
591
–599
(9)
Single image super-resolution (SISR) has gained great attraction and progress in recent years. Since the SISR is an ill-posed inverse problem, most researchers are concentrated on making efforts to learn effective and reasonable mapping functions from low-resolution observation to its potential high-resolution (HR) counterpart. In this study, the authors have proposed a deep residual refining based pseudo-multi-frame network for efficient SISR. A channel-wise attention mechanism is employed for residual refinement. It can ease residual learning process through explicitly modelling non-linear dependencies between channels by using global information embedding. Multiple potential HRs from different deconvolutional layers are further artificially learned, and then adaptively fused into final desired HR image. The authors call this strategy as pseudo-multi-frame SR. It could make full use of available redundant information possessed in hierarchical layers. They have evaluated the proposed network on several popular benchmark datasets. The experimental results have shown that the two highlights proposed can consistently boost final performance. The proposed network can outperform most of the state-of-the-art methods with acceptable less parameters.
- Author(s): R. Mathusoothana S. Kumar
- Source: IET Image Processing, Volume 13, Issue 4, p. 600 –606
- DOI: 10.1049/iet-ipr.2018.5268
- Type: Article
- + Show details - Hide details
-
p.
600
–606
(7)
The task of face recognition in real-world scenarios is still a challenging one. There exist many techniques for the recognition of faces in videos. The recognition task may be computationally easier, but is susceptible to pose variations, lighting conditions etc. This paper focuses on recognition of faces from multi-view videos using the combination of particle filtering with Immune Genetic Algorithm (IGA) and HSH, which is insensitive to pose variations. Particle filtering along with the IGA efficiently track the target using immune system mechanism and then the recognition phases are carried out using HSH. For recognition of video, the ensemble feature similarity calculated which can be measured with the limiting Bhattacharyya distance of features in the Reproducing Kernel Hilbert space. The proposed system with HSH provides better performance than using Spherical Harmonics (SH) for recognising the face of the target and the performance are analysed with existing techniques for recognition of face.
- Author(s): Xiangjian Chen ; Di Li ; Xun Wang ; Xibei Yang ; Hongmei Li
- Source: IET Image Processing, Volume 13, Issue 4, p. 607 –614
- DOI: 10.1049/iet-ipr.2018.5597
- Type: Article
- + Show details - Hide details
-
p.
607
–614
(8)
In recent years, the clinical application of magnetic resonance (MR) images is more and more extensive and in-depth. However, image segmentation is a bottleneck to restrict the application of MR imaging in clinic, and the segmentation of brain MR images now is confronted with the presence of uncertainty and noise, and various kinds of algorithms have been proposed to handle this problem. In this study, a hybrid clustering algorithm combined with a new intuitionistic fuzzy factor and local spatial information is proposed, where type-2 fuzzy logic can handle randomness, the rough set can deal with vagueness, and the intuitionistic fuzzy logic can address the external noises. Finally, the experimental tests have been done to demonstrate the superiority of the proposed technique.
- Author(s): Ali Javed ; Aun Irtaza ; Hafiz Malik ; Muhammad Tariq Mahmood ; Syed Adnan
- Source: IET Image Processing, Volume 13, Issue 4, p. 615 –622
- DOI: 10.1049/iet-ipr.2018.5589
- Type: Article
- + Show details - Hide details
-
p.
615
–622
(8)
Sports broadcasters generate an enormous amount of video content on the cyberspace due to massive viewership all over the world. Analysis and consumption of this huge repository urges the broadcasters to apply video summarisation to extract the exciting segments from the entire video to capture user's interest and reap the storage and transmission benefits. Therefore, in this study an automatic method for key-events detection and summarisation based on audio-visual features is presented for cricket videos. Acoustic local binary pattern features are used to capture excitement level in the audio stream, which is used to train a binary support vector machine (SVM) classifier. Trained SVM classifier is used to label audio frame as an excited or non-excited frame. Excited audio frames are used to select candidate key-video frames. A decision tree-based classifier is trained to detect key-events in the input cricket videos that are then used for video summarisation. Performance of the proposed framework has been evaluated on a diverse dataset of cricket videos belonging to different tournaments and broadcasters. Experimental results indicate that the proposed method achieves an average accuracy of 95.5%, which signifies its effectiveness.
- Author(s): Cong Jin and Shu-Wei Jin
- Source: IET Image Processing, Volume 13, Issue 4, p. 623 –633
- DOI: 10.1049/iet-ipr.2018.5371
- Type: Article
- + Show details - Hide details
-
p.
623
–633
(11)
Currently, multi-label automatic image annotation (MAIA) approach based on machine learning has been widely applied and developed. Since extreme learning machine (ELM) has the advantages of simple structure, fast learning speed, better generalisation ability and so on, it is used for MAIA in this study. In order to enhance the annotation performance and generalisation ability of MAIA, some work is designed and implemented. First of all, a novel distance metric learning method based on cost-sensitive learning for MAIA task is proposed to reduce the impact of class imbalance of samples. Second, an improved ELM approach based on singular value decomposition is proposed for implementing MAIA task. Finally, the selection of training samples (STS) strategy based on error correlation is also proposed to improve the generalisation ability and annotation performance of MAIA. Based on the above work, a novel MAIA approach is implemented. The experimental results confirm that the proposed cost-sensitive DLM, improved cost-sensitive ELM and STS can obtain the good generalisation ability, and achieve better annotation performance than the existing MAIA approaches.
- Author(s): Gulraiz Khan ; Aiman Siddiqi ; Muhammad Usman Ghani Khan ; Samyan Qayyum Wahla ; Sahar Samyan
- Source: IET Image Processing, Volume 13, Issue 4, p. 634 –643
- DOI: 10.1049/iet-ipr.2018.5728
- Type: Article
- + Show details - Hide details
-
p.
634
–643
(10)
Recent times have witnessed an exponential increase in multimedia specifically visual contents. Emotions are considered an essential part for extracting facial features, evaluating the expressions and as a result predicting the emotions of any person is a trending topic of the time. Based on still images and consecutive video frames, a methodology has been proposed to anticipate the emotions. Facial action coding system (FACS) standards are utilised in the development of an automated visual based emotion detection system worldwide. Employing FACS, the authors estimated facial muscle movement by computing 24 landmark points, 16 mutual distances between them and wrinkles caused due to changing expressions. Canny edge detection has been deployed to calculate the intensity of wrinkles. Geometric positions and optical flow are the key methods deployed in the implemented methodology. The methodology was evaluated on self-generated, JAFFE dataset and EmotioNet.
- Author(s): Zelong Wang and Jubo Zhu
- Source: IET Image Processing, Volume 13, Issue 4, p. 644 –652
- DOI: 10.1049/iet-ipr.2018.5329
- Type: Article
- + Show details - Hide details
-
p.
644
–652
(9)
To reduce the size of spectral data, compressive sensing imaging systems are developed to sample fewer measurements than the Nyquist-rate ones, from which the original data can be recovered by the optimisation model and algorithm. However, this is not a cheap option for the case where the real-time acquisition of spectral information is required. To solve this problem, the authors propose a novel sensing approach for spectral features by combining the sampling, recovery and feature extraction. Inspired by the spectral feature representation, the sampling (sensing) matrix is designed from the training spectral samples to sense the spectral features of the imaging scene, which can be utilised for classification and recognition directly. Besides, the physical realisation of the sensing matrix for compressive spectral imaging systems is demonstrated by designing new modulation patterns of the digital micro-mirror device. The experimental results on real spectral data show the feasibility of the proposed scheme and the robustness to the quantisation error and the measurement noise. Moreover, the proposed sensing approach can reduce the cost of computation and time greatly by removing the sparse recovery and feature extraction..
- Author(s): Niladri Chakraborty ; Priyambada Subudhi ; Susanta Mukhopadhyay
- Source: IET Image Processing, Volume 13, Issue 4, p. 653 –662
- DOI: 10.1049/iet-ipr.2018.5652
- Type: Article
- + Show details - Hide details
-
p.
653
–662
(10)
Coherence enhancing shock filters combine shock filtering with the orientation estimation of the structure tensors thus enhancing the coherent flow-like structures. The basic operations defined here are dilation and erosion that take place in the zones of influence. However, in order to achieve the goal of texture enhancement, in the proposed method, the authors have extended this notion to define an opening and closing based shock filter. Subsequently, the open–close filtered image is employed to locate and highlight the bright and dark texture features over the entirety of the image. Combining these feature images with the original image in a specific way will produce an image with texture features enhanced. Furthermore, we have performed these operations at different scales to achieve better enhancement of the texture features. The method has been formulated, implemented and tested over a number of synthetic and natural texture images and the experimental results establish the efficacy of the proposed method in enhancing the prominent texture parts in the image proportionately more than the non-prominent texture parts.
- Author(s): Debashis Nandi ; Jayashree Karmakar ; Amish Kumar ; Mrinal Kanti Mandal
- Source: IET Image Processing, Volume 13, Issue 4, p. 663 –672
- DOI: 10.1049/iet-ipr.2018.5139
- Type: Article
- + Show details - Hide details
-
p.
663
–672
(10)
This study presents a novel sparse-representation based multi-frame super-resolution (SR) technique to reconstruct a high-resolution (HR) frame from multiple noisy low-resolution (LR) frames by using registration in sub-pixel accuracy and adaptive weighted feature operators. First, the registration of multiple frames in sub-pixel level and the mapping of pixels from LR frames to HR grid puts more information into the reconstructed image with respect to the conventional sparse representation based single image SR technique. This improves the overall resolution of the output image. Second, the introduction of adaptive weighted feature operators in the reconstruction process has significantly improved the robustness of the algorithm to noisy input frames. Hence, from the outputs, it can be seen that the proposed method outperforms the recent techniques in terms of noise robustness even in the higher noise level in the input image. The performance of the proposed algorithm is evaluated and quantified through a set of well-defined quality metrics and compared with some recently developed techniques. The results of the proposed technique confirm the claims of the authors.
- Author(s): Bhaskar Dey and Malay Kumar Kundu
- Source: IET Image Processing, Volume 13, Issue 4, p. 673 –679
- DOI: 10.1049/iet-ipr.2018.5985
- Type: Article
- + Show details - Hide details
-
p.
673
–679
(7)
With modern socio-economic development, the number of vehicles in metropolitan cities is growing rapidly. Therefore, obtaining real-time traffic volume estimates has a very important significance in using the limited road space and traffic infrastructure. In this study, the authors present a video-based traffic volume and direction estimation at road intersections. To discriminate the vehicles from the remaining foreground objects, vehicle recognition is performed by training a deep-learning architecture from a pre-trained model. This method, called transfer learning, primarily circumvents the requirement of huge labelled datasets and the time for training the network. The video sequence is first detected for moving foreground regions or patches. The trained model is subsequently used to classify the vehicles. The vehicles are tracked, and trajectory patterns are clustered using standard techniques. The number and direction of vehicles are noted, which are later compared with the manually observed values. All experiments were performed on real-life surveillance sequences recorded at four different traffic intersections in the city of Kolkata.
- Author(s): Linwei Fan ; Ran Meng ; Qiang Guo ; Miaowen Shi ; Caiming Zhang
- Source: IET Image Processing, Volume 13, Issue 4, p. 680 –691
- DOI: 10.1049/iet-ipr.2018.6357
- Type: Article
- + Show details - Hide details
-
p.
680
–691
(12)
Low-rank approximation has shown great potential in various image tasks. It is found that there is a specific functional relationship about singular values between the original image and a series of noisy images, which can be used to construct the singular values of a noise-free image. In this study, the authors propose a novel denoising method based on the above facts and low-rank approximation theory. Firstly, they estimate the noise energy distribution of the group matrix in the singular value decomposition (SVD) domain using the energy characteristics of the image with different noise levels. The energy distribution of the noise is shrunk to obtain the energy distribution of the true signal. Then, based on the optimal energy compaction property of SVD, the low-rank property of matrix is constrained in the SVD domain to obtain the low-rank approximation of the matrix. Moreover, an iterative back projection method is adopted in this study to suppress residual noise. A new noise standard deviation estimation approach, targeted at the back projection process, is proposed to effectively optimise the denoising results during the iteration. Experimental results show that the authors’ method efficiently decreases the noise and achieves comparable denoising performance to the state-of-the-art methods regarding both quantitative measurement and visual effect.
Fast active learning for hyperspectral image classification using extreme learning machine
Conditional progressive network for clothing parsing
Reduced quaternion matrix-based sparse representation and its application to colour image processing
Fast proximal splitting algorithm for constrained TGV-regularised image restoration and reconstruction
Abnormal region detection in cervical smear images based on fully convolutional network
Deep residual refining based pseudo-multi-frame network for effective single image super-resolution
Robust multi-view videos face recognition based on particle filter with immune genetic algorithm
Rough intuitionistic type-2 fuzzy c-means clustering algorithm for MR image segmentation
Multimodal framework based on audio-visual features for summarisation of cricket videos
Multi-label automatic image annotation approach based on multiple improvement strategies
Geometric positions and optical flow based emotion detection using MLP and reduced dimensions
Compressive spectral feature sensing
Shock filter-based morphological scheme for texture enhancement
Sparse representation based multi-frame image super-resolution reconstruction using adaptive weighted features
Turning video into traffic data – an application to urban intersection analysis using transfer learning
Image denoising by low-rank approximation with estimation of noise energy distribution in SVD domain
-
- Author(s): Altaf ur Rahman ; Muhammad Sajid ; Syed Omer Gillani ; Emad Uddin ; Ibrahim Hassan
- Source: IET Image Processing, Volume 13, Issue 4, p. 692 –697
- DOI: 10.1049/iet-ipr.2018.5684
- Type: Article
- + Show details - Hide details
-
p.
692
–697
(6)
Bubble detection and tracking, which is essential for the enhancement of gas–liquid two-phase flow applications, is difficult due to optical noise surrounding bubble boundaries. Two-phase flow applications involving unconventional wavy channels further complicate these tasks by introducing non-linearity. In this work, the authors apply a path-based approach to transform a sinusoidal channel into a straightened horizontal channel for bubble detection and tracking. Segmented morphological operations are applied to the linear channel to identify each bubble as a single entity and to eliminate noise caused due to illumination conditions. The bubbles are detected through blob analysis and are associated across different frames based on motion, estimated by Kalman filter in both sinusoidal and path-based approach to the sinusoidal channel. The results show that the proposed path-based straightened channel gives more accurate results for the benchmark parameters of bubble count and bubble velocities than the sinusoidal channel with a percentage error reduction of 31.61% for bubble count and 27.36% for velocity.
- Author(s): Omar Hellel ; Mohammed Beladgham ; Abdelmounaim Moulay Lakhdar
- Source: IET Image Processing, Volume 13, Issue 4, p. 698 –706
- DOI: 10.1049/iet-ipr.2018.5345
- Type: Article
- + Show details - Hide details
-
p.
698
–706
(9)
The purpose of data compression, in general, is to encode information using fewer bits than the original representation. In video coding, the goal is to reduce the bitrate of a video while preserving an acceptable quality for visual perception. In this study, the authors describe and study the performance of a proposed video coding framework that utilises a second-generation wavelet transform specifically Spline 5/3. This framework allows for unequal error protection and has a better performance than many well know video coding systems especially for low bitrate applications. The proposed framework allows for bit rate control according to users need both at the encoder end and the decoder end, which is great for networks with variable conditions like wireless IP networks since the noise can reduce the available bandwidth and the number of users can vary from time to time.
Detection and tracking of bubbles in two-phase air water flow for non-convergent sinusoidal channel
Study of performance of a ‘second-generation wavelet video encoder with a scalable rate’
Most viewed content
Most cited content for this Journal
-
Medical image segmentation using deep learning: A survey
- Author(s): Risheng Wang ; Tao Lei ; Ruixia Cui ; Bingtao Zhang ; Hongying Meng ; Asoke K. Nandi
- Type: Article
-
Block-based discrete wavelet transform-singular value decomposition image watermarking scheme using human visual system characteristics
- Author(s): Nasrin M. Makbol ; Bee Ee Khoo ; Taha H. Rassem
- Type: Article
-
Classification of malignant melanoma and benign skin lesions: implementation of automatic ABCD rule
- Author(s): Reda Kasmi and Karim Mokrani
- Type: Article
-
Digital image watermarking method based on DCT and fractal encoding
- Author(s): Shuai Liu ; Zheng Pan ; Houbing Song
- Type: Article
-
Tomato leaf disease classification by exploiting transfer learning and feature concatenation
- Author(s): Mehdhar S. A. M. Al‐gaashani ; Fengjun Shang ; Mohammed S. A. Muthanna ; Mashael Khayyat ; Ahmed A. Abd El‐Latif
- Type: Article