IET Image Processing
Volume 11, Issue 7, July 2017
Volumes & issues:
Volume 11, Issue 7
July 2017
-
- Author(s): Yuanping Zhu and Kuang Zhang
- Source: IET Image Processing, Volume 11, Issue 7, p. 455 –464
- DOI: 10.1049/iet-ipr.2016.0914
- Type: Article
- + Show details - Hide details
-
p.
455
–464
(10)
Text segmentation is important for text image analysis and recognition; however, it is challenging due to noise and complex background in natural scenes. Superpixel-based image representation can enhance robustness to noise and local disturbances, but conventional superpixel algorithms are difficult to obtain the complete stroke regions and accurate boundaries for text images. In this study, a text segmentation method based on superpixel clustering is proposed. First, to generate accurate superpixels for text images, an adaptive simple linear iterative clustering-based text superpixel generation algorithm is proposed. The adaptive superpixel size and compactness are calculated to enhance boundary adherence. Second, to increase the complete coverage of strokes from superpixels, superpixel clustering merges homogeneous superpixels into larger regions for both strokes and the background. A modified density-based spatial clustering of applications with noise is proposed. Finally, stroke superpixel verification assigns each region to a stroke or to the background and the text segmentation result is obtained. The proposed method shows promising robustness to noise and complex background textures. Experimental results on the Korea Advanced Institute of Science and Technology (KAIST) scene text dataset, International Conference on Document Analysis and Recognition (ICDAR) 2003 natural scene text image dataset and Street View Text dataset verify that this method is effective and significantly outperforms existing methods.
- Author(s): Huicong Wu ; Liang Xiao ; Hiuk Jae Shim ; Songze Tang
- Source: IET Image Processing, Volume 11, Issue 7, p. 465 –474
- DOI: 10.1049/iet-ipr.2016.0645
- Type: Article
- + Show details - Hide details
-
p.
465
–474
(10)
This study proposes a robust approach to stabilise videos with a new variational minimising model. In video stabilisation, accumulation error often occurs in cascaded transformation chain-based methods. To alleviate accumulation error, a new total warping variation (TWV) model is proposed, which describes the smoothness of stabilised camera motion and calculates all the warping transformations efficiently. After estimating original motion parameters based on a 2D similarity transformation model, the corresponding warping parameters are calculated under the TWV minimising framework, where the separable property of the motion parameters is utilised to obtain a closed-form solution. The proposed method provides robust, smooth and precise motion trajectories after stabilisation. Furthermore, an iterative TWV method is introduced to reduce high-frequency jitters as well as low-frequency motions. Moreover, an online TWV method is presented for a long video sequence streaming by adopting a sliding windowed approach. Experimental results on various shaky video sequences show the effectiveness of the proposed method.
- Author(s): Amir Reza Sadri ; Sepideh Azarianpour ; Maryam Zekri ; Mehmet Emre Celebi ; Saeid Sadri
- Source: IET Image Processing, Volume 11, Issue 7, p. 475 –482
- DOI: 10.1049/iet-ipr.2016.0681
- Type: Article
- + Show details - Hide details
-
p.
475
–482
(8)
A new computer-aided diagnosis (CAD) system for detecting malignant melanoma from dermoscopy images based on a fixed grid wavelet network (FGWN) is proposed. This novel approach is unique in at least three ways: (i) the FGWN is a fixed WN which does not require gradient-type algorithms for its construction, (ii) the construction of FGWN is based on a new regressor selection technique: D-optimality orthogonal matching pursuit (DOOMP), and (iii) the entire CAD system relies on the proposed FGWN. These characteristics enhance the integrity and reliability of the results obtained from different stages of automatic melanoma diagnosis. The DOOMP algorithm optimises the network model approximation ability rapidly while improving the model adequacy and robustness. This FGWN is then used to build a CAD system, which performs image enhancement, segmentation, and classification. To classify the images, in the first stage, 441 features with respect to colour, texture, and shape of each lesion are extracted. By means of feature selection, these 441 features are then reduced to 10. The proposed CAD system achieved an accuracy of 91.82%, sensitivity of 92.61%, specificity of 91%, and area under the curve value of 0.944 on a challenging set of 1039 dermoscopy images.
- Author(s): Mohsen Biglari ; Ali Soleimani ; Hamid Hassanpour
- Source: IET Image Processing, Volume 11, Issue 7, p. 483 –491
- DOI: 10.1049/iet-ipr.2016.0969
- Type: Article
- + Show details - Hide details
-
p.
483
–491
(9)
Fine-grained recognition is a challenge that the computer vision community faces nowadays. The main category of the object is known in this problem and the goal is to determine the subcategory or fine-grained category. Vehicle make and model recognition (VMMR) is a hard fine-grained classification problem, due to the large number of classes, substantial inner-class and small inter-class distance. In this study, a novel approach has been proposed for VMMR based on latent SVM formulation. This approach automatically finds a set of discriminative parts in each class of vehicles by employing a novel greedy parts localisation algorithm, while learning a model per class using both features extracted from these parts and the spatial relationship between them. An effective and practical multi-class data mining method is proposed to filter out hard negative samples in the training procedure. Employing these trained individual models together, the authors’ system can classify vehicles make and model with a high accuracy. For evaluation purposes, a new dataset including more than 5000 vehicles of 28 different makes and models has been collected and fully annotated. The experimental results on this dataset and the CompCars dataset indicate the outstanding performance of the authors’ approach.
- Author(s): Xu Qiao ; Xiaoqing Liu ; Yen-wei Chen ; Zhi-Ping Liu
- Source: IET Image Processing, Volume 11, Issue 7, p. 492 –501
- DOI: 10.1049/iet-ipr.2016.0795
- Type: Article
- + Show details - Hide details
-
p.
492
–501
(10)
Linear coding is widely used to concisely represent data sets by discovering basis functions of capturing high-level features. However, the efficient identification of linear codes for representing multi-dimensional data remains very challenging. In this study, the authors address the problem by proposing a linear tensor coding algorithm to represent multi-dimensional data succinctly via a linear combination of tensor-formed bases without data expansion. Motivated by the amalgamation of linear image coding and multi-linear algebra, each basis function in the authors’ algorithm captures some specific variabilities. The basis-associated coefficients can be used for data representation, compression and classification. When the authors apply the algorithm on both simulated phantom data and real facial data, the experimental results demonstrate their algorithm not only preserves the original information of input data, but also produces localised bases with concrete physical meanings.
- Author(s): Hong Liu ; Meng Yan ; Enmin Song ; Yuejing Qian ; Xiangyang Xu ; Renchao Jin ; Lianghai Jin ; Chih-Cheng Hung
- Source: IET Image Processing, Volume 11, Issue 7, p. 502 –511
- DOI: 10.1049/iet-ipr.2016.0988
- Type: Article
- + Show details - Hide details
-
p.
502
–511
(10)
The multi-Atlas patch-based label fusion method (MAS-PBM) has emerged as a promising technique for the magnetic resonance imaging (MRI) image segmentation. The state-of-the-art MAS-PBM approach measures the patch similarity between the target image and each atlas image using the features extracted from images intensity only. It is well known that each atlas consists of both MRI image and labelled image (which is also called the map). In other words, the map information is not used in calculating the similarity in the existing MAS-PBM. To improve the segmentation result, the authors propose an enhanced MAS-PBM in which the maps will be used for similarity measure. The first component of the proposed method is that an initial segmentation result (i.e. an appropriate map for the target) is obtained by using either the non-local-patch-based label fusion method (NPBM) or the sparse patch-based label fusion method (SPBM) based on the grey scales of patches. Then, the SPBM is applied again to obtain the finer segmentation based on the labels of patches. The authors called these two versions of the proposed fusion method as MAS-PBM-NPBM and MAS-PBM-SPBM. Experimental results show that more accurate segmentation results are achieved compared with those of the majority voting, NPBM, SPBM, STEPS and the hierarchical multi-atlas label fusion with multi-scale feature representation and label-specific patch partition.
- Author(s): Ba Thai ; Mukhalad Al-nasrawi ; Guang Deng ; Zhuo Su
- Source: IET Image Processing, Volume 11, Issue 7, p. 512 –521
- DOI: 10.1049/iet-ipr.2016.0418
- Type: Article
- + Show details - Hide details
-
p.
512
–521
(10)
The bilateral filter (BF) is a non-linear filter that spatially smooths images with awareness of large structures such as edges. The level of smoothness applied to a pixel is constrained by a photometric weight, which can be obtained from the same image to be filtered (in case of the original BF) or from a guided image (in case of the joint/cross BF). In this study, the authors propose a new filter called the semi-guided BF which is derived from solving a non-linear constraint least square problem. The proposed filter's photometric weight incorporates information from the image to be filtered and the guided image. They propose a fast implementation of the filter based on layer approximation. They also study the iterative application of the proposed filter and show that the filter can preserve large structures while smoothing out small structures. This makes the proposed filter an efficient and effective tool for structure-aware image smoothing. Experimental results have demonstrated that performance of the proposed filter is comparable to those of the state-of-the-art algorithms.
- Author(s): Tingya Yang and Houshou Chen
- Source: IET Image Processing, Volume 11, Issue 7, p. 522 –529
- DOI: 10.1049/iet-ipr.2016.0655
- Type: Article
- + Show details - Hide details
-
p.
522
–529
(8)
This study presents a modified majority-logic decoding algorithm of Reed–Muller (RM) codes for matrix embedding (ME) in steganography. An ME algorithm uses linear block code to improve the embedding efficiency in steganography. The optimal embedding algorithm in steganography is equivalent to the maximum likelihood decoding (MLD) algorithm in error-correcting codes. The main disadvantage of ME is that the equivalent MLD algorithm of lengthy embedding codes requires highly complex embedding. This study used RM codes to embed data in binary host images. The authors propose a novel low-complexity embedding algorithm that uses a modified majority-logic algorithm to decode RM codes, in which a message-passing algorithm (i.e. sum-product, min-sum, or bias propagation) is performed on the highest order of information bits in the RM codes. The experimental results indicate that integrating bias propagation into the proposed scheme achieves superior embedding efficiency (relative to when the sum-product or min-sum algorithm is used) and can even achieve the embedding bound of RM codes.
- Author(s): K. Raghesh Krishnan and Sudhakar Radhakrishnan
- Source: IET Image Processing, Volume 11, Issue 7, p. 530 –538
- DOI: 10.1049/iet-ipr.2016.1072
- Type: Article
- + Show details - Hide details
-
p.
530
–538
(9)
This study presents a computer-based approach to classify ten different kinds of focal and diffused liver disorders using ultrasound images. The diseased portion is isolated from the ultrasound image by applying active contour segmentation technique. The segmented region is further decomposed into horizontal, vertical and diagonal component images by applying biorthogonal wavelet transform. From the above wavelet filtered component images, grey level run-length matrix features are extracted and classified using random forests by applying ten-fold cross-validation strategy. The results are compared with spatial feature extraction techniques such as intensity histogram, invariant moment features and spatial texture features such as grey-level co-occurrence matrices, grey-level run length matrices and fractal texture features. The proposed technique, which is an application of texture feature extraction on transform domain images, gives an overall classification accuracy of 91% for a combination of ten classes of similar looking diseases which is appreciable than the spatial domain only techniques for liver disease classification from ultrasound images.
- Author(s): Zhihua Chen ; Zhenzhu Wang ; Bin Sheng ; Chao Li ; Ruimin Shen ; Ping Li
- Source: IET Image Processing, Volume 11, Issue 7, p. 539 –549
- DOI: 10.1049/iet-ipr.2016.0989
- Type: Article
- + Show details - Hide details
-
p.
539
–549
(11)
As the standard colour space used by printers, Cyan, Magenta, Yellow, Black (CMYK) colour model is a subtractive colour space used to describe the printing process. Existing CMYK conversion methods rely on static conversion table, which may not preserve the subtle visual structures of images, due to the local visual contrast loss caused by the static colour mapping. Therefore, the authors propose a novel dynamic Red, Green, Blue (RGB)-to-CMYK colour conversion, which utilises the weighted entropy to extract the pixels with filter response change dramatically. They obtain the image activity map by combining these pixels with high skin probability regions, and optimise the colour conversion of each pixel to ensure that the ink used for each pixel can be saved, while the visual contrast can be preserved with ink-saving. In this way, their proposed technique can achieve dynamic CMYK colour conversion, in which the consumption of ink can be reduced without the loss of visual contrast. The experimental results have shown that their dynamic CMYK colour conversion saved 10–25% ink consumption compared with the static conversion method, while with high visual quality for the converted images.
- Author(s): Sagar Shriram Salwe and Karamtot Krishna Naik
- Source: IET Image Processing, Volume 11, Issue 7, p. 550 –558
- DOI: 10.1049/iet-ipr.2016.0779
- Type: Article
- + Show details - Hide details
-
p.
550
–558
(9)
Vertical handover (VHO) plays an important role in providing seamless connectivity between heterogeneous wireless networks. The authors propose VHO mechanism using an image as dynamic discrete data for transmission and received signal strength (RSS)-based switching mechanism. The novelty of work lies in image data used for simulation, RSS calculation using free space propagation model, receive delay calculation and sample-based time series analysis of received data. Results shows that when VHO mechanism is carried out in ISM band operated devices it provides seamless connectivity between diverse communication protocols. Simulation results exhibit the continuous transmission of data and synchronisation of received image data.
Text segmentation using superpixel clustering
Video stabilisation with total warping variation model
WN-based approach to melanoma diagnosis from dermoscopy images
Part-based recognition of vehicle make and model
Multi-dimensional data representation using linear tensor coding
Label fusion method based on sparse patch representation for the brain MRI image segmentation
Semi-guided bilateral filter
Matrix embedding in steganography with binary Reed–Muller codes
Hybrid approach to classification of focal and diffused liver disorders using ultrasound images with wavelets and texture features
Dynamic RGB-to-CMYK conversion using visual contrast optimisation
Discrete image data transmission in heterogeneous wireless network using vertical handover mechanism
Most viewed content
Most cited content for this Journal
-
Medical image segmentation using deep learning: A survey
- Author(s): Risheng Wang ; Tao Lei ; Ruixia Cui ; Bingtao Zhang ; Hongying Meng ; Asoke K. Nandi
- Type: Article
-
Block-based discrete wavelet transform-singular value decomposition image watermarking scheme using human visual system characteristics
- Author(s): Nasrin M. Makbol ; Bee Ee Khoo ; Taha H. Rassem
- Type: Article
-
Classification of malignant melanoma and benign skin lesions: implementation of automatic ABCD rule
- Author(s): Reda Kasmi and Karim Mokrani
- Type: Article
-
Digital image watermarking method based on DCT and fractal encoding
- Author(s): Shuai Liu ; Zheng Pan ; Houbing Song
- Type: Article
-
Chaos-based fast colour image encryption scheme with true random number keys from environmental noise
- Author(s): Hongjun Liu ; Abdurahman Kadir ; Xiaobo Sun
- Type: Article