IET Image Processing
Volume 13, Issue 12, 17 October 2019
Volumes & issues:
Volume 13, Issue 12
17 October 2019
-
- Author(s): D. Latha and Y. Jacob Vetha Raj
- Source: IET Image Processing, Volume 13, Issue 12, p. 2031 –2044
- DOI: 10.1049/iet-ipr.2018.5797
- Type: Article
- + Show details - Hide details
-
p.
2031
–2044
(14)
Content-based image retrieval (CBIR) is an image retrieval technique that can retrieve images by matching its feature set values. This research focuses on a novel CBIR method namely hybrid CBIR method using statistical, Discrete Wavelet Transform (DWT)-Entropy and Peak-oriented Octal Pattern-derived Majority Voting (POPMV)-based feature sets (CBIR_SWPOPMV) to efficiently extract the relevant colour images from the colour image dataset. The doctrine of the proposed method is influenced by a novel texture descriptor namely POPMV which is an octal pattern based on the histogram peak information, to bring about a majority voting-based feature set and three histogram-based feature sets. Furthermore, to improve the retrieval accuracy, the DWT-based Entropy feature set and the statistical feature set are also included. Finally, the Euclidean distance-based matching process brings more favourable relevant images with respect to the query image. The proposed methodology is experimentally compared with the existing recent CBIR versions by using seven standard databases such as Corel-1k, USPTex, MIT-VisTex, KTH-TIPS, KTH-TIPS2a, KTH-TIPS2a, Colored Brodatz and a user-contributed database named DB_VEG.
Hybrid CBIR method using statistical, DWT-Entropy and POPMV-based feature sets
-
- Author(s): Abhinav Gupta and Divya Singhal
- Source: IET Image Processing, Volume 13, Issue 12, p. 2045 –2057
- DOI: 10.1049/iet-ipr.2018.6074
- Type: Article
- + Show details - Hide details
-
p.
2045
–2057
(13)
Median filtering forensics in images is a subject under intense study nowadays. Existing median filtering detectors are developed based on hand-crafted features and convolutional neural networks (CNN). Among hand-crafted features based detectors, most of the detector's performance deteriorate for low-resolution images compressed with low-quality factors. However, CNN-based detectors are found to be more robust at the expense of large database and large training time requirement. In this study, the authors propose a robust median filtering detector by exploiting the statistics of the Pearson parameter , is defined as the polynomial ratio of skewness and kurtosis. To capture fingerprints of median filtering, is determined for the median filtered residual (MFR) of the images to construct a novel feature set of 23 dimensions. The efficacy of the proposed feature set, against existing hand-crafted features based and CNN-based detectors, is established by a series of experiments for global median filtering detection. Results reveal that the proposed feature set exhibits performance gain of 2–4% against existing hand-crafted features based detectors and an approximate gain of 4% against CNN-based detector for detection of low-resolution median filtered images compressed with low-quality factors.
- Author(s): Chou-Yuan Liu ; Chin-Chen Chang ; Der-Lor Way ; Wen-Kai Tai
- Source: IET Image Processing, Volume 13, Issue 12, p. 2058 –2066
- DOI: 10.1049/iet-ipr.2018.5298
- Type: Article
- + Show details - Hide details
-
p.
2058
–2066
(9)
Here, the authors present a fuzzy-based approach for image resizing. The authors introduce a new approach for improving the gradient map and saliency map. Authors’ approach first constructs three different sizes of structural elements, which are used for the operation of close–open filtering to process the input image to obtain three smooth images. Then, the authors use edge detection to process these images, and merge them to obtain an improved gradient map. Besides, a saliency map of the input image is calculated. The authors use hedges in fuzzy logic to strengthen the values of the significant pixels and reduce the values of the background pixels. Hence, the authors can obtain an improved saliency map. After that, the authors introduce a technique to generate a weighted map and then to obtain weighted gradient and saliency maps. Finally, the authors use fuzzy-based approach combining the weighted gradient and saliency maps to obtain an importance map. The map is used when the authors use the seam carving to adjust image size. Experimental results show authors’ approach using the importance map can produce better image resizing results.
- Author(s): Reza Moradi ; Reza Berangi ; Behrooz Minaei
- Source: IET Image Processing, Volume 13, Issue 12, p. 2067 –2076
- DOI: 10.1049/iet-ipr.2018.6620
- Type: Article
- + Show details - Hide details
-
p.
2067
–2076
(10)
In image processing domain of deep learning, the big size and complexity of the visual data require a large number of learnable variables. Subsequently, the training process consumes enormous computation and memory resources. Based on residual modules, the authors developed a new model architecture that has a minimal number of parameters and layers that enabled us to classify tiny images using much less computation and memory costs. Also, the summation of correlations between pairs of feature maps as an additive penalty in the objective function was used. This technique encourages the kernels to be learned in a way that elicit uncorrelated representations from the input images. Also, employing Fractional pooling helped to have deeper networks that consequently resulted in more informative representation. Moreover, employing periodic learning rate curves, multiple machines are trained with a less total cost. In the training phase, a random augmentation to the input data that prevent the model from being overfitted was applied. Applying MNIST and CIFAR-10 datasets to the proposed model resulted in the classification accuracy of 99.72 and 93.98, respectively.
- Author(s): Samiran Das ; Aurobinda Routray ; Alok Kanti Deb
- Source: IET Image Processing, Volume 13, Issue 12, p. 2077 –2085
- DOI: 10.1049/iet-ipr.2018.5426
- Type: Article
- + Show details - Hide details
-
p.
2077
–2085
(9)
Availability of a large number of application-specific spectral libraries has generated a great deal of interest in semi-blind unmixing of the hyperspectral image in both remote sensing and signal processing community. This study presents a novel, semi-supervised, parameter-free algorithm which employs sparsity measures for library pruning. The overall algorithm includes sparsity criteria based library pruning and sparse inversion method for abundance computation. In the pruning process, each library element is removed from the spectral library and the corresponding sparse abundance matrix is computed. The library elements which lead to higher sparsity are adjudged as image endmembers, based on the assumption that elimination of actual image endmember enhances sparsity level. The authors also present a detailed exploration of standard sparsity measures. They calculate the abundance of the pruned library by maximising Gini index or pq-norm sparsity, which satisfies the desirable sparsity properties and is easier to compute. The abundance calculation task is solved using the adaptive direction method of multipliers. The experimental results on several real and synthetic image datasets demonstrate the computational efficiency and proficiency the authors’ method in the presence of noise and highly coherent spectral library.
- Author(s): Abdelrahman Karawia
- Source: IET Image Processing, Volume 13, Issue 12, p. 2086 –2097
- DOI: 10.1049/iet-ipr.2018.5142
- Type: Article
- + Show details - Hide details
-
p.
2086
–2097
(12)
Recently, few researchers investigated algorithms of image encryption using different chaotic economic maps (CEMs). However, the authors investigated the effect of these maps on the encryption of the plain image. In the current study, an image encryption algorithm via Fisher–Yates shuffling (FYS) combined with a three-dimensional (3D) CEM is given. FYS is used to generate the random permutation of a finite sequence. First, it is used to shuffle the rows and the columns of the plain image. Second, the 3DCEM is used in the substitution stage to confuse the pixels of the shuffling image. The proposed algorithm is applied to several types of images. Many measurements are performed to check the security and performance of the proposed algorithm. In addition, numerical simulations and experimental results have been implemented to verify that the proposed algorithm can resist different attack types.
- Author(s): Zhuo Zhao ; Bing Li ; Lei Chen ; Meiting Xin ; Fei Gao ; Qiang Zhao
- Source: IET Image Processing, Volume 13, Issue 12, p. 2098 –2105
- DOI: 10.1049/iet-ipr.2018.5824
- Type: Article
- + Show details - Hide details
-
p.
2098
–2105
(8)
In this study, a novel interest point detection algorithm which combines image intensity variation and edge contour information is proposed. Firstly, the Canny edge contour detector is used to extract the edge map. Secondly, the imaginary parts of multi-scale Gabor filters are applied to smooth input image and then the normalised information entropies at various scales can be acquired. Finally, multiplication of different normalised information entropies will be served as a new measure for interest point that can also be used for interest point detection. This method has two advantages: on the one hand, detection accuracy is greatly improved because combination information is adopted to extract interest points including contour shape information, grey variation of edge pixels and their neighbours; on the other hand, non-interest points can be well inhabited due to multi-scale product in detector is served as an interest point measure. Also, desirable noise robustness and time efficiency are validated through experiments. Compared with four other state-of-art methods, the proposed method shows excellent performance in terms of geometric transformations and localisation accuracy of repeated interest points.
- Author(s): Yun-Qiu Lv ; Kai Liu ; Fei Cheng ; Wei Li
- Source: IET Image Processing, Volume 13, Issue 12, p. 2106 –2115
- DOI: 10.1049/iet-ipr.2018.6517
- Type: Article
- + Show details - Hide details
-
p.
2106
–2115
(10)
Deep learning has been widely used in many visual recognition tasks owing to its powerful representation ability. However, online learning is a bottleneck to obstruct the application of deep learning in visual tracking. Although many algorithms have discarded the process of online learning during tracking, they demonstrate poor robustness to the online adaptation to appearance changes of the target. In this study, the authors design a tree structure specifically for online learning, which enables the appearance model to be updated smoothly. Once the target appearance has changed severely, a new branch is generated to avoid the fuzzy boundary of classification. In addition, active learning technique and artificial data are employed in the update to make the best of the limited knowledge about the interesting object during the tracking process. The proposed algorithm is evaluated on OTB2013 and VOT2017 benchmark and outperforms many state-of-the-art methods.
- Author(s): Pabitra Pal ; Biswapati Jana ; Jaydeb Bhaumik
- Source: IET Image Processing, Volume 13, Issue 12, p. 2116 –2129
- DOI: 10.1049/iet-ipr.2018.6638
- Type: Article
- + Show details - Hide details
-
p.
2116
–2129
(14)
In this study, the authors have employed a special type of periodic boundary CA (called CA attractor) for image authentication and tamper detection through the watermarking scheme. Authentication code (AC) has been generated utilising secure hash algorithm-512 on watermark image. The cover image (CI) is subdivided into four sub-sampled interpolated image, where AC and secret watermark bits are embedded to enhance capacity, quality and security. At the receiver end, the secret watermark information and CI are successfully extracted without any distortion from four sub-sampled watermarked images (WIs). Additionally, the proposed scheme can successfully determine any type of distortion within the WI that may be possible to occur by various steganographic attacks. Indeed, better results in terms of capacity and quality are experienced after having compared with similar schemes in vogue. The intended outcome brought into the limelight some remarkable sublime characteristics in the field of image authentication and ownership identification without which the technology life is stunted. Innumerable government and private sector facet including health care, commercial security, defence and intellectual property rights are immensely benefited from this scheme.
- Author(s): Rahul Sarkar ; Chandra Churh Chatterjee ; Animesh Hazra
- Source: IET Image Processing, Volume 13, Issue 12, p. 2130 –2142
- DOI: 10.1049/iet-ipr.2018.6669
- Type: Article
- + Show details - Hide details
-
p.
2130
–2142
(13)
Melanoma is one of the four major types of skin cancers caused by malignant growth in the melanocyte cells. It is the rarest one, accounting to only 1% of all skin cancer cases. However, it is the deadliest among all the skin cancer types. Owing to its rarity, efficient diagnosis of the disease becomes rather difficult. Here, a deep depthwise separable residual convolutional algorithm is introduced to perform binary melanoma classification on a dermoscopic skin lesion image dataset. Prior to training the model with the dataset noise removal from the images using non-local means filter is performed followed by enhancement using contrast-limited adaptive histogram equilisation over discrete wavelet transform algorithm. Images are fed to the model as multi-channel image matrices with channels chosen across multiple color spaces based on their ability to optimize the performance of the model. Proper lesion detection and classification ability of the model are tested by monitoring the gradient weighted class activation maps and saliency maps, respectively. Dynamic effectiveness of the model is shown through its performance in multiple skin lesion image datasets. The proposed model achieved an ACC of 99.50% on international skin imaging collaboration (ISIC), 96.77% on PH2, 94.44% on DermIS and 95.23% on MED-NODE datasets.
- Author(s): Jianxin Liao ; Baoran Li ; Di Yang ; Jingyu Wang ; Qi Qi ; Jing Wang
- Source: IET Image Processing, Volume 13, Issue 12, p. 2143 –2151
- DOI: 10.1049/iet-ipr.2018.6644
- Type: Article
- + Show details - Hide details
-
p.
2143
–2151
(9)
Hashing has been widely deployed to approximate nearest neighbour search for large-scale multimedia retrieval tasks due to storage and retrieval efficiency. State-of-the-art supervised hashing methods for image retrieval construct deep structures to simultaneously learn image representation and generate good hash codes, and the key step among them is simultaneously learned feature representation and binary hash code. Existing methods use similarity and regularity loss to train deep hashing systems, but these two functions usually work together but not cooperative, which may lead to inadequate performance of the whole system. In this study, a new method for training deep hashing system to learn compact binary codes is presented. The deep supervised hashing network with integrated regularisation (DSHIR) system develop the zero division restriction as a new part of the loss function, which settles the problem of cooperatively guiding the system generate similarity preserving binary codes. DSHIR system also modifies the similarity handling loss to better extract features from image data, which promotes the performance compared to existing end-to-end deep hashing systems. Experiments show that DSHIR yields about 10 per cent higher mean average precision on CIFAR-10 dataset, and also promote on other evaluation indexes compared with state-of-the-art systems.
- Author(s): Xin Lai ; Le Zhou ; Zeyu Fu ; Syed Mohsen Naqvi ; Jonathon Chambers
- Source: IET Image Processing, Volume 13, Issue 12, p. 2152 –2161
- DOI: 10.1049/iet-ipr.2018.6322
- Type: Article
- + Show details - Hide details
-
p.
2152
–2161
(10)
To obtain the best pooling effect and higher accuracy in image recognition, an improved method based on optimal search theory for the pooling layer of convolutional neural networks (CNNs) is proposed. The purpose is to solve the problems of the traditional pooling method, namely that it is too simplistic and it is difficult to extract effective features. The basic principle and network structure of CNN are introduced in the study. A new optimum-pooling method is proposed, and the authors study how to obtain the maximum probability to detect the target function under the constrained condition. Comparison experiments of different pooling methods are performed on three widely used datasets: LFW, CIFAR-10, and ImageNet. The experimental results show that the proposed method has the characteristics of more effective feature extraction and wide adaptability, and leads to higher accuracy and lower error rate in image recognition.
- Author(s): Daniela Moctezuma ; Eric S. Tellez ; Sabino Miranda-Jiménez ; Mario Graff
- Source: IET Image Processing, Volume 13, Issue 12, p. 2162 –2168
- DOI: 10.1049/iet-ipr.2019.0083
- Type: Article
- + Show details - Hide details
-
p.
2162
–2168
(7)
Intelligent surveillance systems in multi-camera environments pose a hard-open problem for computer vision. The way the people look changes inside and also among cameras, so people re-identification task can be largely improved collecting data about people already identified and take advantage of it as time advances in surveillance video. Furthermore, a camera change or a slight change in the objective traits may require the complete re-formulation of the appearance models. In this paper, we propose several heuristics for updating the appearance model in a multi-camera surveillance environment. Through these heuristics, the subject's appearance model is updated across different time and environmental conditions. The update process is carried out primarily in three different aspects: 1) based on time lapses, 2) based on the change of camera, and 3) based on the automatic selection of the most representative samples selected through decision functions of the classifier. The proposed system focuses on video surveillance environments, that is, the objective is to identify an individual across the set of cameras in the surveillance area, the comparison considers only those people that share time and space. We used four public benchmarks to test our claims; the results confirm the importance of continuous appearance model's updating.
- Author(s): Davood Akbari
- Source: IET Image Processing, Volume 13, Issue 12, p. 2169 –2175
- DOI: 10.1049/iet-ipr.2018.5693
- Type: Article
- + Show details - Hide details
-
p.
2169
–2175
(7)
This study proposes a modified spectral–spatial classification approach for improving the spectral–spatial classification of hyperspectral images. The spatial information is obtained by an enhanced marker-based hierarchical segmentation (MHS) algorithm. The weighted genetic (WG) algorithm is first employed to obtain the subspace of hyperspectral data. The obtained features are then fed into the multi-layer perceptron (MLP) neural network classification algorithm. Afterwards, the MHS algorithm is applied in order to increase the accuracy of less-accurately classified land-cover types. In the proposed approach, namely MLP-MHS, the markers are extracted from the classification maps obtained by MLP and support vector machine classifiers. Experiments on two benchmark hyperspectral datasets, Pavia University and Berlin, validate the soundness of the proposed approach compared to the MLP and the original MHS algorithms.
- Author(s): Anubha Pearline Sundara Sobitha Raj and Sathiesh Kumar Vajravelu
- Source: IET Image Processing, Volume 13, Issue 12, p. 2176 –2182
- DOI: 10.1049/iet-ipr.2019.0346
- Type: Article
- + Show details - Hide details
-
p.
2176
–2182
(7)
Plant species recognition is performed using a dual deep learning architecture (DDLA) approach. DDLA consists of MobileNet and DenseNet-121 architectures. The feature vectors obtained from individual architectures are concatenated to form a final feature vector. The extracted features are then classified using machine learning (ML) classifiers such as linear discriminant analysis, multinomial logistic regression (LR), Naive Bayes, classification and regression tree, k-nearest neighbour, random forest classifier, bagging classifier and multi-layer perceptron. The dataset considered in the studies is standard (Flavia, Folio, and Swedish Leaf) and custom collected (Leaf-12) dataset. The MobileNet and DenseNet-121 architectures are also used as a feature extractor and a classifier. It is observed that the DDLA architecture with LR classifier produced the highest accuracies of 98.71, 96.38, 99.41, and 99.39% for Flavia, Folio, Swedish leaf, and Leaf-12 datasets. The observed accuracy for DDLA + LR is higher compared with other approaches (DDLA + ML classifiers, MobileNet + ML classifiers, DenseNet-121 + ML classifiers, MobileNet + fully connected layer (FCL), DenseNet-121 + FCL). It is also observed that the DDLA architecture with LR classifier achieves higher accuracy in comparable computation time with other approaches.
- Author(s): Mingfeng Jiang ; Liang Lu ; Yi Shen ; Long Wu ; Yinglan Gong ; Ling Xia ; Feng Liu
- Source: IET Image Processing, Volume 13, Issue 12, p. 2183 –2189
- DOI: 10.1049/iet-ipr.2018.5614
- Type: Article
- + Show details - Hide details
-
p.
2183
–2189
(7)
Compressed sensing magnetic resonance imaging (CS-MRI) is an effective way of reducing the sampling data in the k-space and shortening the scanning time. Motivated by the high performance of directional tensor product complex tight framelets (TPCTFs) for the image denoising problem, the authors proposed a novel framework that integrated TPCTF for sparse representation and projected fast iterative soft-thresholding algorithm (pFISTA) for CS-MRI reconstruction. Furthermore, to take advantage of the cross-scale relations in the wavelet tree of frame coefficients, the bivariate shrinkage (BS) function with local variance estimation is proposed to shrink thresholding. Such TPCTFs can provide sparse directional representations very well for MR image. When compared with other the state-of-the-art CS-MRI algorithms in numerical experiments, the proposed TPCTF-BS method achieves a higher reconstruction quality with respect to image edge preservation and the artefact suppression.
- Author(s): Shaheera Rashwan ; Nicolas Dobigeon ; Walaa Sheta ; Hanan Hassan
- Source: IET Image Processing, Volume 13, Issue 12, p. 2190 –2195
- DOI: 10.1049/iet-ipr.2018.5094
- Type: Article
- + Show details - Hide details
-
p.
2190
–2195
(6)
The spatial pixel resolution of common multispectral and hyperspectral sensors is generally not sufficient to avoid that multiple elementary materials contribute to the observed spectrum of a single pixel. To alleviate this limitation, spectral unmixing is a by-pass procedure which consists in decomposing the observed spectra associated with these mixed pixels into a set of component spectra, or endmembers, and a set of corresponding proportions, or abundances, that represent the proportion of each endmember in these pixels. In this study, a spectral unmixing technique is proposed to handle the challenging scenario of non-linear mixtures. This algorithm relies on a dedicated implementation of multiple-kernel learning using self-organising map proposed as a solver for the non-linear unmixing problem. Based on a priori knowledge of the endmember spectra, it aims at estimating their relative abundances without specifying the non-linear model under consideration. It is compared to state-of-the-art algorithms using synthetic yet realistic and real hyperspectral images. Results obtained from experiments conducted on synthetic and real hyperspectral images assess the potential and the effectiveness of this unmixing strategy. Finally, the relevance and potential parallel implementation of the proposed method is demonstrated.
- Author(s): Noor Baha Aldin ; Mahmut Aykaç ; Shaima Baha Aldin
- Source: IET Image Processing, Volume 13, Issue 12, p. 2196 –2203
- DOI: 10.1049/iet-ipr.2018.5908
- Type: Article
- + Show details - Hide details
-
p.
2196
–2203
(8)
Although the high-efficiency video codec plays a leading role among the video coding standards, it consumes a lot of memory resources and time processing due to its superior video quality and coding performance. The computation algorithm of rate distortion optimisation is one of the reasons behind the encoder's bottleneck. Therefore, a new technique has been introduced here based on the entropy threshold and size of coding unit, which reduces the coding time and bit rate to be more suitable for real-time processing. According to the obtained results, the encoding time has been enhanced by 56.596% with a reduction in the bit rate reached to 74% on average.
- Author(s): Shaojun Qu ; Qiaoliang Li ; Ming Chen
- Source: IET Image Processing, Volume 13, Issue 12, p. 2204 –2212
- DOI: 10.1049/iet-ipr.2018.6241
- Type: Article
- + Show details - Hide details
-
p.
2204
–2212
(9)
Effective and efficient image segmentation is an important task in computer vision. As the full-automatic image segmentation is usually difficult to segment the natural image, it is an excellent solution to use interactive schemes. Here, to overcome the defects of SSNCut in its low quality and speed, the authors proposed a new interactive image segmentation method based on superpixels, must-link, cannot-link constraints and improved normalised cuts. The main contribution of their work is as follows: first, the similarity between two superpixel regions is calculated using Bhattacharyya distance. Second, they adaptively modify the weights of must-link and cannot-link constraints. Compared to SSNCut, their method greatly improves the accuracy of segmentation. Comparative experiments on open datasets show that the proposed method can get better results compared with SSNCut, GrabCut in one cut, interactive segmentation using binary partition tree, interactive graph cut, seed region growing, and simple interactive object extraction.
- Author(s): Xin Zhao ; Fan Guo ; Yuxiang Mai ; Jin Tang ; Xuanchu Duan ; Beiji Zou ; Lingzi Jiang
- Source: IET Image Processing, Volume 13, Issue 12, p. 2213 –2223
- DOI: 10.1049/iet-ipr.2019.0137
- Type: Article
- + Show details - Hide details
-
p.
2213
–2223
(11)
Glaucoma refers to a chronic disease of the eye that leads to vision loss that is irreversible, which is called ‘silent theft of sight’. Thus, an automatic glaucoma screening pipeline from optic disc (OD) localisation to glaucoma risk prediction is proposed in this study. The proposed pipeline consists of three main phases. Firstly, the OD is localised by morphological processing and sliding window methods. Secondly, a novel neural network which is in U-shape and convolutional introduces concatenating path and fusion loss function is developed to split OD and optic cup (OC) at the same time. Thirdly, both clinical measurements including optic cup-to-disc ratio (CDR), neuroretinal rim related features, and hidden features including statistical moments, entropy and energy are combined to train glaucoma classifiers. According to the results of the experiment, the proposed segmentation network achieves the best performance on both OD and OC segmentation and the proposed CDR calculation method is capable of achieving the performance similar to that of ophthalmologist on CDR measurement. Besides, the authors’ glaucoma classification model can obtain the best performance on sensitivity and area under the curve score in comparison with the existing methods.
- Author(s): Changzhi Yu ; Hengjian Li ; Xiyu Wang
- Source: IET Image Processing, Volume 13, Issue 12, p. 2224 –2232
- DOI: 10.1049/iet-ipr.2018.5912
- Type: Article
- + Show details - Hide details
-
p.
2224
–2232
(9)
Based on singular value decomposition (SVD), an image compression, encryption, and identity authentication scheme is proposed here. This scheme can not only encrypt image data which would store in the cloud but also implement identity authentication. The authors use the SVD to decompose the image data into three parts: the left singular value matrix, the right singular value matrix, and the singular value matrix. The left singular value matrix and right singular value matrix are not as important as the singular value matrix. They propose a logistic-tent-sine chaotic system to encrypt them. In this scheme, they proposed a novel authentication value calculation algorithm, which can calculate the authentication value according to related data. According to the authentication value calculated from the ciphertext, the algorithm has the perfect authentication performance, so as in the scenarios if the image is cropped or added noisy. Theoretical analysis and empirical evaluations show that the proposed system can achieve better compression performance, satisfactory security performance, and low computational complexity.
- Author(s): Meryem Souaidi and Mohamed El Ansari
- Source: IET Image Processing, Volume 13, Issue 12, p. 2233 –2244
- DOI: 10.1049/iet-ipr.2019.0415
- Type: Article
- + Show details - Hide details
-
p.
2233
–2244
(12)
Wireless capsule endoscopy (WCE) proves its robustness as a great technology to examine the entire digestive tract or the small intestine. An automatic computer-aided design method is proposed in this study, in a manner to differentiate between ulcer disease and normal WCE images. A multi-scale analysis-based grey-level co-occurrence matrix (GLCM) is conducted here. The main step, the co-occurrence matrix (GLCM), is computed from each sub-band Laplacian pyramid decomposition, so as to extract the common Haralick features. Moreover, the p-value and area under the curve are used to select the relevant characteristics from the feature descriptor. This proposed approach was separately applied to the components of CIELab colour space. Ulcer detection was performed using the support vector machine. The findings demonstrate an encouraging detection rate performance of 95.38% for accuracy and 97.42% for sensitivity based on the first dataset and an average accuracy of 99.25 and 98.51% of sensitivities for the second dataset.
- Author(s): Yunyun Yang and Wenjing Jia
- Source: IET Image Processing, Volume 13, Issue 12, p. 2245 –2254
- DOI: 10.1049/iet-ipr.2019.0698
- Type: Article
- + Show details - Hide details
-
p.
2245
–2254
(10)
Accurate segmentation of medical images plays a very important role in clinical diagnosis so that the segmentation technology for medical images attracts more and more attention. However, most medical images usually suffer from severe intensity inhomogeneity and make accurate segmentation difficult. In this study, the authors propose an efficient and robust active contour model for simultaneous image segmentation and correction. The proposed model not only can accurately segment images with severe intensity inhomogeneity and serious noise but also can eliminate the intensity varying information to get the homogeneous correction images. They first present the level set formulation of the two-phase model, which is then extended to the multi-phase formulation. The split Bregman method is applied to efficiently minimise the proposed energy functionals. The proposed model is tested with lots of synthetic images and medical images with promising results. Experimental results demonstrate that the proposed model can accurately segment and correct the inhomogeneous images with serious noise. Quantitative comparison results of the proposed model and other models illustrate the proposed model is more accurate and more efficient. What's more, the proposed model not only is insensitive to the initial contour, but also is robust to the noise.
- Author(s): Farnaam Samadi ; Gholamreza Akbarizadeh ; Hooman Kaabi
- Source: IET Image Processing, Volume 13, Issue 12, p. 2255 –2264
- DOI: 10.1049/iet-ipr.2018.6248
- Type: Article
- + Show details - Hide details
-
p.
2255
–2264
(10)
In solving change detection problem, unsupervised methods are usually preferred to their supervised counterparts due to the difficulty of producing labelled data. Nevertheless, in this paper, a supervised deep learning-based method is presented for change detection in synthetic aperture radar (SAR) images. A Deep Belief Network (DBN) was employed as the deep architecture in the proposed method, and the training process of this network included unsupervised feature learning followed by supervised network fine-tuning. From a general perspective, the trained DBN produces a change detection map as the output. Studies on DBNs demonstrate that they do not produce ideal output without a proper dataset for training. Therefore, the proposed method in this study provided a dataset with an appropriate data volume and diversity for training the DBN using the input images and those obtained from applying the morphological operators on them. The great computational volume and the time-consuming nature of simulation are the drawbacks of deep learning-based algorithms. To overcome such disadvantages, a method was introduced to greatly reduce computations without compromising the performance of the trained DBN. Experimental results indicated that the proposed method had an acceptable implementation time in addition to its desirable performance and high accuracy.
- Author(s): Dong Zhang ; Junhua Zhang ; Zheng Wang ; Meijun Sun
- Source: IET Image Processing, Volume 13, Issue 12, p. 2265 –2270
- DOI: 10.1049/iet-ipr.2018.5398
- Type: Article
- + Show details - Hide details
-
p.
2265
–2270
(6)
Tongue diagnosis is an important concept in Traditional Chinese Medicine (TCM). The tongue colour and coating can aid understanding of the body's physiological mechanisms, as well as the pathology of diseases. Existing research has focused on using digital images and tongue colour classification, without considering the other visible bands of information in the tongue. In this study, a visible hyperspectral image system, with an approximate spectral range of 400–1000 nm, was used to predict the tongue colour values and the coating position in TCM, and a stacked autoencoder (SAE) predict model based on spectral–spatial feature was performed to digital the tongue colour space and the coating. The experimental results show the effectiveness of the spectral–spatial feature with SAE model in predicting the CIELAB values of L, a, and coating position, thus the authors provide a new technique for the objective and digitising development of TCM.
- Author(s): Bhakti Palkar and Dhirendra Mishra
- Source: IET Image Processing, Volume 13, Issue 12, p. 2271 –2280
- DOI: 10.1049/iet-ipr.2018.5609
- Type: Article
- + Show details - Hide details
-
p.
2271
–2280
(10)
Image fusion is the process of merging multiple images to generate a single image called ‘fused image’ which is more informative than input images in terms of human perception and machine processing. In medical applications, images of the same or different modalities are fused to generate a new image which helps clinicians in reliable and accurate diagnosis. Fused image of mono-modal medical images is used to see pre- and post-operative results. Multi-modal medical images are fused for treatment or surgical planning. In this study, the authors have focused on the fusion of lumbar spine images of two completely different modalities: Computed Tomography (CT) and Magnetic Resonance Imaging (MRI). CT provides bony details whereas MR provides soft tissue details. Since the two images are captured using two different machines, these images need to be strictly aligned with each other before fusing. Kekre's Hybrid wavelet transform (KHWT) is used to fuse registered images using combinations of six different orthogonal transforms with four different transform sizes. It is compared with five other fusion methods in qualitative and quantitative ways. The overall comparison indicates that the fused image generated using KHWT is better than input images in terms of content, quality and contrast.
- Author(s): Mengmeng Liao and Xiaodong Gu
- Source: IET Image Processing, Volume 13, Issue 12, p. 2281 –2293
- DOI: 10.1049/iet-ipr.2018.5263
- Type: Article
- + Show details - Hide details
-
p.
2281
–2293
(13)
Generally, the commonly used sparse-based methods, such as sparse representation classifier, have achieved a good recognition result in face recognition. However, there exist several problems in those methods. First, those methods think that the importance of each atom is the same in representing other query samples. This is not reasonable because different atoms contain different amounts of information, their importance should be different when they together represent the query samples. Second, those methods cannot meet the real-time requirement when dealing with large data set. In this study, on the one hand, the authors propose a fast extended sparse-weighted representation classifier (FESWRC) by considering the different importance of atoms and using primal augmented Lagrangian method as well as principal component analysis. On the other hand, the authors propose a distinctive feature descriptor, named logarithmic-weighted sum (LWS) feature descriptor. The authors combine FESWRC and LWS and used for face recognition, this method is called face recognition algorithm based on feature descriptor and weighted linear sparse representation (FDWLSR). Experimental results show that FDWLSR can realise real-time recognition and the recognition rate can achieve 100.0, 100.0, 91.6, 93.4 and 87.4%, respectively, on the Yale, Olivetti Research Laboratory (ORL), faculdade de engenharia industrial (FEI), face recognition technology program (FERET) and labelled face in the wild datasets.
- Author(s): Xuan Fei ; Lei Li ; Heling Cao ; Jianyu Miao ; Renping Yu
- Source: IET Image Processing, Volume 13, Issue 12, p. 2294 –2303
- DOI: 10.1049/iet-ipr.2019.0295
- Type: Article
- + Show details - Hide details
-
p.
2294
–2303
(10)
Compressed sensing (CS) multi-camera network reconstruction has attracted much attention in the field of distributed CS networks. However, many multi-camera network reconstructions based on CS usually recover every image separately; the view's dependency and geometrical structure among these multi-view images could be rarely considered in this way, which will result in some unsatisfied joint reconstruction results. Here, the authors introduce to extract the multiple view geometry from multi-view images to construct the view's dependency observation model. Based on the proposed parametric transformation observation model, they propose a novel CS joint reconstruction method of multi-view image that guided by the spatial correlation and low-rank background constraints. The eventual optimisation model could be relaxed to a series of convex optimisation problems, which could be efficiently solved by combining the variable splitting and alternate iteration technique. The extended experimental results indicate that they proposed method has achieved a remarkable improvement in both objective criterion and visual fidelity compared with other competitive reconstruction methods.
- Author(s): Chuan Lin ; Fuzhang Li ; Yijun Cao ; Haojun Zhao
- Source: IET Image Processing, Volume 13, Issue 12, p. 2304 –2313
- DOI: 10.1049/iet-ipr.2019.0214
- Type: Article
- + Show details - Hide details
-
p.
2304
–2313
(10)
Relevant physiological studies have revealed that the response of the classical receptive field (CRF) to visual stimuli could be suppressed by non-CRF (nCRF) inhibition of the kernel in the primary visual cortex (V1). Based on this mechanism, many bio-inspired contour detection models have been proposed, which are mainly achieved through CRF responses and nCRF surround inhibition calculation. In fact, the dynamic characteristics of neurons play an important role in contour detection in biological vision. Inspired by these visual mechanisms, the authors propose a contour detection model that emulates these dynamic characteristics. By introducing a multi-bandwidth Gabor filter, according to the target image, they can effectively adjust the weight ratios of the filter to protect the contours and filter the background textures in the calculation of CRF responses. Additionally, they logarithmically modulate the nCRF inhibition kernel to make texture suppression more flexible and effective, thus improving the accuracy of detection algorithm as a whole. Compared with existing bio-inspired contour detection models, the proposed model is more effective at contour detection, which will aid engineering applications that utilise pattern recognition in machine vision.
- Author(s): Chengfeng Jian ; Junjie Li ; Meiyu Zhang
- Source: IET Image Processing, Volume 13, Issue 12, p. 2314 –2320
- DOI: 10.1049/iet-ipr.2019.0650
- Type: Article
- + Show details - Hide details
-
p.
2314
–2320
(7)
In the field of continuous hand-gesture trajectory recognition, aiming at the problems of existing a lot of noise for handwriting trajectories, and difficult to segment multiple continuous hand gestures accurately, a long short-term memory-based dynamic probability (DP-LSTM) method is proposed. Firstly, obtain the classification result for each sub-period in the whole time period by using LSTM; secondly, cluster the classification results by non-maximum suppression for trajectory algorithm to eliminate interference of invalid subsets; Finally, the end point of the valid trajectory is obtained according to the characteristics of the probability change, thus realising dynamic trajectory segmentation and recognition. In order to evaluate the performance of the DP-LSTM, this method is evaluated by using an Arabic numerals gesture database. The experiments show that the DP-LSTM has a high recognition rate for continuous hand gestures and can recognise its in real time.
- Author(s): Mehrnaz Aghanouri ; Ali Ghaffari ; Nasim Dadashi Serej ; Hossein Rabbani ; Peyman Adibi
- Source: IET Image Processing, Volume 13, Issue 12, p. 2321 –2327
- DOI: 10.1049/iet-ipr.2018.6366
- Type: Article
- + Show details - Hide details
-
p.
2321
–2327
(7)
Localisation of an active capsule endoscope inside the stomach has different challenges. One of them is the estimation of the capsule's roll angle. Another challenge is adjusting the distance between the capsule and the stomach to achieve high-quality imaging in the region of interest. In this study, an optimised image-guided localisation (O-Localisation) method is proposed to estimate the roll angle and the scale factor between the consecutive frames. The distance between the capsule and walls of the stomach can be adjusted using the suggested fuzzy adjuster, which is developed based on the estimated scale factors and calibration parameters. This new method is only based on visual information extracted from wireless capsule endoscope video frames. The results show that this method can accurately estimate the rotation angles and scale factors with errors <0.2% for the angles up to 90° and 0.3% for the scales up to 5, respectively. The method is robust to the brightness changes up to 80% with a maximum error of 0.3%. The computational time is about 1 s and can be considered near real-time for this application. Accordingly, the O-Localisation method as a real-time, robust and precise method for capsule localisation can provide a more efficient controllable and steerable capsule endoscopes.
- Author(s): Guoqing Xu ; Chen Li ; Qi Wang
- Source: IET Image Processing, Volume 13, Issue 12, p. 2328 –2334
- DOI: 10.1049/iet-ipr.2018.6551
- Type: Article
- + Show details - Hide details
-
p.
2328
–2334
(7)
Leaf image identification is a significant and challenging research work. Here, a unified multi-scale method is proposed to capture leaf geometric information for plant leaf classification and image retrieval. For each point on the leaf contour, the unified multi-scale method utilises a simple yet effective three-step strategy to locate corresponding neighbour points. The descriptor extracted using these neighbour points can provide a coarse-to-fine description of leaf contours and is of multi-scale characteristic intrinsically. More importantly, there is no scale parameter to be adjusted in the method, and hence no optimisation procedure is required. The proposed method is applied to three well-known contour features to capture geometric information of leaves, including angle, arch-height, and triangle-area representation. FFT is applied on the features in unified multi-scale method for convenient and fast leaf matching. Leaf classification and image retrieval experiments are conducted on four challenging leaf datasets to test the proposed method and evaluated using three standard performance evaluation metrics. The experimental results and comparisons with the state-of-art methods indicate that the unified multi-scale method has remarkable performance.
- Author(s): Rohit Bhargav and Parag Deshpande
- Source: IET Image Processing, Volume 13, Issue 12, p. 2335 –2345
- DOI: 10.1049/iet-ipr.2018.6237
- Type: Article
- + Show details - Hide details
-
p.
2335
–2345
(11)
A license plate (LP) can help identify a motor vehicle. However, no common standard for LP exists across countries, and even within a country, significant LP variations are observed. In addition, environmental factors cause uncontrolled plate and character variations. It is, therefore, a challenge to design a robust and universal license plate recognition (LPR) system which works for multiple countries, for different types of vehicles, and for different styles of LPs. This study presents a novel approach for locating LP based on the use of multiple clustering and filtering techniques applied to the geometrical properties of LP characters. The proposed approach is independent of the size, rotation, and colour of the LP and can be used to locate single or multiple LP of different styles of different vehicles and of different countries. The approach has been validated using the standard Media-lab and application-oriented license plate (AOLP) datasets as well as on datasets of vehicles from other countries. The approach achieved an average success ratio of 93.42% for locating LPs from both the Media-lab and the AOLP dataset and is higher than the results of previously published methods which evaluated their performance over the same datasets.
- Author(s): Xiaofen Jia ; Yongcun Guo ; Baiting Zhao ; Yourui Huang
- Source: IET Image Processing, Volume 13, Issue 12, p. 2346 –2357
- DOI: 10.1049/iet-ipr.2019.0624
- Type: Article
- + Show details - Hide details
-
p.
2346
–2357
(12)
To protect edge and texture information, when removing salt-and-pepper (SP) noise in grayscale images, a support vector machine (SVM) denoising method is employed. First, a mapping relation between the neighborhood signal pixels and the central pixel is designed. The size of the neighborhood is a 5 × 5 region, with a signal pixel in the center. In this region, a 25-dimensional input sample is constructed using the correlation between the neighborhood pixels and the eight-direction fractional integral operators. The center signal pixel acts as the corresponding output sample to provide a training sample. Then, the SVM is trained with all training samples, and the SVM denoising model is obtained. Next, the center pixel value is estimated using the SVM denoising model in every 5 × 5 region with a noise pixel in the center. Finally, the noise pixel values are replaced with the estimated values of the SVM. The experiments demonstrate that the best denoising effect is obtained when the fractional integral order is in the range of 1.8 ± 0.1. The proposed method produces a visually pleasing denoised image and obtains superior image quality assessment indicators. Our method has significant advantages compared with state-of-the-art denoisers when a low level of noise is present.
Global median filtering forensic method based on Pearson parameter statistics
Image resizing using fuzzy inferences
OrthoMaps: an efficient convolutional neural network with orthogonal feature maps for tiny image classification
Sparsity measure based library aided unmixing of hyperspectral image
Image encryption based on Fisher-Yates shuffling and three dimensional chaotic economic map
Interest point detection method based on multi-scale Gabor filters
Visual tracking with tree-structured appearance model for online learning
Robust watermarking scheme for tamper detection and authentication exploiting CA
Diagnosis of melanoma from dermoscopic images using a deep depthwise separable residual convolutional network
Deep supervised hashing network with integrated regularisation
Enhanced pooling method for convolutional neural networks based on optimal search theory
Appearance model update based on online learning and soft-biometrics traits for people re-identification in multi-camera environments
Improved neural network classification of hyperspectral imagery using weighted genetic algorithm and hierarchical segmentation
DDLA: dual deep learning architecture for classification of plant species
Directional tensor product complex tight framelets for compressed sensing MRI reconstruction
Non-linear unmixing of hyperspectral images using multiple-kernel self-organising maps
RHEVC intra-prediction mode
Supervised image segmentation based on superpixel and improved normalised cuts
Glaucoma screening pipeline based on clinical measurements and hidden features
SVD-based image compression, encryption, and identity authentication algorithm on cloud
Multi-scale analysis of ulcer disease detection from WCE images
Efficient and robust segmentation and correction model for medical images
Change detection in SAR images using deep belief network: a new training approach based on morphological images
Tongue colour and coating prediction in traditional Chinese medicine based on visible hyperspectral imaging
Fusion of multi-modal lumbar spine images using Kekre's hybrid wavelet transform
Face recognition algorithm based on feature descriptor and weighted linear sparse representation
View's dependency and low-rank background-guided compressed sensing for multi-view image joint reconstruction
Bio-inspired contour detection model based on multi-bandwidth fusion and logarithmic texture inhibition
LSTM-based dynamic probability continuous hand gesture trajectory recognition
New image-guided method for localisation of an active capsule endoscope in the stomach
Unified multi-scale method for fast leaf classification and retrieval using geometric information
Locating multiple license plates using scale, rotation, and colour-independent clustering and filtering techniques
Fractional-integral-operator-based improved SVM for filtering salt-and-pepper noise
-
- Author(s): Michał Tomaszewski ; Paweł Michalski ; Bogdan Ruszczak ; Sławomir Zator
- Source: IET Image Processing, Volume 13, Issue 12, p. 2358 –2366
- DOI: 10.1049/iet-ipr.2018.6284
- Type: Article
- + Show details - Hide details
-
p.
2358
–2366
(9)
The massive growth of technologies used to register and process digital images allow for their application in evaluating the technical condition of power lines. However, it is not possible without a set of dedicated methods for obtaining diagnostic information based on registered video data. The method described here details the detection of power line insulators in digital images featuring diversified backgrounds using laser spots. The algorithm of detecting an insulator in analysed images is based on testing the digital signal of pixel intensity profiles read between subsequent pairs of laser points in the image. The method is comprised of the following stages: import the image with laser spots, detection of spots on the image, and pattern classification of each image profile that is calculated for each found laser spots pair. The evaluated profiles depicting an insulator were characterised by regular patterns that reflect the target structure. To classify profiles as either insulator containing or non-containing, several steps should be followed: averaging the signal, removing the linear trend, finding and alternating the minima and maxima. The performance of the proposed method was verified using an open-access dataset, comprised of various scenes featuring high-voltage power line insulators.
Detection of power line insulators on digital images with the use of laser spots
Most viewed content
Most cited content for this Journal
-
Medical image segmentation using deep learning: A survey
- Author(s): Risheng Wang ; Tao Lei ; Ruixia Cui ; Bingtao Zhang ; Hongying Meng ; Asoke K. Nandi
- Type: Article
-
Block-based discrete wavelet transform-singular value decomposition image watermarking scheme using human visual system characteristics
- Author(s): Nasrin M. Makbol ; Bee Ee Khoo ; Taha H. Rassem
- Type: Article
-
Classification of malignant melanoma and benign skin lesions: implementation of automatic ABCD rule
- Author(s): Reda Kasmi and Karim Mokrani
- Type: Article
-
Digital image watermarking method based on DCT and fractal encoding
- Author(s): Shuai Liu ; Zheng Pan ; Houbing Song
- Type: Article
-
Tomato leaf disease classification by exploiting transfer learning and feature concatenation
- Author(s): Mehdhar S. A. M. Al‐gaashani ; Fengjun Shang ; Mohammed S. A. Muthanna ; Mashael Khayyat ; Ahmed A. Abd El‐Latif
- Type: Article