IET Image Processing
Volume 12, Issue 12, December 2018
Volumes & issues:
Volume 12, Issue 12
December 2018
-
- Author(s): Luc Gillibert ; Théodore Chabardès ; Beatriz Marcotegui
- Source: IET Image Processing, Volume 12, Issue 12, p. 2138 –2146
- DOI: 10.1049/iet-ipr.2017.0095
- Type: Article
- + Show details - Hide details
-
p.
2138
–2146
(9)
In this study, a multiscale local blur estimation is proposed based on the existing local focus measure that combines gradient and toggle mapping. This method evaluates the quality of images regardless of their content (not in an autofocus context) and can predict Optical Character Recognition accuracy based on local blur. The resulting approach outperforms state of the art blur detection methods. Quantitative results are given on DIQA database. Moreover, the authors demonstrate its usefulness for extracting a region of interest from partially blurry images. Results are shown on images acquired by a project devoted to smartphone based text extraction for visually impaired people. In this study, sharp region extraction is essential since it allows warning the users when their picture is unusable. Moreover, it saves computing time.
- Author(s): Yongbin Zhang ; Hongjun Liu ; Nan Huang ; Zhaolu Wang
- Source: IET Image Processing, Volume 12, Issue 12, p. 2147 –2152
- DOI: 10.1049/iet-ipr.2018.5634
- Type: Article
- + Show details - Hide details
-
p.
2147
–2152
(6)
Images taken under poor illumination conditions have low contrast and dark tones. General dark image enhancement algorithms cannot effectively enhance these images without introducing over-enhancement, detail loss, and noise amplification. In this study, a simple and fast enhancement technique of non-uniform illumination images is proposed based on dynamical stochastic resonance (DSR). The low-contrast images are enhanced through the nonlinear iteration by solving monostable Langevin equation. Iteration parameters are dynamically adjusted according to the intensity distribution of the original images, which ensure the balance of visibility and naturalness in the entire areas. A threshold is defined to automatically confirm the optimal outputs. The enhanced image is obtained by fusing the DSR result, original component, and illumination compensation component. The computational time, no-reference perceptual quality assessment, and lightness order error are measured to evaluate the performance of experimental results. The subjective and objective comparison with state-of-the-art methods shows that our method performs well to enhance the non-uniform illumination images with a low-computational complexity.
- Author(s): Sairamya Nanjappan Jothiraj ; Thomas George Selvaraj ; Balakrishnan Ramasamy ; Narain Ponraj Deivendran ; Subathra M.S.P
- Source: IET Image Processing, Volume 12, Issue 12, p. 2153 –2162
- DOI: 10.1049/iet-ipr.2018.5418
- Type: Article
- + Show details - Hide details
-
p.
2153
–2162
(10)
This study presents a novel feature extraction approach based on image processing algorithms for the automated detection of epileptic seizure activities in brain map representation of electroencephalography (EEG) signal using an efficient classification technique. The proposed technique uses independent component analysis to extract independent components (ICs) from the EEG signal and each extracted IC is transformed into an image termed as brain maps. Two feature extraction techniques namely closed neighbourhood gradient pattern (CNGP) and combined texture pattern (CTP) are propounded for automatic elimination of artefact brain maps. The extracted features are fed into the least square support vector machine (LSSVM) for automatic detection of epileptic brain maps. Extensive experimental result over the existing image processing techniques in literature demonstrates that the texture pattern representations of CNGP and CTP are improved to obtain better features to enhance the performance of texture classification. The obtained result shows that the LSSVM classifier with Gaussian RBF kernel is able to detect the epileptic brain map with a high accuracy rate. The results are reliable and it assists the neurologist to diagnose epileptic signals effortlessly by visually locating the brain area being affected by seizure activities.
- Author(s): Bilal Khomri ; Argyrios Christodoulidis ; Leila Djerou ; Mohamed Chaouki Babahenini ; Farida Cheriet
- Source: IET Image Processing, Volume 12, Issue 12, p. 2163 –2171
- DOI: 10.1049/iet-ipr.2018.5425
- Type: Article
- + Show details - Hide details
-
p.
2163
–2171
(9)
Retinal vessel segmentation constitutes an essential part of computer-assisted tools for the diagnosis of ocular diseases. In this study, the authors propose an unsupervised retinal blood vessels segmentation approach based on the elite-guided multi-objective artificial bee colony (EMOABC) algorithm. The proposed method exploits several criteria simultaneously to improve the accuracy of the segmentation results. An energy curve function is used to calculate the values of the thresholding criteria, in order to reduce the noise response from lesions and select the optimal thresholds that separate the blood vessels from the background. In order to achieve computational speed up, a stopping criterion method is used to adjust the parameters of the EMOABC algorithm. The proposed method is computationally simple and faster than most of the available unsupervised algorithms, demonstrating fast convergence to the final segmentation. Additionally, the proposed vessel segmentation method outperforms the metaheuristics vessels segmentation algorithms reported in the literature. The achieved mean discrepancy metrics for the proposed approach are 94.5% accuracy, 97.4% specificity and 73.9% sensitivity for DRIVE database, and 94% accuracy, 96.2% specificity and 73.7% sensitivity for STARE database.
- Author(s): Youness Aliyari Ghassabeh and Frank Rudzicz
- Source: IET Image Processing, Volume 12, Issue 12, p. 2172 –2177
- DOI: 10.1049/iet-ipr.2018.5600
- Type: Article
- + Show details - Hide details
-
p.
2172
–2177
(6)
The mean shift (MS) algorithm is an iterative method introduced for locating modes of a probability density function. Although the MS algorithm has been widely used in many applications, the convergence of the algorithm has not yet been proven. In this study, the authors modify the MS algorithm in order to guarantee its convergence. The authors prove that the generated sequence using the proposed modified algorithm is a convergent sequence and the density estimate values along the generated sequence are monotonically increasing and convergent. In contrast to the MS algorithm, the proposed modified version does not require setting a stopping criterion a priori; instead, it guarantees the convergence after a finite number of iterations. The proposed modified version defines an upper bound for the number of iterations which is missing in the MS algorithm. The authors also present the matrix form of the proposed algorithm and show that, in contrast to the MS algorithm, the weight matrix is required to be computed once in the first iteration. The performance of the proposed modified version is compared with the MS algorithm and it was shown through the simulations that the proposed version can be used successfully to estimate cluster centres.
- Author(s): Khalid M. Hosny ; Yasmeen M. Khedr ; Walid I. Khedr ; Ehab R. Mohamed
- Source: IET Image Processing, Volume 12, Issue 12, p. 2178 –2185
- DOI: 10.1049/iet-ipr.2018.5661
- Type: Article
- + Show details - Hide details
-
p.
2178
–2185
(8)
In this work, a new method for robust image hashing is presented. The objectives of image hash are robustness and uniqueness. Exact Gaussian–Hermite moments and their invariants are used to extract highly accurate features for grey-scale images. The hash value is estimated by the sender from the features of the given image and then appended to the image to be sent. On the other hand, the authenticity of the received image has been checked by decrypting the hash value at the receiver side. To increase the level of security, a pre-shared key is used between the sender and the receiver. This process is to encrypt the hash value using a secret key before attaching with the image and then transmitting it. The similarity between different hashes is calculated by using Euclidean distance. Numerical simulation ensures the robustness of the proposed method against different kinds of attacks and preserves the image content. Hash different images exhibit very low collision probability which proves the suitability of the proposed method for robust image hash. The proposed method is compared with the existing hash methods where the obtained results clearly show the superiority of the proposed method.
- Author(s): Tariq Tashan and Maher Al-Azawi
- Source: IET Image Processing, Volume 12, Issue 12, p. 2186 –2191
- DOI: 10.1049/iet-ipr.2018.5611
- Type: Article
- + Show details - Hide details
-
p.
2186
–2191
(6)
In this study, a multilevel compressive sensing (CS) compression for magnetic resonance imaging (MRI) images is presented. The proposed algorithm divides the image into frames of equal size, transforms the pixels inside the frame into the sparse domain, and then applies the CS compression to each frame with different level of compression. Four levels of compression are suggested, based on how sparse is the information inside the frame. The proposed algorithm is evaluated using six real MRI images showing different parts of the human body. The experimental results show a significant improvement of 7.03 dB in peak-signal-to-noise ratio and 23.76% in compression level (CL) when compared with a uniform CL algorithm.
- Author(s): Abderrahmane Kefali and Toufik Sari
- Source: IET Image Processing, Volume 12, Issue 12, p. 2192 –2203
- DOI: 10.1049/iet-ipr.2018.5132
- Type: Article
- + Show details - Hide details
-
p.
2192
–2203
(12)
Most of the proposed binarisation methods include parameters that must be set correctly before use. The determination of the values of these parameters is made most of the time manually after several tests. However, the optimum parameter values differ from an image to another and therefore the parameterisation shall be carried out for each image separately. In fact, as this task is very difficult, even impossible for large collections of images, the tuning is usually done once for the entire image collection. In this study, the authors propose a tool for automatic and adaptive parameterisation of binarisation techniques for each image separately. The adopted methodology is based on the use of an artificial neural network (ANN) to learn the optimal parameter values of a binarisation method for a set of images (training set), based on their features, and to use the trained ANN to determine the optimal parameter values for other images not learned. Several experiments have been conducted on images of degraded documents and the obtained results are encouraging.
- Author(s): Marcos Roberto e Souza ; Luiz Fernando Rodrigues da Fonseca ; Helio Pedrini
- Source: IET Image Processing, Volume 12, Issue 12, p. 2204 –2211
- DOI: 10.1049/iet-ipr.2018.5445
- Type: Article
- + Show details - Hide details
-
p.
2204
–2211
(8)
A large amount of video content has been produced by compact and portable cameras. Several applications have been benefited from such growth of multimedia data, such as telemedicine, business conferencing, surveillance and security, entertainment, distance learning, and robotics. Video stabilisation is the process of detecting and removing undesired motion or instabilities from a video stream caused during the acquisition stage when handling the camera. In this work, the authors introduce and analyse a novel approach that identifies failures in the global motion estimation of the camera by means of local features. Moreover, they propose an optimisation method for computing a new estimate of the corrected motion. Experiments conducted on different video sequences are performed to demonstrate the effectiveness of the developed method. Results obtained with the stabilisation process are compared against the state-of-the-art YouTube method.
- Author(s): Roushanak Rahmat and David Harris-Birtill
- Source: IET Image Processing, Volume 12, Issue 12, p. 2212 –2221
- DOI: 10.1049/iet-ipr.2018.5796
- Type: Article
- + Show details - Hide details
-
p.
2212
–2221
(10)
Image segmentation is one of the most important tasks in modern imaging applications, which leads to shape reconstruction, volume estimation, object detection and classification. One of the most popular active segmentation models is level set models which are used extensively as an important category of modern image segmentation technique with many different available models to tackle different image applications. Level sets are designed to overcome the topology problems during the evolution of curves in their process of segmentation while the previous algorithms cannot deal with this problem effectively. As a result, there is often considerable investigation into the performance of several level set models for a given segmentation problem. It would therefore be helpful to know the characteristics of a range of level set models before applying to a given segmentation problem. In this study, the authors review a range of level set models and their application to image segmentation work and explain in detail their properties for practical use.
- Author(s): Eugene J. OBrien ; Colin C. Caprani ; Serena Blacoe ; Dong Guo ; Abdollah Malekjafarian
- Source: IET Image Processing, Volume 12, Issue 12, p. 2222 –2228
- DOI: 10.1049/iet-ipr.2018.5369
- Type: Article
- + Show details - Hide details
-
p.
2222
–2228
(7)
There is potential for significant savings if the safety of existing bridges can be more accurately assessed. For long-span bridges, congestion is the governing traffic load condition. The current methods of simulating congestion make assumptions about the axle-to-axle gaps maintained between vehicles. There is potential for improvement in congestion models if accurate data on axle-to-axle gaps can be obtained. In this study, the use of a camera to collect this information is put forward. A new image processing technique is proposed to detect wheels in variable light conditions. The method is based on a pseudo-wavelet filter that amplifies circles, in conjunction with an algorithm that weights features in the image according to their circularity. This new approach is compared with the Hough transform, template matching and the deformable part-based model (DPM) methods previously developed. In a sample set of 80 images, 96.9% of wheels are detected, considerably more than with the Hough transform and template matching methods. It also provides the same level of accuracy as DPM without requiring a training process.
- Author(s): Yawar Rehman ; Jameel Ahmed Khan ; Hyunchul Shin
- Source: IET Image Processing, Volume 12, Issue 12, p. 2229 –2237
- DOI: 10.1049/iet-ipr.2018.5424
- Type: Article
- + Show details - Hide details
-
p.
2229
–2237
(9)
In this study, the authors present a new efficient method based on discriminative patches (d-patches) for holistic traffic sign detection with occlusion handling. Traffic sign detection is an important part in autonomous driving, but usually hampered by the occlusions encountered on roads. They propose a method which basically upgrades d-patches by integrating vocabulary learning features. Consequently, d-patches are more discriminatively trained for robust occlusion handling. In addition, a holistic classifier is trained on d-patches, which identify those regions where occlusion exists. This results in higher confidence-score for the regions which contain traffic signs and lower confidence-score for the regions containing occlusions. Furthermore, they also propose a new coarser-to-fine (CTF) approach to speed up the traffic sign detection process. CTF minimises the use of traditional sliding window for object detection. It relies on colour variance to search the regions with high probability of traffic sign presence. Sliding window is used only on the selected high probability regions. The proposed method achieves 100% detection results on German Traffic Sign Detection Benchmark and performs 2.2% better than the previous state-of-the-art methods on Korean Traffic Sign Detection dataset, under partially occluded settings. By using CTF approach, five times speedup with a marginal loss in accuracy can be achieved.
- Author(s): Beijing Chen ; Ming Yu ; Yuhang Tian ; Leida Li ; Dingcheng Wang ; Xingming Sun
- Source: IET Image Processing, Volume 12, Issue 12, p. 2238 –2249
- DOI: 10.1049/iet-ipr.2018.5440
- Type: Article
- + Show details - Hide details
-
p.
2238
–2249
(12)
In this study, by using the quaternion algebra, multiple-parameter fractional quaternion Fourier transform (MPFrQFT) is proposed to generalise the conventional multiple-parameter fractional Fourier transform (MPFrFT) to quaternion signal processing in a holistic manner. First, the new transform MPFrQFT and its inverse transform are defined. An efficient discrete implementation method of MPFrQFT is then proposed, in which the relationship between MPFrQFT and MPFrFT of four components is utilised for a quaternion signal. Finally, a new colour image encryption algorithm based on the proposed MPFrQFT and the double random phase encoding technique is proposed to evaluate the performance of the proposed MPFrQFT. Experimental results demonstrate that: (i) the computational time of the proposed implementation method is almost a half of the direct method's time; (ii) the proposed MPFrQFT-based encryption algorithm has an overall better performance than eight compared algorithms in security test and robustness test: it is more secure than the compared frequency-based algorithms due to the larger key space and the more sensitive key ‘transform orders’; it is also more robust than the compared spatial-domain algorithms.
- Author(s): Jinsheng Xiao ; Wentao Zou ; Shangyue Zhang ; Junfeng Lei ; Wen Wang ; Yuan-Fang Wang
- Source: IET Image Processing, Volume 12, Issue 12, p. 2250 –2257
- DOI: 10.1049/iet-ipr.2018.5563
- Type: Article
- + Show details - Hide details
-
p.
2250
–2257
(8)
This study introduces an algorithm for video denoising based on improved dual-domain filtering and 3D block matching. The wavelet thresholding based on 3D block matching is introduced to make full use of the correlation of video sequence in order to apply dual-domain filtering to the video. A layered approach is used that attempts denoising in both a base layer and a detail layer. The result of wavelet thresholding based on 3D block matching is used as a guide image to make the base layer smoother. Shrinkage of short-time Fourier transform coefficients further decreases the noise in the detail layer. Experimental results show that the authors’ algorithm generates a better base layer and detail layer than the traditional dual-domain filtering algorithms. The subjective and objective comparisons of different algorithms also prove that the proposed algorithm performs better for video denoising.
- Author(s): Tianjun Ma ; Qingge Ji ; Ning Li
- Source: IET Image Processing, Volume 12, Issue 12, p. 2258 –2263
- DOI: 10.1049/iet-ipr.2018.5368
- Type: Article
- + Show details - Hide details
-
p.
2258
–2263
(6)
With a soaring increase in the application of video surveillance in daily life, the estimation of crowd density has already become a hot field. Crowd counting has a very close relationship with traffic planning, pedestrian analysing and emergency warning. Here, a novel crowd counting method based on multi-scales head detection is proposed. The authors’ approach first uses gradients difference to extract the foreground of the images and apply the overlapped patches in different scales to split the input images. Then, the patches are selected and classified into different groups corresponding to their gradient distributions, and features are extracted for training. Finally, with the predicting result, density maps of different scales are computed and summed with the perspective map. In particular, the authors’ method overcomes the traditional detecting method's deficiencies of low accuracy when facing perspective transformation. Also, experiments demonstrate that this proposed method not only achieved high accuracy in counting but also has outstanding robustness in our data sets.
- Author(s): Qiyu Jin ; Osamu Miyashita ; Florence Tama ; Jie Yang ; Slavica Jonic
- Source: IET Image Processing, Volume 12, Issue 12, p. 2264 –2274
- DOI: 10.1049/iet-ipr.2018.5145
- Type: Article
- + Show details - Hide details
-
p.
2264
–2274
(11)
This study describes an improved method for Poisson image denoising that is based on a state-of-the-art Poisson denoising approach known as non-local principal component analysis (NLPCA). The new method is referred to as PieceWise Principal Component Analysis (PWPCA). In PWPCA, the given image is first split into pieces, then NLPCA is run on each image piece, and finally the entire image is reconstituted by a weighted combination of the NLPCA-processed image pieces. Using standard test images with Poisson noise, the authors show that PWPCA restores images more effectively than state-of-the-art Poisson denoising approaches. In addition, and to the best of their knowledge, they show the first application of such approaches to single-particle X-ray free-electron laser (XFEL) data. They show that the resolution of three-dimensional reconstruction from XFEL diffraction images is improved when the data are preprocessed with PWPCA. XFELs are currently under rapid development to allow high-resolution biomolecular structure determination at near-physiological conditions. Data analysis methods developments follow these technological advances and are expected to have high impact in structural biology and drug design. This study contributes to these developments. As little experimental single-particle XFEL data is available still, the XFEL experiments shown here were performed with simulated data.
- Author(s): Masoumeh Rezaei Abkenar ; Hamidreza Sadreazami ; M. Omair Ahmad
- Source: IET Image Processing, Volume 12, Issue 12, p. 2275 –2282
- DOI: 10.1049/iet-ipr.2018.5479
- Type: Article
- + Show details - Hide details
-
p.
2275
–2282
(8)
The human visual system is attracted to the most dominant part of the image which is called salient region. There has been a surge of interest in the past few years to efficiently detect the salient regions of images. In this study, a new salient region detection method is proposed using the non-subsampled contourlet transform. It is known that this transform is capable of providing a multiscale, multi-directional and translation invariant decomposition of images. The proposed saliency detection method is realised by extracting various local and global features from the non-subsampled contourlet coefficients of the colour channels. A saliency map is obtained based on a linear combination of the local features and the distribution of the global features. In order to provide a better preservation of the structure and boundary of the objects and to obtain a more uniformly highlighted salient region, the saliency map is abstracted using an optimisation framework. Several experiments are conducted on sets of natural images to evaluate the performance of the proposed method. The results show that the performance of the proposed method is superior to that of the other existing methods in terms of precision-recall performance, F-measure, and mean absolute error values.
- Author(s): Zelong Wang and Jubo Zhu
- Source: IET Image Processing, Volume 12, Issue 12, p. 2283 –2291
- DOI: 10.1049/iet-ipr.2018.5741
- Type: Article
- + Show details - Hide details
-
p.
2283
–2291
(9)
Compressive sensing (CS) theory enables single-pixel compressive imaging (SPCI) popular in optical remote sensing imaging, since single-pixel imager not only realises the acquisition and recovery of sparse images at the sampling rates significantly below the classical Nyquist rate, but also makes use of special detectors to improve sensitivity, dynamic range, spectral range and so on. However, its requirement of static imaging condition for the time-sequential measurements makes it difficult to apply in remote sensing, since the imaging platform is always in motion relative to the imaging scene. In this study, the authors develop a new method for SPCI on moving platform in optical remote sensing. Instead of ignoring the motion during sampling, the proposed method first builds the compressive sampling model based on motion compensation during sampling, and then reconstructs the image by frame-by-frame recovery method and joint recovery method, respectively, in the framework of CS theory, where the recovery condition is also analysed. To validate the basic principle and the physical feasibility of motion compensation, the authors implement the numerical simulations and optical experiments, where different conditions are exploited sufficiently. In addition, the proposed method for SPCI can be also extended into other applications, such as 360° annular imaging, multi-pixel compressive imaging as well as super-resolution imaging.
- Author(s): Erol Seke ; Yıldıray Anagün ; Nihat Adar
- Source: IET Image Processing, Volume 12, Issue 12, p. 2292 –2299
- DOI: 10.1049/iet-ipr.2018.5168
- Type: Article
- + Show details - Hide details
-
p.
2292
–2299
(8)
Super-resolution (SR) applications aim to use information within one or more low-resolution (LR) image(s) to obtain high-resolution (HR) image(s). LR images might well be the consecutive frames of a video sequence. When multiple LR images are used as input, motion estimation (ME) between portions of images is an important step of the solution for this ill-posed problem. In this study, the authors employed translational optical flow for ME, followed by common vector approach (CVA) for HR image reconstruction from multiple sources. CVA provides a way to reduce outliers caused by noise, occlusion, shadows and incorrect ME. HR image blocks are obtained by combining common and difference vectors of the blocks’ class that are handled separately. Noise in difference vectors is reduced by a known noise reduction method before combining. Separate handling of common and difference parts guarantees better results and greatly reduce artefacts. Compared to the state-of-the-art, experimental results confirm the authors’ achievement by visual, peak signal-to-noise ratio and structural similarity index measures criteria.
- Author(s): Changda Xing ; Zhisheng Wang ; Quan Ouyang ; Chong Dong
- Source: IET Image Processing, Volume 12, Issue 12, p. 2300 –2310
- DOI: 10.1049/iet-ipr.2018.5554
- Type: Article
- + Show details - Hide details
-
p.
2300
–2310
(11)
Infrared and visible images fusion based on edge-preserving can improve the fused result in a clear outline. However, there exists the performance degradation caused by some edges in the data which are smaller than the level of the noise with traditional edge-preserving decomposition. To remedy such deficiency, a method based on bitonic filtering decomposition and sparse representation is proposed for fusion of infrared and visible images. The bitonic filtering decomposition and sparse representation (BFSR) method consists of three steps: multi-scale bitonic filtering decomposition, mergence of base layers and detail layers, and reconstruction of the fused result. Compared with traditional image fusion based on edge-preserving, data-level-sensitive parameters are not included in the BFSR method, which can locally adapt to the signal and noise levels in an image. Moreover, the sparsity of images for fusing details is used in the BFSR method, which can analyse the explanatory factors hidden behind the data. As demonstrated in the experimental results, the proposed BFSR method achieves much fusion performance compared with other commonly used image fusion methods.
- Author(s): Hang-Yu Fan and Zhe-Ming Lu
- Source: IET Image Processing, Volume 12, Issue 12, p. 2311 –2318
- DOI: 10.1049/iet-ipr.2018.5443
- Type: Article
- + Show details - Hide details
-
p.
2311
–2318
(8)
Vector quantisation (VQ) is a widely used method for data compression or data clustering. This study proposes a fast codebook design method for image VQ. General particle swarm optimisation (PSO)-K-means hybrid methods for codebook design take too much time. To deal with this problem, the authors present a new clustering method, which takes a very short time and generates a pretty good codebook. This method contains four parts, i.e. pooling, PSO algorithm, reverse cast, and K-means algorithm. Pooled images are used for the PSO process while original images are used for K-means fine-tuning, and these two processes connected by the reverse cast process. In the authors’ experiment, this method can dramatically reduce the computation time by using big sizes of pooling window or enhance the codebook quality by using small sizes of pooling window. They can make the calculation time almost one-tenth of that of the PSO-K-means hybrid method, and the scheme is also faster than the K-means algorithm. Experimental results demonstrate that the main advantages of the proposed algorithm lie in the fact that it can reduce the computation time and enhance the quality of codebooks.
- Author(s): Wang Huan Gu ; Yu Zhu ; Xu Dong Chen ; Lin Fei He ; Bing Bing Zheng
- Source: IET Image Processing, Volume 12, Issue 12, p. 2319 –2329
- DOI: 10.1049/iet-ipr.2018.5245
- Type: Article
- + Show details - Hide details
-
p.
2319
–2329
(11)
Visual-based technologies are very useful and meaningful to driver's fatigue detection. In this study, the authors present a multi-task hierarchical CNN scheme for fatigue detection system and propose a convolutional neural network (CNN) model with multi-scale pooling (MSP-Net). ‘Multi-task’ includes three tasks: face detection, eye and mouth state detection and fatigue detection. First, they use a pre-trained network – multi-task CNN for face detection extracting eye and mouth regions. Then, the main work of this study, eye and mouth state detection is processed by MSP-Net, which can fit multi-resolution input images captured from variant cameras excellently. For the third step, the percentage of eyelid closure over the pupil over time (PERCLOS) parameters and the frequency of open mouth (FOM) parameters are used to detect fatigue, and the FOM parameters are proposed by ourselves. Besides, they successfully port the system to the embedded platform (the NVIDIA JETSON TX2 development board) and test on real driving scene. The results show that their system performs well and is robust to complex environments and is in line with the demand of real-time system.
- Author(s): Jiafa Mao ; Kaihui Wang ; Yahong Hu ; Weiguo Sheng ; Qixin Feng
- Source: IET Image Processing, Volume 12, Issue 12, p. 2330 –2335
- DOI: 10.1049/iet-ipr.2018.5730
- Type: Article
- + Show details - Hide details
-
p.
2330
–2335
(6)
Teeth are difficult to be destroyed due to their corrosion resistance, high melting point and hardness. Dental biometrics can therefore provide assistance in human forensic identification, especially to the unknown corpses. One of the key issue in dental based human identification is the segmentation of Dental X-ray images. In this paper, a novel segmentation algorithm has been proposed for this purpose. The proposed algorithm is based on full threshold segmentation. We first obtain the outline image set Iwholen and crown image set Icrownm of the complete target tooth. Morphological open operation is then applied to the difference images of Iwholen and Icrownm . Subsequently, the most complete target tooth image and its corresponding crown image are selected. Getting independent target tooth image I contour and its crown image I crown from these two images. Median filtering is applied to the synthetic image of I contour and I crown, and the resulted image will be used as the Mask for GrabCut to obtain the target tooth image. Experimental results show our proposed algorithm can effectively overcome the problems of uneven grayscale distribution and adhesion of adjacent crowns in dental X-ray images. It can also achieve a high segmentation accuracy and outperform related methods to be compared.
Local multiscale blur estimation based on toggle mapping for sharp region extraction
Dynamical stochastic resonance for non-uniform illumination image enhancement
Classification of EEG signals for detection of epileptic seizure activities based on feature extraction from brain maps using image processing algorithms
Retinal blood vessel segmentation using the elite-guided multi-objective artificial bee colony algorithm
Modified mean shift algorithm
Robust image hashing using exact Gaussian–Hermite moments
Multilevel magnetic resonance imaging compression using compressive sensing
Tool for automatic tuning of binarisation techniques
Improvement of global motion estimation in two-dimensional digital video stabilisation methods
Comparison of level set models in image segmentation
Detection of vehicle wheels from images using a pseudo-wavelet filter for analysis of congested traffic
Efficient coarser-to-fine holistic traffic sign detection for occlusion handling
Multiple-parameter fractional quaternion Fourier transform and its application in colour image encryption
Video denoising algorithm based on improved dual-domain filtering and 3D block matching
Scene invariant crowd counting using multi-scales head detection in video surveillance
Poisson image denoising by piecewise principal component analysis and its application in single-particle X-ray diffraction imaging
Salient region detection using feature extraction in the non-subsampled contourlet domain
Single-pixel compressive imaging based on motion compensation
Multi-frame super-resolution algorithm using common vector approach
Method based on bitonic filtering decomposition and sparse representation for fusion of infrared and visible images
Fast codebook design method for image vector quantisation
Hierarchical CNN-based real-time fatigue detection system by visual-based technologies using MSP model
GrabCut algorithm for dental X-ray images based on full threshold segmentation
-
- Author(s): Amina Tidjani ; Abdelmalik Taleb-Ahmed ; Djamel Samai ; Aiadi Kamal Eddine
- Source: IET Image Processing, Volume 12, Issue 12, p. 2336 –2345
- DOI: 10.1049/iet-ipr.2018.5552
- Type: Article
- + Show details - Hide details
-
p.
2336
–2345
(10)
Facial automatic kinship verification is a novel challenging research problem in computer vision. It performs the automatic examining of the facial attributes and expecting whether two persons have a biological kin relation or not. In this study, the authors introduce a novel learning method for kinship verification which consists of four main stages. (i) A discrete cosine transform network (DCTNet) applied to each face image in order to extract the most significant inherited facial features through convolutional layers based on 2D DCT filter bank. (ii) The response of the last layer is binarised and partitioned into non-overlapping block-wise histograms. (iii) A tied rank normalisation is used to eliminate the disparity of histogram vectors of DCTNet. (iv) The last stage is to distinguish between the different pairs. The distances between data points in the same classes (positive pairs) are as small as possible, while the distances are as large as possible between data points in different classes (negative pairs). Experiments are conducted on three public databases (UBKinFace, KinFaceW-I, and KinFaceW-II). They show significant performance improvements compared to state-of-the-art methods.
- Author(s): Mehdi Mafi ; Solale Tabarestani ; Mercedes Cabrerizo ; Armando Barreto ; Malek Adjouadi
- Source: IET Image Processing, Volume 12, Issue 12, p. 2346 –2351
- DOI: 10.1049/iet-ipr.2018.5292
- Type: Article
- + Show details - Hide details
-
p.
2346
–2351
(6)
This study introduces an image denoising method in the presence of combined speckle and Gaussian noise. The dual-tree complex wavelet transform is applied to the image in order to obtain specific coefficients characterising these types of noise. Then, these extracted coefficients are removed by thresholding and an inverse wavelet transform is applied in order to obtain the reconstructed image. A comparison between the dual-tree and standard wavelet-based denoising filters is provided on the basis of different structural metrics. Finally, in order to remove any remaining noise, a spatial denoising filter is applied to the image. The results obtained on medical ultrasound images corrupted by this noise combination support the authors’ assertion on the method's resilience to the combined effects of speckle and Gaussian noise, and are compared to well-known and most effective speckle–Gaussian denoising filters.
Deep learning features for robust facial kinship verification
Denoising of ultrasound images affected by combined speckle and Gaussian noise
Most viewed content
Most cited content for this Journal
-
Medical image segmentation using deep learning: A survey
- Author(s): Risheng Wang ; Tao Lei ; Ruixia Cui ; Bingtao Zhang ; Hongying Meng ; Asoke K. Nandi
- Type: Article
-
Block-based discrete wavelet transform-singular value decomposition image watermarking scheme using human visual system characteristics
- Author(s): Nasrin M. Makbol ; Bee Ee Khoo ; Taha H. Rassem
- Type: Article
-
Classification of malignant melanoma and benign skin lesions: implementation of automatic ABCD rule
- Author(s): Reda Kasmi and Karim Mokrani
- Type: Article
-
Digital image watermarking method based on DCT and fractal encoding
- Author(s): Shuai Liu ; Zheng Pan ; Houbing Song
- Type: Article
-
Tomato leaf disease classification by exploiting transfer learning and feature concatenation
- Author(s): Mehdhar S. A. M. Al‐gaashani ; Fengjun Shang ; Mohammed S. A. Muthanna ; Mashael Khayyat ; Ahmed A. Abd El‐Latif
- Type: Article