IET Image Processing
Volume 14, Issue 15, 15 December 2020
Volumes & issues:
Volume 14, Issue 15
15 December 2020
-
- Source: IET Image Processing, Volume 14, Issue 15, p. 3633 –3634
- DOI: 10.1049/iet-ipr.2020.1284
- Type: Article
- + Show details - Hide details
-
p.
3633
–3634
(2)
- Author(s): Jinxing Niu ; Yajie Jiang ; Yayun Fu
- Source: IET Image Processing, Volume 14, Issue 15, p. 3635 –3638
- DOI: 10.1049/iet-ipr.2019.1588
- Type: Article
- + Show details - Hide details
-
p.
3635
–3638
(4)
Owing to low light intensity, the image detector is underexposure, the colour and contrast of scene images will be changed, so it is of great significance to study the image sharpening in a weak illumination environment. In this study, the causes of image blurring are firstly analysed by histogram equalisation (HE) and retinex theory. A method combined HE and multi-scale retinex theory with colour recovery is proposed. The experimental simulation results show that the image processing effect of this algorithm is the most obvious, the local details and dark areas are clearly visible, the colour is rich, and the visual effect is the best, which has a certain practical application value.
- Author(s): Yongcun Cao ; Saisai Ji ; Yong Lu
- Source: IET Image Processing, Volume 14, Issue 15, p. 3639 –3650
- DOI: 10.1049/iet-ipr.2020.0111
- Type: Article
- + Show details - Hide details
-
p.
3639
–3650
(12)
The artificial bee colony (ABC) algorithm is a biological-inspired optimisation algorithm proposed by Karaboga. Since its solution search equation is good at exploration but poor at exploitation, the ABC algorithm converges slowly and is easy to fall into local optimum. Inspired by opposition-based learning (OBL), the authors propose an improved ABC algorithm called opposition-based learning ABC (OLABC). In OLABC, firstly, the population would be initialised using OBL. Secondly, to ensure the diversity of the population during the iterative process, the solution search equation is employed to bee phase would be improved. Generate the opposite solution when the fitness value of the newly generated solution is smaller than the current solution, and then apply the greedy selection strategy to update the solution. Thirdly, the adaptive weight strategy is used to dynamically adjust the weight, balancing the global exploration and local exploitation capabilities of the algorithm. Experiments on a set of benchmark functions show that OLABC has better convergence speed and optimisation precision than the compared algorithms.
- Author(s): Qingshan Hou and Jinsheng Xing
- Source: IET Image Processing, Volume 14, Issue 15, p. 3651 –3661
- DOI: 10.1049/iet-ipr.2020.0077
- Type: Article
- + Show details - Hide details
-
p.
3651
–3661
(11)
Considering that the single shot multibox detector (SSD) algorithm will be missed or even false when is used to detect the small- and medium-sized objects, in this study, Kullback–Leibler single shot multibox detection (KSSD) object detection algorithm is proposed to improve the accuracy of small- and medium-sized objects detection. Firstly, the details in the detection process are visualised with gradient-weighted class activation mapping technology, and the details of each detection layer are shown in the form of class activation maps. Then it is noted that the phenomenon of the false or missed detection of the objects to be detected on small- and medium-sized objects in the SSD algorithm is related to the regression loss function. Accordingly, Kullback–Leibler border regression loss strategy is adopted and non-maximum suppression algorithm is used to output the final prediction boxes. Experimental results show that compared with the existed detection algorithms, the improved algorithm in this study has higher accuracy and stability, and can significantly improve the detection effect on small- and medium-sized objects.
- Author(s): Ying Sun ; Yaoqing Weng ; Bowen Luo ; Gongfa Li ; Bo Tao ; Du Jiang ; Disi Chen
- Source: IET Image Processing, Volume 14, Issue 15, p. 3662 –3668
- DOI: 10.1049/iet-ipr.2020.0148
- Type: Article
- + Show details - Hide details
-
p.
3662
–3668
(7)
With the rapid development of sensor technology and artificial intelligence, the video gesture recognition technology under the background of big data makes human–computer interaction more natural and flexible, bringing the richer interactive experience to teaching, on-board control, electronic games etc. To perform robust recognition under the conditions of illumination change, background clutter, rapid movement, and partial occlusion, an algorithm based on multi-level feature fusion of two-stream convolutional neural network is proposed, which includes three main steps. Firstly, the Kinect sensor obtains red–green–blue-depth (RGB-D) images to establish a gesture database. At the same time, data enhancement is performed on the training set and test set. Then, a model of multi-level feature fusion of a two-stream convolutional neural network is established and trained. Experiments show that the proposed network model can robustly track and recognise gestures under complex backgrounds (such as similar complexion, illumination changes, and occlusion), and compared with the single-channel model, the average detection accuracy is improved by 1.08%, and mean average precision is improved by 3.56%.
- Author(s): A. Kavipriya and A. Muthukumar
- Source: IET Image Processing, Volume 14, Issue 15, p. 3669 –3675
- DOI: 10.1049/iet-ipr.2020.0145
- Type: Article
- + Show details - Hide details
-
p.
3669
–3675
(7)
Biometrics is the bureau of science that helps to measure the individual's features by utilising their behavioural and physiological characteristics. Since years ago biometric technology are contemplated to be a unique tool for other security purposes. Inauspicious biometrics, Ear recognition, and Finger Knuckle Print have been enchanted as a booming analysis with interests among several researchers in modern periods. It has an ample range of private and other law enforcement applications. A combination of multiple human attributes is authenticated and considered to be a competent strategy in case of a multimodal Personal authentication system. In this paper extracting the local and global features via the structure of the time-frequency domain has been studied. This proposed scheme exploits the analysis of dual biometric modalities i.e. Ear and Finger Knuckle Print which are carried out at the stage of feature-level fusion. The feature Extraction two biometric patterns are obtained by generating the Local and Global feature information that helps in refining the alignment of dual biometric images in matching i.e. Discrete Orth normal Stockwell Transform-Ear recognition and Band Limited phase-only correlation with Finger Knuckle Print. Experiment results conducted with these FKP and EAR are demonstrated in improving recognition of efficient accuracy.
- Author(s): Jingjing Yang ; Jinzhao Wu ; Xiaojing Wang
- Source: IET Image Processing, Volume 14, Issue 15, p. 3676 –3681
- DOI: 10.1049/iet-ipr.2020.0078
- Type: Article
- + Show details - Hide details
-
p.
3676
–3681
(6)
Privacy information leaks have become a major problem hindering the development of the convolutional neural network and deep learning. Differential privacy protection has been applied to deep learning by more and more scholars to protect image training sets. The differentially private SGD (DP-SGD) algorithm is adding Gaussian noise of a fixed level will cause the accuracy of the model to increase slowly with the increase of training times. To solve this problem, this study presents a convolutional neural network based on differential privacy in exponential attenuation mode. Firstly, the attenuation coefficient of Gaussian noise is linked with the training times and the accuracy of the model. Secondly, the DP-SGD algorithm in exponential attenuation mode is proposed. The calculation method of the differential privacy protection budget in exponential attenuation mode is given. Finally, experiments with the MNIST dataset and X-ray images verify the feasibility of using the original data of the Gaussian noise convolutional neural network based on differential privacy protection in exponential attenuation mode.
- Author(s): Yuan Cui ; Yang Yu ; Zhuyang Chen ; Bo Xue
- Source: IET Image Processing, Volume 14, Issue 15, p. 3682 –3688
- DOI: 10.1049/iet-ipr.2020.0081
- Type: Article
- + Show details - Hide details
-
p.
3682
–3688
(7)
In view of the fact that on flat panel display industrial production line, the power consumption is very important and determined by their different driving modes and light-emitting principles. Based on the industrial control systems, this paper analyses the power consumption of flat panel display, studies the main factors affecting its power consumption and analyses the main methods to improve light efficiency and reduce power consumption. This paper also discusses the focus issues in the current standard of flat panel display power measurement at home and abroad. According to the characteristics of different flat panel displays, this paper discusses the dual indexes of power consumption and efficiency in power measurement, and how to consider the influence of screen brightness and angle of view in the measurement.
- Author(s): Li Huang ; Meiling He ; Chong Tan ; Du Jiang ; Gongfa Li ; Hui Yu
- Source: IET Image Processing, Volume 14, Issue 15, p. 3689 –3697
- DOI: 10.1049/iet-ipr.2020.0088
- Type: Article
- + Show details - Hide details
-
p.
3689
–3697
(9)
Image semantic segmentation has always been a research hotspot in the field of robots. Its purpose is to assign different semantic category labels to objects by segmenting different objects. However, in practical applications, in addition to knowing the semantic category information of objects, robots also need to know the position information of objects to complete more complex visual tasks. Aiming at a complex indoor environment, this study designs an image semantic segmentation network framework of joint target detection. Using the parallel operation of adding semantic segmentation branches to the target detection network, it innovatively implements multi-vision task combining object classification, detection and semantic segmentation. By designing a new loss function, adjusting the training using the idea of transfer learning, and finally verifying it on the self-built indoor scene data set, the experiment proves that the method in this study is feasible and effective, and has good robustness.
- Author(s): Mingyue Qian ; Zhaoting Zhang ; Jiechun Chen
- Source: IET Image Processing, Volume 14, Issue 15, p. 3698 –3704
- DOI: 10.1049/iet-ipr.2019.1629
- Type: Article
- + Show details - Hide details
-
p.
3698
–3704
(7)
The diagnosis and prevention of Alzheimer's disease plays an important role in improving patient cognition. It can be seen from the current situation that the diagnosis of Alzheimer's disease is still poor because it is affected by many factors. Based on this, combined with the symptoms of Alzheimer's disease, this study used computer-aided to diagnose the symptoms of patients. First of all, this study analyses classical machine learning and chooses the appropriate model for diagnosis. Next, this study constructs a diagnostic system based on a mixed Gaussian model and uses a mixed Gaussian model to predict the probability of different distribution methods. Finally, this study designs experiments to analyse the performance of diagnostic models. Studies have shown that the mixed Gaussian model has a good effect on the automatic diagnosis of Alzheimer's disease, and can provide a theoretical reference for subsequent related research.
Guest Editorial: Evolutionary Computation for Image Processing
Research on image sharpening algorithm in weak illumination environment
Improved artificial bee colony algorithm with opposition-based learning
KSSD: single-stage multi-object detection algorithm with higher accuracy
Gesture recognition algorithm based on multi-scale feature fusion in RGB-D images
Innovative approach for multimodal fusion recognition based feature extraction using band-limited phase-only correlation and discrete orthonormal Stockwell transform
Convolutional neural network based on differential privacy in exponential attenuation mode for image classification
Research on measurement and energy efficiency improvement of flat panel display based on industrial control
Jointly network image processing: multi-task image semantic segmentation of indoor scene based on CNN
Combined mixed Gaussian model with pattern recognition in the automatic diagnosis of Alzheimer's disease
-
- Author(s): Prasun Chandra Tripathi and Soumen Bag
- Source: IET Image Processing, Volume 14, Issue 15, p. 3705 –3717
- DOI: 10.1049/iet-ipr.2020.0383
- Type: Article
- + Show details - Hide details
-
p.
3705
–3717
(13)
Segmentation of tissues in brain magnetic resonance (MR) images has a crucial role in computer-aided diagnosis (CAD) of various brain diseases. However, due to the complex anatomical structure and the presence of intensity non-uniformity (INU) artefact, the segmentation of brain MR images is considered as a complicated task. In this study, the authors propose a novel locally influenced fuzzy C-means (LIFCM) clustering for segmentation of tissues in MR brain images. The proposed method incorporates local information in the clustering process to achieve accurate labelling of pixels. A novel local influence factor is proposed, which estimates the influence of a neighbouring pixel on the centre pixel. Furthermore, they have introduced the kernel-induced distance in LIFCM, which deals with complex brain MR data and produces effective segmentation. To evaluate the performance of the proposed method, they have used one simulated and one real MRI data set. Extensive experimental findings suggest that the authors' method not only produces effective segmentation but also retains crucial image details. The statistical significance test has been further conducted to support their experimental observations.
- Author(s): Kuryati Kipli ; Mohammed Enamul Hoque ; Lik Thai Lim ; Tengku Mohd Afendi Zulcaffle ; Siti Kudnie Sahari ; Muhammad Hamdi Mahmood
- Source: IET Image Processing, Volume 14, Issue 15, p. 3718 –3724
- DOI: 10.1049/iet-ipr.2020.0336
- Type: Article
- + Show details - Hide details
-
p.
3718
–3724
(7)
Image processing applications remarkably contributes to modern ophthalmology. This technology is designed to analyse the characteristics of the human eye microvasculature images. The retinal microvasculature is an excellent non-invasive screening window for the assessment of systemic diseases such as diabetes, hypertension, and stroke. Retinal microvasculature character such as widening vessel diameter is recognised as an analysable feature for stroke or transient ischemic attack for predicting the progression of this pathology. Thus, in this study, a computer-assisted method has been developed for this task applying the Euclidean distance transform (EDT) technique. This newly developed algorithm computes the Euclidean distance of the remaining white pixels on the area of interest. Central Light Reflex Image Set (CLRIS) and Vascular Disease Image Set (VDIS) of Retinal Vessel Image set for Estimation of Width database were used for the performance evaluation of the proposed algorithm that showed 98.1 and 97.7% accurate result for both CLRIS and VDIS, respectively. The significantly high accuracy in this newly developed vessel diameter quantification algorithm indicates excellent potential for further development, evaluation, validation, and integration into ophthalmic diagnostic instruments.
- Author(s): Zujing Yan ; Yunhong Xin ; Yixuan Zhang
- Source: IET Image Processing, Volume 14, Issue 15, p. 3725 –3732
- DOI: 10.1049/iet-ipr.2020.1157
- Type: Article
- + Show details - Hide details
-
p.
3725
–3732
(8)
Local contrast measure (LCM) has been proved to be an effective method for infrared small target detection. However, the detection performance of LCM decreases dramatically when the background contains strong edges and pixel-sized noises with high brightness (PNHB). Based on the analysis of the inherent causes of the poor performance of LCM in extremely complex backgrounds, this study presents an effective LCM with an iterative error. The contribution is as follows: first, the two-dimensional least mean square (TDLMS) filter with an adaptive parameter is applied to suppress the background clutters roughly in each multiscale window. Then, the partial maximum pixel mean is applied to the LCM to optimise the sub-block statistical parameters, which achieves excellent strong edges suppression performance. Finally, the iteration error generated by TDLMS and the sub-block weight matrix is updated alternately to further optimise the statistical parameters of the contrast measure to make it more effective in suppressing PNHB. Experimental results demonstrate that the proposed approach is not only superior to the contrast methods in terms of high detection efficiency and low false alarm rate but also has satisfactory adaptability under extremely complex backgrounds.
- Author(s): Vikash K. Mishra and Triloki Pant
- Source: IET Image Processing, Volume 14, Issue 15, p. 3733 –3741
- DOI: 10.1049/iet-ipr.2020.1078
- Type: Article
- + Show details - Hide details
-
p.
3733
–3741
(9)
In the present study, Landsat-8 data have been used for water identification at the Sangam region, the confluence of the Ganges and Yamuna rivers. Since the water of both the rivers varies in characteristics, it is necessary to explore the most appropriate band of Landsat-8 imagery for the identification of water bodies. Accordingly, a two folded study has been performed, viz., the selection of a suitable band to identify water bodies and to monitor the water level at the Sangam region from 2013 to 2018. It is observed in the study that band-7 is the most suitable band for water identification, which is used for the classification and monitoring of water. The classification results are highly significant as the overall accuracy ranges between 95.49 and 100%. The monitoring of water level is done based on classification results for mapping the area of water in real units which can be useful for the prediction of flood situations. The achievement of the present study is the mapping of water area in real units with the help of image pixels which is one of the prominent applications of image processing in earth sciences.
- Author(s): Saeed Najafi Khanbebin and Vahid Mehrdad
- Source: IET Image Processing, Volume 14, Issue 15, p. 3742 –3750
- DOI: 10.1049/iet-ipr.2020.0394
- Type: Article
- + Show details - Hide details
-
p.
3742
–3750
(9)
Local binary patterns (LBPs) are one of the attempts for gathering local features with face recognition algorithms. Although the application of LBP's in many recognition contents is too apparent, these methods have limited accuracy because of their threshold value. One problem is earning one value for two different regions with a diverse pixel neighbourhood, which causes mistakes in feature vector and decreases the discriminative power. In this study, the authors proposed a modified LBP that covers the LBP's disadvantages. The proposed approach is arithmetic coded LBP (ACLBP) that uses arithmetic coding process during LBP calculation instead of applying original thresholds. The proposed policy addresses the problem of returning one similar LBP value for two different patches. Moreover, the proposed method modifies LBP by using a different threshold for calculating the pixels differences. Using this algorithm, the authors conducted a genetic-based feature fusion method by combining LBP and histogram of oriented gradients and ACLBP. The proposed approach could work better on LFW dataset, and the ORL dataset and Yale face dataset that shows the improving role of ACLBP in comparison with the earlier version of LBP.
- Author(s): Wanneng Wu ; Min Xu ; Qiaokang Liang ; Li Mei ; Yu Peng
- Source: IET Image Processing, Volume 14, Issue 15, p. 3751 –3761
- DOI: 10.1049/iet-ipr.2020.0757
- Type: Article
- + Show details - Hide details
-
p.
3751
–3761
(11)
Accurate ball tracking in sports is vital for automatic sports analysis yet it is challenging mainly due to the small size and occlusions. This study proposes a novel multi-camera 3D ball tracking (MBT) framework for sports video. The proposed framework consists of four parts: 2D ball detection, 2D ball tracking, 3D position fusion, and 3D ball tracking. In 2D aspect, the multi-scale features are introduced to enhance the 2D ball detection, and the 2D ball tracking is also improved by exploring cross-view information to handle the occlusion and timely updating tracking model with detection results to alleviate the problem of tracking drift. For 3D ball, a novel 3D position fusion method is proposed to optimise the ball position and the 3D ball tracking approach with improved Kalman filter is finally applied to ensure a smooth 3D ball trajectory. Moreover, compared to the existing products in commercial, the proposed framework does not require any special equipment and is thus low cost. Extensive experiments for 2D and 3D ball on a public dataset demonstrate that the proposed framework is robust to ball tracking in sports video, even in the presence of environmental interferences, substantial occlusions, and even calibration errors.
- Author(s): Hiwa Sufi karimi and Karim Mohammadi
- Source: IET Image Processing, Volume 14, Issue 15, p. 3762 –3773
- DOI: 10.1049/iet-ipr.2019.1621
- Type: Article
- + Show details - Hide details
-
p.
3762
–3773
(12)
The biologically inspired hierarchical model and X (HMAX) has been one of the superior techniques for object recognition purposes. HMAX is a robust method in the presence of some image variations including illumination, different scales, and location changes. However, the performance of HMAX extremely deteriorates if the orientation of the applied images in training phase is different than the orientation of testing images. In this study, the authors propose rotational invariant HMAX (RIMAX) to overcome the existing issues in object recognition imposed by rotations in the images. To this end, they embed two new layers into the structure of the standard HMAX. In the first added layer, non-accidental properties (e.g. corners and edges) are extracted as features that lead to obtaining a repeatable object recognition process. The second added layer provides robustness of RIMAX to image rotations by normalising the dominant orientation of the extracted features. Furthermore, they considerably reduce the imposed computational load by modifying template matching strategy as well as removing multiple scales of the Gabor filter. Simulation results employ Caltech101, TUD, Caltech5, and GRAZ-02 databases that validate RIMAX outperforms the standard HMAX and the other mathematical approaches in terms of robustness, accuracy, repeatability, and speed.
- Author(s): Mohammad Amin Mehralian and Mohsen Soryani
- Source: IET Image Processing, Volume 14, Issue 15, p. 3774 –3780
- DOI: 10.1049/iet-ipr.2020.0606
- Type: Article
- + Show details - Hide details
-
p.
3774
–3780
(7)
In real-world applications the perspective-n-point (PnP) problem should generally be applied to a sequence of images which a set of drift-prone features are tracked over time. In this study, the authors consider both the temporal dependency of camera poses and the uncertainty of features for the vision-only sequential camera pose estimation. Using the extended Kalman filter (EKF), a priori estimate of the camera pose is calculated from the camera motion model and then it is corrected by minimising the reprojection error of the reference points. Applying probabilistic approach also provides the covariance of the pose parameters which helps to measure the reliability of the estimated parameters. Experimental results, using both synthetic and real data, demonstrate that the proposed method improves the robustness of the camera pose estimation, in the presence of tracking error and feature matching outliers, compared to the state of the art.
- Author(s): Sherin Shibi and Gayathri Rajagopal
- Source: IET Image Processing, Volume 14, Issue 15, p. 3781 –3790
- DOI: 10.1049/iet-ipr.2020.0344
- Type: Article
- + Show details - Hide details
-
p.
3781
–3790
(10)
Target object detection is an important research direction in the area of hyperspectral imaging (HSI) as it aims to detect the anomalies or objects in HSI. Some of the existing target object detection methods are merely suitable for HSI with low resolution as they failed to apply directly in the high-resolution HSI. Therefore, an effective target detection method named chicken social-based deep belief network (CS-based DBN) is proposed to achieve an automatic target object detection framework in the high-resolution HSI. The proposed CS-based DBN is developed by integrating the chicken swarm optimisation with the social ski-driver algorithm. The optimal solution for detecting the target object is revealed through the fitness function, which accepts the minimal error value as the best solution. Moreover, the weights of the DBN classifier are optimally trained based on the proposed algorithm to render an accurate and optimal solution in detecting the target objects. The proposed CS-based DBN obtained better performance through the facility of stochastic exploration in search space. Moreover, the results achieved using the proposed model in terms of accuracy, specificity, and sensitivity are 0.8950, 0.8940, and 0.9, respectively.
- Author(s): Mehdi Mafi ; Walter Izquierdo ; Harold Martin ; Mercedes Cabrerizo ; Malek Adjouadi
- Source: IET Image Processing, Volume 14, Issue 15, p. 3791 –3801
- DOI: 10.1049/iet-ipr.2019.0931
- Type: Article
- + Show details - Hide details
-
p.
3791
–3801
(11)
This study utilises a deep convolutional neural network (CNN) implementing regularisation and batch normalisation for the removal of mixed, random, impulse, and Gaussian noise of various levels from digital images. This deep CNN achieves minimal loss of detail and yet yields an optimal estimation of structural metrics when dealing with both known and unknown noise mixtures. Moreover, a comprehensive comparison of denoising filters through the use of different structural metrics is provided to highlight the merits of the proposed approach. Optimal denoising results were obtained by using a 20-layer network with 40 × 40 patches trained on 400 180 × 180 images from the Berkeley segmentation data set (BSD) and tested on the BSD100 data set and an additional 12 images of general interest to the research community. The comparative results provide credence to the merits of the proposed filter and the comprehensive assessment of results highlights the novelty and performance of this CNN-based approach.
- Author(s): Lili Fan ; Hongwei Zhao ; Haoyu Zhao
- Source: IET Image Processing, Volume 14, Issue 15, p. 3802 –3811
- DOI: 10.1049/iet-ipr.2020.0454
- Type: Article
- + Show details - Hide details
-
p.
3802
–3811
(10)
Deep image embedding learns how to map images onto feature vectors. Image retrieval performance is often used to evaluate embedding quality. In this study, the authors proposed a wise deep image embedding optimisation (WDIEO) algorithm based on informative pair weighting and ranked list learning (IPWRLL) for network optimisation of fine-grained image retrieval. First, a hard sample mining method Top-k is proposed to select positive and negative samples. Then, for the selected query sample, a ranking list is obtained by comparing the similarity between samples in the data set and the query sample, and the sample is labelled according to the similarity. Finally, for positive samples, two optimisation rules with different functions are used, while ensuring two key issues of instance weighting and intra-class data distribution. For negative samples, different from the widely adopted methods based on the weight of sample information, the authors’ algorithm's weights are set according to the ranking list, which keeps the inter-class data distribution and the optimisation direction consistent with the loss reduction direction. The WDIEO-IPWRLL model is an end-to-end optimisation that can share parameters in the testing process. Experiments show that their proposed model achieves the state-of-the-art performance on the benchmark data set.
- Author(s): Hajar Danesh ; Keivan Maghooli ; Alireza Dehghani ; Rahele Kafieh
- Source: IET Image Processing, Volume 14, Issue 15, p. 3812 –3818
- DOI: 10.1049/iet-ipr.2020.0075
- Type: Article
- + Show details - Hide details
-
p.
3812
–3818
(7)
Limited labelled data is a challenge in the field of medical imaging and the need for a large number of them is paramount for the training of machine learning algorithms, as well as measuring the performance of image processing algorithms. The purpose of this study is to construct synthetic and labelled optical coherence tomography (OCT) data to solve the problems of having access to accurately labelled data and evaluating the processing algorithms. In this study, a modified active shape model is used which considers the anatomical features of available images such as the number and thickness of the layers as well as their associated brightness, the location of retinal blood vessels and shadow information with respect to speckle noise. The algorithm is also able to provide different data sets with the varying noise level. The validity of the proposed method for the synthesis of retinal images is measured by two methods (qualitative assessment and quantitative analysis).
- Author(s): Pei Jiang ; Shiwen He ; Hufei Yu ; Yaoxue Zhang
- Source: IET Image Processing, Volume 14, Issue 15, p. 3819 –3828
- DOI: 10.1049/iet-ipr.2020.0444
- Type: Article
- + Show details - Hide details
-
p.
3819
–3828
(10)
With the rapid development of the Internet, watermarks are widely used in images to protect copyright. This implies that the robustness of watermark is very important. In recent years, there have been some studies to evaluate watermark performance by removing the watermark. Among them, some methods need to mark the watermark position in advance, and some require multiple images with the same watermark. Moreover, when the colour of thewatermark is similar to that of the background, the existing methods can hardly remove the watermark from the watermarked image. In the proposed work, the authors presented a watermark removal structure consisting of watermark extraction and image inpainting to address the aforementioned issues. In particular, the extraction network is used to extract the watermark in the watermarked image, and the inpainting network is used to inpainting image for a better watermark removal image, respectively. Finally, the authors train and test the developed network architecture by constructing two data sets, i.e. white watermarked image data set (WW-data set) and colour watermarked image data set (CW-data set). The proposed method not only has better performance on the WW-data set than the current latest methods (on the CW-data set, other methods have almost failed) but also effectively removes the watermarks.
- Author(s): Zihan Yuan ; Qingtang Su ; Decheng Liu ; Xueting Zhang ; Tao Yao
- Source: IET Image Processing, Volume 14, Issue 15, p. 3829 –3838
- DOI: 10.1049/iet-ipr.2019.1740
- Type: Article
- + Show details - Hide details
-
p.
3829
–3838
(10)
To solve the copyright protection problem of a colour image, a new blind colour image watermarking method combining a discrete cosine transform (DCT) in the spatial domain is presented in this study. The advantages of the spatial-domain watermarking algorithm and frequency-domain one are made full use in this scheme. Based on the different quantisation steps in red, green, and blue three-layer images, the processes of watermark embedding and blind extraction are completed in the spatial domain without a real DCT domain. The scheme is realised by using the unique features of the direct current (DC) coefficient and the relativity of DC coefficients between adjacent pixel blocks. This scheme can effectively solve the problems of the large-capacity colour image watermarking algorithm, such as long-running time and weak robustness. Comparing with other advanced watermarking algorithms, the presented scheme has better invisibility, stronger robustness, and higher real-time performance.
- Author(s): Yahong Xie ; Hailin Wang ; Jianjun Wang
- Source: IET Image Processing, Volume 14, Issue 15, p. 3839 –3850
- DOI: 10.1049/iet-ipr.2020.0834
- Type: Article
- + Show details - Hide details
-
p.
3839
–3850
(12)
Recently, deep learning methods have made a remarkable improvement in compressed sensing image recovery stage. In the compressed measurement stage, the existing methods measured by block by block owing to a huge measurement dictionary for the whole images and the high computational complexity. In this work, a novel deep convolutional neural network (DCNN) named Convolutional Measurement Compressed Sensing network (CMCS-net) is proposed for image compressed sensing considering both convolutional measurement (CM) and sparse prior. Different from existing works, the convolution operation is adopted both in the measurement phase and reconstruction phase, which retains the structure information of images much better. Simultaneously, the size of the measurement matrix is no longer limited by data dimensions. Particularly, by unfolding the CM process to analyse a Toeplitz-type matrix, the theoretical support of the convolutional compressed measurement is proposed. In addition, in the recovery phase, the authors consider the sparse prior in nature images by embedding the truncated hierarchical projection algorithm into their architecture to solve the problem of multilayered convolutional sparse coding. Furthermore, extensive experiments demonstrate that their proposed CMCS-net can marvellously reconstruct the images and fully remove the block artefact.
- Author(s): Rahul Singh ; Aditya Goel ; D.K. Raghuvanshi
- Source: IET Image Processing, Volume 14, Issue 15, p. 3851 –3858
- DOI: 10.1049/iet-ipr.2020.0908
- Type: Article
- + Show details - Hide details
-
p.
3851
–3858
(8)
This work aims at developing an automated ensemble-based glioma grade classification framework that classifies glioma into low-grade glioma (LGG) and high-grade glioma (HGG). Discriminant features are extracted using the Gabor filter bank and concatenated in a vectorised form. The feature set is then divided into k subsets of features. An ensemble of base classifiers known as rotation forest is employed for classification purpose. Independent components analysis (ICA) is applied on every feature subset and independent features are extracted. Each classifier in the ensemble is trained with these independent features from all the subset of features. These k feature subsets are responsible for different rotations during the training phase. This results in classifier diversity in the ensemble. Extensive experiments are conducted on benchmark BraTS 2017 data set and comparative analysis reveals that the proposed framework outperforms the competitive techniques in terms of various performance metrics. Data-augmentation technique, synthetic minority over-sampling technique is applied to oversample minority class samples alleviate class biasness problem. The proposed classification framework achieves an accuracy of 98.63%, dice similarity coefficient of 0.98 and sensitivity of 0.96. The authors conduct different comparative experiments with state-of-the-art ensemble-based, deep learning-based and traditional machine learning-based classification approaches to validate the performance of the proposed framework.
- Author(s): Sara Daas ; Amira Yahi ; Toufik Bakir ; Mouna Sedhane ; Mohamed Boughazi ; El-Bay Bourennane
- Source: IET Image Processing, Volume 14, Issue 15, p. 3859 –3868
- DOI: 10.1049/iet-ipr.2020.0491
- Type: Article
- + Show details - Hide details
-
p.
3859
–3868
(10)
Recognition systems using multimodal biometrics attracts attention because they improve recognition efficiency and high-security level compared to the unimodal biometrics system. In this study, the authors present a secure multimodal biometrics recognition system based on the deep learning method that uses convolutional neural networks (CNNs). The authors propose two multimodal architectures using the finger knuckle print (FKP) and the finger vein (FV) biometrics with different levels of fusion: the features level fusion and scores level fusion. The features extraction for FKP and FV are performed using transfer learning CNN architectures: AlexNet, VGG16, and ResNet50. The key step aims to select separate features descriptors from each unimodal biometrics modality. After that, the authors combine them using the proposed fusion approaches were support vector machine or Softmax applies as classifiers to increase the proposed system security. The efficiency of the proposed algorithms is tested using publicly available biometrics databases. The experimental results show that the proposed fusion architectures achieve an accuracy of 99.89% and an equal error rate of 0.05%. The obtained results indicate that the biometrics recognition system using deep learning is secure, robust, and reliable.
- Author(s): Rini Smita Thakur ; Ram Narayan Yadav ; Lalita Gupta
- Source: IET Image Processing, Volume 14, Issue 15, p. 3869 –3879
- DOI: 10.1049/iet-ipr.2020.0717
- Type: Article
- + Show details - Hide details
-
p.
3869
–3879
(11)
Convolutional neural networks (CNNs) based on the discriminative learning model have been widely used for image denoising. In this study, a feed-forward denoising CNN (DnCNN) with a parametric rectified linear unit (PReLU) is used to improve the denoising performance. PReLU enhances the model fitting of the DnCNN network without affecting computational cost. This network learns the leaky parameter of negative inputs in an activation function and therefore finds a proper slope in a negative direction. The proposed denoising network is based on residual learning, which comprises repeated convolutional and PReLU units along with batch normalisation. Residual learning with batch normalisation accelerates the network training, which can be used for blind Gaussian denoising. In this network, feature maps are processed by principal component analysis and transferred to subsequent convolution layers. An adaptive bilateral filter further processes the output image of the proposed CNN for image smoothening and sharpening. The mean and variance of the Gaussian kernel of adaptive filter vary from pixel to pixel. The performance of this network is analysed on BSD-68 and Set-12 datasets, and it exhibits an improvement in peak signal-to-noise ratio and structural similarity index metric and visual representation over other state-of-the-art methods.
- Author(s): Ching-Ta Lu ; Ruei-Han Chen ; Ling-Ling Wang ; Jia-An Lin
- Source: IET Image Processing, Volume 14, Issue 15, p. 3880 –3889
- DOI: 10.1049/iet-ipr.2020.0560
- Type: Article
- + Show details - Hide details
-
p.
3880
–3889
(10)
An image may be disturbed by impulse noise during transmission or acquisition. To effectively restore the disturbed image is important for the applications of image processing. This study aims at enhancing the disturbed images by using the convolutional neural network (CNN) to identify similar patterns for the restoration of noisy pixels. In the training phase, each noisy pixel is analysed and compared with the noise-free image to find the closest neighbouring pixels. The pixels in a local window form a micro-pattern. All the captured micro-patterns, whose centre pixel is noisy, become a dataset for the training of a position CNN. The closest neighbouring pixel of a noisy image to the centre one of the noise-free image at the same position of each micro-pattern is selected to be the target. In the enhancement phase, a noisy micro-pattern, where the centre pixel is noisy, is input into the trained position CNN. The top N pixels are recognised and averaged to replace the grey level of the centre pixel. An enhanced pixel is obtained. The experimental results show that the position CNN can well recognise the similar neighbouring pixels and effectively enhance the noisy pixels in an image disturbed by salt-and-pepper noise.
- Author(s): Ricardo Batista das Neves Junior ; Estanislau Lima ; Byron L.D. Bezerra ; Cleber Zanchettin ; Alejandro H. Toselli
- Source: IET Image Processing, Volume 14, Issue 15, p. 3890 –3898
- DOI: 10.1049/iet-ipr.2020.0532
- Type: Article
- + Show details - Hide details
-
p.
3890
–3898
(9)
November The offer of online, automated, and impersonal services demand users to upload scanned copies of their documents to the organisations. As a consequence of this decentralisation, the documents present more challenges to the already complex process of image processing and information extraction. To address this problem, the authors presented an optimised fully convolutional neural network model for document segmentation that works on mobile devices to detect the region of the document in the captured image. They performed experiments in three representative datasets comparing the proposed method with the Geodesic object Proposals, U-net, Mask R-CNN, and OctHU-PageScan algorithms. They also compared the proposed model with all competitors of the ICDAR2015 Competition on smartphone document capture. Furthermore, they performed a qualitative and comparative analysis with the CamScanner software, a popular app for Android and iOS smartphones used for more than 100 million users in over 200 countries. The proposed approach achieved a significant performance compared with the current state-of-the-art methods, providing a powerful approach for document segmentation in photos and scanned images.
- Author(s): Jungang Yang ; Chao Xiao ; Yingqian Wang ; Chengjin An ; Wei An
- Source: IET Image Processing, Volume 14, Issue 15, p. 3899 –3908
- DOI: 10.1049/iet-ipr.2019.0081
- Type: Article
- + Show details - Hide details
-
p.
3899
–3908
(10)
Camera array image refocusing can change the in-focus region so that objects lying on a specified plane are in focus, whereas objects lying off this plane are blurred. Existing refocusing methods for camera array or light field images usually contain two interpolations. Since interpolation brings distortion, especially on the sharp edge in the images, existing methods are not sufficiently precise. In order to improve the quality of the refocusing result, the authors propose a high-precision method to refocus camera array images. They first back-project the pixel coordinates to a corresponding location on the focal plane in the world coordinate. Then they reproject the world coordinates to the pixel coordinates by using the parameters of each camera in the array. After that, they align the images with the focal plane by employing interpolation according to the acquired pixel coordinates. Finally, they get the synthetic image refocused on the focal plane by averaging the resulted images. In the proposed method, only one interpolation is used. So that it alleviates the quality degradation of the refocused image compared to the existing methods. Experiments on real-world scenes (captured by their self-developed light field devices) demonstrate that their method can yield better results than the existing methods.
- Author(s): Radhesyam Vaddi and Prabukumar Manoharan
- Source: IET Image Processing, Volume 14, Issue 15, p. 3909 –3919
- DOI: 10.1049/iet-ipr.2020.0728
- Type: Article
- + Show details - Hide details
-
p.
3909
–3919
(11)
Hyperspectral image (HSI) consists of hundreds of contiguous spectral bands, which can be used in the classification of different objects on the earth. The inclusion of both spectral as well as spatial features stands essential in order that high classification accuracy is achieved. However, incorporation of the spectral and spatial information without preserving the intrinsic structure of the data leads on to downscaling the classification accuracy. To address the issue aforementioned, the proposed method which involves using unsupervised spectral band selection based on three major constrains: (i) low reconstruction error with neighbourhood bands, (ii) low noise, (iii) high information entropy, is put forward. In addition, the structure-preserving recursive filter is used to extract spatial features. Finally, the classification is performed using convolutional neural networks (CNNs) with different sets of convolutional, pooling, and fully connected layers. To test the performance of the proposed method, experiments have been carried out with three benchmark HSI datasets Indian pines, University of Pavia, and Salinas. These experiments reveal that the proposed method offers better classification accuracy over the purportedly state-of-the-art methods in terms of standard metrics like overall accuracy, average accuracy, and kappa coefficient (K). The proposed method has attained OAs of 99.9, 98.9, and 99.93% for the three datasets, respectively.
- Author(s): Marziye Rahmati ; Mansoor Fateh ; Mohsen Rezvani ; Alireza Tajary ; Vahid Abolghasemi
- Source: IET Image Processing, Volume 14, Issue 15, p. 3920 –3931
- DOI: 10.1049/iet-ipr.2019.0728
- Type: Article
- + Show details - Hide details
-
p.
3920
–3931
(12)
Optical character recognition, known as OCR, has been widely used due to high demand of different technologies. Currently, most existing OCR systems have been focused on Latin languages. In recent studies, OCR systems for non-Latin texts involving cursive style have also been introduced despite posing some challenges. In this study, the authors propose an OCR system based on long short-term memory neural networks for the Persian language. The authors also investigate the effects of variations of parameters, involved in this approach. The proposed OCR system solves false recognition of sub-word ‘LA’ and ‘LA’. Moreover, the authors present a preprocessing algorithm to remove ‘justification’ using image processing. A new comprehensive collated data set is introduced, comprising five million images with eight popular Persian fonts and in ten various font sizes. The proposed evaluations show that the accuracy of the proposed OCR is increased by 2%, compared to the existing Persian OCR system. The experimental results indicated that the proposed system has average accuracy of 99.69% at the letter level. The proposed system has an accuracy of 98.1% for ‘zero-width non-breaking space’ and 98.64% for ‘LA’ at the word level.
- Author(s): Jiaquan Shen ; Ningzhong Liu ; Han Sun
- Source: IET Image Processing, Volume 14, Issue 15, p. 3932 –3940
- DOI: 10.1049/iet-ipr.2020.0841
- Type: Article
- + Show details - Hide details
-
p.
3932
–3940
(9)
With the rapid development of the electronic industry, the defect detection of printed circuit board (PCB) components is becoming more and more important. The types of PCB components are diverse and accompanied by complex character information, which is difficult to identify. The traditional detection method is inefficient, and it is unable to effectively perform the diversified category detection of PCB components and character recognition in complex scenes. The deep convolutional neural network has obvious advantages in object detection and character recognition, which can be used to implement a PCB component defect detection system. In this study, the authors have established a lightweight PCB type detection model called LD-PCB, which can perform real-time detection while improving detection accuracy. In addition, in the character detection of PCB, they have established a fast and robust character recognition model, called CR-PCB. This model can effectively improve the accuracy of irregular character recognition. Finally, they established and published a dataset of PCB components, and combined with LD-PCB and CR-PCB to realise the PCB defect detection system. This system can realise the functions of defect detection, wrong insertion, missing insertion, and character recognition in industrial PCB production. The results show that the method proposed in this study can effectively detect defects on PCB components.
- Author(s): Vivekraj V K ; Debashis Sen ; Balasubramanian Raman
- Source: IET Image Processing, Volume 14, Issue 15, p. 3941 –3956
- DOI: 10.1049/iet-ipr.2020.0234
- Type: Article
- + Show details - Hide details
-
p.
3941
–3956
(16)
Dynamic video summarisation (video skimming) is a process of generating a shorter video (video skim) as a summary of a given video, which helps in its easier and quicker comprehension. In this study, an efficient dynamic summarisation approach for user videos is proposed using vector ordering for ranking video units (frames/shots). User videos are casually shot unscripted videos, where skimming involves the selection of its interesting part(s) ignoring many uninteresting ones. The concept of R-ordering of vectors is employed to find a representative frame, which is used to perform relative ranking of the video frames. It is theoretically shown that significance is given to each element of a frame's feature vector while computing the importance scores that lead to the frame ranks used for skimming. Furthermore, the allocation of different weights to the features involved is also achieved using linear and Gaussian process regressions. Through extensive experiments considering several standard datasets with human-labelled ground truth, the proposed approach is demonstrated to be efficient and to perform better than the relevant state-of-the-art.
- Author(s): Mohammed Alghaili ; Zhiyong Li ; Hamdi A.R. Ali
- Source: IET Image Processing, Volume 14, Issue 15, p. 3957 –3964
- DOI: 10.1049/iet-ipr.2020.0199
- Type: Article
- + Show details - Hide details
-
p.
3957
–3964
(8)
The great attention to gender classification is increasing recently as genders carry rich information related to male and female social activities. Extracting discriminating visual representations for gender classification is challenging especially with covered or camouflaged faces. In this work, the authors propose a network that uses a combination of inceptions with variational feature learning (VFL) loss function. The proposed network recognises the gender of normal or covered/camouflaged faces through the middle face part. This network trained on the middle part of the faces that contain both eyes with a small margin from the top-left corner to the bottom-right corner of the area of the eyes. Experimental results showed that the proposed network achieved state-of-art performance on five public data sets: FEI, SCIEN, AR FACES, LFW, and ADIENCE. They also evaluated the authors’ network on another new collected data set for covered and camouflaged faces and obtained encouraging outcomes.
- Author(s): Yanzhu Hu ; Dongdong Zhu ; Xinbo Ai ; Yabo Xu
- Source: IET Image Processing, Volume 14, Issue 15, p. 3965 –3974
- DOI: 10.1049/iet-ipr.2020.0640
- Type: Article
- + Show details - Hide details
-
p.
3965
–3974
(10)
Fully supervised object detection needs to use the data set with object category annotation and location annotation to train the model. In contrast, weak supervised object localisation only needs to use the data set with object category annotation to train the model, but it can complete the classification task and object localisation task at the same time. Inspired by the attention-based dropout layer (ADL) method, this study designs a category-wise feature extractor (CFE), which can explicitly obtain the localisation map used to indicate the object location, and it is directly related to the category output of the classification task. Although its amount of calculation is slightly larger than that of the ADL method, the performance is better than ADL in some tasks. In addition to the standard CFE method mentioned above, this study also designs a lightweight CFE-tiny method, which adopts split-attention mechanism, and the calculation amount of this method is much smaller than that of ADL method.
- Author(s): Ning Xiao ; Yan Qiang ; Zijuan Zhao ; Juanjuan Zhao ; Jianhong Lian
- Source: IET Image Processing, Volume 14, Issue 15, p. 3975 –3981
- DOI: 10.1049/iet-ipr.2020.0496
- Type: Article
- + Show details - Hide details
-
p.
3975
–3981
(7)
The prediction of lung tumour growth is the key to early treatment of lung cancer. However, the lack of intuitive and clear judgments about the future development of the tumour often leads patients to miss the best treatment opportunities. Combining the characteristics of the variational autoencoder and recurrent neural networks, this study proposes a tumour growth prediction via a conditional recurrent variational autoencoder. The proposed model uses a variational autoencoder to reconstruct tumour images at different times. Meanwhile, the recurrent units are proposed to infer the relationship between tumour images according to the chronological order. The different tumour development varies in different patients, patients' condition is adopted to achieve personalised prediction. To solve the problem of blurred results, the authors add the total variation regularisation term into the object function. The proposed method was tested on longitudinal studies, National Lung Screening Trial and cooperative hospital dataset, with three points on lung tumours. The precision, recall, and dice similarity coefficient reach 82.22, 79.89 and 82.49%, respectively. Both quantitative and qualitative experimental results show that the proposed method can produce realistic tumour images.
- Author(s): Zhaorun Zhou and Zhenghao Shi
- Source: IET Image Processing, Volume 14, Issue 15, p. 3982 –3988
- DOI: 10.1049/iet-ipr.2020.1153
- Type: Article
- + Show details - Hide details
-
p.
3982
–3988
(7)
Image haze removal is highly desired for the application of computer vision. This study proposes a novel context-guided generative adversarial network (CGGAN) for single image dehazing. Of which, a novel new encoder–decoder is employed as the generator. In addition, it consists of a feature-extraction net, a context-extraction net, and a fusion net in sequence. The feature-extraction net acts as an encoder, and is used for extracting haze features. The content-extraction net is a multi-scale parallel pyramid decoder and is used for extracting the deep features of the encoder and generating coarse dehazing image. The fusion net is a decoder and is used for obtaining the final haze-free image. In order to get better dehazing results, multi-scale information obtained during the decoding process of the context extraction decoder is used for guiding the fusion decoder. By introducing an extra coarse decoder to the original encoder–decoder, the CGGAN can make better use of the deep feature information extracted by the encoder. To ensure that the proposed CGGAN works effectively for different haze scenarios, different loss functions are employed for the two decoders. Experiments results show the advantage and the effectiveness of the proposed CGGAN, evidential improvements over existing state-of-the-art methods are obtained.
- Author(s): Qianqian Du ; Xueting Ren ; Jiawen Wang ; Yan Qiang ; Xiaotang Yang ; Ntikurako Guy-Fernand Kazihise
- Source: IET Image Processing, Volume 14, Issue 15, p. 3989 –3999
- DOI: 10.1049/iet-ipr.2020.1056
- Type: Article
- + Show details - Hide details
-
p.
3989
–3999
(11)
This study proposed a GAN-based reconstruction method-cascaded data consistency generative adversarial network (CDCGAN) to recover high-quality PET images from filtered back projection PET images with streaking artifacts and high noise. First, the authors embed defined data consistency layer (DC layer) in their generator network to constrain the reconstruction process and adjust accurately generated faked PET images. Second, to improve the accuracy of reconstruction on average, their generator network was built iteratively to achieve better performance with simple structures. They observed that the proposed CDCGAN allows the preservation of fine anomalous features while eliminating the streaking artifacts and noise. Experimental results show that the reconstructed PET images by their methods perform well comparably to other state-of-the-art methods but at a faster speed. A clinical experiment was also performed to show the validity of the CDCGAN for artifacts reduction.
- Author(s): Ram Krishna Pandey and Ramakrishnan Angarai Ganesan
- Source: IET Image Processing, Volume 14, Issue 15, p. 4000 –4011
- DOI: 10.1049/iet-ipr.2019.1244
- Type: Article
- + Show details - Hide details
-
p.
4000
–4011
(12)
The authors propose architectures that learn end-to-end mapping functions to improve the spatial resolution of the input natural images. The models are unique in forming non-linear combinations of three image interpolation techniques using the convolutional neural network. Another proposed architecture uses a skip connection with nearest-neighbour interpolation, achieving almost similar results. The architectures have been carefully designed to ensure that the reconstructed images lie precisely in the manifold of high-resolution images, thereby preserving the high-frequency components with fine details. They have compared with the state-of-the-art and recent deep learning-based natural image super-resolution techniques and found that their methods can preserve the sharp details in the image, while also obtaining comparable or better peak-signal-to-noise ratio values than them. Since their methods use image interpolations and a shallow convolutional neural network (CNN) with a fewer number of smaller filters, the computational cost is kept low. They have reported the results of the best two proposed architectures on five standard data sets for an upscale factor of 2. Their methods generalise well in most cases, which is evident from the better results obtained with increasingly complex data sets. For four times upscaling, they have designed similar architectures for comparing with other methods.
Segmentation of brain magnetic resonance images using a novel fuzzy clustering based method
Retinal image blood vessel extraction and quantification with Euclidean distance transform approach
Local contrast measure with iterative error for infrared small target detection
Water level monitoring using classification techniques on Landsat-8 data at Sangam region, Prayagraj, India
Genetic-based feature fusion in face recognition using arithmetic coded local binary patterns
Multi-camera 3D ball tracking framework for sports video
Rotational invariant biologically inspired object recognition
EKFPnP: extended Kalman filter for camera pose estimation in a sequence of images
Target object detection using chicken social-based deep belief network with hyperspectral imagery
Deep convolutional neural network for mixed random impulse and Gaussian noise reduction in digital images
Wise optimisation: deep image embedding by informative pair weighting and ranked list learning
Automatic production of synthetic labelled OCT images using an active shape model
Two-stage visible watermark removal architecture based on deep learning
Fast and robust image watermarking method in the spatial domain
CMCS-net: image compressed sensing with convolutional measurement via DCNN
Ensemble-based glioma grade classification using Gabor filter bank and rotation forest
Multimodal biometric recognition systems using deep learning based on the finger vein and finger knuckle print fusion
PReLU and edge-aware filter-based image denoiser using convolutional neural network
Image enhancement using convolutional neural network to identify similar patterns
HU-PageScan: a fully convolutional neural network for document page crop
High-precision refocusing method with one interpolation for camera array images
Hyperspectral remote sensing image classification using combinatorial optimisation based un-supervised band selection and CNN
Printed Persian OCR system using deep learning
Defect detection of printed circuit board based on lightweight deep convolution network
Vector ordering and regression learning-based ranking for dynamic summarisation of user videos
Deep feature learning for gender classification with covered/camouflaged faces
Category-wise feature extractor based on ADL method for weak-supervised object localisation
Tumour growth prediction of follow-up lung cancer via conditional recurrent variational autoencoder
CGGAN: a context-guided generative adversarial network for single image dehazing
Iterative PET image reconstruction using cascaded data consistency generative adversarial network
DeepInterpolation: fusion of multiple interpolations and CNN to obtain super-resolution
Most viewed content
Most cited content for this Journal
-
Medical image segmentation using deep learning: A survey
- Author(s): Risheng Wang ; Tao Lei ; Ruixia Cui ; Bingtao Zhang ; Hongying Meng ; Asoke K. Nandi
- Type: Article
-
Block-based discrete wavelet transform-singular value decomposition image watermarking scheme using human visual system characteristics
- Author(s): Nasrin M. Makbol ; Bee Ee Khoo ; Taha H. Rassem
- Type: Article
-
Classification of malignant melanoma and benign skin lesions: implementation of automatic ABCD rule
- Author(s): Reda Kasmi and Karim Mokrani
- Type: Article
-
Digital image watermarking method based on DCT and fractal encoding
- Author(s): Shuai Liu ; Zheng Pan ; Houbing Song
- Type: Article
-
Tomato leaf disease classification by exploiting transfer learning and feature concatenation
- Author(s): Mehdhar S. A. M. Al‐gaashani ; Fengjun Shang ; Mohammed S. A. Muthanna ; Mashael Khayyat ; Ahmed A. Abd El‐Latif
- Type: Article