IET Image Processing
Volume 14, Issue 1, 10 January 2020
Volumes & issues:
Volume 14, Issue 1
10 January 2020
-
- Author(s): Agha Asim Husain ; Tanmoy Maity ; Ravindra Kumar Yadav
- Source: IET Image Processing, Volume 14, Issue 1, p. 1 –10
- DOI: 10.1049/iet-ipr.2018.5351
- Type: Article
- + Show details - Hide details
-
p.
1
–10
(10)
Developing an intelligent transportation system has attracted a lot of attention in the recent past. Moreover, with the growing number of vehicles on the road most nations are adopting an intelligent transport system (ITS) for handling issues like traffic flow density, queue length, the average speed of the traffic, and total vehicles passing through a point in a specific time interval and so on. ITS by capturing traffic images and videos through cameras, helps the traffic control centres in monitoring and managing the traffic. Efficient and unfailing vehicle detection is a crucial step for the ITS. This study reviews different techniques and applications used around the world for vehicle detection under various environmental conditions based on video processing systems. This study also discusses the types of cameras used for vehicle detections, and the classification of vehicles for traffic monitoring and controlling. This study finally highlights the problems encountered during surveillance under extreme weather conditions.
Vehicle detection in intelligent transport system under a hazy environment: a survey
-
- Author(s): Vikrant Singh Thakur ; Kavita Thakur ; Shubhrata Gupta ; Kamisetty R. Rao
- Source: IET Image Processing, Volume 14, Issue 1, p. 11 –24
- DOI: 10.1049/iet-ipr.2018.6302
- Type: Article
- + Show details - Hide details
-
p.
11
–24
(14)
The optimum non-negative integer bit allocation (ONIBA) is an important technique, which provides optimal quantisation of transform coefficients for the image transform coders (ITCs). However, the existing ONIBA algorithms are still not popular for the discrete cosine transform (DCT)-based ITCs, due to their image-dependent nature and additional side information requirements. Therefore, this study presents a novel image-independent ONIBA (IIONIBA) technique to achieve efficient quantisation for the DCT-ITCs. For the development of the proposed IIONIBA technique, initially, an image-dependent ONIBA algorithm is proposed, which is then mapped into desired image-independent solution via utilisation of a prepared combined image and proposed modified step size mapping technique. Thereafter, a new lookup table for the elements of quantisation tables, obtained from the proposed IIONIBA technique, is established using non-linear regression analysis, to reduce the problem of additional side information requirements. Several experiments are performed to evaluate the performance of the proposed IIONIBA technique based on the visual quality assessment of reconstructed images and the image quality indexes peak signal to noise ratio (PSNR) and mean structural similarity index (MSSIM). The results show that the proposed IIONIBA technique delivers better quantisation and provides significant gains in the image quality indexes as compared to the recent quantisation techniques.
- Author(s): Benedek Nagy and Tibor Lukić
- Source: IET Image Processing, Volume 14, Issue 1, p. 25 –30
- DOI: 10.1049/iet-ipr.2019.0099
- Type: Article
- + Show details - Hide details
-
p.
25
–30
(6)
In this study, a tomography reconstruction problem of binary images is considered on the isometric grid. On this grid, the triangle pixels have two types of orientations, accordingly, the authors call them delta or nabla shape pixels. The proposed reconstruction method uses data of projections of three natural directions. They are the lane directions of the triangular tessellation (these directions are somewhat analogous to row/column directions on the rectangular grids). The projection ray, penetrating through a grid lane, now not passing through the middle of pixels (i.e. through the middle line of triangle shape pixels), as usually taken, but little bit shifted from the middle parallel to the lane. This method provides the exact information about the number of nabla and delta shape triangle pixels in each lane of the image. This additional information is included in the reconstruction process to improve the quality of reconstruction. They formulate the suggested model into an energy-minimisation problem and apply a gradient-based approach for its minimisation. They show and analyse various experimental results on test images. The presented approach shows both better quality reconstructions and shorter running time than the earlier approaches.
- Author(s): Smaragda Markaki ; Costas Panagiotakis ; Dimitra Lasthiotaki
- Source: IET Image Processing, Volume 14, Issue 1, p. 31 –39
- DOI: 10.1049/iet-ipr.2018.5880
- Type: Article
- + Show details - Hide details
-
p.
31
–39
(9)
The authors define and approximately solve the problem of unsupervised image sorting that is considered as a kind of content-based image clustering. The content-based image sorting is the creation of a route that passes through all the images once, in such an order that the next one from the previous image has similar content. In the end, an image ordering (e.g. slideshow) is automatically produced, so that the images with similar content should be close to each other. This problem resembles the problem known in the literature as ‘travelling salesman problem’ (TSP). In this work, the authors have proposed two classes of methods (the nearest-neighbour and genetic methods) that have also been applied on the TSP problem. Their benefits on computational efficiency and accuracy are discussed over six datasets that have been created from the GHIM-10K dataset. The experimental results demonstrate that the proposed methods efficiently solve the image sorting problem, producing image sequences that almost agree with human intuition.
- Author(s): Djamel Herbadji ; Aissa Belmeguenai ; Nadir Derouiche ; Hongjung Liu
- Source: IET Image Processing, Volume 14, Issue 1, p. 40 –52
- DOI: 10.1049/iet-ipr.2019.0123
- Type: Article
- + Show details - Hide details
-
p.
40
–52
(13)
In this study, an enhanced quadratic map (EQM) is proposed and has been applied in a new colour image encryption scheme. The performance evaluations show that the EQM has excellent performances such as better Lyapunov exponent and larger chaotic ranges when compared with the classical quadratic map. The sequences generated from this EQM are successfully used in a new proposed colour image encryption scheme with excellent confusion and diffusion properties. The encryption structure is based on the permutation–diffusion process, and then adopted on the classical permutation, it is characterised by a high speed of diffusion, which enables the encryption of the three components of the plaintext image at the same time, and these encrypted components are simultaneously related to each other. The proposed scheme is tested on the USC-SIPI image dataset and on the real-life image dataset; its effectiveness is also compared with five latterly proposed image encryption schemes. The simulation results indicate that the proposed scheme has the properties of large key space, a weaker correlation between neighbouring pixels, higher sensitivity towards key, greater randomness of pixels and the capacity to withstand statistical analysis, plaintext/chosen-plaintext attacks, and differential attacks, thus that it has higher security and can be appropriate for image encryption.
- Author(s): Ridong Zhu ; Xiaoyuan Yang ; Jingkai Wang ; Zhengze Li
- Source: IET Image Processing, Volume 14, Issue 1, p. 53 –61
- DOI: 10.1049/iet-ipr.2018.6037
- Type: Article
- + Show details - Hide details
-
p.
53
–61
(9)
In this study, the authors present a novel ensemble tracking system by formulating the tracking task in terms of a linear regression which is a least-squares problem. A set of weak classifiers are trained using least squares which are solved efficiently using the Moore–Penrose inverse. Then, these weak classifiers are combined into a strong classifier using bagging. The strong classifier is used to recognise the target and locate its position, which is obtained efficiently in the Fourier domain. For obtaining a good ensemble, a novel sampling strategy is proposed to train accurate and diverse weak classifiers. By exploiting historical targets to monitor the training process, pose change and occlusion are well-handled. The proposed method is extensively evaluated using a variety of evaluation protocols on the recent standard datasets including OTB50, OTB100 and VOT2016. Experimental results show that the proposed methodology performs favourably against state-of-the-art methods in terms of efficiency, accuracy and robustness.
- Author(s): Sourabh Paul and Umesh C. Pati
- Source: IET Image Processing, Volume 14, Issue 1, p. 62 –73
- DOI: 10.1049/iet-ipr.2019.0389
- Type: Article
- + Show details - Hide details
-
p.
62
–73
(12)
Optical-to-synthetic aperture radar (SAR) image registration is a challenging task in remote sensing as the images have significant non-linear intensity variations as well as large geometric differences. Moreover, the influence of speckle noise in the SAR image further affects the registration result. The structural descriptors are very effective to handle the non-linear intensity variations between optical and SAR images. Although a number of optical-to-SAR image registration methods have been proposed in the past few years based on the structural information of image, most of them are ineffective for the images having large geometric differences. To address these problems, a novel optical-to-SAR image registration algorithm is proposed by using a new structural descriptor. Initially, the corner features are extracted from the optical and SAR images. Then, the proposed structural descriptor is constructed for the extracted features. Finally, feature matching is performed between the optical and SAR images and the correct match are identified. The proposed method is very effective to register the optical and SAR images having significant intensity variations and large geometric differences. It can increase the number of correct match between the images. Experiments on six sets of optical and SAR image pairs demonstrate the effectiveness of the proposed method.
- Author(s): Arpan Garai ; Samit Biswas ; Sekhar Mandal ; Bidyut B. Chaudhuri
- Source: IET Image Processing, Volume 14, Issue 1, p. 74 –83
- DOI: 10.1049/iet-ipr.2019.0831
- Type: Article
- + Show details - Hide details
-
p.
74
–83
(10)
In this study, a robust algorithm for dewarping of camera-captured document images, mainly in Bangla script, is proposed. The algorithm can handle various types of warped document images and they are generated due to different types of document surfaces (convex, concave or multi-folded). The proposed algorithm is independent of font type, font size, font style and camera view angle. After initial preprocessing, the method first demarcates the text lines present in the document image. Then, the headline (shirorekha) position of each text line is estimated. Based on the headline position and shape, each text line is dewarped. If the document is highly warped, distorted text (e.g. thinner and shorter characters) is generated after dewarping. Special care has been taken to minimise this distortion based on most undistorted character information. Exhaustive testing shows the robustness and shape improvement of the proposed algorithm. Finally, for shape quality evaluation, some new measures are defined.
- Author(s): Jyotsna Dogra ; Shruti Jain ; Meenakshi Sood
- Source: IET Image Processing, Volume 14, Issue 1, p. 84 –93
- DOI: 10.1049/iet-ipr.2018.6615
- Type: Article
- + Show details - Hide details
-
p.
84
–93
(10)
Magnetic resonance imaging is a powerful, ubiquitous imaging technique that provides detailed high-contrast images differentiating soft tissues. The low radio-frequency bias field creates intensity inhomogeneity generating low contrast that often creates difficulty for quantitative and qualitative analyses. Segmentation aids in analysis of changes occurring in brain, where bias effect severely affects performance. The graph-cut (GC) segmentation provides supervised computer-assisted diagnosis and treatment. GC's interactive nature requires manual selection of kernels for initialisation. The shrinkage behaviour of GC creates inaccurate and fallacious extraction. On the basis of these problems, this study proposes gradient-based kernel selection GC method that simultaneously removes shrinkage problem and locates tumour in image, eliminating human interaction with accurate segmentation for even bias field images. The proposed method addresses these problems by emphasising on directive inclination of intensity scales of symmetrical halves of images. The proposed method is evaluated for high-grade glioma and low-grade glioma images with and without bias field. The average performance metrics evaluated for these images depict remarkable improvement in comparison with existing techniques. The proposed technique is validated by applying on real-time dataset of tumour images obtained from State Government Hospital, Shimla, India.
- Author(s): Mohammad Khateri ; Fahim Shabanzade ; Fardin Mirzapour
- Source: IET Image Processing, Volume 14, Issue 1, p. 94 –104
- DOI: 10.1049/iet-ipr.2019.0283
- Type: Article
- + Show details - Hide details
-
p.
94
–104
(11)
In this study, the authors address the fusion of low-resolution multi-spectral image with the corresponding high-resolution panchromatic image to provide high-resolution multi-spectral (HRM) one, i.e. pan sharpening. The intensity–hue–saturation (IHS)-based pan-sharpening methods are popular because they are simple, efficient, and of high-spatial quality. However, their frameworks are unavoidably subject to spectral distortion. To reduce the inevitable spectral distortion of IHS-based pan-sharpening approaches, the spectral consistency constraint is used in the proposed method. Moreover, to stabilise fusion results obtained from the ill-posed pan-sharpening problem and to keep the smoothness of the HRM image, a total variation regularisation term is considered. These considerations are formulated in a non-quadratic optimisation problem. To solve this problem, a kind of variable splitting method, known as half-quadratic approximation is utilised, and also an alternating optimisation procedure is used to reconstruct HRM image. To gain convenient control on the local spectral and the spatial information, and also to reduce the required memory, in the optimisation stage, the patch-based strategy is employed. The proposed method was tested on two datasets acquired by GeoEye-1 and Pleiades satellites. To evaluate the proposed method, visual assessment, as well as quantitative comparison with different pan-sharpening methods, was carried out.
- Author(s): Lin Zhang ; Xiaokang Bu ; Bing Li
- Source: IET Image Processing, Volume 14, Issue 1, p. 105 –113
- DOI: 10.1049/iet-ipr.2019.0385
- Type: Article
- + Show details - Hide details
-
p.
105
–113
(9)
Nowadays, convolutional neural networks (CNNs) have become a research hotspot because of their high performance in computer vision and pattern recognition. However, as the high energy consumption of traditional graphic processing units-based CNNs, it is difficult to deploy them into portable devices. To deal with this problem, a hybrid CNN structure (XNORCONV) was proposed and implemented on field-programmable gate array (FPGA) in this study. Two improvements have been applied in XNORCONV. Firstly, the multiplications in the convolutional layer (CONV) were replaced by XNOR operations to save the multiplier and reduce computational complexity. Secondly, an inter-layer pipeline was designed to further accelerate the calculation. XNORCONV was implemented on Xilinx Zynq-7000 xc7z020clg400-1 under the clock frequency of 150 MHz and tested with MNIST dataset. The results of the experiment show that XNORCONV can classify each picture from MNIST in , and achieve 98.4% recognition accuracy. Compared with traditional Lenet-5 on different platforms, XNORCONV reduced multiplication by 85.6% with only 0.4% accuracy loss.
- Author(s): Xiaohua Liu ; Xiao-Yuan Jing ; Guijin Tang ; Fei Wu ; Xiwei Dong
- Source: IET Image Processing, Volume 14, Issue 1, p. 114 –124
- DOI: 10.1049/iet-ipr.2018.6594
- Type: Article
- + Show details - Hide details
-
p.
114
–124
(11)
In this study, the authors study the problem of tensor completion, in particular for three-dimensional arrays such as visual data. Previous works have shown that the low-rank constraint can produce impressive performances for tensor completion. These works are often solved by means of Tucker rank. However, Tucker rank does not capture the intrinsic correlation of the tensor entries. Therefore, the authors propose a new proximal operator for the approximation of tensor nuclear norms based on tensor-train rank-1 decomposition via the singular value decomposition. The proximal operator will perform a soft-thresholding operation on tensor singular values. In addition, the low-rank constraint can capture the global structure of data well, but it does not exploit local smooth of visual data. Therefore, they integrate total variation as a regularisation term into low-rank tensor completion. Finally, they use a primal–dual splitting to achieve optimisation. Experimental results have shown that the proposed method, can preserve the multi-dimensional nature inherent in the data, and thus provide superior results over many state-of-the-art tensor completion techniques.
- Author(s): Sio-Kei Im and Ka-Hou Chan
- Source: IET Image Processing, Volume 14, Issue 1, p. 125 –131
- DOI: 10.1049/iet-ipr.2018.6602
- Type: Article
- + Show details - Hide details
-
p.
125
–131
(7)
The Lagrangian rate distortion optimisation is widely employed in modern video encoders, such as high-efficiency video coding (H.265/HEVC). In this work, the authors propose a more accurate context-based adaptive binary arithmetic coding look-up table that can enhance compression quality and provide substantially better accuracy of range estimation, by employing one-more bit with 64 probability states. For the hardware implementation, they propose a higher precision look-up table instead of the HEVC Test Model (HM) standard table. The authors also define a new finite-state machine to handle the probability changing in real-time. The significant BD-RATE gain of the proposed context modelling is up to 6.0% for all-intra mode and 13.0% for inter mode. This finite state machine offers no divergence from the H.265/HEVC standards and can be used in the current systems.
- Author(s): Mohamadreza Abbasi ; Saeed Kermani ; Ardeshir Tajebib ; Morteza Moradi Amin ; Manije Abbasi
- Source: IET Image Processing, Volume 14, Issue 1, p. 132 –137
- DOI: 10.1049/iet-ipr.2018.5910
- Type: Article
- + Show details - Hide details
-
p.
132
–137
(6)
The main purpose of this study is to introduce a new species of features to improve the diagnosis efficiency of acute lymphoblastic leukaemia from microscopic images. First, the authors segmented nuclei by the k-means and watershed algorithms. They extracted three sets of geometrical, statistical, and chaotic features from nuclei images. Six chaotic features were extracted by calculating the fractal dimension from five sub-images driven from the nuclei images, with their grey levels being modified. The authors classified the images into binary and multiclass types via the support vector machine algorithm. They conducted principal component analysis for dimensional reduction of feature space and then evaluated the proposed algorithm for the overfitting problem. The obtained overall results represent 99% accuracy, 99% specificity, and 97% sensitivity values in the classification of six-cell groups. The difference between the train and test errors was <3%, which proves that the classification performance had improved by using the multifractal features.
- Author(s): Yin-Fu Huang and Huan-Yu Wu
- Source: IET Image Processing, Volume 14, Issue 1, p. 138 –146
- DOI: 10.1049/iet-ipr.2019.0229
- Type: Article
- + Show details - Hide details
-
p.
138
–146
(9)
For image matching, the scale invariant feature transform (SIFT) algorithm is a commonly used one. They are invariant to image rotation, scale zooming, and partially invariant to change in illumination and 3D camera viewpoint. Affine SIFT (ASIFT) is an extension of SIFT, which solves the problem when images are captured at different angles. However, ASIFT has higher computational complexity than SIFT, due to a huge amount of features in the images. Therefore, in this study, a Hadoop-based image retrieval system is proposed to solve the ASIFT shortcomings of high computation by the MapReduce technology. The system uses a combination of the Bag-of-Words method and support vector machine. Finally, the experimental results verify that the proposed method is more effective than the other state-of-the-art methods for a variety of datasets.
- Author(s): Abbas Biniaz ; Fatemeh Abdolali ; Reza Aghaeizadeh Zoroofi
- Source: IET Image Processing, Volume 14, Issue 1, p. 147 –153
- DOI: 10.1049/iet-ipr.2019.0251
- Type: Article
- + Show details - Hide details
-
p.
147
–153
(7)
Wireless capsule endoscopy (WCE) is a non-invasive diagnosis method that allows recording a video as the capsule travels through the gastrointestinal (GI) tract. The practical drawback is producing a long clinical video in which the review process by an experienced specialist is tedious. Automated summarisation methods can reduce the evaluation time by experts as well as errors in manual interpretation. The proposed approach consists of three main steps as follows: First, an adaptive sliding window singular value decomposition is employed to extract representative video frames. Then, adaptive contrast diffusion is utilised to increase the visibility of WCE frames. At the end stage, a novel knowledge-based method is developed to segment video frames into four topographic zones of GI tract, which are oesophagus, stomach, small intestine and large intestine. The authors have evaluated the proposed framework in the presence of 30 local datasets as well as publicly available KID database. The average recall and precision were estimated by 0.86 and 0.83, and by 0.82 and 0.83 for KID database, respectively. Their results reveal that significant reduction in the review time is feasible using the proposed technique. Quantitative results of summarisation show that the proposed method is more effective than three methods in the literature.
- Author(s): Parveen Malik and Kannan Karthik
- Source: IET Image Processing, Volume 14, Issue 1, p. 154 –167
- DOI: 10.1049/iet-ipr.2019.0732
- Type: Article
- + Show details - Hide details
-
p.
154
–167
(14)
In natural photography, defects in camera imaging pipeline often result in some form of colour noise or distortion. Nature of this distortion is generally intertwined with scene dependent variables such as positioning, intensity and composition of light source and local object colour reflectivity. One such defect is called purple fringing aberration (PFA). PFA problems are of two types, one of which corresponds to a localised fringing effect near high contrast zones (termed as Isolated PFA or IS-PFA) and the second which corresponds to a widespread semi-transparent purple haze over a large part of natural scene (termed as complex PFA or C-PFA). Much of the PFA-correction solutions have been driven towards IS-PFA and very little towards C-PFA. Based on a premise that in C-PFA, green channel is heavily suppressed and noisy, while colour information in red and blue channels are largely conserved, authors propose a green-channel compensation algorithm for restoring true natural colours in fringe affected region. To correct white-tuft produced by proposed compensation algorithm, they also devise a suitable localised luminance adaptation procedure to equalise perceived changes in luminance profile. Comparisons with state-of-the-art methods devised to combat this purple haze effect yield promising results for a majority of test cases.
- Author(s): Yinhao Li ; Katsuhisa Ogawa ; Yutaro Iwamoto ; Yen-Wei Chen
- Source: IET Image Processing, Volume 14, Issue 1, p. 168 –175
- DOI: 10.1049/iet-ipr.2019.0319
- Type: Article
- + Show details - Hide details
-
p.
168
–175
(8)
In this study, the authors propose a novel multi-frame super-resolution method using frame selection and multiple fusions for atmospherically distorted, zoomed-in, image-quality enhancement. When a small part of the image captured by placing a target several kilometres away from the fixed camera is enlarged, the quality of the part becomes poor owing to low resolution, spatial deformations and noise that are mainly caused by long distance and atmospheric turbulence. Thus, the authors propose an adaptive frame selection method that selects only a few frames with small blur based on the corresponding images with relatively clear edges. Further, they propose multiple fusion schemes to reconstruct the selected frames, thereby suppressing the influence of deformation. By converting all the frames into high-resolution based on each frame and integrating them, deformation and noise are effectively removed without high computation cost using the multiple fusion scheme. The proposed method, which enhances the quality of atmospherically distorted zoomed-in images, exhibits superior performance than the state-of-the-art image super-resolution methods with regard to high accuracy, efficiency and ease of implementation, ensuring that the proposed method is suitable for enhancing the quality of an image captured using a general digital camera or a smartphone.
- Author(s): Xianzhang Pan
- Source: IET Image Processing, Volume 14, Issue 1, p. 176 –182
- DOI: 10.1049/iet-ipr.2019.0293
- Type: Article
- + Show details - Hide details
-
p.
176
–182
(7)
Video-based facial expression recognition (VFER) is the fundamental feature of various computer vision applications. Visual features are the key factors for facial expression recognition. However, the gap between the visual features and the emotions is large. In order to bridge the gap, the proposed method utilises convolutional neural networks (CNNs) and histogram of oriented gradient (HOG) to obtain the more comprehensive feature for VFER. Firstly, it extracts shallow features from the video frame through a number of convolutional kernels in CNNs, which has the characteristics of displacement, scale and deformation invariance. Then, the HOG is employed to extract HOG features from CNN's shallow features, which are strongly correlated with facial expressions. Finally, the support vector machine (SVM) is employed to conduct the task of facial expression recognition. The extensive experiments on RML, CK+ and AFEW5.0 database show that this framework takes on the promising performance and outperforming the state of the arts.
- Author(s): Yongming Han ; Shuheng Zhang ; Zhiqing Geng ; Qin Wei ; Zhi Ouyang
- Source: IET Image Processing, Volume 14, Issue 1, p. 183 –191
- DOI: 10.1049/iet-ipr.2018.6622
- Type: Article
- + Show details - Hide details
-
p.
183
–191
(9)
Deep convolutional neural network can effectively extract hidden patterns in images and learn realistic image priors from the training set. And fully convolutional networks (FCNs) have achieved state-of-the-art performance in the image segmentation. However, these methods have the disadvantages of noise, boundary roughness and no prior shape. Therefore, this study proposes a level set with the deep prior method for the image segmentation based on the priors learned by FCNs. The FCNs can learn high-level semantic patterns from the training set. Also, the output of the FCNs represents the high-level semantic information as a probability map and the global affine transformation can obtain the optimal affine transformation of the intrinsic prior shape. Moreover, the improved level set method integrates the information of the original image, the probability map and the corrected prior shape to achieve the image segmentation. Compared with the traditional level set method of simple scenes, the proposed method solves the disadvantage of FCNs by using the high-level semantic information to segment images of complex scenes. Finally, Portrait data set are used to verify the effectiveness of the proposed method. The experimental results show that the proposed method can obtain more accurate segmentation results than the traditional FCNs.
- Author(s): Xuegang Hu and Hongguang Yang
- Source: IET Image Processing, Volume 14, Issue 1, p. 192 –200
- DOI: 10.1049/iet-ipr.2019.0025
- Type: Article
- + Show details - Hide details
-
p.
192
–200
(9)
With the wide applications of biomedical images in the medical field, the segmentation of biomedical images plays an important role in clinical diagnosis, pathological analysis, and medical intervention. Full convolutional neural networks, especially U-net, have improved the performance of segmentation greatly in recent years. However, due to their regular geometric structure, the standard convolutions that they use are inherently limited in dealing with geometric transformations while biomedical objects have huge variations in shape and size. In this study, the authors propose the DRU-net, which is a novel U-net with deformable encoder and reshaping upsampling convolution decoder, for biomedical image segmentation. First, deformable convolutional networks are applied and improved to enhance the learning ability of the encoder for geometric transformations. Second, a novel upsampling method named reshape upsampling convolution is proposed for better-restoring resolution and fusion features. Furthermore, focal loss is used to address class imbalance and model overwhelmed problems in biomedical image segmentation tasks. Theoretic analysis and experimental results have shown that the proposed algorithm not only reduces the number of parameters of U-Net, but also achieves produces competitive results compared with the state-of-the-art algorithms in terms of various quantitative measures on Drosophila electron microscopy dataset and Warwick-QU dataset.
- Author(s): Hager Merdassi ; Walid Barhoumi ; Ezzeddine Zagrouba
- Source: IET Image Processing, Volume 14, Issue 1, p. 201 –210
- DOI: 10.1049/iet-ipr.2018.5176
- Type: Article
- + Show details - Hide details
-
p.
201
–210
(10)
This work proposes a framework for simultaneously segmenting foreground objects in a collection of images having heterogeneous contents. Rather than resorting to image co-segmentation to segment similar objects in multiple images, which requires the use of categorised images, the authors’ idea disseminates segmentation information within images. In this way, it becomes easier to detect foreground objects in all of them simultaneously, mainly under the hypothesis of using similar or different images. General information is aggregated, on foregrounds as well as on backgrounds, from a set of images for joint segmentation of category-independent objects. The key idea is to estimate the linear dependence of the foreground histograms of the input images to optimise a Markov random field-based energy function. Iterative optimisation of each image permits after that the enhancement of the final segmentation results. Extensive experiments demonstrate that the proposed method (PM) enables full-object segmentation of foreground objects within a collection of images composed of different classes. Indeed, the validation of the accuracy on five challenging datasets (iCoseg, Oxford Flowers, MicroSoft Research Cambridge (MSRC), Caltech101 and Berkeley) shows that the PM achieves satisfactory results as compared with state-of-the-art methods. Besides, it has the challenging ability to efficiently deal with uncategorised objects.
Image-independent optimal non-negative integer bit allocation technique for the DCT-based image transform coders
Binary tomography on the isometric tessellation involving pixel shape orientation
Image sorting via a reduction in travelling salesman problem
Colour image encryption scheme based on enhanced quadratic chaotic map
Real-time least-squares ensemble visual tracking
Automatic optical-to-SAR image registration using a structural descriptor
Automatic rectification of warped Bangla document images
Gradient-based kernel selection technique for tumour detection and extraction of medical images using graph cut
Regularised IHS-based pan-sharpening approach using spectral consistency constraint and total variation
XNORCONV: CNNs accelerator implemented on FPGA using a hybrid CNNs structure and an inter-layer pipeline method
Low-rank tensor completion for visual data recovery via the tensor train rank-1 decomposition
Higher precision range estimation for context-based adaptive binary arithmetic coding
Automatic detection of acute lymphoblastic leukaemia based on extending the multifractal features
Image retrieval based on ASIFT features in a Hadoop clustered system
Integrated system for automatic detection of representative video frames in wireless capsule endoscopy using adaptive sliding window singular value decomposition
Correction of complex purple fringing by green-channel compensation and local luminance adaptation
Novel image restoration method based on multi-frame super-resolution for atmospherically distorted images
Fusing HOG and convolutional neural network spatial–temporal features for video-based facial expression recognition
Level set based shape prior and deep learning for image segmentation
DRU-net: a novel U-net for biomedical image segmentation
Optimisation of linear dependence energy for object co-segmentation in a set of images with heterogeneous contents
Most viewed content
Most cited content for this Journal
-
Medical image segmentation using deep learning: A survey
- Author(s): Risheng Wang ; Tao Lei ; Ruixia Cui ; Bingtao Zhang ; Hongying Meng ; Asoke K. Nandi
- Type: Article
-
Block-based discrete wavelet transform-singular value decomposition image watermarking scheme using human visual system characteristics
- Author(s): Nasrin M. Makbol ; Bee Ee Khoo ; Taha H. Rassem
- Type: Article
-
Classification of malignant melanoma and benign skin lesions: implementation of automatic ABCD rule
- Author(s): Reda Kasmi and Karim Mokrani
- Type: Article
-
Digital image watermarking method based on DCT and fractal encoding
- Author(s): Shuai Liu ; Zheng Pan ; Houbing Song
- Type: Article
-
Tomato leaf disease classification by exploiting transfer learning and feature concatenation
- Author(s): Mehdhar S. A. M. Al‐gaashani ; Fengjun Shang ; Mohammed S. A. Muthanna ; Mashael Khayyat ; Ahmed A. Abd El‐Latif
- Type: Article