Optical, image and video signal processing
More general concepts than this:
More specific concepts than this:
Filter by subject:
- Optical, image and video signal processing [10240]
- Electrical and electronic engineering [10220]
- Communications [10220]
- Information and communication theory [10220]
- Computer and control engineering [8338]
- Computer hardware [8171]
- Logic design and digital techniques [8154]
- Digital signal processing [8136]
- Computer vision and image processing techniques [7326]
- General topics, engineering mathematics and materials science [4328]
- [4229]
- http://iet.metastore.ingenta.com/content/subject/c1000,http://iet.metastore.ingenta.com/content/subject/c1100,http://iet.metastore.ingenta.com/content/subject/b6135e,http://iet.metastore.ingenta.com/content/subject/c6000,http://iet.metastore.ingenta.com/content/subject/c6100,http://iet.metastore.ingenta.com/content/subject/c7000,http://iet.metastore.ingenta.com/content/subject/b7000,http://iet.metastore.ingenta.com/content/subject/b6300,http://iet.metastore.ingenta.com/content/subject/b6135c,http://iet.metastore.ingenta.com/content/subject/b0240,http://iet.metastore.ingenta.com/content/subject/b6320,http://iet.metastore.ingenta.com/content/subject/c5260d,http://iet.metastore.ingenta.com/content/subject/a,http://iet.metastore.ingenta.com/content/subject/b0240z,http://iet.metastore.ingenta.com/content/subject/c1140,http://iet.metastore.ingenta.com/content/subject/c1140z,http://iet.metastore.ingenta.com/content/subject/b0290,http://iet.metastore.ingenta.com/content/subject/c7300,http://iet.metastore.ingenta.com/content/subject/c4000,http://iet.metastore.ingenta.com/content/subject/b6140,http://iet.metastore.ingenta.com/content/subject/c6170,http://iet.metastore.ingenta.com/content/subject/c6170k,http://iet.metastore.ingenta.com/content/subject/c7330,http://iet.metastore.ingenta.com/content/subject/a8000,http://iet.metastore.ingenta.com/content/subject/c6130,http://iet.metastore.ingenta.com/content/subject/a8700,http://iet.metastore.ingenta.com/content/subject/b7500,http://iet.metastore.ingenta.com/content/subject/b7510,http://iet.metastore.ingenta.com/content/subject/c4100,http://iet.metastore.ingenta.com/content/subject/a8770,http://iet.metastore.ingenta.com/content/subject/b0230,http://iet.metastore.ingenta.com/content/subject/a8770e,http://iet.metastore.ingenta.com/content/subject/a8760,http://iet.metastore.ingenta.com/content/subject/b0290f,http://iet.metastore.ingenta.com/content/subject/b6140b,http://iet.metastore.ingenta.com/content/subject/c5290,http://iet.metastore.ingenta.com/content/subject/c1130,http://iet.metastore.ingenta.com/content/subject/b0260,http://iet.metastore.ingenta.com/content/subject/c1200,http://iet.metastore.ingenta.com/content/subject/c4130,http://iet.metastore.ingenta.com/content/subject/c1180,http://iet.metastore.ingenta.com/content/subject/c7400,http://iet.metastore.ingenta.com/content/subject/b7200,http://iet.metastore.ingenta.com/content/subject/b0250,http://iet.metastore.ingenta.com/content/subject/c1160,http://iet.metastore.ingenta.com/content/subject/c6130s,http://iet.metastore.ingenta.com/content/subject/c1250,http://iet.metastore.ingenta.com/content/subject/a0000,http://iet.metastore.ingenta.com/content/subject/b7230,http://iet.metastore.ingenta.com/content/subject/b6200,http://iet.metastore.ingenta.com/content/subject/c1250m,http://iet.metastore.ingenta.com/content/subject/b6120,http://iet.metastore.ingenta.com/content/subject/b7230g,http://iet.metastore.ingenta.com/content/subject/a0200,http://iet.metastore.ingenta.com/content/subject/b6310,http://iet.metastore.ingenta.com/content/subject/b0210,http://iet.metastore.ingenta.com/content/subject/b0290x,http://iet.metastore.ingenta.com/content/subject/b7700,http://iet.metastore.ingenta.com/content/subject/b1000,http://iet.metastore.ingenta.com/content/subject/c1110,http://iet.metastore.ingenta.com/content/subject/a9000,http://iet.metastore.ingenta.com/content/subject/b1200,http://iet.metastore.ingenta.com/content/subject/b5000,http://iet.metastore.ingenta.com/content/subject/b5200,http://iet.metastore.ingenta.com/content/subject/c7445,http://iet.metastore.ingenta.com/content/subject/a9300,http://iet.metastore.ingenta.com/content/subject/b7710,http://iet.metastore.ingenta.com/content/subject/a9385,http://iet.metastore.ingenta.com/content/subject/c4188,http://iet.metastore.ingenta.com/content/subject/c4200,http://iet.metastore.ingenta.com/content/subject/b1265,http://iet.metastore.ingenta.com/content/subject/c3000,http://iet.metastore.ingenta.com/content/subject/b6250,http://iet.metastore.ingenta.com/content/subject/b6400,http://iet.metastore.ingenta.com/content/subject/c7200,http://iet.metastore.ingenta.com/content/subject/c3300,http://iet.metastore.ingenta.com/content/subject/c6130b,http://iet.metastore.ingenta.com/content/subject/a4000,http://iet.metastore.ingenta.com/content/subject/b6430,http://iet.metastore.ingenta.com/content/subject/c7800,http://iet.metastore.ingenta.com/content/subject/c7250,http://iet.metastore.ingenta.com/content/subject/b7510n,http://iet.metastore.ingenta.com/content/subject/c7250r,http://iet.metastore.ingenta.com/content/subject/a8760i,http://iet.metastore.ingenta.com/content/subject/c5100,http://iet.metastore.ingenta.com/content/subject/b6120d,http://iet.metastore.ingenta.com/content/subject/a4200,http://iet.metastore.ingenta.com/content/subject/e,http://iet.metastore.ingenta.com/content/subject/c5260a
- c1000,c1100,b6135e,c6000,c6100,c7000,b7000,b6300,b6135c,b0240,b6320,c5260d,a,b0240z,c1140,c1140z,b0290,c7300,c4000,b6140,c6170,c6170k,c7330,a8000,c6130,a8700,b7500,b7510,c4100,a8770,b0230,a8770e,a8760,b0290f,b6140b,c5290,c1130,b0260,c1200,c4130,c1180,c7400,b7200,b0250,c1160,c6130s,c1250,a0000,b7230,b6200,c1250m,b6120,b7230g,a0200,b6310,b0210,b0290x,b7700,b1000,c1110,a9000,b1200,b5000,b5200,c7445,a9300,b7710,a9385,c4188,c4200,b1265,c3000,b6250,b6400,c7200,c3300,c6130b,a4000,b6430,c7800,c7250,b7510n,c7250r,a8760i,c5100,b6120d,a4200,e,c5260a
- [3429],[2998],[2985],[2228],[2228],[2133],[1964],[1758],[1683],[1653],[1629],[1493],[1486],[1468],[1451],[1291],[1183],[1183],[1134],[1092],[1073],[1059],[1039],[999],[990],[978],[974],[939],[918],[913],[895],[870],[808],[792],[770],[754],[738],[688],[658],[615],[596],[569],[545],[503],[493],[484],[474],[461],[460],[459],[431],[399],[392],[373],[368],[360],[339],[336],[330],[308],[305],[305],[305],[303],[303],[290],[285],[283],[273],[255],[250],[250],[249],[243],[234],[224],[219],[208],[203],[203],[201],[199],[198],[194],[193],[192],[191],[187],[185]
- /search/morefacet;jsessionid=5gp58l4m3kafs.x-iet-live-01
- /content/searchconcept;jsessionid=5gp58l4m3kafs.x-iet-live-01?option1=pub_concept&sortField=prism_publicationDate&pageSize=20&sortDescending=true&value1=b6135&facetOptions=2&facetNames=pub_concept_facet&operator2=AND&option2=pub_concept_facet&value2=
- See more See less
Filter by content type:
- Article [6968]
- ConferencePaper [2883]
- Chapter [361]
- E-Book [28]
- Appendix [18]
- E-First Article [7]
- ReferenceWorks [1]
Filter by publication date:
- 2019 [721]
- 2020 [623]
- 2018 [614]
- 2015 [606]
- 2012 [603]
- 2017 [441]
- 2013 [438]
- 2011 [395]
- 2009 [385]
- 2006 [365]
- 2008 [353]
- 1999 [346]
- 2016 [313]
- 2005 [292]
- 2007 [285]
- 2010 [277]
- 2003 [243]
- 2000 [202]
- 2004 [187]
- 2014 [176]
- 2002 [149]
- 2001 [144]
- 1998 [70]
- 1985 [3]
- 1993 [2]
- See more See less
Filter by author:
- M. Ghanbari [36]
- E. Izquierdo [32]
- A.M. Kondoz [31]
- Zegang Ding [30]
- Cheng Hu [27]
- E.R. Davies [24]
- D.R. Bull [23]
- Haifeng Hu [22]
- S.A. Velastin [22]
- Tao Zeng [22]
- Mengdao Xing [21]
- Ming Li [21]
- J. Jiang [20]
- J. Yang [20]
- Jian Yang [20]
- J. Li [19]
- Zheng Bao [19]
- Teng Long [18]
- X. Li [18]
- Christoph Busch [17]
- Jie Yang [17]
- Jun Wang [17]
- S.-J. Ko [17]
- Y. Liu [17]
- Z. Bao [17]
- A.H. Sadka [16]
- Di Yao [16]
- F. Deravi [16]
- He Chen [16]
- Hong Liu [16]
- J. Zhang [16]
- Liang Chen [16]
- Y. Wang [16]
- A. Bouridane [15]
- C.N. Canagarajah [15]
- Daiyin Zhu [15]
- F. Bremond [15]
- J. Kim [15]
- L. Zhang [15]
- S. Lee [15]
- Wanggen Wan [15]
- Wei Zhang [15]
- Wen Hong [15]
- Yong Wang [15]
- A. Hilton [14]
- Andreas Uhl [14]
- M.S. Nixon [14]
- Marco Martorella [14]
- W.A.C. Fernando [14]
- Wei Liu [14]
- X. Wang [14]
- X. Yang [14]
- Chao Wang [13]
- E. Jones [13]
- E.A.B. da Silva [13]
- Fukun Bi [13]
- Jun Zhang [13]
- Lingjiang Kong [13]
- M. Glavin [13]
- M. Xing [13]
- M.C. Fairhurst [13]
- T. Vlachos [13]
- Tian Jin [13]
- Wei Wang [13]
- Xingzhao Liu [13]
- Zhe-Ming Lu [13]
- A. Kumar [12]
- Lei Zhang [12]
- Qun Zhang [12]
- S.K. Mitra [12]
- Weiming Tian [12]
- Xiang Li [12]
- Yizhuang Xie [12]
- D. Crookes [11]
- J.J. Soraghan [11]
- Jian Wang [11]
- Jing Zhang [11]
- M. Fleury [11]
- M. Martorella [11]
- M. Thonnat [11]
- M.N.S. Swamy [11]
- P. Lombardo [11]
- Rae-Hong Park [11]
- S. Worrall [11]
- Y. Li [11]
- Y. Zhao [11]
- Yong Li [11]
- A.T.S. Ho [10]
- Baojun Zhao [10]
- C. Abhayaratne [10]
- Christian Rathgeb [10]
- Fei Gao [10]
- Gang Li [10]
- J. Jeong [10]
- J. Xu [10]
- L. Liu [10]
- Lei Wang [10]
- M.A. Joshi [10]
- R. Cucchiara [10]
- Rui Wang [10]
- See more See less
Filter by access type:
In this research, digital image processing method (DIPM) is used as an innovative approach to predict precisely the shape of electrical tree (ET) in cross-linked polyethylene (XLPE) power cables in the presence of air voids based on the field calculation using finite-element method (FEM). With the help of DIPM, two case studies are held to detect the accurate parameters of either the first initiated major branch or the tips of the major branches of ET. A hyperbolic needle-to-plane simulation model is proposed to illustrate the ET inception and propagation stages. The non-uniform electric fields thatare accompanied with the electrical treeing phenomenon are calculated using FEM as one of the most effective numerical methods to deal with non-uniform shapes. The predicted shapes of ET initiation and growth are provided in an innovative manner with the implemented hybrid connection between FEM and DIPM for the two proposed case studies. Direction branching approach and deviation angle branching approach are provided in this work to predict the shape and the direction of ET branched voids. The validity of the proposed model is assessed with the help of available previous experimental and simulation data.
The advancement in sensing technologies and infrastructure allows real-time condition monitoring on wind turbines (WTs), which helps improve the power generation efficiency, lower the maintenance costs of wind farms (WFs). Practically, the real-time measurements could be unavailable at the Supervisory Control and Data Acquisition end due to unintended events such as sensor faults and communication loss, which significantly depreciates the condition monitoring and fault detection performance. Aiming to mitigate the missing data impact on data-driven WF applications, this study develops a robust anomaly detection approach for WT fault detection using a denoising variational autoencoder. In presence of missing measurements, the proposed approach can not only sustain high fault detection performance but also recover the missing data as an auxiliary function. The proposed approach is tested on a realistic offshore WF and compared with other autoencoder variants and traditional anomaly detection methods. The testing results verify the outstanding robustness of the proposed approach against missing data events and demonstrate its great potential in missing data recovery.
This work introduces and evaluates a model for predicting driver behaviour, namely turns or proceeding straight, at traffic light intersections from driver three-dimensional gaze data and traffic light recognition. Based on vehicular data, this work relates the traffic light position, the driver's gaze, head movement, and distance from the centre of the traffic light to build a model of driver behaviour. The model can be used to predict the expected driver manoeuvre 3 to 4 s prior to arrival at the intersection. As part of this study, a framework for driving scene understanding based on driver gaze is presented. The outcomes of this study indicate that this deep learning framework for measuring, accumulating and validating different driving actions may be useful in developing models for predicting driver intent before intersections and perhaps in other key-driving situations. Such models are an essential part of advanced driving assistance systems that help drivers in the execution of manoeuvres.
Hand held gun detection has an important application in both the field of video forensic and surveillance, because, gun is operative by hand only while committing any crime with it. The significant application encompasses the vulnerable places, such as around airport, marketplace, shopping malls, etc. In view of non-availability of relevant public data set, this study provides a newly created mimicked video data set for detection of gun carried by a person and entitled as Tripura University Video Data set for Crime-Scene-Analysis (TUVD-CSA). Effects of illumination, occlusion, rotation, pan, tilt, scaling of gun are effectively demonstrated in it. Moreover, the authors proposed an Iterative Model Generation Framework (IMGF) for gun detection, which is immune to scaling and rotation. Instead of locating the best matched object (gun) in the whole reference image to a query model via exhaustive search, IMGF searches only where the moving person carrying gun appears, which drastically reduces the computational overhead associated with a general template matching scheme. This has been employed by the background subtraction algorithm. Experimental results demonstrate that the proposed IMGF performs efficiently in gun detection with lesser number of true-negatives compared with the state-of-the-art methods.
A novel learning-based end-to-end network for stereo matching, named Multi-path Attention Stereo Matching (MPA-Net), is introduced in this study. Different from existing methods, the multi-path attention aggregation module is designed firstly, named MPA, which is a unified structure using three different parallel layers with a respective attention mechanism to extract the multi-scale informational features. Secondly, the method of cost volume construction, which differs from the traditional stereo matching methods, is extended. And then, the absolute difference between two input features is calculated. Furthermore, a u-shaped structure with 3D attention gate is selected as the encoder-decoder module. Specifically, the module is used to fuse the encoding features to their corresponding decoding features under the supervision of the authors' attention gate with skip-connection, and thus exploit more significant information for matching cost regularisation and disparity prediction. Finally, specific experiments are conducted to evaluate their network on SceneFlow, KITTI2012 and KITTI2015 data sets. The results show that their method achieves a better improvement in disparity maps prediction compared with some existing state-of-the-art methods on KITTI benchmark.
In recent years, consumer depth cameras have been widely used in digital entertainment and human-machine interaction due to the advantages of real-time performance and low cost. Facial depth maps have shown great potential in 3D-face-related studies. However, disadvantages of low resolution and precision limit its further applications. In this work, the authors propose an edge-guided convolutional neural network for single facial depth map super-resolution. It consists of two parts: an edge prediction sub-network and a depth reconstruction sub-network. The edge prediction sub-network generates an edge guidance map to guide the depth reconstruction sub-network to recover sharp edges and fine structures. Effective data augmentation methods are proposed as well. The network is patch-based and able to cope with any size of the input depth maps. In addition, it is insensitive to the face pose since the synthetic training dataset they generated covers a wide range of face poses. The proposed method is validated with three datasets including a synthetic facial depth data set, a real Kinect V2 facial depth data set and Middlebury Stereo Data set. Experimental results show that it outperforms the state-of-the-art methods on all the three data sets.
The query image is usually a simple and single object in image retrieval, and the reference images in the database usually have many distractions. The precision of image retrieval can be greatly improved If the target regions in the database image are extracted during retrieval. So this paper proposes a Bow image retrieval method based on SSD target detection. First, the training gallery is manually annotated to record the location and size information. Second, the SSD target detection model is trained with the labeled training gallery to obtain the target object SSD model. Third, the SSD model is used to locate the similar target regions of the reference image and the query graph. Finally, the target region information is mapped into the convolutional features, and these feature vectors are used for image similarity matching. The performance of the proposed method is evaluated on Paris6k, Oxford5k, Paris106k and Oxford105k databases. The experimental results show that the accuracy of image retrieval will be greatly improved by adding optimization methods in the proposed image retrieval framework. The image retrieval accuracy of this method is higher than that of similar methods in recent years.
With the development of artificial intelligence and image processing technology, more and more intelligent diagnosis technologies are used in cervical cancer screening. Among them, the detection of cervical lesions by thin liquid-based cytology is the most common method for cervical cancer screening. At present, most cervical cancer detection algorithms use the object detection technology of natural images, and often only minor modifications are made while ignoring the specificity of the complex application scenario of cervical lesions detection in cervical smear images. In this study, the authors combine the domain knowledge of cervical cancer detection and the characteristics of pathological cells to design a network and propose a booster for cervical cancer detection (CCDB). The booster mainly consists of two components: the refinement module and the spatial-aware module. The characteristics of cancer cells are fully considered in the booster, and the booster is light and transplantable. As far as the authors know, they are the first to design a CCDB according to the characteristics of cervical cancer cells. Compared with baseline (Retinanet), the sensitivity at four false positives per image and average precision of the proposed method are improved by 2.79 and 7.2%, respectively.
Currently, the existing literature on multipurpose watermarking employs multiple watermarks for simultaneous achieving of multiple security objectives. This approach can be problematic since a watermark targeting tampers might fail to detect tampering of another watermark inserted for copyright violation. Moreover, most of the current schemes follow a non-blind approach where the original watermark is needed for authentication. A question that naturally arises is whether multipurpose watermarking can be realised with the insertion of a single watermark in a blind way or not. The goal of the authors study is to provide an affirmative answer to this important question. They show how a cryptographic primitive called verifiable threshold secret sharing can be used to come up with a generic construction for blind multipurpose watermarking which inserts a single watermark into the host image for simultaneous achieving of copyright protection, authentication, and tamper localisation. The generic property of the proposed scheme provides flexibility for choosing the embedding/extraction of the watermark based on the desired level of fidelity, robustness and capacity. Experimental results are provided to confirm the superiority of their proposed technique as compared to existing approaches.
It is extremely challenging to accomplish excellent accuracy for gesture recognition using an approach where complexity in computation time for recognition is less. This study compares accuracy in hand gesture recognition of a single viewpoint set-up with proposed two viewpoint set-up for different classification techniques. The efficacy of the presented approach is verified practically with various image processing, feature extraction and classification techniques. Two camera system make geometry learning and three-dimensional (3D) view feasible compared to a single camera system. Geometrical features from additional viewpoint contribute to 3D view estimation of the hand gesture. It also improves the classification accuracy. Experimental results demonstrate that the proposed method show escalation in recognition rate compared to the single-camera system, and also has great performance using simple classifiers like the nearest neighbour and decision tree. Classification within 1 s is considered as real-time in this study.
Retinal vessel segmentation has important application value in clinical diagnosis. If experts manually segment the retinal vessels, the workload is heavy, and the result is strong subjectively. However, some existing automatic segmentation methods have the problems of incomplete vessel segmentation and low-segmentation accuracy. In order to solve the above problems, this study proposes a retinal vessel segmentation method based on task-driven generative adversarial network (GAN). In the generative model, a U-Net network is used to segment the retinal vessels. In the discriminative model, multi-scale discriminators with different receptive fields are used to guide the generative model to generate more details. On the other hand, in view of the uncontrollable characteristics of the data generated by the traditional GAN, a task-driven model based on perceptual loss is added to traditional GAN for feature matching, which makes the generated image more task-specific. Experimental results show that the accuracy, sensitivity, specificity and area under the receiver operating characteristic curve of the proposed method on data set digital retinal images for vessel extraction are 96.83, 80.66, 98.97 and 0.9830%, respectively.
The water quality, contaminant migration characteristics, and emissions quantity of pollutants in the basin would have a great impact on aquatic creatures, agricultural irrigation, human life, and so on. In the aquaculture industry, because water colour can reflect the species and number of phytoplankton in the water, the water quality type can be obtained by analysing the colour of the aquaculture water using image processing techniques. Therefore, this study proposes an intelligent monitoring approach for water quality. The critical features of water colour images are extracted, and then using the machine learning methods, an intelligent system for water quality monitoring is established based on the fused random vector functional link network (RVFL) and group method of data handling (GMDH) model. The proposed approach presents a superior performance relative to other state-of-the-art methods, and it achieves an average predicting accuracy of 96.19% on the feature dataset. Experimental findings demonstrate the validity of the proposed approach, and it is accomplished efficiently for the monitoring of water quality.
The existing single image super-resolution methods based on deep learning cannot handle multiple degradations well, and the generated image tends to be blurred and over-smoothed due to poor generalisation ability. In this study, the authors propose a method based on a generative adversarial network (GAN) to deal with multiple degradations. In the generator network, blur kernel and noise level are used as input through dimensionality stretching strategy preprocessing to make full use of prior knowledge. In addition, three discriminators with different scales are used in the discriminator network to pay attention to the reconstruction of image details while focusing on the global consistency of the image. For the problems of vanishing gradient and mode collapse existing in GAN-based methods, a gradient penalty term is added in the loss function. Extensive experiments demonstrate that the proposed method not only can handle multiple degradations to obtain state-of-the-art performance, but also deliver visually credible results in real scenes.
This study presents insights into the computational complexity of fractal image compression (FIC) algorithms. Unlike JPEG, a fractal encoder necessitates more CPU time in contrast to the decoder. The study examines various factors that impact the encoder and its computational cost. Many researchers have dedicated themselves to the field of fractal encoding to overcome the computational cost of the FIC algorithm. Here, this study offers a look over the approaches in the aspect of time complexity. The automated baseline fractal compression algorithm is studied to demonstrate the understanding of delay in the encoder. The study establishes how various approaches trade-off between the quality of decoder, compression ratio, and CPU time. The experiment section shows the bargain between fidelity criteria of the baseline algorithm.
Currently, PCA (principal component analysis) is widely used in many neural networks and has become a crucial part of the convolutional neural network (CNN) feature extraction. However, whether PCA is suitable for this process remains to be elucidated. The authors proposed a new method called balanced principal component (BPC) that generates a balanced local feature and combines with CNN as a layer to cope with the fusion problem. Specifically, BPC layer includes regionalisation module and average compression PCA (AC-PCA) module. First, they used regionalisation module to generate some sub-region that focuses on the local feature in each view. Secondly, the AC-PCA module is a computational process that enlarges the feature matrix by PCA and eventually compacts the matrix to a one-dimensional (1D) vector by AC. Next, all 1D vectors are compacted by AC to obtain a multi-dimensional balance. Finally, they designed this layer with an end-to-end trainable structure to promote the feature extraction task of CNN. They addressed 3D shapes using a projection method that is pre-trained on ImageNet and migration learning on ModelNet dataset. By comparing with the state-of-the-art network, they achieved a significant gain in performance of retrieval and classification tasks.
Crowd counting is getting more and more attention in our daily life, because it can effectively prevent some safety problems. However, due to scale variations and background noise in the image, such as buildings and trees, getting the accurate number from image is a hard work. In order to address these problems, this work introduces a new multi-scale supervised network. The proposed model uses part of vgg16 model as the backbone to extract feature. In the training process, a multi-scale dilated convolution module is added at the end of each stage of the backbone network to generate attention map with different resolutions to help the model focus on the head area in feature map. In addition, the dilated convolution adopts three dilation ratios to fit different sizes of head in the image. Finally, in order to get the high-quality density map with high-resolution, the authors employ the upsampling operation to restore the density map size to the quarter size of original image. A large number of experiments on these four datasets show that the proposed network has greatly improved the counting accuracy of many existing methods.
In order to increase the detection effect of fuzzy vehicles and small size vehicles, an improved multitask cascaded convolutional neural network (IMC-CNN) based on mixed image enhancement is proposed. Firstly, contrast limited adaptive histogram equalisation and multi-scale Retinex are used to enhance images. Mixed image enhancement can effectively solve the problems of image blurring, low contrast and uneven illumination when the imaging environment is not ideal. IMC-CNN includes two stages: object location and object classification. The object location network based on multi-layer feature fusion can locate and extract the object from complex background, and output regions contain only a single vehicle object. The object classification network is a lightweight convolutional neural network with only two convolutional layers, which can effectively reduce information loss and improve the classification accuracy of small objects and fuzzy objects. In addition, online hard example mining algorithm and focal loss function are adopted in network training. These strategies can solve the problem of unbalance between positive and negative samples. To verify the validity of the proposed algorithm, the experiments are performed on SYIT-Fuzzy dataset and COCO-Vehicle dataset. Compared with Faster R-CNN, YOLO v4 and other recent models, the average classification accuracy of the proposed method is significantly increased.
Computer-aided diagnosis (CAD) is a common tool for the detection of diseases, particularly different types of cancers, based on medical images. Digital image processing thus plays a significant role in the processing and analysis of medical images for diseases identification and detection purposes. In this study, an efficient CAD system for the acute lymphoblastic leukaemia (ALL) detection is proposed. The proposed approach entails two phases. In the first phase, the white blood cells (WBCs) are segmented from the microscopic blood image. The second phase involves extracting important features, such as shape and texture features from the segmented cells. Eventually, on the extracted features, Naïve Bayes and k-nearest neighbour classifier techniques are implemented to identify the segmented cells into normal and abnormal cells. The performance of the proposed approach has been assessed through comprehensive experiments carried out on the well-known ALL-IDB data set of microscopic blood images. The experimental results demonstrate the superior performance of the proposed approach over the state-of-the-art in terms of accuracy rate in which achieved 98.7%.
Outdoor images are having several applications including autonomous vehicles, geo-mapping, and surveillance. It is a common phenomenon that the images captured outdoor are prone to noise, which arises due to natural and manmade extreme atmospheric conditions such as haze, fog, and smog. Importantly in autonomous vehicle navigation, it is very important to recover the ground truth image to get the better decision by the system. Estimation of the transmission map and air-light is very crucial in recovering the ground truth image. In this study, the authors proposed a new method to estimate the transmission map based on a mean channel prior (MCP), which represents the depth map to estimate the transmission map. The authors proposed a deep neural network to identify the hazy image for the further dehazing process. In this study, the authors presented, two novel contributions, first an MCP-based image dehazing and second, a deep neural network-based identification of hazy images as a pre-processing block in the proposed end to end system. The proposed deep learning network using the TensorFlow platform provided validation accuracy of 93.4% for hazy image classification. Finally, the proposed MCP-based dehazing network showed better performance in terms of peak-signal-to-noise ratio, structural similarity index, and computational time than that of existing methods.
Passive millimetre-wave (PMMW) imaging frequently suffers from blurring and low resolution due to the long wavelengths. In addition, the observed images are inevitably disturbed by noise. Traditional image deblurring methods are sensitive to image noise, even a small amount of which will greatly reduce the quality of the point spread function (PSF) estimation. In this paper, we propose a blind deblurring and denoising method via a learning deep denoising convolutional neural networks (DnCNN) denoiser prior and an adaptive -regularized gradient prior for passive millimetre-wave images. First, a blind deblurring restoration model based on the DnCNN denoising prior constraint is established. Second, an adaptive -regularized gradient prior is incorporated into the model to estimate the latent clear image, and the PSF is estimated in the gradient domain. In a multi-scale framework, alternate iterative denoising and deblurring are used to obtain the final PSF estimation and noise estimation. Ultimately, the final clear image is restored by non-blind deconvolution. The experimental results show that the algorithm used in this paper not only has good detail recovery ability but is also more stable to different noise levels. The proposed method is superior to state-of-the-art methods in terms of both subjective measure and visual quality.