Training CNNs with image patches for object localisation

Training CNNs with image patches for object localisation

For access to this article, please select a purchase option:

Buy article PDF
(plus tax if applicable)
Buy Knowledge Pack
10 articles for $120.00
(plus taxes if applicable)

IET members benefit from discounts to all IET publications and free access to E&T Magazine. If you are an IET member, log in to your account and the discounts will automatically be applied.

Learn more about IET membership 

Recommend Title Publication to library

You must fill out fields marked with: *

Librarian details
Your details
Why are you recommending this title?
Select reason:
Electronics Letters — Recommend this title to your library

Thank you

Your recommendation has been sent to your librarian.

Recently, convolutional neural networks (CNNs) have shown great performance in different problems of computer vision including object detection and localisation. A novel training approach is proposed for CNNs to localise some animal species whose bodies have distinctive patterns such as leopards and zebras. To learn characteristic patterns, small patches which are taken from different body parts of animals are used to train models. To find object location, in a test image, all locations are visited in a sliding window fashion. Crops are fed into trained CNN and their classification scores are combined into a heat map. Later on, heat maps are converted to bounding box estimates for varying confidence scores. The localisation performance of the patch-based training approach is compared with Faster R-CNN – a state-of-the-art CNN-based object detection and localisation method. Experimental results reveal that the patch-based training outperforms Faster R-CNN, especially for classes with distinctive patterns.


    1. 1)
      • 1. Sermanet, P., Eigen, D., Zhang, X., et al: ‘Overfeat: integrated recognition, localization and detection using convolutional networks’, 2013. Available at
    2. 2)
      • 2. Ren, S., He, K., Girshick, R., et al: ‘Faster R-CNN: towards real-time object detection with region proposal networks’. Neural Information Processing Systems (NIPS), Montreal, Canada, December 2015, pp. 9199.
    3. 3)
      • 3. Redmon, J., Divvala, S., Girshick, R., et al: ‘You only look once: unified, real time object detection’. IEEE Conf. Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, June 2016, pp. 779788.
    4. 4)
      • 4. He, K., Zhang, X., Ren, S., et al: ‘Deep residual learning for image recognition’. IEEE Conf. Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, July 2016, pp. 770778.
    5. 5)

Related content

This is a required field
Please enter a valid email address