Mid-level deep Food Part mining for food image recognition

Jiannan Zheng; Liang Zou; Z. Jane Wang

Mid-level deep Food Part mining for food image recognition

View Fulltext

Author(s): Jiannan Zheng¹ ; Liang Zou¹ ; Z. Jane Wang¹
- Affiliations: 1: Department of Electrical and Computer Engineering , University of British Columbia , 5500-2332 Main Mall, Vancouver , Canada
Source: Volume 12, Issue 3, April 2018, p. 298 – 304
DOI: 10.1049/iet-cvi.2016.0335 , Print ISSN 1751-9632, Online ISSN 1751-9640

Received 17/09/2016, Accepted 06/11/2017, Revised 07/10/2017, Published 16/11/2017

There has been a growing interest in food image recognition for a wide range of applications. Among existing methods, mid-level image part-based approaches show promising performances due to their suitability for modelling deformable food parts (FPs). However, the achievable accuracy is limited by the FP representations based on low-level features. Benefiting from the capacity to learn powerful features with labelled data, deep learning approaches achieved state-of-the-art performances in several food image recognition problems. Both mid-level-based approaches and deep convolutional neural networks (DCNNs) approaches clearly have their respective advantages, but perhaps most importantly these two approaches can be considered complementary. As such, the authors propose a novel framework to better utilise DCNN features for food images by jointly exploring the advantages of both the mid-level-based approaches and the DCNN approaches. Furthermore, they tackle the challenge of training a DCNN model with the unlabelled mid-level parts data. They accomplish this by designing a clustering-based FP label mining scheme to generate part-level labels from unlabelled data. They test on three benchmark food image datasets, and the numerical results demonstrate that the proposed approach achieves competitive performance when compared with existing food image recognition approaches.

References

1. 1)
  - 4. Kitamura, K., de Silva, C., Yamasaki, T., et al: ‘Image processing based approach to food balance analysis for personal food logging’. 2010 IEEE Int. Conf. Multimedia and Expo (ICME), 2010, pp. 625–630.
2. 2)
  - 6. Matsuda, Y., Hoashi, H., Yanai, K.: ‘Recognition of multiple-food images by detecting candidate regions’. 2012 IEEE Int. Conf. Multimedia and Expo (ICME), 2012, pp. 25–30.
3. 3)
  - 1. Zhu, F., Bosch, M., Woo, I., et al: ‘The use of mobile devices in aiding dietary assessment and evaluation’, IEEE J. Sel. Top. Signal Process., 2010, 4, (4), pp. 756–766.
4. 4)
  - 30. Hinton, G.E.: ‘Deep belief networks’, Scholarpedia, 2009, 4, (5), p. 5947.
5. 5)
  - 27. Gong, Y., Wang, L., Guo, R., et al: ‘Multi-scale orderless pooling of deep convolutional activation features’. Computer Vision – ECCV 2014, 2014, pp. 392–407.
6. 6)
  - 23. Doersch, C., Gupta, A., Efros, A. A.: ‘Mid-level visual element discovery as discriminative mode seeking’. Advances in Neural Information Processing Systems, 2013, pp. 494–502.
7. 7)
  - 24. Martinel, N., Piciarelli, C., Micheloni, C.: ‘A supervised extreme learning committee for food recognition’, Comput. Vis. Image Underst., 2016, 148, pp. 67–86.
8. 8)
  - 31. Chen, M., Weinberger, K. Q., Sha, F., et al: ‘Marginalized denoising auto-encoders for nonlinear representations’. Proc. 31st Int. Conf. Machine Learning (ICML-14), 2014, pp. 1476–1484.
9. 9)
  - 14. Krizhevsky, A., Sutskever, I., Hinton, G.E.: ‘ImageNet classification with deep convolutional neural networks’, in: Pereira, F., Burges, C., Bottou, L., et al (Eds.): ‘Advances in neural information processing systems 25’ (Curran Associates, Inc., 2012), pp. 1097–1105. Available at http://papers.nips.cc/paper/4824-imagenet-classification-with-deep-convolutional-neural-networks.pdf.
10. 10)
  - 25. Hassannejad, H., Matrella, G., Ciampolini, P., et al: ‘Food image recognition using very deep convolutional networks’. Proc. Second Int. Workshop on Multimedia Assisted Dietary Management, 2016, pp. 41–49.
11. 11)
  - 37. Kawano, Y., Yanai, K.: ‘Automatic expansion of a food image dataset leveraging existing categories with domain adaptation’. Proc. ECCV Workshop on Transferring and Adapting Source Knowledge in Computer Vision (TASK-CV), 2014.
12. 12)
  - 18. He, Y., Xu, C., Khanna, N., et al: ‘Context based food image analysis’. 2013 20th IEEE Int. Conf. Image Processing (ICIP), 2013, pp. 2748–2752.
13. 13)
  - 12. Perronnin, F., Sánchez, J., Mensink, T.: ‘Improving the Fisher kernel for large-scale image classification’. Computer Vision – ECCV 2010, 2010, pp. 143–156.
14. 14)
  - 21. Chen, M., Dhingra, K., Wu, W., et al: ‘PFID: Pittsburgh fast-food image dataset’. 2009 16th IEEE Int. Conf. Image Processing (ICIP), 2009, pp. 289–292.
15. 15)
  - 22. Kawano, Y., Yanai, K.: ‘FoodCam: a real-time food recognition system on a smartphone’, Multimedia Tools Appl., 2015, 74, (14), pp. 5263–5287.
16. 16)
  - 39. Jia, Y., Shelhamer, E., Donahue, J., et al: ‘Caffe: convolutional architecture for fast feature embedding’, arXiv preprint arXiv:1408.5093.
17. 17)
  - 15. Karpathy, A., Toderici, G., Shetty, S., et al: ‘Large-scale video classification with convolutional neural networks’. 2014 IEEE Conf. Computer Vision and Pattern Recognition (CVPR), 2014, pp. 1725–1732.
18. 18)
  - 19. Farinella, G.M., Allegra, D., Moltisanti, M., et al: ‘Retrieval and classification of food images’, Comput. Biol. Med., 2016, 77, pp. 23–39.
19. 19)
  - 16. Kawano, Y., Yanai, K.: ‘Food image recognition with deep convolutional features’. Proc. 2014 ACM Int. Joint Conf. Pervasive and Ubiquitous Computing: Adjunct Publication, 2014, pp. 589–593.
20. 20)
  - 9. Bossard, L., Guillaumin, M., Van Gool, L.: ‘Food-101 – mining discriminative components with random forests’. Computer Vision – ECCV 2014, 2014, pp. 446–461.
21. 21)
  - 20. Farinella, G.M., Moltisanti, M., Battiato, S.: ‘Food recognition using consensus vocabularies’. New Trends in Image Analysis and Processing – ICIAP 2015 Workshops, 2015, pp. 384–392.
22. 22)
  - 32. Coates, A., Karpathy, A., Ng, A.Y.: ‘Emergence of object-selective features in unsupervised feature learning’, in: Pereira, F., Burges, C., Bottou, L., et al (Eds.): ‘Advances in neural information processing systems 25’ (Curran Associates, Inc., 2012), pp. 2681–2689. Available at http://papers.nips.cc/paper/4497-emergence-of-object-selective-features-in-unsupervised-feature-learning.pdf.
23. 23)
  - 7. Dalal, N., Triggs, B.: ‘Histograms of oriented gradients for human detection’. IEEE Computer Society Conf. Computer Vision and Pattern Recognition 2005 CVPR 2005, 2005, vol. 1, pp. 886–893.
24. 24)
  - 36. Wang, X., Wang, B., Bai, X., et al: ‘Max-margin multiple-instance dictionary learning’. Proc. 30th Int. Conf. Machine Learning, 2013, pp. 846–854.
25. 25)
  - 34. Oquab, M., Bottou, L., Laptev, I., et al: ‘Is object localization for free? – weakly-supervised learning with convolutional neural networks’. Proc. IEEE Conf. Computer Vision and Pattern Recognition, 2015, pp. 685–694.
26. 26)
  - 35. Feng, J., Wei, Y., Tao, L., et al: ‘Salient object detection by composition’. 2011 IEEE Int. Conf. Computer Vision (ICCV), 2011, pp. 1028–1035.
27. 27)
  - 2. Anthimopoulos, M., Gianola, L., Scarnato, L., et al: ‘A food recognition system for diabetic patients based on an optimized bag-of-features model’, IEEE J. Biomed. Health Inf., 2014, 18, (4), pp. 1261–1271.
28. 28)
  - 3. Aizawa, K., Maruyama, Y., Li, H., et al: ‘Food balance estimation by using personal dietary tendencies in a multimedia food log’, IEEE J. Multimed., 2013, 15, (8), pp. 2176–2185.
29. 29)
  - 17. Yanai, K., Kawano, Y.: ‘Food image recognition using deep convolutional network with pretraining and fine-tuning’. 2015 IEEE Int. Conf. Multimedia & Expo Workshops (ICMEW), 2015, pp. 1–6.
30. 30)
  - 38. Long, M., Cao, Y., Wang, J., et al: ‘Learning transferable features with deep adaptation networks’. Int. Conf. Machine Learning, 2015, pp. 97–105.
31. 31)
  - 28. Liu, L., Shen, C., Wang, L., et al: ‘Encoding high dimensional local features by sparse coding based Fisher vectors’. Advances in Neural Information Processing Systems, 2014, pp. 1143–1151.
32. 32)
  - 29. Felzenszwalb, P., Huttenlocher, D.P.: ‘Efficient graph-based image segmentation’, Int. J. Comput. Vis., 2004, 59, (2), pp. 167–181.
33. 33)
  - 26. Singla, A., Yuan, L., Ebrahimi, T.: ‘Food/non-food image classification and food categorization using pre-trained GoogleNet model’. Proc. Second Int. Workshop on Multimedia Assisted Dietary Management, 2016, pp. 3–11.
34. 34)
  - 40. Abadi, M., Agarwal, A., Barham, P., et al: ‘Tensorflow: large-scale machine learning on heterogeneous distributed systems’ arXiv preprint arXiv:1603.04467.
35. 35)
  - 10. Zheng, J., Wang, Z.J., Zhu, C.: ‘Food image recognition via superpixel based low-level and mid-level distance coding for smart home applications’, Sustainability, 2017, 9, (5), p. 856.
36. 36)
  - 11. Li, Y., Liu, L., Shen, C., et al: ‘Mid-level deep pattern mining’. 2015 IEEE Conf. Computer Vision and Pattern Recognition (CVPR), 2015, pp. 971–980.
37. 37)
  - 13. Bengio, Y., Courville, A., Vincent, P.: ‘Representation learning: a review and new perspectives’, IEEE Trans. Pattern Anal. Mach. Intell., 2013, 35, (8), pp. 1798–1828.
38. 38)
  - 33. Oquab, M., Bottou, L., Laptev, I., et al: ‘Learning and transferring mid-level image representations using convolutional neural networks’. 2014 IEEE Conf. Computer Vision and Pattern Recognition (CVPR), 2014, pp. 1717–1724.
39. 39)
  - 5. Yanai, K., Takamu, T., Kawano, Y.: ‘Real-time photo mining from the twitter stream: event photo discovery and food photo detection’. 2014 IEEE Int. Symp.Multimedia (ISM), 2014, pp. 295–302.
40. 40)
  - 8. Lowe, D.G.: ‘Distinctive image features from scale-invariant keypoints’, Int. J. Comput. Vis., 2004, 60, (2), pp. 91–110.

Login

Not registered yet?

Share

Tools

Login to add to favourites

Key

Mid-level deep Food Part mining for food image recognition

References

Related content