http://iet.metastore.ingenta.com
1887

Driving posture recognition by convolutional neural networks

Driving posture recognition by convolutional neural networks

For access to this article, please select a purchase option:

Buy article PDF
$19.95
(plus tax if applicable)
Buy Knowledge Pack
10 articles for $120.00
(plus taxes if applicable)

IET members benefit from discounts to all IET publications and free access to E&T Magazine. If you are an IET member, log in to your account and the discounts will automatically be applied.

Learn more about IET membership 

Recommend Title Publication to library

You must fill out fields marked with: *

Librarian details
Name:*
Email:*
Your details
Name:*
Email:*
Department:*
Why are you recommending this title?
Select reason:
 
 
 
 
 
IET Computer Vision — Recommend this title to your library

Thank you

Your recommendation has been sent to your librarian.

Driver fatigue and inattention have long been recognised as the main contributing factors in traffic accidents. This study presents a novel system which applies convolutional neural network (CNN) to automatically learn and predict pre-defined driving postures. The main idea is to monitor driver hand position with discriminative information extracted to predict safe/unsafe driving posture. In comparison to previous approaches, CNNs can automatically learn discriminative features directly from raw images. In the authors' works, a CNN model was first pre-trained by an unsupervised feature learning method called sparse filtering, and subsequently fine-tuned with classification. The approach was verified using the Southeast University driving posture dataset, which comprised of video clips covering four driving postures, including normal driving, responding to a cell phone call, eating, and smoking. Compared with other popular approaches with different image descriptors and classification methods, the authors' scheme achieves the best performance with an overall accuracy of 99.78%. To evaluate the effectiveness and generalisation performance in more realistic conditions, the method was further tested using other two specially designed datasets which takes into account of the poor illuminations and different road conditions, achieving an overall accuracy of 99.3 and 95.77%, respectively.

References

    1. 1)
      • 1. WHO: ‘World report on road traffic injury prevention’, 2004. Available at: http://www.who.int/violence_injury_prevention/publications/road_traffic/world_report/en/.
    2. 2)
    3. 3)
    4. 4)
    5. 5)
    6. 6)
    7. 7)
    8. 8)
    9. 9)
    10. 10)
      • 10. Teyeb, I., Jemai, O., Zaied, M., et al: ‘A drowsy driver detection system based on a new method of head posture estimation’, in Corchado, E., Lozano, J., Quinti¢n, H., Yin, H. (Eds.): ‘Intelligent data engineering and automated learning C IDEAL 2014’, (LNCS, 8669) (Springer International Publishing, 2014), pp. 362369.
    11. 11)
      • 11. Teyeb, I., Jemai, O., Zaied, M., et al: ‘A novel approach for drowsy driver detection using head posture estimation and eyes recognition system based on wavelet network’. The 5th Int. Conf. on Information, Intelligence, Systems and Applications, IISA 2014, 2014, pp. 379384.
    12. 12)
    13. 13)
    14. 14)
    15. 15)
    16. 16)
    17. 17)
    18. 18)
    19. 19)
      • 19. Le, Q., Zou, W., Yeung, S., et al: ‘Learning hierarchical invariant spatio-temporal features for action recognition with independent subspace analysis’. IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), 2011, 2011, pp. 33613368, doi: 10.1109/CVPR.2011.5995496.
    20. 20)
    21. 21)
    22. 22)
    23. 23)
      • 23. Hu, B., Lu, Z., Li, H., et al: ‘Convolutional neural network architectures for matching natural language sentences’, in Ghahramani, Z., Welling, M., Cortes, C., Lawrence, N., Weinberger, K. (Eds.): ‘Advances in neural information processing systems 27’, (Curran Associates Inc., 2014), pp. 20422050.
    24. 24)
      • 24. Krizhevsky, A., Sutskever, I., Hinton, G.E.: ‘Imagenet classification with deep convolutional neural networks’. Advances in Neural Information Processing Systems, 2012, pp. 10971105.
    25. 25)
      • 25. Krause, J., Gebru, T., Deng, J., et al: ‘Learning features and parts for fine-grained recognition’. Twenty-Second Int. Conf. on Pattern Recognition (ICPR), 2014, 2014, pp. 2633, doi: 10.1109/ICPR.2014.15.
    26. 26)
      • 26. Simonyan, K., Zisserman, A.: ‘Two-stream convolutional networks for action recognition in videos’, in Ghahramani, Z., Welling, M., Cortes, C., Lawrence, N., Weinberger, K. (Eds.): ‘Advances in neural information processing systems 27’ (Curran Associates Inc., 2014), pp. 568576.
    27. 27)
      • 27. Zhang, N., Paluri, M., Ranzato, M., et al: ‘Panda: pose aligned networks for deep attribute modeling’. IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), 2014, 2014, pp. 16371644.
    28. 28)
      • 28. Girshick, R., Donahue, J., Darrell, T., et al: ‘Rich feature hierarchies for accurate object detection and semantic segmentation’. IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), 2014, 2014, pp. 580587.
    29. 29)
    30. 30)
      • 30. Weinzaepfel, P., Revaud, J., Harchaoui, Z., et al: ‘Deepflow: large displacement optical flow with deep matching’. IEEE Int. Conf. on Computer Vision (ICCV), 2013, 2013, pp. 13851392, doi: 10.1109/ICCV.2013.175.
    31. 31)
      • 31. Yi, D., Lei, Z., Liao, S., et al: ‘Deep metric learning for person re-identification’. Twenty-Second Int. Conf. on Pattern Recognition (ICPR), 2014, 2014, pp. 3439.
    32. 32)
      • 32. Taigman, Y., Yang, M., Ranzato, M., et al: ‘Deepface: closing the gap to human-level performance in face verification’. IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), 2014, 2014, pp. 17011708.
    33. 33)
      • 33. Sun, Y., Wang, X., Tang, X.: ‘Deep learning face representation from predicting 10,000 classes’. IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), 2014, 2014, pp. 18911898, doi: 10.1109/CVPR.2014.244.
    34. 34)
      • 34. Ngiam, J., Chen, Z., Bhaskar, S.A., et al: ‘Sparse filtering’, in Shawe-Taylor, J., Zemel, R., Bartlett, P., Pereira, F., Weinberger, K. (Eds.): ‘Advances in neural information processing systems 24’ (Curran Associates Inc., 2011), pp. 11251133.
    35. 35)
      • 35. Glorot, X., Bordes, A., Bengio, Y.: ‘Deep sparse rectifier neural networks’. Journal of Machine Learning Research 15 (Proc. 14th Int. Conf. on Artificial Intelligence and Statistics, AISTATS 2011), 2011, pp. 315323.
    36. 36)
    37. 37)
    38. 38)
    39. 39)
      • 39. Dugas, C., Bengio, Y., Bélisle, F., et al: ‘Incorporating second-order functional knowledge for better option pricing’, in Leen, T., Dietterich, T., Tresp, V. (Eds.): ‘Advances in neural information processing systems 13’ (MIT Press, 2001), pp. 472478.
    40. 40)
      • 40. Boureau, Y.-L., Ponce, J., Lecun, Y.: ‘A theoretical analysis of feature pooling in visual recognition’. Twenty-Seventh Int. Conf. on Machine Learning, Haifa, Israel, 2010.
    41. 41)
      • 41. Zeiler, M.D., Fergus, R.: ‘Stochastic pooling for regularization of deep convolutional neural networks’, Available at: http://arxiv.org/abs/1301.3557 abs/1301.3557.
    42. 42)
      • 42. Jarrett, K., Kavukcuoglu, K., Ranzato, M., et al: ‘What is the best multi-stage architecture for object recognition?’. IEEE 12th Int. Conf. on Computer Vision, 2009, pp. 21462153, doi: 10.1109/ICCV.2009.5459469.
    43. 43)
      • 43. Dong, Z., Pei, M., He, Y., et al: ‘Vehicle type classification using unsupervised convolutional neural network’. Twenty-Second Int. Conf. on Pattern Recognition (ICPR),2014, pp. 172177, doi: 10.1109/ICPR.2014.39.
    44. 44)
      • 44. Simoncelli, E.: ‘Statistical models for images: compression, restoration and synthesis’. Conf. Record of the Thirty-First Asilomar Conf. on Signals, Systems Computers 1997, 1997, vol. 1, pp. 673678.
    45. 45)
    46. 46)
    47. 47)
      • 47. Lyu, S., Simoncelli, E.: ‘Nonlinear image representation using divisive normalization’. IEEE Conf. on Computer Vision and Pattern Recognition, 2008. CVPR 2008, 2008, pp. 18, doi:10.1109/CVPR.2008.4587821.
    48. 48)
    49. 49)
      • 49. Murphy, K.P.: ‘Machine learning: a probabilistic perspective, adaptive computation and machine learning’ (MIT Press, Cambridge, Mass, 2012).
    50. 50)
      • 50. Online, http://ufldl.stanford.edu/tutorial/supervised/ConvolutionalNeuralNetwork/.
    51. 51)
      • 51. Erhan, D., Bengio, Y., Courville, A., et al: ‘Why does unsupervised pre-training help deep learning?’, J. Mach. Learn. Res., 2010, 11, pp. 625660.
    52. 52)
      • 52. Yosinski, J., Clune, J., Bengio, Y., et al: ‘How transferable are features in deep neural networks?’, in Ghahramani, Z., Welling, M., Cortes, C., Lawrence, N., Weinberger, K. (Eds.): ‘Advances in neural information processing systems 27’ (Curran Associates Inc., 2014), pp. 33203328.
    53. 53)
      • 53. Vincent, P., Larochelle, H., Bengio, Y., et al: ‘Extracting and composing robust features with denoising autoencoders’. Proc. Twenty-Fifth Int. Conf. on Machine Learning (ICML 2008), Helsinki, Finland, 5–9 June 2008, pp. 10961103.
    54. 54)
      • 54. Olshausen, B.A., Fieldt, D.J.: ‘Sparse coding with an overcomplete basis set: a strategy employed by v1 ?.
    55. 55)
      • 55. Lee, H., Ekanadham, C., Ng, A.Y.: ‘Sparse deep belief net model for visual area v2’, in Platt, J., Koller, D., Singer, Y., Roweis, S. (Eds.): ‘Advances in neural information processing systems 20’ (Curran Associates Inc., 2008), pp. 873880.
    56. 56)
    57. 57)
    58. 58)
    59. 59)
      • 59. Bosch, A., Zisserman, A., Munoz, X.: ‘Representing shape with a spatial pyramid kernel’. Proc. 6th ACM Int. Conf. on Image and Video Retrieval, CIVR ‘07, ACM, New York, NY, USA, 2007, pp. 401408.
    60. 60)
http://iet.metastore.ingenta.com/content/journals/10.1049/iet-cvi.2015.0175
Loading

Related content

content/journals/10.1049/iet-cvi.2015.0175
pub_keyword,iet_inspecKeyword,pub_concept
6
6
Loading
This is a required field
Please enter a valid email address