Detección e identificación de características de vehículos utilizando algoritmos de machine learning / Feature extraction from images of vehicles using machine learning algorithms

Mariotti, Enrique N. (2020) Detección e identificación de características de vehículos utilizando algoritmos de machine learning / Feature extraction from images of vehicles using machine learning algorithms. Maestría en Ingeniería, Universidad Nacional de Cuyo, Instituto Balseiro.

[img]
Vista previa
PDF (Tesis)
Disponible bajo licencia Creative Commons: Reconocimiento - No comercial - Compartir igual.

Español
18Mb

Resumen en español

En esta tesis de maestría en ciencias de la ingeniera se propone una implementación modular para resolver el problema del reconocimiento automático de patentes vehiculares (ALPR) basado en un esquema detección-detección-reconocimiento, usando íntegramente deep learning. Se pretende emplear esta implementación para aplicaciones practicas que involucran control vehicular visual de cualquier tipo. A lo largo de este trabajo se empleo un enfoque modular, dividiendo el objetivo principal en etapas progresivas. Se desarrollaron módulos individuales, que implementan una arquitectura de clases basada en el paradigma orientado a objetos (OOP), lo que los hace genéricos y fácilmente modificables. A su vez, los módulos u objetos individuales pueden ser usados con otros propósitos, como ser la lectura de texto impreso o señales de transito en imágenes de tracfio urbano. Durante la realización de esta tesis se priorizo el uso de transfer learning en redes profundas convolucionales (CNN), con el objetivo de adaptarse a una disponibilidad de recursos limitada y para disminuir considerablemente los tiempos asociados al entrenamiento. La etapa primaria de segmentación fue basada en un esquema de detección de código libre denominado usualmente YOLOv3, el cual fue re-entrenado con imágenes naturales de vehículos perteneciente a un dataset de acceso publico. La etapa de reconocimiento de texto se lleva a cabo mediante una red recurrente-convolucional (RCNN) que realiza el reconocimiento de la sentencia. Finalmente, el nexo entre estas dos redes esta dado por un esquema detección de uso comercial denominado CRAFT. En este escrito se presentan todos los detalles de la implementación de dicho esquema, y por sobre el final del mismo, se evalúa cuantitativamente la performance del sistema usando diferentes métricas contra OpenALPR, un software de uso extendido en el área de reconocimiento de patentes vehiculares.

Resumen en inglés

In this master of science in engineering thesis, a modular implementation is proposed to solve the problem of automatic licence plate recognition (ALPR), based on a detection-detection-recognition scheme, done entirely using deep learning. The scheme here presented is intended to be used for practical applications involving visual vehicle control of any kind. A modular approach was used throughout this work, dividing the main objective into progressive stages. Individual modules were developed, which implement a class architecture based on the object-oriented paradigm (OOP) that makes them generic and easily modifiable. In turn, the modules or individual objects can be used for other purposes, such as reading printed text or traffic signs in images of urban traffic. During the completion of this thesis, the use of transfer learning on deep convolutional networks (CNN) was prioritized, in order to adapt to a limited availability of resources and to reduce considerably the time associated with training. The primary segmentation stage was based on an open source detection scheme usually named YOLOv3, which was re-trained with natural images of vehicles belonging to a public access dataset. The text recognition stage is carried out by means of a recurrent-convolutional network (RCNN) that performs the recognition of the sentence. Finally, the link between these two networks is given by a commercial detection scheme called CRAFT. In this writing, all the details of the scheme implementation are presented, and near the end of it, the performance of the system is quantitatively evaluated using different metrics against OpenALPR, a software widely used in the eld of automatic licence plate recognition.

Tipo de objeto:Tesis (Maestría en Ingeniería)
Palabras Clave:Machine learning; Aprendizaje automático; [Automatic licence plate recognition; Reconocimiento automático de patentes vehiculares; Deep learning; Aprendizaje profundo; Convolutional neural networks; Redes neuronales convolucionales; Objetc oriented programming; Programación orientada a objetos]
Referencias:[1] Redmon, J., Farhadi, A. Yolov3: An incremental improvement. arXiv:1804.02767, 2018. [2] Eu licence plate dataset. URL https://github.com/RobertLucian/ license-plate-dataset. [3] Deng, J., Dong, W., Socher, R., Li, L., Li, K., Fei-Fei, L. Imagenet: A large-scale hierarchical image database. En: CVPR09. 2009. [4] Convnets series: Spatial transformer networks. URL https:// towardsdatascience.com/convnets-series-spatial-transformer-networks-cff47565ae81 [5] Stallkamp, J., Schlipsing, M., Salmen, J., Igel, C. Man vs. computer: Benchmarking machine learning algorithms for traffic sign recognition. Neural Networks, 2012. URL http://www.sciencedirect.com/science/article/pii/S0893608012000457. [6] Jaderberg, M., Simonyan, K., Zisserman, A., Kavukcuoglu, K. Spatial transformer networks. arXiv:1506.02025, 2015. [7] W. Feng, Y. L. X. Z. Z. L., N. Guan. Audio visual speech recognition with multimodal recurrent neural networks. En: IEEE International Joint Conference on Neural Networks, págs. 681{688. 2017. [8] The vanishing gradient problem. URL http://harinisuresh.com/2016/10/09/ lstms/. [9] Wikipedia: Long-short term memory. URL https://en.wikipedia.org/wiki/ Long_short-term_memory. [10] Jaderberg, M., Simonyan, K., Vedaldi, A., Zisserman, A. Synthetic data and articial neural networks for natural scene text recognition. En: Workshop on Deep Learning, NIPS. 2014. [11] Stanford university: Sequence modeling with ctc. URL https://distill.pub/ 2017/ctc/. [12] Buslaev, A., Iglovikov, V., Khvedchenya, E. Albumentations: Fast and exible image augmentations, 2020. URL https://www.mdpi.com/2078-2489/11/2/125. [13] Evaluating object detection models: Guide to performance metrics. URL https://manalelaidouni.github.io/ Evaluating-Object-Detection-Models-Guide-to-Performance-Metrics. html. [14] Jetson nano developer kit. URL https://developer.nvidia.com/embedded/ jetson-nano-developer-kit. [15] Open alpr. URL https://github.com/openalpr/openalpr. [16] Du, S., Ibrahim, M., Shehata, M., Badawy, W. Automatic license plate recognition (alpr): A state-of-the-art review. En: IEEE Transactions on Circuits and Systems for Video Technology, vol. 23, no. 2, pags. 311{325. IEEE, 2013. [17] Gou, C., Wang, K., Yao, Y., Li, Z. Vehicle license plate recognition based on extremal regions and restricted boltzmann machines. En: IEEE Transactions on Intelligent Transportation Systems, vol. 17, no. 4, pags. 1096{1107. 2016. [18] Montazzolli, S., Jung, C. R. Real-time brazilian license plate detection and recognition using deep convolutional neural networks. En: 30th SIBGRAPI Conference on Graphics, Patterns and Images, págs. 55{62. 2017. [19] Hsu, G. S., Ambikapathi, A., Chung, S. L., Su, C. P. Robust license plate detection in the wild. En: 14th IEEE International Conference on Advanced Video and Signal Based Surveillance, págs. 1{6. IEEE, 2017. [20] Raque, M. A., Pedrycz, W., Jeon, M. Vehicle license plate detection using region-based convolutional neural networks, 2017. [21] Li, H., Shen, C. Reading car license plates using deep convolutional neural networks and lstms. arXiv:1601.05610, 2016. [22] Laroca, R., Severo, E., Zanlorensi, L. A., Oliveira, L. S. A robust real-time automatic license plate recognition based on the yolo detector. arXiv:1802.09567v6, 2018. [23] Bulan, O., Kozitsky, V., Ramesh, P., Shreve, M. Segmentation and annotation-free license plate recognition with deep localization and failure identication. En: IEEE Transactions on Intelligent Transportation Systems, vol. 18, no. 9, págs. 2351{2363. IEEE, 2017. [24] Menotti, D., Chiachia, G., Falc~ao, A. X., Neto, V. J. O. Vehicle license plate recognition with random convolutional networks. En: 27th SIBGRAPI Conference on Graphics, Patterns and Images, pags. 298{303. 2014. [25] Redmon, J., Farhadi, A. You only look once: Unied, real-time object detection. En: IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 2016. [26] Redmon, J., Farhadi, A. Yolo9000: Better, faster, stronger. arXiv:1612.08242, 2016. [27] He, K., Zhang, X., Ren, S., Sun, J. Deep residual learning for image recognition. arXiv:1512.03385, 2015. [28] Yolov3 theory explained. URL https://pylessons.com/YOLOv3-introduction/. [29] Darknet repository. URL https://github.com/pjreddie/darknet. [30] Goodfellow, I., Bengio, Y., Courville, A. Deep Learning. MIT Press, 2016. http: //www.deeplearningbook.org. [31] Spatial transformer network for tensor ow. URL https://github.com/skaae/ transformer_network. [32] Bengio, Y., Simard, P., Frasconi, P. Learning long-term dependencies with gradient descent is diffcult. En: IEEE Transactions on Neural Networks, vol. 5, no. 2. 1994. [33] Gers, F., Schmidhuber, J., Cummins, F. Learning to forget: Continual prediction with lstm. En: Neural Computation, vol. 12, no. 10, págs. 2451{2471. 2000. [34] Graves, A., Liwicki, M., Fernández, S., Bertolami, R., Bunke, H., Schmidhuber, J. A novel connectionist system for unconstrained handwriting recognition. En: IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 31, no. 5, pags. 855{868. 2009. [35] Graves, A., Fernandez, S., Gomez, F., Schmidhuber, J. Connectionist temporal classication : Labelling unsegmented sequence data with recurrent neural networks. En: Proceedings of the 23rd international conference on Machine Learning, págs. 369{376. 2006. [36] Liwicki, M., Graves, A., Bunke, H., Schmidhuber, J. A novel approach to on-line handwriting recognition based on bidirectional long short-term memory networks. En: Proceedings of the 9th Int. Conf. on Document Analysis and Recognition, págs. 367{371. 2007. [37] Graves, A. Supervised Sequence Labelling with Recurrent Neural Networks. Springer, 2012. [38] Graves, A. Sequence transduction with recurrent neural networks. arXiv:1211.3711, 2012. [39] Tensor ow repository. URL https://github.com/tensorflow/tensorflow. [40] Xu, K., Lei, J., Kiros, R., Cho, K., Courville, A., Salakhutdinov, R., et al. Show, attend and tell: Neural image caption generation with visual attention. En: Proceedings of the 32nd International Conference on Machine Learning, vol. 37, no. 32. 2015. [41] Baek, Y., Lee, B., Han, D., Yun, S., Lee, H. Character region awareness for text detection. arXiv:1904.1941, 2019. [42] Kuznetsova, A. The open images dataset v4: Unied image classication, object detection, and visual relationship detection at scale. arXiv:1811.00982, 2018. [43] Moretti, I., Jorge, J., Amado, J. Software libre para reconocimiento automatico de las nuevas patentes del mercosur. [44] Especicaciones tecnicas - patente unica mercosur. URL http://www. casademoneda.gob.ar/wp-content/uploads/2019/08/ESPEC.-TECNICAS2. pdf. [45] He, D., Wang, L. Texture unit, texture spectrum, and texture analysis. En: IEEE Transactions on Geoscience and Remote Sensing, vol. 28, pags. 509{512. 1990. [46] Wang, L., He, D. Texture classication using texture spectrum. En: Pattern Recognition, vol. 23, no. 8, págs. 905{910. 1990. [47] Wolf, C., Jolion, J. M. Extraction and recognition of articial text in multimedia documents. En: Pattern Analysis and Applications, vol. 6, no. 4, págs. 309{326. 2003. [48] Sauvola, J., Seppanen, T., Haapakoski, S., Pietikainen, M. Texture classication using texture spectrum. En: 4th International Conference On Document Analysis and Recognition, págs. 147{152. 1997. [49] Hough, P. U.s. patent 3 069 654: Method and means for recognizing complex patterns, 1962. [50] Duda, R., Hart, P. Use of the hough transformation to detect lines and curves in pictures. En: ACM Communications, vol. 15, págs. 11{15. 1972. [51] Tesseract ocr. URL https://github.com/tesseract-ocr/tesseract. [52] Wojna, Z., Gorban, A., Lee, D. S., Murphy, K. Attention-based extraction of structured information from street view imagery. arXiv:1704.03549, 2017. [53] He, T., Huang, W., Qiao, Y., Yao, J. Text-attentional convolutional neural network for scene text detection. En: IEEE Transactions on Image Processing, vol. 25, no. 6, págs. 2529{2541. 2016. [54] Lee, C., Osindero, S. Recursive recurrent nets with attention modeling for ocr in the wild. págs. 2231{2239, 2016. [55] Polosukhin, I., Kaiser, L., Gomez, A. N., Jones, L. Attention is all you need. arXiv:1706.03762, 2017.
Materias:Ingeniería > Visión artificial
Divisiones:INVAP
Código ID:986
Depositado Por:Marisa G. Velazco Aldao
Depositado En:05 Oct 2021 15:31
Última Modificación:12 Oct 2021 15:18

Personal del repositorio solamente: página de control del documento