License plate localization in unconstrained scenes using a two-stage CNN-RNN

Jingjing Zhang, Yuanyuan Li, Teng Li (Corresponding author), Lina Xun, Caifeng Shan

Research output: Contribution to journalArticleAcademicpeer-review

20 Citations (Scopus)
1 Downloads (Pure)


Recent deep object detection methods neglect the intrinsic properties of the license plate, which limits the detection performance in unconstrained scenes. In this paper, we propose a two-stage deep learning-based method to locate license plates in unconstrained scenes, especially for special license plates such as fouling, occlusion, and so on. A deep network consisting of convolutional neural network (CNN) and recurrent neural network is designed. In the first stage, fine-scale proposals are detected according to the characteristics of the license plate characters, and CNN is used to extract the local features of characters. A vertical anchor mechanism is designed to jointly predict the position and confidence of each fix-width character. Furthermore, the sequential contexts of characters are modeled with the bi-directional long short-term memory, which greatly improves the locating rate of license plates in complex scenes. In the second stage, the whole license plate is obtained by connecting the fine-scale proposals. The experimental results show that the proposed method not only locates license plates of different countries accurately but also be robust to scenes of illumination variation, noise distortion, and blurry effects. The average precision reaches 97.11% on multi-country license plates, and the precision and recall reaches 99.10% and 98.68%, respectively, on Chinese license plate images.

Original languageEnglish
Article number8643978
Pages (from-to)5256-5265
Number of pages10
JournalIEEE Sensors Journal
Issue number13
Publication statusPublished - 1 Jul 2019


  • convolutional neural networks
  • License plate localization
  • recurrent neural networks


Dive into the research topics of 'License plate localization in unconstrained scenes using a two-stage CNN-RNN'. Together they form a unique fingerprint.

Cite this