馬平 楊興財(cái)
摘 要:智能交通體系中的無人駕駛這項(xiàng)課題是近年來一直是各大傳統(tǒng)汽車行業(yè)甚至各大互聯(lián)網(wǎng)巨頭企業(yè)的研究熱點(diǎn)。目前的無人駕駛技術(shù)以及輔助駕駛技術(shù)主要依賴于計(jì)算機(jī)視覺來采集道路交通標(biāo)志信號(hào),并通過分析系統(tǒng)實(shí)現(xiàn)對(duì)交通標(biāo)志的處理及分類任務(wù)?,F(xiàn)階段的主要研究方法有傳統(tǒng)方式的提取圖片的HOG特征、SIFT特征等,之后送入SVM分類器或者貝葉斯分類器中。實(shí)現(xiàn)對(duì)于交通標(biāo)志的提取與分類。近年來神經(jīng)網(wǎng)絡(luò)的迅速發(fā)展也為交通標(biāo)志的識(shí)別貢獻(xiàn)了新的力量,CNN,F(xiàn)aster R-CNN等網(wǎng)絡(luò)的出現(xiàn)也被運(yùn)用在交通標(biāo)志的識(shí)別中。針對(duì)智能交通體系構(gòu)建過程中的道路交通標(biāo)志識(shí)別率較低的問題,本文將注意力機(jī)制引入到神經(jīng)網(wǎng)絡(luò)中,實(shí)現(xiàn)對(duì)交通標(biāo)志圖片的有效識(shí)別。該方法通過VGG網(wǎng)絡(luò)實(shí)現(xiàn)對(duì)輸入數(shù)據(jù)的特征提取,并加入遞進(jìn)的注意力網(wǎng)絡(luò)實(shí)現(xiàn)對(duì)關(guān)注區(qū)域的放大以及細(xì)節(jié)提取,使得網(wǎng)絡(luò)能夠更有效地關(guān)注細(xì)節(jié)區(qū)域。將網(wǎng)絡(luò)應(yīng)用在比利時(shí)交通數(shù)據(jù)集上并取得了優(yōu)秀的識(shí)別結(jié)果。最終的測(cè)試集分類準(zhǔn)確率達(dá)到了98.2%。
關(guān)鍵詞:注意力;交通標(biāo)志;智能交通;神經(jīng)網(wǎng)絡(luò);卷積網(wǎng)絡(luò)
中圖分類號(hào):TP391文獻(xiàn)標(biāo)識(shí)碼:A
Traffic sign recognition combined with attention network
Ma Ping Yang Xingcai
Department of Automation,North China Electric Power University HebeiBaoding 071003
Abstract:The topic of unmanned driving in the intelligent transportation system has been a research hotspot in major traditional automobile industries and even major Internet giants in recent years.The current unmanned technology and assisted driving technology mainly rely on computer vision to collect road traffic sign signals,and through the analysis system to achieve the handling and classification tasks of traffic signs.At present,the main research methods are the HOG feature,SIFT feature,etc.of the extracted picture in the traditional way,and then sent to the SVM classifier or Bayesian classifier.Achieve the extraction and classification of traffic signs.In recent years,the rapid development of neural networks has also contributed to the identification of traffic signs.The emergence of networks such as CNN and Faster R-CNN has also been used in the identification of traffic signs.Aiming at the problem that the recognition rate of road traffic signs is low during the construction of intelligent transportation system,this paper introduces the attention mechanism into the neural network to realize the effective identification of traffic sign pictures.The method realizes feature extraction of input data through the VGG network,and adds a progressive attention network to realize amplification and detail extraction of the attention area,so that the network can pay more attention to the detail area.The network was applied to the Belgian traffic dataset and achieved excellent recognition results.The final test set classification accuracy rate reached 98.2%.
Key words:attention;traffic signs;intelligent transportation;neural network;convolutional network
1 背景與意義
近年來,無人駕駛不但在學(xué)術(shù)界掀起一股熱潮,并且也成為各個(gè)互聯(lián)網(wǎng)公司及傳統(tǒng)汽車行業(yè)不斷追求的目標(biāo)。目前主流主要研究方向?yàn)榛跈C(jī)器視覺的自動(dòng)駕駛與輔助駕駛技術(shù)。而交通標(biāo)志是人們?cè)谌粘3鲂羞^程中的重要的駕駛準(zhǔn)則,按規(guī)則行駛才能夠保證車輛的安全有序且高效的通行。實(shí)現(xiàn)無人駕駛或者現(xiàn)階段的輔助駕駛的過程中,對(duì)于路面交通標(biāo)志的準(zhǔn)確識(shí)別都具有重要的意義。準(zhǔn)確地識(shí)別之后,將結(jié)果傳遞給車輛的決策系統(tǒng)或者駕駛?cè)藛T,能夠更有效地保障車輛的安全有效的行駛。
網(wǎng)絡(luò)權(quán)重的優(yōu)化包括兩部分:分類網(wǎng)絡(luò)以及注意力網(wǎng)絡(luò)。首先是分類損失LOSSclass,簡(jiǎn)寫為L(zhǎng)cls。文章中采用交叉熵函數(shù)公式(5)來計(jì)算分類損失。并根據(jù)分類損失來進(jìn)行對(duì)特征提取網(wǎng)絡(luò)以及后續(xù)全連接層分類網(wǎng)絡(luò)的參數(shù)調(diào)節(jié)與優(yōu)化。
LCLS(Y(n),Y(n))=-∑kY(n)logY(n) (5)
注意力網(wǎng)絡(luò)的權(quán)重優(yōu)化主要是依靠不同層級(jí)網(wǎng)絡(luò)輸出的比重進(jìn)行優(yōu)化。Pt1為第一層網(wǎng)絡(luò)的預(yù)測(cè)輸出,經(jīng)由softmax層輸出時(shí),我們不僅保留其最大值索引的標(biāo)簽,同時(shí)保留索引標(biāo)簽所占的百分比重即排名情況。所以將注意力網(wǎng)絡(luò)的優(yōu)化依據(jù)定義為L(zhǎng)OSSrank,記為L(zhǎng)rank,其表達(dá)式為(6)。
Lrank(pnt,pn+1t)=max(0,1pnt-pn+1t) (6)
并通過公式(7)實(shí)現(xiàn)對(duì)注意力區(qū)域的選擇與更新。
Lrankax∝Dtop⊙M(ax,ay,al)ax (7)
公式(7)代表了對(duì)橫軸注意力參數(shù)ax的更新情況,同樣的ay,al也是按照類似的方式實(shí)現(xiàn)對(duì)選中區(qū)域的迭代與更新,實(shí)現(xiàn)對(duì)APN網(wǎng)絡(luò)權(quán)重參數(shù)的訓(xùn)練與優(yōu)化。
4 分類的結(jié)果以及圖像
實(shí)驗(yàn)時(shí),測(cè)試環(huán)境是工作站i7處理器,主頻是3.6GHz,16G 內(nèi)存,GTX1080,Ubuntu 14.04操作系統(tǒng)。
對(duì)于改進(jìn)前的RA-CNN網(wǎng)絡(luò)來說,前兩層的識(shí)別率較高,但第三層注意力網(wǎng)絡(luò)輸出時(shí)就會(huì)出現(xiàn)識(shí)別正確率較低的情況,推測(cè)是由于只是聚焦在某一特定位置,而忽略了對(duì)全局信息的特征化,從而使得識(shí)別結(jié)果不理想。
損失情況如下圖4,5所示,我們?cè)谶@里采用兩個(gè)網(wǎng)絡(luò)交替訓(xùn)練的形式,所以在圖像中會(huì)出現(xiàn)平直線的情況。由圖像可知損失函數(shù)是在有效下降的。相對(duì)于未優(yōu)化前的網(wǎng)絡(luò),損失函數(shù)更夠更快速的收斂。
正確率的圖像見圖6、7,可以由圖像得出,第二三層網(wǎng)絡(luò)的識(shí)別正確率明顯高于未使用注意力網(wǎng)絡(luò)的區(qū)域。而在改進(jìn)網(wǎng)絡(luò)之后,第三層網(wǎng)絡(luò)的識(shí)別結(jié)果有了明顯的改善,實(shí)現(xiàn)融合全局信息與細(xì)節(jié)信息,將識(shí)別正確率提升至高于第二層網(wǎng)絡(luò),最終實(shí)現(xiàn)98.2%的正確率,比原網(wǎng)絡(luò)提升6%。
5 結(jié)論
本文將細(xì)粒度分類的方法引入到交通標(biāo)志的分類中,使得深度神經(jīng)網(wǎng)絡(luò)能夠在圖片中提取到更加有用的分類信息,并通過改進(jìn)的注意力機(jī)制以及注意力損失函數(shù)實(shí)現(xiàn)了對(duì)交通標(biāo)志細(xì)節(jié)的放大,不同層級(jí)的放大區(qū)域最終輸出時(shí)帶有明顯的準(zhǔn)確率的差距。最終使得網(wǎng)絡(luò)能夠?qū)崿F(xiàn)對(duì)數(shù)據(jù)集BTSD的有效分類。但由于數(shù)據(jù)集圖像像素大小的限制,更深層次的注意力區(qū)域特征提取無法取得更好的效果,這一點(diǎn)也是今后可以改進(jìn)的一個(gè)方向。
參考文獻(xiàn):
[1]Yao C,Wu F,Chen H J,et al.Traffic sign recognition using HOG-SVM and grid search[C].Hangzhou:International Conference on Signal Processing,2015.
[2]Gim J W,Hwang M C,Ko B C,et al.Real-time speed-limit sign detection and recognition using spatial pyramid feature and boosted random forest[C].Genoa:International Conference Image Analysis and Recognition,2015.
[3]Deng Zhijie,Wang Yong,Tao Xiaoling.Method of network traffic classification using Nave Bayes based on FPGA[C].Chongqing:13th IEEE Joint International Computer Science and Information Technology Conference,2011.
[4]Wu Yihui,Liu Yulong,Li Jianmin,et al.Traffic sign detection based on convolutional neural networks[C].Dallas:International Joint Conference on Neural Networks,2014.
[5]Girshick R,Donahue J,Darrell T,et al.Rich feature hierarchies for accurate object detection and semantic segmentation[C].Columbus:IEEE Conference on Computer Vision and Pattern Recognition,2013.
[6]Girshick R.Fast R-CNN[C].Santiago:IEEE International Conference on Computer Vision (ICCV),2015.
[7]袁小平,王崗,王曄楓,汪喆遠(yuǎn),孫輝.基于改進(jìn)卷積神經(jīng)網(wǎng)絡(luò)的交通標(biāo)志識(shí)別方法[J/OL].電子科技,2019(11):1-5[2019-04-07].http://kns.cnki.net/kcms/detail/61.1291.TN.20181229.1448.024.html.
[8]Zhu Zhe,Liang Dun,Zhang Songhai,et al.Traffic-sign detection and classification in the wild[C].Las Vegas:IEEE Conference on Computer Vision and Pattern Recognition,2016.
[9]王子恒.路面交通標(biāo)志檢測(cè)調(diào)研:數(shù)據(jù)集及算法[A].中國(guó)汽車工程學(xué)會(huì)(China Society of Automotive Engineers).2018中國(guó)汽車工程學(xué)會(huì)年會(huì)論文集[C].中國(guó)汽車工程學(xué)會(huì)(China Society of Automotive Engineers):中國(guó)汽車工程學(xué)會(huì),2018:7.
[10]Wei X S,Luo J H,Wu J,et al.Selective Convolutional Descriptor Aggregation for Fine-Grained Image Retrieval[J].IEEE Transactions on Image Processing,2017,26(6):2868-2881.
[11]Fu J,Zheng H,Tao M.Look Closer to See Better:Recurrent Attention Convolutional Neural Network for Fine-Grained Image Recognition[C]// Computer Vision & Pattern Recognition.2017.
[12]K.Simonyan and A.Zisserman.Very deep convolutional networks for large-scale image recognition.In ICLR,pages 1409-1556,2015.