王文瑾,游子繹,邵歷江,李小林,吳松青,3,張珠河,黃世國,3,張飛萍,3
融合超分辨率重建的YOLOv5松枯死木識別模型
王文瑾1,游子繹2,邵歷江1,李小林1,吳松青2,3,張珠河4,黃世國1,3※,張飛萍2,3
(1. 福建農(nóng)林大學計算機與信息學院,福州 350002;2. 福建農(nóng)林大學林學院,福州 350002;3. 生態(tài)公益重大有害生物防控福建省高校重點實驗室,福州 350002;4. 福州市森林病蟲害防治檢疫站,福州 350002)
為解決山地地形起伏大、無人機飛行高度高導致圖像中尺度小且紋理模糊的松枯死木識別困難問題,該研究提出了一種在特征層級進行超分辨率重建的YOLOv5松枯死木識別算法。在YOLOv5網(wǎng)絡(luò)中添加選擇性核特征紋理遷移模塊生成有細節(jié)紋理的高清檢測特征圖,自適應改變感受野的機制分配權(quán)重,將更多注意力集中在紋理細節(jié),提升了小目標和模糊目標的識別精度。同時,使用前景背景平衡損失函數(shù)抑制背景噪聲干擾,增加正樣本的梯度貢獻,改善正負樣本分布不平衡問題。試驗結(jié)果表明,改進后算法在交并比(intersection over union, IoU)閾值取0.5時的平均精度均值(mean average precision, mAP50)為92.7%,mAP50~95(以步長0.05從0.5到0.95間取IoU閾值下的平均mAP)為62.1%,APsmall(小目標平均精度值)為53.2%,相比于原算法mAP50提高了3.2個百分點,mAP50~95提升了8.3個百分點,APsmall提升了15.8個百分點。不同算法對比分析表明,該方法優(yōu)于Faster R-CNN、YOLOv4、YOLOX、MT-YOLOv6、QueryDet、DDYOLOv5等深度學習算法,mAP50分別提高了16.7、15.3、2.5、2.8、12.3和1.2個百分點。改進后松枯死木識別算法具有較高精度,有效緩解了小目標與紋理模糊目標識別困難問題,為后續(xù)疫木清零提供技術(shù)支持。
無人機;圖像識別;松枯死木;小目標檢測;超分辨率重建;特征融合
松材線蟲是一種嚴重危害松樹的檢疫性有害生物,中國6 000萬hm2松林正面臨松材線蟲病大流行的威脅[1]。松枯死木的快速監(jiān)測與識別是松材線蟲病防控的關(guān)鍵。傳統(tǒng)人工目測及地面普查耗時費力,實施較為困難。近年來,利用無人機快速高效收集林分圖像的方法已表現(xiàn)出大規(guī)模高效監(jiān)測松材線蟲病發(fā)生區(qū)域的應用潛力[2]。因此,從影像中快速準確地識別松枯死木成為研究熱點。
傳統(tǒng)松枯死木識別算法主要基于顏色、形狀、紋理等特征手工設(shè)計特征描述子進行識別或分割[3-5],較難提取具有代表性的語義信息[6]。近年來,受益于數(shù)字圖像和人工智能技術(shù)的發(fā)展,基于深度學習的算法在農(nóng)林業(yè)領(lǐng)域展現(xiàn)出良好的識別能力。一些學者對常用目標算法進行參數(shù)調(diào)優(yōu)或改進,如GoogLeNet[7]、Faster R-CNN[8-10]、YOLOv3[11-13]、SSD[14]、YOLOv4[10,15-17]等算法,同時,LI等[18]也將YOLOv5算法用于無人機圖像與衛(wèi)星圖像的松枯死木識別預測,HU等[19]使用高效通道注意力(efficient channel attention, ECA)和混合擴張卷積改進YOLOv5算法進行松枯死木識別。在林業(yè)生產(chǎn)中,由于地形起伏大,無人機只能以較高高度飛行(如800 m以上)才能滿足大規(guī)模監(jiān)測的要求。這種情況下拍攝的松林影像中目標像素占比小,分辨率低,識別困難[20]。同時在高郁閉度林分的影像中存在枝葉遮擋、過曝或逆光的目標,邊界和紋理均較模糊,提取特征過程中紋理特征容易丟失,誤檢嚴重[15]。但目前的研究主要集中在飛行高度500 m以下,甚至是飛行高度100 m左右無人機拍攝的林分影像研究,這種高度很難滿足實際生產(chǎn)的要求。因此,迫切需要研究滿足林業(yè)生產(chǎn)需求的松枯死木識別技術(shù)。
超分辨率(super-resolution, SR)重建技術(shù)旨在通過特定算法從低質(zhì)量模糊圖片中重建相應的高分辨率圖像,包含插值法、重構(gòu)法和基于學習的方法[21],目前基于深度學習的方法是當前的研究熱點。DONG等[22]提出了第一個基于卷積神經(jīng)網(wǎng)絡(luò)的SR深度學習網(wǎng)絡(luò),經(jīng)過雙三次插值放大后再用卷積網(wǎng)絡(luò)恢復圖像,效果優(yōu)于之前的經(jīng)典算法。隨后越來越多神經(jīng)網(wǎng)絡(luò)變體被引入到超分辨率重建任務,如基于生成對抗網(wǎng)絡(luò)的SR網(wǎng)絡(luò)[23]、基于Transformer的SR網(wǎng)絡(luò)[24]等進一步提升重建效果。近年來,諸多深度學習SR重建網(wǎng)絡(luò)被提出為上游視覺任務提供幫助[25-27]。研究表明通過提高圖像分辨率來增加圖像信息量,可以有效恢復小目標和紋理模糊目標細節(jié),提升目標識別的性能[28-30],但輸入大尺寸圖像使網(wǎng)絡(luò)計算量顯著增加[31-32]。因此,一些學者直接對特征層級進行超分辨率操作,在保證運行速度的情況下提高淺層特征的表達能力。NOH等[33]使用生成對抗網(wǎng)絡(luò)的思想訓練超分辨率網(wǎng)絡(luò),將高分辨率特征用作超分辨率網(wǎng)絡(luò)的直接監(jiān)督信號,與低分辨率特征感受野匹配后進行配對訓練,但在重建低分辨率特征時沒有考慮淺層特征與深層特征的上下文聯(lián)系,生成的特征不穩(wěn)定。DENG等[34]提出了一種基于特征紋理遷移的特征超分辨率方法,用知識蒸餾將額外擴展的特征金字塔(feature pyramid networks, FPN)中的更高分辨率圖像信息作為監(jiān)督,從而生成高清特征圖用于識別,但默認用固定權(quán)重分配特征,導致網(wǎng)絡(luò)的表達能力有限,計算資源浪費。
綜上所述,現(xiàn)有融合超分辨率重建的目標識別方法使用固定權(quán)重融合特征或未考慮上下文聯(lián)系,難以精準識別小目標和紋理模糊目標。為此,本文在YOLO系列中識別性能優(yōu)秀的YOLOv5[35]算法上進一步改進,提出融合超分辨率重建的YOLOv5松枯死木識別模型。
1.1.1 試驗區(qū)概況
本研究中使用的無人機圖像空間范圍為25°20′N~26°18′N,118°29′E~119°31′E,覆蓋福建省福州市閩侯縣和莆田市仙游縣共15 400 hm2,其中閩侯縣白沙鎮(zhèn)面積約7 467 hm2,鴻尾鄉(xiāng)面積約5 333 hm2,竹岐鄉(xiāng)面積約1 133 hm2,甘蔗街道面積約933 hm2,仙游縣西天尾鎮(zhèn)面積約533 hm2。閩侯縣氣候類型為中亞熱帶季風氣候,境內(nèi)年平均氣溫19.5 ℃,年平均降雨量約1 673.9 mm,仙游縣屬于南亞熱帶海洋性季風氣候,年平均氣溫20.6 ℃,年平均降雨量約1 300~2 300 mm,全年溫暖濕潤,無霜期長。研究區(qū)地貌類型多為山地丘陵,地形起伏較大,海拔高度在400~1 200 m不等,但大部分處在800 m以上。林業(yè)用地土壤以紅壤為主,小部分為山地黃壤,林地成分多為馬尾松天然林,其次為馬尾松和闊葉混交林。
1.1.2 圖像采集與預處理
為滿足作業(yè)面積及飛行高度等需求,研究選用CW-007型固定翼無人機采集無人機影像,該機搭載CA-102型4 200萬像素相機。依據(jù)大量晚期變色松枯死木出現(xiàn)的時間,于2020年11月與2021年10月進行試驗數(shù)據(jù)采集。受地形限制,飛行高度為800~1 200 m不等。將拍攝好的分辨率為7 952×5 304像素的影像用Pix4Dmapper軟件進行拼接,得到39張TIF圖像,圖像大小介于31 984×26 045~64 033×50 719像素。
將拼接獲得的影像首先按600×600像素進行裁剪,得到8 978張子圖像。利用圖像信息熵來衡量圖像成像質(zhì)量,將信息熵小于5的空白圖像和邊緣圖像剔除。部分數(shù)據(jù)如圖1所示,采集場景包含晴天、多云、黃昏等,保留部分過曝與逆光圖像增加樣本多樣性,借此提升模型魯棒性。對剔除后的7 923張圖像進行標注,實地驗證不確定樣本13 581棵,共得到松枯死木樣本29 250棵。標注后的數(shù)據(jù)集按COCO格式存儲,并將該數(shù)據(jù)集按8∶1∶1劃分為訓練集、驗證集和測試集。從圖2a可以看出,密集地區(qū)的目標邊緣輪廓模糊,同時小尺寸的松枯死木分辨率低、紋理模糊,從圖2b標簽框大小分布來看,左下角點出現(xiàn)聚集,說明數(shù)據(jù)中存在大量屬于小目標的松枯死木。
a. 晴天a. Sunb. 多云b. Cloudyc. 黃昏c. Twilightd. 過曝d. Overexposured. 逆光d. Backlight
圖2 訓練集實例與標簽框大小分布
為解決無人機獲取的林分影像中小目標和紋理模糊目標造成的識別精度低問題,研究選用YOLOv5-6.0版本的YOLOv5x作為基礎(chǔ)模型,提出了一種融合超分辨率重建的YOLOv5松枯死木識別算法(圖3)。該算法的核心思想是利用超分辨率模塊融合多級特征生成高分辨率特征圖來提升目標識別精度。對YOLOv5算法具體做了如下改進:1)提出了選擇性核特征紋理遷移模塊,用于特征層級的超分辨率重建,使用自注意力機制選擇性地融合各尺度特征生成高分辨率特征信息,使網(wǎng)絡(luò)適應性更強。2)更改頸部網(wǎng)絡(luò),使用SKFTT替換原有網(wǎng)絡(luò)的UpSample上采樣和Concat拼接操作,用加入語義后的淺層特征圖識別小目標與紋理模糊目標,同時將其向下擴展一層用于更小目標的識別。3)使用FB Loss損失函數(shù),抑制背景噪聲干擾,提升對正樣本關(guān)注度,加強對目標區(qū)域的學習。
圖3 融合超分辨率重建的YOLOv5松枯死木識別網(wǎng)絡(luò)
1.2.1 選擇性核特征紋理遷移模塊
淺層特征中的噪聲通常會隨著主干網(wǎng)絡(luò)的不斷卷積向下傳遞,淹沒目標區(qū)域的關(guān)鍵信息,得到的特征圖分辨率也越來越低。本研究提出SKFTT模塊,旨在提取并強化目標區(qū)域的關(guān)鍵信息,從而提升目標識別精度。
SKFTT模塊由內(nèi)容提取器、紋理提取器與選擇性核特征融合器組成(圖4)。其主要輸入經(jīng)過內(nèi)容提取器提取到語義內(nèi)容信息,然后經(jīng)過亞像素卷積強化信息得到兩倍尺寸分辨率特征圖′,為獲取更豐富的上下文信息,再從參考特征層L中提取需要的紋理信息L′,最后將′和L′一起送入選擇性核特征融合器,通過自注意力機制將更多有效權(quán)重分配給淺層高分辨率特征圖,加強目標細節(jié)紋理信息在網(wǎng)絡(luò)中的傳遞。
內(nèi)容提取器與紋理提取器均采用殘差連接的方式提取特征。首先使用卷積層、批量歸一化(batch normalization, BN)和線性整流激活層(rectified linear unit, ReLU)的堆疊分別提取強語義與高分辨率信息,然后將輸入與輸出層相連,避免網(wǎng)絡(luò)過深造成梯度消失,實現(xiàn)信息的完整傳遞。最后,紋理提取器經(jīng)過卷積層強化特征,內(nèi)容提取器使用亞像素卷積提升分辨率,生成高分辨率特征。
選擇性核特征融合器依據(jù)人類視覺神經(jīng)元可以根據(jù)刺激而改變感受野的機制,通過融合操作符和選擇操作符動態(tài)調(diào)整接受的信息。在融合運算部分,先將具有豐富內(nèi)容和紋理信息的特征圖聚合,再通過全局平均池化獲取全局信息,經(jīng)過卷積和激活后得到降尺度的緊湊特征,由兩個并行的卷積實現(xiàn)通道擴增后得到兩個特征描述符1和2。在選擇運算部分,將更多的權(quán)重分配給淺層高分辨率特征圖,使用softmax激活函數(shù)作用于1、2得到選擇權(quán)重矩陣1、2,最后由1、2自適應地完成多尺度特征圖加權(quán)操作,捕捉更多有效特征,增強小目標與紋理模糊目標的細節(jié)特征。
注:L為主要輸入層,Lt為參考特征層,L′為內(nèi)容提取后特征層,Lt′為紋理提取后特征層,L″為重建后特征層,s是全局平均池化后的全局信息,z表示降尺度緊湊特征,v1和v2是特征描述符,s1、s2是選擇權(quán)重矩陣。
1.2.2 前景背景平衡損失函數(shù)
在目標識別算法中,正負樣本的數(shù)量和比例設(shè)置對算法精度有著顯著的影響,但基于錨框和無錨框的框架都遵循密集預測的范式,在訓練過程中會產(chǎn)生大量的背景樣本[36]。常見的全局損失會使背景被過分表達,導致像素占比少的小目標信息表達欠缺,造成正負樣本不均衡問題(圖5)。
FB Loss損失函數(shù)由全局重建損失和局部正樣本損失兩部分組成,通過擴大正樣本在訓練過程中的梯度貢獻,平衡前景和背景的特征表現(xiàn)。
圖5 正負樣本不均衡
全局重建損失L用于指導SKFTT模塊生成的高分辨率特征圖整體信息重建,使用L1范數(shù)作為損失函數(shù)保證經(jīng)過超分辨率模塊后的特征與背景特征相似。
式中為重建后特征圖,L為目標特征圖,本文使用主干網(wǎng)絡(luò)中2倍尺寸的特征圖作為目標特征圖來監(jiān)督生成高分辨率特征圖。
局部正樣本損失L用于加強對正樣本高頻信息關(guān)注。
式中P為真實框,為目標區(qū)域像素數(shù)量,(,)為特征圖上像素點坐標。
FB Loss損失函數(shù)(L)是全局重建損失和局部正樣本損失的加權(quán)。
式中是正樣本權(quán)重平衡因子,本文中根據(jù)經(jīng)驗設(shè)為1。
改進后的YOLOv5總損失L包含置信度損失L、邊界框損失L與前景背景平衡損失L。
式中L使用二進制交叉熵損失函數(shù)(binary cross entropy loss, BCE Loss)計算,L使用交并比損失函數(shù)(generalized intersection over union loss, GIoU Loss)計算,平衡系數(shù)、、分別設(shè)為1.00、0.05、0.01。
試驗用計算機硬件為Intel? Xeon(R) CPU E5-2678 v3 @ 2.50GHz,NVIDIA GeForce RTX 3090 GPU,在Ubuntu18.04系統(tǒng)完成,試驗環(huán)境為Pytorch1.9.0,CUDA11.1。訓練過程分為在COCO公開數(shù)據(jù)集[37]上預訓練和在松枯死木數(shù)據(jù)集上微調(diào)訓練。微調(diào)訓練過程中的參數(shù)設(shè)置如下:批處理尺寸設(shè)置為8,訓練輪次200輪,初始學習率為0.01,采用SGD優(yōu)化器,動量設(shè)為0.937,權(quán)重衰減為0.000 5。
研究采用COCO評價標準[38],使用mAP50、mAP75、mAP50~95、APsmall、APmid、APlarge等指標綜合評價松枯死木識別模型性能,其中AP計算均使用101插值法計算。mAP50、mAP75是交并比(intersection over union, IoU)閾值為0.5、0.75時模型的平均精度均值,通常在高閾值下的識別精度更貼近真實標注。mAP50~95表示從0.5到0.95間以步長0.05取IoU閾值下的平均mAP,APsmall、APmid、APlarge分別表示小、中、大目標的平均精度,依據(jù)目標檢測領(lǐng)域中通用數(shù)據(jù)集COCO物體定義,小目標是小于32×32個像素點的目標,中目標是在32×32~96×96個像素點之間的目標,大目標是指大于96×96個像素點的目標。
為了驗證各個改進模塊的優(yōu)化作用,在測試集上設(shè)計了消融對比試驗,結(jié)果如表1所示。加入SKFTT模塊后,mAP50提高了3.0個百分點,APsmall提升了11.9個百分點;再加入FB Loss后,進一步有效提升了小目標識別精度,與第二行相比,mAP50提升了0.2個百分點,APsmall提升了3.9個百分點。綜上所述,本文提出算法的mAP50達到92.7%,相較于YOLOv5算法mAP50提高了3.2個百分點,mAP50~95提升了8.3個百分點,APsmall、APmid與APlarge則分別提升了15.8、8.0、5.7個百分點。同時,添加了SKFTT模塊后,參數(shù)量和計算量有一定程度增加,F(xiàn)PS有一定下降,但能滿足對松材線蟲病防治的速度需求。
表1 YOLOv5不同改進方法的消融試驗結(jié)果對比
注:SKFTT為選擇性核特征紋理遷移模塊;FB Loss為前景背景平衡損失函數(shù);mAP50和mAP75是IoU為0.5和0.75時模型的mAP;mAP50~95是從0.5到0.95間以步長0.05取IoU閾值下的平均mAP;APsmall、APmid、APlarge分別表示小、中、大目標的平均精度;“√”表示加入該模塊,“-”表示不執(zhí)行此操作。下同。
Note: SKFTT is the selective kernel feature texture transfer module; FB Loss is the foreground background balance loss function; mAP50and mAP75refer to the mAP of the model when the IoU threshold is 0.5 and 0.75; mAP50~95indicates the average mAP at the IoU threshold in steps of 0.05 from 0.5 to 0.95; APsmall, APmid, and APlargemean AP for small, medium, and large targets, respectively; “√” indicates joining the module, and “-” means that this operation is not performed. The same below.
本研究采用的無人機影像包括不同光照、不同角度、不同尺度的實際林業(yè)生產(chǎn)場景,其中有大量小尺寸目標以及枝葉遮擋、過曝或逆光的邊界和紋理模糊樣本。改進前后的松枯死木識別模型效果對比如圖6所示,可以看出加入SKFTT與FB Loss后,預測框的分值增大,小目標的漏檢情況得到改善,輕度遮擋所致的邊界模糊問題得到緩解,過曝與逆光環(huán)境下模糊目標識別結(jié)果也有所提升。但仍有異形樣本存在漏檢情況,因為此類樣本不充分,訓練不足影響識別結(jié)果。同時,逆光環(huán)境下松枯死木紋理過于模糊、圖片所含信息過少,也存在少量漏檢情況。
正常Normal 遮擋Occlusion 過曝Overexposure 逆光Backlight a. 人工標注a. Manual annotationb. YOLOv5c. YOLOv5+SKFTTd. YOLOv5+SKFTT+FB Loss
為驗證改進算法識別優(yōu)勢,本文與其他先進的目標識別算法在測試集上進行對比試驗,包括兩階段的Faster R-CNN算法[39],一階段基于錨框的YOLOv4算法[40],一階段無錨框的YOLOX算法[41]、MT-YOLOv6算法[42],小目標識別算法QueryDet[43],以及松枯死木識別算法DDYOLOv5[19],結(jié)果如表2所示。
表2表明,在松枯死木測試集上,本文提出的改進型YOLOv5算法識別精度最佳。算法相較于Faster R-CNN、YOLOv4、YOLOX、MT-YOLOv6、QueryDet和DDYOLOv5的mAP50分別提高16.7、15.3、2.5、2.8、12.3和1.2個百分點,APsmall分別提高46.0、21.1、0.5、1.3、5.2和2.7個百分點。綜上,本文提出的改進算法對圖像信息具有較強的學習能力,能有效提升小目標的識別精度。
隨著特征網(wǎng)絡(luò)的加深,特征越來越抽象,特征可視化有助于理解深度學習網(wǎng)絡(luò)識別松枯死木的過程。在圖7中,本文將帶SKFTT的改進后算法與超分辨率算法BSRGAN[44]、realESRGAN[45]、Swin IR[46]、FTT[34]進行可視化對比。
可以看出,2倍特征圖與原圖中目標較為相似,說明使用2倍特征圖監(jiān)督SKFTT模塊重建高分辨率特征圖是可行的。同時,相比于先用超分辨率算法重建高清圖像,再提取得到特征圖,經(jīng)過SKFTT模塊重建的特征層有豐富的區(qū)域細節(jié)。相比FTT等其他超分辨算法,利用SKFTT模塊獲得的目標邊界更加清晰、紋理更加豐富,這有利于網(wǎng)絡(luò)對小目標與紋理模糊目標的特征進行提取。
表2 不同算法在測試集上的識別結(jié)果
注:原模型特征圖與超分辨率算法BSRGAN、realESRGAN、Swin IR的特征圖均來自YOLOv5擴展一層小目標檢測層后模型,F(xiàn)TT算法特征圖由其替換改進后模型中SKFTT得到。
在生產(chǎn)實踐中,受山地地形起伏大、無人機飛行高度高的限制,導致部分松枯死木在圖像中尺度小且紋理模糊,識別困難。為此本文提出了一種結(jié)合超分辨率重建的YOLOv5算法用于松枯死木識別在交并比閾值取0.5時的平均精度均值。
該算法在特征融合網(wǎng)絡(luò)中加入超分辨率模塊SKFTT和前景背景平衡損失函數(shù)FB Loss來挖掘不同尺度的關(guān)鍵信息并優(yōu)化重建特征。改進后算法松枯死木mAP50達到92.7%,小目標的平均精度為53.2%,相比于原算法mAP50提高了3.2個百分點,APsmall提升了15.8個百分點,F(xiàn)PS為37幀/s。將改進后算法與常用目標識別算法對比,其精度優(yōu)于Faster R-CNN、YOLOv4、YOLOX、MT-YOLOv6、QueryDet、DDYOLOv5等深度學習算法。
通過SKFTT模塊和其他超分辨率算法的重建特征圖可視化分析,發(fā)現(xiàn)經(jīng)過SKFTT模塊重建特征圖后,高分辨率的特征信息得到充分利用,目標的細節(jié)紋理得到補充,有利于小目標和紋理模糊目標的特征提取。
[1] 葉建仁. 松材線蟲病在中國的流行現(xiàn)狀、防治技術(shù)與對策分析[J]. 林業(yè)科學,2019,55(9):1-10.
YE Jianren. Epidemic status of pine wilt disease in China and its prevention and control techniques and counter measures[J]. Scientia Silvae Sinicae, 2019, 55(9): 1-10. (in Chinese with English abstract)
[2] 張曉東,楊皓博,蔡佩華,等. 松材線蟲病遙感監(jiān)測研究進展及方法述評[J]. 農(nóng)業(yè)工程學報,2022,38(18):184-194.
ZHANG Xiaodong, YANG Haobo, CAI Peihua, et al. Research progress on remote sensing monitoring of pine wilt disease[J]. Transactions of the Chinese Society of Agricultural Engineering (Transactions of the CSAE), 2022, 38(18): 184-194. (in Chinese with English abstract)
[3] 陶歡,李存軍,謝春春,等. 基于HSV閾值法的無人機影像變色松樹識別[J]. 南京林業(yè)大學學報(自然科學版),2019,43(3):99-106.
TAO Huan, LI Cunjun, XIE Chunchun, et al. Recognition of red-attack pine trees from UAV imagery based on the HSV threshold method[J]. Journal of Nanjing Forestry University (Natural Sciences Edition), 2019, 43(3): 99-106. (in Chinese with English abstract)
[4] 劉遐齡,程多祥,李濤,等. 無人機遙感影像的松材線蟲病危害木自動監(jiān)測技術(shù)初探[J]. 中國森林病蟲,2018,37(5):16-21.
LIU Xialing, CHENG Duoxiang, LI Tao, et al. Preliminary study on automatic monitoring trees infected by pine wood nematode with high resolution images from unmanned aerial vehicle[J]. Forest Pest and Disease, 2018, 37(5): 16-21. (in Chinese with English abstract)
[5] 劉金滄,王成波,常原飛. 基于多特征CRF的無人機影像松材線蟲病監(jiān)測方法[J]. 測繪通報,2019(7):78-82.
LIU Jincang, WANG Chengbo, CHANG Yuanfei. Monitoring method of bursaphelenchus xylophilus based on multi-feature CRF by UAV image[J]. Bulletin of Surveying and Mapping, 2019(7): 78-82. (in Chinese with English abstract)
[6] WU X, SAHOO D, HOI S C H. Recent advances in deep learning for object detection[J]. Neurocomputing, 2020, 396: 39-64.
[7] 李嘉祺,吳開華,張垚,等. 基于無人機光譜遙感和AI技術(shù)建立松材線蟲害監(jiān)測模型[J]. 電子技術(shù)與軟件工程,2021(8):91-94.
[8] 徐信羅,陶歡,李存軍,等. 基于Faster R-CNN的松材線蟲病受害木識別與定位[J]. 農(nóng)業(yè)機械學報,2020,51(7):228-236.
XU Xinluo, TAO Huan, LI Cunjun, et al. Detection and location of pine wilt disease induced dead pine trees based on Faster R-CNN[J]. Transactions of the Chinese Society for Agricultural Machinery, 2020, 51(7): 228-236. (in Chinese with English abstract)
[9] DENG X, TONG Z, LAN Y, et al. Detection and location of dead trees with pine wilt disease based on deep learning and UAV remote sensing[J]. AgriEngineering, 2020, 2(2): 294-307.
[10] 毛銳,張宇晨,王澤璽,等. 利用改進Faster-RCNN識別小麥條銹病和黃矮病[J]. 農(nóng)業(yè)工程學報,2022,38(17):176-185.
MAO Rui, ZHANG Yuchen, WANG Zexi, et al. Recognizing stripe rust and yellow dwarf of wheat using improved Faster-RCNN[J]. Transactions of the Chinese Society of Agricultural Engineering (Transactions of the CSAE), 2022, 38(17): 176-185. (in Chinese with English abstract)
[11] WU B, LIANG A, ZHANG H, et al. Application of conventional UAV-based high-throughput object detection to the early diagnosis of pine wilt disease by deep learning[J]. Forest Ecology and Management, 2021, 486: 118986.
[12] LIM W, CHOI K, CHO W, et al. Efficient dead pine tree detecting method in the Forest damaged by pine wood nematode () through utilizing unmanned aerial vehicles and deep learning-based object detection techniques[J]. Forest Science and Technology, 2022, 18(1): 36-43.
[13] 陳鋒軍,朱學巖,周文靜,等. 基于無人機航拍與改進YOLOv3模型的云杉計數(shù)[J]. 農(nóng)業(yè)工程學報,2020,36(22):22-30.
CHEN Fengjun, ZHU Xueyan, ZHOU Wenjing, et al. Spruce counting method based on improved YOLOv3 model in UAV images[J]. Transactions of the Chinese Society of Agricultural Engineering (Transactions of the CSAE), 2020, 36(22): 22-30. (in Chinese with English abstract)
[14] 孫鈺,周焱,袁明帥,等. 基于深度學習的森林蟲害無人機實時監(jiān)測方法[J]. 農(nóng)業(yè)工程學報,2018,34(21):74-81.
SUN Yu, ZHOU Yan, YUAN Mingshuai, et al. UAV real-time monitoring for forest pest based on deep learning[J]. Transactions of the Chinese Society of Agricultural Engineering (Transactions of the CSAE), 2018, 34(21): 74-81. (in Chinese with English abstract)
[15] 黃麗明,王懿祥,徐琪,等. 采用YOLO算法和無人機影像的松材線蟲病異常變色木識別[J]. 農(nóng)業(yè)工程學報,2021,37(14):197-203.
HUANG Liming, WANG Yixiang, XU Qi, et al. Recognition of abnormally discolored trees caused by pine wilt disease using YOLO algorithm and UAV images[J]. Transactions of the Chinese Society of Agricultural Engineering (Transactions of the CSAE), 2021, 37(14): 197-203. (in Chinese with English abstract)
[16] SUN Z, IBRAYIM M, HAMDULLA A. Detection of pine wilt nematode from drone images using UAV[J]. Sensors, 2022, 22(13): 4704.
[17] LI F, LIU Z, SHEN W, et al. A remote sensing and airborne edge-computing based detection system for pine wilt disease[J]. IEEE Access, 2021, 9: 66346-66360.
[18] LI X, LIU Y, HUANG P, et al. Integrating multi-scale remote-sensing data to monitor severe forest infestation in response to pine wilt disease[J]. Remote Sensing, 2022, 14(20): 5164.
[19] HU G, YAO P, WAN M, et al. Detection and classification of diseased pine trees with different levels of severity from UAV remote sensing images[J]. Ecological Informatics, 2022, 72: 101844.
[20] YOU J, ZHANG R, LEE J. A deep learning-based generalized system for detecting pine wilt disease using RGB-based UAV images[J]. Remote Sensing, 2021, 14(1): 150.
[21] HA V K, REN J C, XU X Y, et al. Deep learning based single image super-resolution: A survey[J]. International Journal of Automation and Computing, 2019, 16(4): 413-426.
[22] DONG C, LOY C C, HE K, et al. Learning a deep convolutional network for image super-resolution[C]// Proceedings of the European Conference on Computer Vision. Zurich, Switzerland: Springer, 2014: 184-199.
[23] 韓巧玲,周希博,宋潤澤,等. 基于序列信息的土壤CT圖像超分辨率重建[J]. 農(nóng)業(yè)工程學報,2021,37(17):90-96.
HAN Qiaoling, ZHOU Xibo, SONG Runze, et al. Super-resolution reconstruction of soil CT images using sequence information[J]. Transactions of the Chinese Society of Agricultural Engineering (Transactions of the CSAE), 2021, 37(17): 90-96. (in Chinese with English abstract)
[24] CHEN H, WANG Y, GUO T, et al. Pre-trained image processing transformer[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Nashville, TN, USA: IEEE, 2021: 12299-12310.
[25] WANG B, YAN B, JEON G, et al. Lightweight dual mutual-feedback network for artificial intelligence in medical image super-resolution[J]. Applied Sciences, 2022, 12(24): 12794.
[26] ALWAKID G, GOUDA W, HUMAYUN M, et al. Melanoma detection using deep learning-based classifications[J]. Healthcare, 2022, 10(12): 2481.
[27] 車熒璞,王慶,李世林,等. 基于超分辨率重建和多模態(tài)數(shù)據(jù)融合的玉米表型性狀監(jiān)測[J]. 農(nóng)業(yè)工程學報,2021,37(20):169-178.
CHE Yingpu, WANG Qing, LI Shilin, et al. Monitoring of maize phenotypic traits using super-resolution reconstruction and multimodal data fusion[J]. Transactions of the Chinese Society of Agricultural Engineering (Transactions of the CSAE), 2021, 37(20): 169-178. (in Chinese with English abstract)
[28] SHERMEYER J, VAN ETTEN A. The effects of super-resolution on object detection performance in satellite imagery[C]//Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW). Long Beach, CA, USA: IEEE, 2019: 1432-1441.
[29] 趙軍艷,降愛蓮,強彥. YOLOv3融合圖像超分辨率重建的魯棒人臉檢測[J]. 計算機工程與應用,2022,58(19):250-256.
ZHAO Junyan, JIANG Ailian, QIANG Yan. Robust Face Detection Using YOLOv3 Fusion Super Resolution Reconstruction[J]. Computer Engineering and Applications, 2022, 58(19): 250-256. (in Chinese with English abstract)
[30] 奉志強,謝志軍,包正偉,等. 基于改進YOLOv5的無人機實時密集小目標檢測算法[J/OL]. 航空學報,(2022-05-11)[2022-08-13].http://kns.cnki.net/kcms/detail/11.1929.V.20220509.2316.010.html.
FENG Zhiqiang, XIE Zhijun, BAO Zhengwei, et al. Real-time dense small target detection algorithm for unmanned aerial vehicle based on improved YOLOv5[J/OL]. Acta Aeronautica et Astronautica Sinica, (2022-05-11) [2022-08-13].http://kns.cnki.net/kcms/detail/11.1929.V.20220509.2316.010.html. (in Chinese with English abstract)
[31] BOSCH M, GIFFORD C M, RODRIGUEZ P A. Super-resolution for overhead imagery using densenets and adversarial learning[C]//Proceedings of the 2018 IEEE Winter Conference on Applications of Computer Vision (WACV). Lake Tahoe, NV, USA: IEEE, 2018: 1414-1422.
[32] BAI Y, ZHANG Y, DING M, et al. Sod-mtgan: Small object detection via multi-task generative adversarial network[C]// Proceedings of the European Conference on Computer Vision (ECCV). Munich, Germany: Springer, 2018: 206-221.
[33] NOH J, BAE W, LEE W, et al. Better to follow, follow to be better: Towards precise supervision of feature super-resolution for small object detection[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision. Seoul, Korea (South): IEEE, 2019: 9725-9734.
[34] DENG C, WANG M, LIU L, et al. Extended feature pyramid network for small object detection[J]. IEEE Transactions on Multimedia, 2021, 24: 1968-1979.
[35] GLENN Jocher, ALEX Stoken, JIRKA Borovec, et al. ultralytics/yolov5: v6.0-YOLOv5n ‘Nano’ models, Roboflow integration, TensorFlow export, OpenCV DNN support[EB/OL]. (2021-10-12)[2022-10-19]. https://github.com/ultralytics/yolov5.
[36] CHEN J, WU Q, LIU D, et al. Foreground-background imbalance problem in deep object detectors: A review[C]//Proceedings of the 2020 IEEE Conference on Multimedia Information Processing and Retrieval (MIPR). Shenzhen, China: IEEE, 2020: 285-290.
[37] LIN T Y, MAIRE M, BELONGIE S, et al. Microsoft coco: Common objects in context[C]//Proceedings of the European Conference on Computer Vision. Zurich, Switzerland: Springer, 2014: 740-755.
[38] CHEN X, FANG H, LIN T Y, et al. Microsoft coco captions: Data collection and evaluation server[EB/OL]. (2015-04-01)[2022-10-19]. https://arxiv.org/abs/ 1504.00325.
[39] REN S, HE K, GIRSHICK R, et al. Faster R-CNN: Towards real-time object detection with region proposal networks[J]. IEEE Transactions on Pattern Analysis & Machine Intelligence, 2017, 39(6): 1137-1149.
[40] BOCHKOVSKIY A, WANG C Y, LIAO H Y M. Yolov4: Optimal speed and accuracy of object detection[EB/OL]. (2020-04-23)[2022-11-06]. https://arxiv.org/abs/2004.10934.
[41] GE Z, LIU S, WANG F, et al. Yolox: Exceeding yolo series in 2021[EB/OL]. (2021-08-06)[2022-11-06]. https://arxiv.org/ abs/2107.08430.
[42] LI C, LI L, JIANG H, et al. YOLOv6: A single-stage object detection framework for industrial applications[EB/OL]. (2022-09-07)[2022-11-06]. https://arxiv.org/abs/2209.02976.
[43] YANG C, HUANG Z, WANG N. QueryDet: Cascaded sparse query for accelerating high-resolution small object detection[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. New Orleans, LA, USA: IEEE, 2022: 13668-13677.
[44] ZHANG K, LIANG J, VAN Gool L, et al. Designing a practical degradation model for deep blind image super-resolution[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision. Montreal, QC, Canada: IEEE, 2021: 4791-4800.
[45] WANG X, XIE L, DONG C, et al. Real-esrgan: Training real-world blind super-resolution with pure synthetic data[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision. Montreal, BC, Canada: IEEE, 2021: 1905-1914.
[46] LIANG J, CAO J, SUN G, et al. Swinir: Image restoration using swin transformer[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision. Montreal, BC, Canada: IEEE, 2021: 1833-1844.
Recognition of dead pine trees using YOLOv5 by super-resolution reconstruction
WANG Wenjin1, YOU Ziyi2, SHAO Lijiang1, LI Xiaolin1, WU Songqing2,3, ZHANG Zhuhe4, HUANG Shiguo1,3※, ZHANG Feiping2,3
(1.,,350002,; 2.,350002,; 3.350002,; 4.350002,)
Pine wilt disease has posed a significant threat to forest ecosystems, due to its highly contagious and destructive nature. The critical step in the prevention and control of pine wilt is eliminating the disease sources, which requires the accurate recognition and removal of dead pine trees. However, small or blurred targets are captured, such as overexposure, backlight, and samples occluded by foliage in practical applications. The reason is that the UAVs have to fly high for capture, due to the geography of the hilly and mountain areas. In this study, a novel You Look Only Once v5 (YOLOv5) algorithm was proposed for the recognition of dead pine trees. The super-resolution reconstruction was performed at the feature level, in order to overcome the challenge of recognizing such targets. The YOLOv5 structure was redesigned in two aspects. Firstly, the Selective Kernel Feature Texture Transfer (SKFTT) module was adopted to create the high-resolution detection feature maps with detailed textures, where improved detection accuracy was obtained for the small targets and blurred targets. Specifically, the feature maps with high texture were selected from the backbone network, whereas, the feature maps with high semantics were selected from the feature fusion network. These feature maps were then sent to the texture extractor and content extractor. A selective feature fusion module was used to fuse the critical information about different scales using their weights. Secondly, the Foreground Background Loss function (FB Loss) was introduced to attenuate useless features, while enhancing the gradient contribution of positive samples, and balancing the distribution of positive and negative samples, in order to supervise the reconstruction of high-resolution feature maps. Furthermore, the dataset was obtained to validate the effectiveness of the improved model from the approximately 15 400 hectares of forest land located in Fuzhou and Minhou City, Fujian Province, China. The UAV images were subsequently cropped and screened to obtain about 29 250 labelled samples for further experiments. A series of ablation tests and visualizations were conducted on the testing datasets to verify the effectiveness. Experimental results showed that the mean Average Precision (mAP50) of the improved model was 92.7%, mAP50~95was 62.1%, and APsmallwas 53.2%. Compared with the baseline model, the improved model was achieved in the increases of 3.2, 8.3, and 15.8 percentage points in the mAP50, mAP50~95, and mAPsmall, respectively. The mAP50of the improved model was 16.7, 15.3, 2.5, 2.8, 12.3, and 1.2 percentage points higher than that of the Faster R-CNN, YOLOv4, YOLOX, MT-YOLOv6, QueryDet, and DDYOLOv5 networks, respectively. In addition, the improved model was achieved in the frames per second FPS of 37, which fully met the detection requirements of dead pine trees. Visualization results showed that the improved model can be expected to serve as the recognition of occlusion, overexposure, and backlight targets. The feature maps of the small target detection layer were visualized with different super-resolution algorithms to facilitate observation. The comparison revealed that the texture was improved with the apparently clear boundary shape. In conclusion, the detecting challenge of small targets and blurred targets can be effectively alleviated using the improved dead pine tree detection algorithm, due to the high accuracy. Therefore, the improved detection algorithm is conducive to the efficient removal and comprehensive prevention/control of diseased trees. The improved model can greatly contribute to accelerating the “digital forest prevention” process in precision agriculture.
UAV; image recognition; dead pine trees; small target detection; image super-reconstruction; feature fusion
10.11975/j.issn.1002-6819.202211141
TP391.4
A
1002-6819(2023)-05-0137-09
王文瑾,游子繹,邵歷江,等. 融合超分辨率重建的YOLOv5松枯死木識別模型[J]. 農(nóng)業(yè)工程學報,2023,39(5):137-145.doi:10.11975/j.issn.1002-6819.202211141 http://www.tcsae.org
WANG Wenjin, YOU Ziyi, SHAO Lijiang, et al. Recognition of dead pine trees using YOLOv5 by super-resolution reconstruction[J]. Transactions of the Chinese Society of Agricultural Engineering (Transactions of the CSAE), 2023, 39(5): 137-145. (in Chinese with English abstract) doi:10.11975/j.issn.1002-6819.202211141 http://www.tcsae.org
2022-11-15
2023-02-13
福建省林業(yè)科技項目(閩林文[2021]35號);國家林業(yè)和草原局重大應急科技項目(ZD202001);福建農(nóng)林大學科技創(chuàng)新專項基金項目(KFb22097XA)
王文瑾,研究方向為計算機視覺。Email:1201193020@fafu.edu.cn
黃世國,博士,教授,研究方向為農(nóng)林業(yè)計算機應用、計算機視覺。Email:sghuang@fafu.edu.cn