趙 昊,劉文萍,周 焱,駱有慶,宗世祥,任利利
基于半監(jiān)督學(xué)習(xí)的松林變色疫木檢測(cè)方法
趙 昊1,劉文萍1※,周 焱1,駱有慶2,宗世祥2,任利利2
(1. 北京林業(yè)大學(xué)信息學(xué)院,北京 100083;2. 北京林業(yè)大學(xué)林學(xué)院,北京 100083)
針對(duì)模型訓(xùn)練中數(shù)據(jù)標(biāo)注成本過(guò)高的問(wèn)題,提出一種基于無(wú)人機(jī)圖像分析的半監(jiān)督變色疫木目標(biāo)檢測(cè)方法。該方法提出級(jí)聯(lián)抗噪聲半監(jiān)督目標(biāo)檢測(cè)模型(Cascade Noise-Resistant Semi-supervised object detection,CNRS),使用抗噪聲學(xué)習(xí)提升模型對(duì)偽標(biāo)簽的學(xué)習(xí)質(zhì)量;通過(guò)級(jí)聯(lián)網(wǎng)絡(luò)解決訓(xùn)練中正負(fù)樣本的分布問(wèn)題;使用ResNet50和特征金字塔網(wǎng)絡(luò)結(jié)構(gòu)增強(qiáng)模型對(duì)多尺寸和小目標(biāo)疫木的識(shí)別能力;在監(jiān)督學(xué)習(xí)階段使用FocalLoss,提升網(wǎng)絡(luò)對(duì)邊緣目標(biāo)和早期疫木等困難樣本的學(xué)習(xí),使用SmoothL1Loss保證梯度相對(duì)穩(wěn)定;在RCNN階段使用軟化非極大抑制軟化檢測(cè)框剔除過(guò)程。該文提出的半監(jiān)督目標(biāo)檢測(cè)模型CNRS使用訓(xùn)練集中半數(shù)標(biāo)注的數(shù)據(jù)進(jìn)行訓(xùn)練,試驗(yàn)結(jié)果表明,最優(yōu)模型在測(cè)試集上的平均精度(Average Precision,AP)可達(dá)87.7%,與Faster RCNN使用完全標(biāo)注數(shù)據(jù)相比,標(biāo)注量減少了50%,且AP提升了2.3個(gè)百分點(diǎn),與同時(shí)期最先進(jìn)的半監(jiān)督模型Combating Noise相比,AP提升了1.6個(gè)百分點(diǎn)。該方法在準(zhǔn)確檢出多種不同形態(tài)疫木的基礎(chǔ)上,大幅度降低了數(shù)據(jù)標(biāo)注成本,為農(nóng)林病蟲害防治提供了可靠的數(shù)據(jù)支持。
無(wú)人機(jī);圖像識(shí)別;松林疫木檢測(cè);半監(jiān)督學(xué)習(xí);目標(biāo)檢測(cè)
森林在保障物種多樣性、調(diào)節(jié)氣候等方面起著重要作用。林業(yè)病蟲害每年給中國(guó)帶來(lái)的經(jīng)濟(jì)損失已超2 000億。隨著經(jīng)濟(jì)全球化發(fā)展進(jìn)程的加快,突發(fā)性的外來(lái)入侵也變得更加普遍,對(duì)于生物多樣性較低的人工松林,病蟲害入侵所帶來(lái)的危害性更大[1]。傳統(tǒng)監(jiān)測(cè)和治理手段大多依靠人工,效率極低且效果不佳?;谶b感圖像的監(jiān)測(cè)手段,如商業(yè)衛(wèi)星、無(wú)人機(jī)航拍等,可實(shí)現(xiàn)對(duì)松林受災(zāi)情況的粗略統(tǒng)計(jì)[2]。近年來(lái),隨著民用無(wú)人機(jī)的普及,其相較于商業(yè)衛(wèi)星更具有便攜性、易用性和準(zhǔn)確性等優(yōu)勢(shì),更適用于林區(qū)的信息采集和監(jiān)測(cè)任務(wù)[3]。但是,在數(shù)據(jù)統(tǒng)計(jì)過(guò)程中仍存在許多不便,需依靠大量人工進(jìn)行目視解意。
隨著計(jì)算機(jī)科技的不斷發(fā)展,科研工作者們將圖像處理和人工智能應(yīng)用于松林疫木檢測(cè)當(dāng)中。主要方式有:1)傳統(tǒng)圖像分析,如采用線性譜聚類實(shí)現(xiàn)對(duì)枯死木的識(shí)別[4]。2)BP神經(jīng)網(wǎng)絡(luò),根據(jù)疫木光譜信息和其他生態(tài)因子的相關(guān)關(guān)系建立預(yù)測(cè)模型[5]。3)深度學(xué)習(xí),使用卷積神經(jīng)網(wǎng)絡(luò)對(duì)標(biāo)注圖像進(jìn)行卷積操作,提取深層次特征,并進(jìn)行目標(biāo)識(shí)別和分類[6]。傳統(tǒng)圖像分析和BP神經(jīng)網(wǎng)絡(luò)相對(duì)復(fù)雜,在實(shí)際應(yīng)用過(guò)程中較為困難。因此,將深度學(xué)習(xí)應(yīng)用于松林疫木檢測(cè)[7]-[10]成為一種主流。
基于無(wú)人機(jī)航拍圖像的松林疫木檢測(cè)流程主要包括:無(wú)人機(jī)采集試驗(yàn)數(shù)據(jù)、數(shù)據(jù)標(biāo)注、數(shù)據(jù)預(yù)處理、模型改進(jìn)及訓(xùn)練、對(duì)圖像進(jìn)行檢測(cè)并輸出結(jié)果。將深度學(xué)習(xí)應(yīng)用于單株疫木檢測(cè)的研究尚停留在全監(jiān)督階段,全監(jiān)督主要分為單階段模型和兩階段模型[11]。兩階段模型相較于單階段模型增加了區(qū)域候選框,其準(zhǔn)確率更高,但速度較慢,代表性的模型有RCNN[12]、Faster RCNN[13]等。單階段模型只需提取一次特征,不再使用候選區(qū)域網(wǎng)絡(luò),所以速度更快,但是精度有所損失,代表性的模型有YOLO系列[14]-[17]、SSD[18]和CornerNet[19]等。由于全監(jiān)督模型的數(shù)據(jù)標(biāo)注成本過(guò)高,數(shù)據(jù)集普遍較小,這導(dǎo)致模型的泛化能力較差。降低標(biāo)注成本,擴(kuò)大數(shù)據(jù)集規(guī)模成為提升模型綜合性能的迫切需求。
弱監(jiān)督和半監(jiān)督學(xué)習(xí)是解決數(shù)據(jù)標(biāo)注問(wèn)題的兩個(gè)研究方向,但是弱監(jiān)督目標(biāo)檢測(cè)所依賴的標(biāo)簽量極其龐大,甚至遠(yuǎn)超全監(jiān)督模型,并且其性能相對(duì)有限[20]-[23]。半監(jiān)督目標(biāo)檢測(cè)模型分為一致性學(xué)習(xí)[24]-[27]和偽標(biāo)簽[28-30],但本質(zhì)上均為基于偽標(biāo)簽的學(xué)習(xí)。該模型只需少量標(biāo)注,即可與大量未標(biāo)注數(shù)據(jù)共同進(jìn)行訓(xùn)練,其檢測(cè)精度超越全監(jiān)督模型,并且泛化能力更強(qiáng)。半監(jiān)督學(xué)習(xí)中的偽標(biāo)簽由教師模型產(chǎn)生,該過(guò)程會(huì)摻雜大量的噪聲標(biāo)簽,隨著模型對(duì)噪聲標(biāo)簽的擬合,檢測(cè)精度會(huì)受到較大的影響。在最先進(jìn)的半監(jiān)督目標(biāo)檢測(cè)模型中,很少有針對(duì)偽標(biāo)簽進(jìn)行噪聲過(guò)濾的研究。Combating Noise是較為先進(jìn)的半監(jiān)督目標(biāo)檢測(cè)模型,其對(duì)噪聲進(jìn)行評(píng)估,并依據(jù)量化指標(biāo)提取偽標(biāo)簽中的高質(zhì)量軟目標(biāo)進(jìn)行半監(jiān)督訓(xùn)練,在COCO數(shù)據(jù)集上的平均精度均值(mean Average Precision,mAP)達(dá)到了43.2%,該結(jié)果超越了大多數(shù)全監(jiān)督模型精度[31]。
本文以松材線蟲()和紅脂大小蠹()侵染的紅松林和油松林為研究對(duì)象,使用無(wú)人機(jī)航拍采集數(shù)據(jù),并對(duì)Combating Noise半監(jiān)督目標(biāo)檢測(cè)模型進(jìn)一步改進(jìn),提出級(jí)聯(lián)抗噪聲半監(jiān)督目標(biāo)檢測(cè)模型(Cascade Noise-Resistant Semi-supervised object detection,CNRS)。
本文試驗(yàn)數(shù)據(jù)集涵蓋:遼寧省撫順市、建平縣、阜新市,內(nèi)蒙古自治區(qū)黑里河鎮(zhèn)和廣東省始興縣。受災(zāi)區(qū)域由當(dāng)?shù)亓謭?chǎng)進(jìn)行初步判定,獲取原始數(shù)據(jù)后再由林業(yè)專家和護(hù)林員進(jìn)行復(fù)審。撫順市和始興縣為松材線蟲病受災(zāi)區(qū)域,建平縣、阜新市和黑里河鎮(zhèn)為紅脂大小蠹受災(zāi)區(qū)域。數(shù)據(jù)集中的疫木形態(tài)主要包含圖1中所示的幾種。
圖1 數(shù)據(jù)集樣例
本文將松材線蟲和紅脂大小蠹侵染的紅松和油松作為研究對(duì)象,數(shù)據(jù)采集時(shí)間為2019年7月、2020年8月和2021年5月3個(gè)時(shí)間點(diǎn)。分別使用大疆精靈4和大疆御Mavic Air 2無(wú)人機(jī)(表1),飛行高度為80~150 m,飛行速度為5 m/s,采用垂直正射的形式進(jìn)行拍攝。相鄰航片的重疊度設(shè)置為50%,在100 m的飛行高度上單張圖像覆蓋面積約為7 200 m2。
數(shù)據(jù)集的質(zhì)量在很大程度上決定了模型的好壞,本文數(shù)據(jù)集涉及3個(gè)省份,5個(gè)區(qū)域,6種疫木形態(tài),采用VOC格式進(jìn)行標(biāo)注,標(biāo)注形式為完全覆蓋整棵疫木,共8 589個(gè)標(biāo)注框,平均每張圖35.79個(gè);共有1個(gè)分類,為'infect'。通過(guò)隨機(jī)裁剪、水平翻轉(zhuǎn)、隨機(jī)旋轉(zhuǎn)等操作對(duì)標(biāo)注后的數(shù)據(jù)進(jìn)行離線增廣,增廣后的訓(xùn)練集為2 160張,測(cè)試集為240張。在所有標(biāo)注框中最小為22×23像素,最大為500×632像素。本文依照像素面積對(duì)目標(biāo)進(jìn)行劃分[32],可知小目標(biāo)占1.9%(0~2 500像素點(diǎn)),中等目標(biāo)占95.9%(2 500~90 000像素點(diǎn)),大目標(biāo)占2.2%(90 000像素點(diǎn)以上)。標(biāo)注框數(shù)據(jù)分布與VOC數(shù)據(jù)集較為相似,因此更適用于在使用VOC標(biāo)準(zhǔn)數(shù)據(jù)集的模型上進(jìn)行改進(jìn)。
表1 無(wú)人機(jī)及相機(jī)主要參數(shù)
基于半監(jiān)督學(xué)習(xí)的松林變色疫木檢測(cè)方法由數(shù)據(jù)采集、半監(jiān)督模型訓(xùn)練和算法檢測(cè)3個(gè)步驟組成,具體流程如圖2所示:使用無(wú)人機(jī)對(duì)松林疫區(qū)進(jìn)行航線掃描構(gòu)建數(shù)據(jù)集,通過(guò)標(biāo)注工具LabelImg對(duì)隨機(jī)選取的半數(shù)圖像進(jìn)行疫木標(biāo)記;將標(biāo)注數(shù)據(jù)送入監(jiān)督網(wǎng)絡(luò)進(jìn)行學(xué)習(xí),使用訓(xùn)練得到的監(jiān)督模型為半監(jiān)督學(xué)習(xí)生成偽標(biāo)簽,將偽標(biāo)簽和真實(shí)標(biāo)簽一同送入網(wǎng)絡(luò)進(jìn)行半監(jiān)督學(xué)習(xí),綜合平均精度和F-Score選取最優(yōu)模型。
圖2 基于偽標(biāo)簽的半監(jiān)督目標(biāo)檢測(cè)流程
為解決檢測(cè)過(guò)程中航線重疊度的問(wèn)題,可使用圖像拼接軟件對(duì)航線掃描區(qū)域進(jìn)行圖像拼接,拼接完畢后對(duì)圖像進(jìn)行水平裁剪。最后,使用疫木檢測(cè)軟件對(duì)裁剪后的圖像進(jìn)行檢測(cè)并輸出結(jié)果。檢測(cè)軟件需在帶有圖像處理單元的本地便攜式服務(wù)器中運(yùn)行,以此加速檢測(cè)速度。該流程無(wú)需使用網(wǎng)絡(luò),更適用于林場(chǎng)、山區(qū)等網(wǎng)絡(luò)信號(hào)不佳的地區(qū)。
2.2.1 改進(jìn)半監(jiān)督目標(biāo)檢測(cè)模型CNRS的框架結(jié)構(gòu)
Combating Noise的檢測(cè)器由閾值界定正負(fù)樣本,當(dāng)區(qū)域建議的交并比(Intersection over Union,IoU)與設(shè)置的閾值相近時(shí)檢測(cè)結(jié)果會(huì)更優(yōu),但是該閾值難以界定,當(dāng)閾值為0.5時(shí)會(huì)產(chǎn)生較多誤檢,閾值增加到0.7時(shí)正樣本會(huì)隨之減少,易造成過(guò)擬合。因此,本文提出的CNRS半監(jiān)督目標(biāo)檢測(cè)模型將檢測(cè)器的RCNN網(wǎng)絡(luò)替換為級(jí)聯(lián)網(wǎng)絡(luò)[33],如圖3所示。級(jí)聯(lián)網(wǎng)絡(luò)相較于原網(wǎng)絡(luò)增加了兩個(gè)RCNN階段,上一階段的輸出作為下一階段的輸入,從H1至H3的IoU閾值依次遞增,分別設(shè)置為{0.5,0.6,0.7},以此逐步過(guò)濾誤檢框,均衡精度和過(guò)擬合問(wèn)題。
CNRS替換為級(jí)聯(lián)網(wǎng)絡(luò)后檢測(cè)器框架與Combating Noise保持一致,相較于原始Faster RCNN有如下改進(jìn):1)將基礎(chǔ)特征提取器由VGG16[34]替換為ResNet50[35],ResNet50通過(guò)殘差結(jié)構(gòu)進(jìn)一步提取更加抽象和豐富的語(yǔ)義信息,其在精度和速度上處于更加領(lǐng)先的地位。2)添加了特征金字塔網(wǎng)絡(luò)(Feature Pyramid Network,F(xiàn)PN)[36],數(shù)據(jù)集中部分疫木目標(biāo)較小,通過(guò)卷積操作后,空間信息將大量丟失,該網(wǎng)絡(luò)能夠融合淺層空間信息和深層高級(jí)語(yǔ)義提升模型對(duì)多尺寸和小目標(biāo)的檢測(cè)能力。3)將原網(wǎng)絡(luò)中的RoI Pooling替換為RoI Align[37],RoI Pooling需進(jìn)行兩次量化取整操作,會(huì)帶來(lái)一定的精度損失,而RoI Align取消量化操作,采用雙線性插值的方法將特征聚集轉(zhuǎn)化為連續(xù)操作,克服了RoI Pooling帶來(lái)的區(qū)域不匹配問(wèn)題。
注:“C2~C5”為基礎(chǔ)特征;“P2~P6”為特征圖;“H1~H3”為網(wǎng)絡(luò)頭;“Cl1~Cl3”為分類信息;“B1~B3”為回歸框;“FPN”為特征金字塔網(wǎng)絡(luò);“RPN”為區(qū)域建議網(wǎng)絡(luò);“設(shè)置為背景”表示將區(qū)域建議中的“負(fù)面建議”和“錯(cuò)誤的正面建議”設(shè)置為背景。
2.2.2 CNRS模型引入抗噪聲學(xué)習(xí)
抗噪聲學(xué)習(xí)主要改進(jìn)了RCNN階段的感興趣區(qū)域頭(Region of interest head,RoI head)。CNRS在改變網(wǎng)絡(luò)結(jié)構(gòu)后,需要在每個(gè)RCNN階段的RoI head中構(gòu)建軟目標(biāo),并使用SoftFocalLoss和添加了不確定性指標(biāo)的L1Loss提高對(duì)偽標(biāo)簽的學(xué)習(xí)質(zhì)量。SoftFocalLoss和改進(jìn)L1Loss定義如下:
2.2.3 優(yōu)化損失函數(shù)
2.2.4 軟化非極大抑制
數(shù)據(jù)集中的疫木密集程度較高,標(biāo)注框會(huì)出現(xiàn)大量重疊,式(8)傳統(tǒng)非極大抑制(Non-Maximum Suppression,NMS)會(huì)將IoU大于閾值的檢測(cè)框全部設(shè)為0,這將導(dǎo)致檢測(cè)框中某個(gè)IoU較小的臨近目標(biāo)被過(guò)濾掉,造成模型準(zhǔn)確率下降。所以,CNRS將RCNN階段的NMS替換為式(9)軟化非極大抑制(Soft Non-Maximum Suppression,Soft-NMS),Soft-NMS采用線性加權(quán)的方式降低置信度,以此來(lái)軟化檢測(cè)框剔除過(guò)程,定義如下
本文使用平均精度(Average Precision,AP)和F-Score作為模型的評(píng)價(jià)指標(biāo),AP為精準(zhǔn)率(Precision)和召回率(Recall)曲線下方面積[38],公式定義如下:
模型訓(xùn)練采用Colab云服務(wù)器,操作系統(tǒng)為Ubuntu18.04,使用基于Pytorch的開(kāi)源框架MMdetection2.18版本。圖形處理器為16 GB顯存的NVIDIA Tesla P100,CPU為4核Intel Xeon,CUDA版本為10.2。
監(jiān)督和半監(jiān)督學(xué)習(xí)的優(yōu)化器均采用0.9動(dòng)量,0.000 1衰減權(quán)重的“SGD”類型。模型的最大學(xué)習(xí)率為0.01,最大epoch為150,設(shè)置{8,98}兩個(gè)學(xué)習(xí)率衰減起止回合。RCNN測(cè)試階段的Soft-NMS閾值均設(shè)置0.45。監(jiān)督學(xué)習(xí)中FocalLoss的損失權(quán)重設(shè)置為1.0,半監(jiān)督學(xué)習(xí)中SoftFocalLoss的損失權(quán)重為15.0。半監(jiān)督學(xué)習(xí)的學(xué)習(xí)率設(shè)置為線性預(yù)熱策略,初始學(xué)習(xí)率為0.001,預(yù)熱600次迭代后達(dá)到最大學(xué)習(xí)率,隨后保持0.01的學(xué)習(xí)率至第8個(gè)epoch。此外在第8個(gè)epoch后抗噪聲損失中的動(dòng)態(tài)指數(shù)將由0.2調(diào)整為0.3。監(jiān)督和半監(jiān)督學(xué)習(xí)均使用8個(gè)GPU線程將8張圖片為一個(gè)批次進(jìn)行訓(xùn)練。
在線數(shù)據(jù)增強(qiáng)采用隨機(jī)翻轉(zhuǎn)和隨機(jī)剪裁相組合的方式。監(jiān)督學(xué)習(xí)階段使用1號(hào)數(shù)據(jù)集,訓(xùn)練集為1 080張,因此一個(gè)epoch包含135次迭代。圖像被隨機(jī)剪裁后,像素大小為769×1 368,經(jīng)縮放后,送入模型的像素大小均為1 120.01×640.01。半監(jiān)督階段使用1號(hào)數(shù)據(jù)集和2號(hào)數(shù)據(jù)集的偽標(biāo)簽進(jìn)行訓(xùn)練,訓(xùn)練集共2 160張,一個(gè)epoch包含270次迭代。
由表2可知,單階段檢測(cè)器SSD300在該數(shù)據(jù)集上的平均精度為64.2%,不適用于改進(jìn)半監(jiān)督,F(xiàn)aster RCNN的平均精度為85.4%,相比SSD300提升了21.2個(gè)百分點(diǎn),更適合作為半監(jiān)督目標(biāo)檢測(cè)模型的檢測(cè)器。CNRS最優(yōu)模型在測(cè)試集上的AP為87.7%,F(xiàn)-Score為0.669,與Faster RCNN相比AP提升了2.3個(gè)百分點(diǎn),F(xiàn)-Score提升了0.03,與Combating Noise相比AP提升了1.6個(gè)百分點(diǎn),F(xiàn)-Score提升了0.071。綜合表2和圖4可知,CNRS在與之對(duì)比的所有模型中精度最高。由檢測(cè)速度和模型大小可知,CNRS雖能滿足使用需求,但未來(lái)將嘗試進(jìn)一步壓縮模型和提升檢測(cè)速度。
為了驗(yàn)證級(jí)聯(lián)網(wǎng)絡(luò)、抗噪聲學(xué)習(xí)、優(yōu)化損失函數(shù)和軟化非極大抑制在CNRS中的有效性,表2中Model1和Model2為級(jí)聯(lián)抗噪聲半監(jiān)督目標(biāo)檢測(cè)模型不同配置下的試驗(yàn)結(jié)果。Model1添加級(jí)聯(lián)網(wǎng)絡(luò)后,在使用RoI Pooling的情況下將AP提升了0.3個(gè)百分點(diǎn)。因RoI Pooling對(duì)分類結(jié)果影響較小,但其量化偏差對(duì)位置精度影響較大,所以Model2在Model1的基礎(chǔ)上改用RoI Align后AP進(jìn)一步提升了0.3個(gè)百分點(diǎn)。因圖像中遮擋目標(biāo)和邊緣困難樣本較多,所以CNRS監(jiān)督學(xué)習(xí)階段優(yōu)化損失函數(shù)和使用Soft NMS后AP進(jìn)一步提升了1個(gè)百分點(diǎn)。由Model1、Model2和CNRS與其監(jiān)督學(xué)習(xí)階段的模型對(duì)比可知,抗噪聲學(xué)習(xí)能夠有效提取偽標(biāo)簽中的信息。
表2 不同模型平均精度及F-Score
注:以sup結(jié)尾的模型為半監(jiān)督模型的監(jiān)督學(xué)習(xí)階段。
Note: The model ending with “sup” is the supervised learning phase of the semi-supervised model.
圖4 不同模型PR曲線
本文使用CNRS的最優(yōu)模型對(duì)24張測(cè)試圖像進(jìn)行檢測(cè),結(jié)果如圖5所示。CNRS可準(zhǔn)確檢出小目標(biāo)疫木、邊緣疫木、枯死疫木、早期疫木、過(guò)火疫木和紫色疫木,由檢測(cè)結(jié)果可知:復(fù)雜多樣的數(shù)據(jù)集能夠使模型的泛化能力更強(qiáng),可檢出多種地形中不同形態(tài)的疫木。
本文還對(duì)4個(gè)不同模型進(jìn)行了錯(cuò)檢和漏檢對(duì)比,其中測(cè)試圖像的真實(shí)疫木數(shù)量為841株,統(tǒng)計(jì)結(jié)果如表3所示。半監(jiān)督目標(biāo)檢測(cè)模型Combating Noise和CNRS在只進(jìn)行50%數(shù)據(jù)標(biāo)注的情況下,錯(cuò)檢率和漏檢率仍得到穩(wěn)步下降。本文提出的CNRS模型與同時(shí)期最先進(jìn)的半監(jiān)督目標(biāo)檢測(cè)模型Combating Noise相比,錯(cuò)檢率和漏檢率分別下降了1.6和1.9個(gè)百分點(diǎn)。由表中數(shù)據(jù)可知CNRS的檢測(cè)效果最佳。
圖5 CNRS檢測(cè)結(jié)果示例
表3 不同模型檢測(cè)結(jié)果對(duì)比
CNRS的檢測(cè)結(jié)果中仍存在錯(cuò)檢和漏檢情況,因此,本文對(duì)其進(jìn)行了誤檢分析,如圖6所示。典型錯(cuò)誤有:因畸變?cè)斐傻穆z、個(gè)別早期疫木的漏檢、遮擋造成的漏檢、因強(qiáng)光造成的錯(cuò)檢。錯(cuò)檢和漏檢的主要原因有以下幾點(diǎn):1)離中心點(diǎn)較遠(yuǎn)區(qū)域存在側(cè)視疫木。2)疫木形態(tài)較多,未進(jìn)行詳細(xì)分類。3)處于早期感染階段的疫木特征不明顯。4)出現(xiàn)坡地時(shí),樹(shù)木形態(tài)存在畸變。5)數(shù)據(jù)采集時(shí)光照較強(qiáng)。
圖6 誤檢樣例
本文針對(duì)深度學(xué)習(xí)中松林疫木檢測(cè)模型數(shù)據(jù)標(biāo)注成本過(guò)高的問(wèn)題,提出了一種級(jí)聯(lián)抗噪聲半監(jiān)督目標(biāo)檢測(cè)模型CNRS,通過(guò)使用抗噪聲損失函數(shù)提升偽標(biāo)簽學(xué)習(xí)質(zhì)量,降低了數(shù)據(jù)標(biāo)注量。在此基礎(chǔ)上,對(duì)模型框架進(jìn)行優(yōu)化,通過(guò)試驗(yàn)對(duì)比選取最優(yōu)改進(jìn)模型,進(jìn)一步提高了模型的檢測(cè)精度。
1)CNRS最優(yōu)模型的平均精度為87.7%,F(xiàn)-Score為0.669,與Faster RCNN和SSD300相比數(shù)據(jù)標(biāo)注成本降低了50%,與同時(shí)期最先進(jìn)的半監(jiān)督目標(biāo)檢測(cè)模型Combating Noise相比平均精度提升了1.6個(gè)百分點(diǎn)。該結(jié)果表明,本文提出的檢測(cè)方法能夠有效降低數(shù)據(jù)標(biāo)注成本,在半監(jiān)督松林變色疫木檢測(cè)精度上處于領(lǐng)先地位。
2)檢測(cè)結(jié)果顯示,CNRS能夠準(zhǔn)確檢出上述6種疫木。因此,多樣的數(shù)據(jù)能夠使模型的泛化能力更強(qiáng)。
3)CNRS的錯(cuò)檢率為12.3%,漏檢率為6.8%,該結(jié)果在4個(gè)模型中表現(xiàn)最優(yōu)。但其檢測(cè)結(jié)果中仍存在較多漏檢和誤檢。此外,現(xiàn)有研究中半監(jiān)督目標(biāo)檢測(cè)模型多基于兩階段檢測(cè)器,不能滿足實(shí)時(shí)檢測(cè)的需求。
未來(lái),將進(jìn)一步擴(kuò)大數(shù)據(jù)集規(guī)模,并根據(jù)疫木形態(tài)對(duì)數(shù)據(jù)集進(jìn)行詳細(xì)劃分,在提升模型精度的同時(shí)將嘗試使用單階段檢測(cè)器進(jìn)一步提升檢測(cè)速度。
[1] 李東. 無(wú)公害防治技術(shù)在林業(yè)病蟲害防治中的應(yīng)用[J]. 南方農(nóng)業(yè),2021,15(35):12-14.
[2] 張曉東,楊皓博,蔡佩華,等. 松材線蟲病遙感監(jiān)測(cè)研究進(jìn)展及方法述評(píng)[J]. 農(nóng)業(yè)工程學(xué)報(bào),2022,38(18):184-194.
Zhang Xiaodong, Yang Haobo, Cai Peihua, et al. Research progress on remote sensing monitoring of Pine Wilt Disease[J]. Transactions of the Chinese Society of Agricultural Engineering (Transactions of the CSAE), 2022, 38(18): 184-194.(in Chinese with English abstract)
[3] 張軍國(guó),馮文釗,胡春鶴,等. 無(wú)人機(jī)航拍林業(yè)蟲害圖像分割復(fù)合梯度分水嶺算法[J]. 農(nóng)業(yè)工程學(xué)報(bào),2017,33(14):93-99.
Zhang Junguo, Feng Wenzhao, Hu Chunhe, et al. Image segmentation method for forestry unmanned aerial vehicle pest monitoring based on composite gradient watershed algorithm[J]. Transactions of the Chinese Society of Agricultural Engineering (Transactions of the CSAE), 2017, 33(14): 93-99.(in Chinese with English abstract)
[4] 宋以寧,劉文萍,駱有慶,等. 基于線性譜聚類的林地圖像中枯死樹(shù)監(jiān)測(cè)[J]. 林業(yè)科學(xué),2019,55(4):187-195.
Song Yining, Liu Wenping, Luo Youqing, et al. Monitoring of dead trees in forest images based on linear spectral clustering[J]. Scientia Silvae Sinicae, 2019, 55(4): 187-195. (in Chinese with English abstract)
[5] 張?zhí)?,張曉麗,劉紅偉,等. 遙感在森林病蟲害監(jiān)測(cè)中的應(yīng)用研究[J]. 安徽農(nóng)業(yè)科學(xué),2010,38(21):11604-11607.
Zhang Tian, Zhang Xiaoli, Liu Hongwei, et al. Application of remote sensing technology in monitoring forest diseases and pests[J]. Journal of Anhui Agricultural Sciences, 2010, 38(21): 11604-11607.(in Chinese with English abstract)
[6] 溫長(zhǎng)吉,王啟銳,陳洪銳,等. 面向大規(guī)模多類別的病蟲害識(shí)別模型[J]. 農(nóng)業(yè)工程學(xué)報(bào),2022,38(8):169-177.
Wen Changji, Wang Qirui, Chen Hongrui, et al. Model for the recognition of large-scale multi-class diseases and pests[J]. Transactions of the Chinese Society of Agricultural Engineering (Transactions of the CSAE), 2022, 38(8): 169-177.(in Chinese with English abstract)
[7] 孫鈺,周焱,袁明帥,等. 基于深度學(xué)習(xí)的森林蟲害無(wú)人機(jī)實(shí)時(shí)監(jiān)測(cè)方法[J]. 農(nóng)業(yè)工程學(xué)報(bào),2018,34(21):74-81.
Sun Yu, Zhou Yan, Yuan Mingshuai, et al. UAV real-time monitoring for forest pest based on deep learning[J]. Transactions of the Chinese Society of Agricultural Engineering (Transactions of the CSAE), 2018, 34(21): 74-81. (in Chinese with English abstract)
[8] 周焱,劉文萍,駱有慶,等. 基于深度學(xué)習(xí)的小目標(biāo)受災(zāi)樹(shù)木檢測(cè)方法[J]. 林業(yè)科學(xué),2021,57(3):98-107.
Zhou Yan, Liu Wenping, Luo Youqing, et al. Small object detection for infected trees based on the deep learning method[J]. Scientia Silvae Sinicae, 2021, 57(3): 98-107. (in Chinese with English abstract)
[9] 賈少鵬,高紅菊,杭瀟. 基于深度學(xué)習(xí)的農(nóng)作物病蟲害圖像識(shí)別技術(shù)研究進(jìn)展[J]. 農(nóng)業(yè)機(jī)械學(xué)報(bào),2019,50(S1):313-317.
Jia Shaopeng, Gao Hongju, Hang Xiao. Research progress on image recognition technology of crop pests and diseases based on deep learning[J]. Transactions of the Chinese Society for Agricultural Machinery, 2019, 50(S1): 313-317. (in Chinese with English abstract)
[10] 黃麗明,王懿祥,徐琪,等. 采用YOLO算法和無(wú)人機(jī)影像的松材線蟲病異常變色木識(shí)別[J]. 農(nóng)業(yè)工程學(xué)報(bào),2021,37(14):197-203.
Huang Liming, Wang Yixiang, Xi Qi, et al. Recognition of abnormally discolored trees caused by pine wilt disease using YOLO algorithm and UAV images[J]. Transactions of the Chinese Society of Agricultural Engineering (Transactions of the CSAE), 2021, 37(14): 197-203.(in Chinese with English abstract)
[11] Zaidi S S A, Ansari M S, Aslam A, et al. A survey of modern deep learning based object detection models[J]. Digital Signal Processing, 2022: 103514.
[12] Girshick R, Donahue J, Darrell T, et al. Rich feature hierarchies for accurate object detection and semantic segmentation[C]. Boston: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 2014: 580-587.
[13] Ren S, He K, Girshick R, et al. Faster R-CNN: Towards real-time object detection with region proposal networks[J]. IEEE Trans Pattern Anal Mach Intell, 2017, 39(6): 1137-1149.
[14] Redmon J, Divvala S, Girshick R, et al. You only look once: Unified, real-time object detection[C]. Las Vegas: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition(CVPR). 2016: 779-788.
[15] Redmon J, Farhadi A. YOLO9000: Better, faster, stronger[C]. Honolulu: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition(CVPR). 2017: 7263-7271.
[16] Farhadi A, Redmon J. Yolov3: An incremental improvement[C]. Salt Lake City: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition(CVPR). 2018: 1804-2767.
[17] Bochkovskiy A, Wang C Y, Liao H Y M. Yolov4: Optimal speed and accuracy of object detection[EB/OL] (2020-4-23) [2022-07-26]. https: //arxiv. org/abs/2004.10934.
[18] Liu W, Anguelov D, Erhan D, et al. SSD: Single shot multibox detector[C]. Amsterdam: European Conference on Computer Vision(ECCV). 2016: 21-37.
[19] Law H, Deng J. Cornernet: Detecting objects as paired keypoints[C]. Munich: Proceedings of the European Conference on Computer Vision(ECCV). 2018: 734-750.
[20] Shao F, Chen L, Shao J, et al. Deep Learning for weakly-supervised object detection and localization: A survey[J]. Neurocomputing, 2022: 192-207.
[21] Li X, Kan M, Shan S, et al. Weakly supervised object detection with segmentation collaboration[C]. Seoul: Proceedings of the IEEE/CVF International Conference on Computer Vision(ICCV). 2019: 9735-9744.
[22] Zhang D, Han J, Cheng G, et al. Weakly supervised object localization and detection: A survey[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2021, 44(9): 5866-5885.
[23] Zhou B, Khosla A, Lapedriza A, et al. Learning deep features for discriminative localization[C]. Las Vegas: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition(CVPR). 2016: 2921-2929.
[24] 周明非,汪西莉. 弱監(jiān)督深層神經(jīng)網(wǎng)絡(luò)遙感圖像目標(biāo)檢測(cè)模型[J]. 中國(guó)科學(xué):信息科學(xué),2018,48(8):1022-1034.
Zhou Mingfei, Wang Xili, Object detection models of remote sensing images using deep neural networks with weakly supervised training method[J]. In Journal of Scientia Sinica Informationis, 2018, 48(8): 1022-1034.
[25] Liu Y C, Ma C Y, He Z, et al. Unbiased teacher for semi-supervised object detection[EB/OL] (2021-2-18) [2022-07-26]. https: //arxiv. org/abs/2102. 09480.
[26] Zhang Y, Yao X, Liu C, et al. S4OD: Semi-Supervised learning for Single-Stage Object Detection[EB/OL]. https:// arxiv. org/abs/2204. 04492, 2022-4-9.
[27] Xu M, Zhang Z, Hu H, et al. End-to-end semi-supervised object detection with soft teacher[C]// Proceedings of the IEEE/CVF International Conference on Computer Vision(ICCV). 2021: 3060-3069.
[28] Chen B, Li P, Chen X, et al. Dense Learning based Semi-Supervised Object Detection[C]// Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR). 2022: 4815-4824.
[29] Meethal A, Pedersoli M, Zhu Z, et al. Semi-Weakly Supervised Object Detection by Sampling Pseudo Ground-Truth Boxes[EB/OL] (2022-6-16) [2022-07-26]. https: //arxiv. org/abs/2204. 00147.
[30] Yang Q, Wei X, Wang B, et al. Interactive self-training with mean teachers for semi-supervised object detection[C]. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR). 2021: 5941-5950.
[31] Wang Z, Li Y L, Guo Y, et al. Combating Noise: Semi-supervised Learning by Region Uncertainty Quantification[J]. Advances in Neural Information Processing Systems, 2021, 34: 9534-9545.
[32] Yang S, Luo P, Loy C C, et al. Wider face: A face detection benchmark[C]. Las Vegas: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition(CVPR). 2016: 5525-5533.
[33] Cai Z, Vasconcelos N. Cascade r-cnn: Delving into high quality object detection[C]. Salt Lake City: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition(CVPR). 2018: 6154-6162.
[34] Simonyan K, Zisserman A. Very deep convolutional networks for large-scale image recognition[EB/OL]. (2015-4-10) [2022-07-26]. https: //arxiv. org/abs/1409.1556.
[35] He K, Zhang X, Ren S, et al. Deep residual learning for image recognition[C]. Las Vegas: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition(CVPR). 2016: 770-778.
[36] Lin T Y, Dollár P, Girshick R, et al. Feature pyramid networks for object detection[C]. Honolulu: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition(CVPR). 2017: 2117-2125.
[37] He K, Gkioxari G, Dollár P, et al. Mask r-cnn[C]. Venice: Proceedings of the IEEE International Conference on Computer Vision(ICCV). 2017: 2961-2969.
[38] Everingham M, Eslami S M A, Gool L V, et al. The pascal visual object classes challenge: A retrospective[J]. International Journal of Computer Vision, 2015, 111(1): 98-136.
Method for detecting pine forest discoloured epidemic wood based on semi-supervised learning
Zhao Hao1, Liu Wenping1※, Zhou Yan1, Luo Youqing2, Zong Shixiang2, Ren Lili2
(1.,,100083,; 2.,,100083,)
Deep learning has been a promising technology for epidemic tree detection in recent years. However, the expensive data annotation has posed a great challenge to the discolored epidemic wood detection in the pine forest. Particularly, some difficulties are still found in the dataset expansion, model generalization ability, and the presence of samples with obscured or small objects during detection. In this study, target detection was proposed for the discolored epidemic wood using semi-supervised learning and Unmanned Aerial Vehicle (UAV) image analysis. The specific procedure mainly included the dataset for the pine forest epidemic wood, semi-supervised model training, and algorithm detection. The dataset was collected from three provinces in China, especially with a total of six epidemic wood forms. Two datasets were randomly and equally divided for the training and testing in the supervised and semi-supervised learning stages, respectively. 2 160 training and 240 testing sets were available after data augmentation. The anti-noise loss (SoftFocalLoss and L1Loss) was classified with the uncertainty indicator for the data distillation. Among them, combating Noise was the most advanced semi-supervised target detection model in the same period. As such, the quality of pseudo labeling was improved effectively. The following improvements were achieved in the Cascade Noise-Resistant Semi-supervised (CNRS) object detection model, compared with the Combating Noise. 1) A cascade network was added to balance the distribution of positive and negative samples during training, in order to equalize the accuracy and over-fitting. 2) The FocalLoss was used to mine the difficult samples in the phase of supervised learning. The improved learning of the model was achieved in the difficult objects, such as the edge objects and early epidemic wood. 3) SmoothL1Loss was used to ensure a relatively stable gradient, particularly for the large difference between the true and the predicted value. 4) Soft-Non-Maximum Suppression (Soft-NMS) was used to soften the rejection process of the detection frame in the RCNN stage, in order to prevent the near targets from being filtered. The experiments were conducted on Ubuntu 18.04 operating system using NVIDIA Tesla P100 graphics processor. The experimental results show that the Average Precision (AP) values were 64.2%, and 85.4%, respectively, for the single-stage detector SSD300 and the two-stage detector Faster RCNN using fully labeled data. 50% of the labeled data was also selected in the semi-supervised model. Both AP values were higher than the fully supervised model, indicating that the anti-noise learning effectively extracted the semantic information from the pseudo labels. The ablation model of CNRS Model1 improved the AP by 0.3 percentage points, where a cascade network was added with the RoI Pooling. Model2 further improved the AP by 0.3 percentage points after the RoI Align. The AP of the optimal model on the test set was 87.7% with an F-Score of 0.669. A comparison was also made on the detection of the four models using 24 test images. The error detection rate of CNRS was compared with the fully supervised network model, and Faster RCNN using fully labeled data. The CNRS presented a 50% reduction in the labeling, and a 2.3 percentage point increase in the AP, which was 1.6 percentage points higher than that of the semi-supervised network Combating Noise. This improved model can also provide reliable data support for pest control in agriculture and forestry. An accurate detection can be achieved in many different forms of epidemic trees and a significant reduction in the data labeling cost.
unmanned aerial vehicle; image recognition; pine forest epidemic wood detection; semi-supervised learning; object detection
10.11975/j.issn.1002-6819.2022.20.019
TP391.41
A
1002-6819(2022)-20-0164-07
趙昊,劉文萍,周焱,等. 基于半監(jiān)督學(xué)習(xí)的松林變色疫木檢測(cè)方法[J]. 農(nóng)業(yè)工程學(xué)報(bào),2022,38(20):164-170.doi:10.11975/j.issn.1002-6819.2022.20.019 http://www.tcsae.org
Zhao Hao, Liu Wenping, Zhou Yan, et al. Method for detecting pine forest discoloured epidemic wood based on semi-supervised learning[J]. Transactions of the Chinese Society of Agricultural Engineering (Transactions of the CSAE), 2022, 38(20): 164-170. (in Chinese with English abstract) doi:10.11975/j.issn.1002-6819.2022.20.019 http://www.tcsae.org
2022-07-26
2022-09-23
國(guó)家林業(yè)和草原局重大應(yīng)急科技項(xiàng)目“松材線蟲病防控關(guān)鍵技術(shù)研究與示范”項(xiàng)目(ZD202001-05);國(guó)家重點(diǎn)研發(fā)計(jì)劃“松材線蟲病災(zāi)變機(jī)制與可持續(xù)防控技術(shù)研究”項(xiàng)目(2021YFD1400901)
趙昊,研究方向?yàn)閳D像處理與圖像識(shí)別。Email:zhlistudy@163.com
劉文萍,教授,博士生導(dǎo)師,研究方向?yàn)橛?jì)算機(jī)圖像及視頻分析與處理、模式識(shí)別與人工智能。Email:wendyl@vip.163.com