王雪菲 丁維龍
摘 要:針對(duì)高速公路傳統(tǒng)的短時(shí)交通流預(yù)測(cè)方法適用數(shù)據(jù)規(guī)模小、全網(wǎng)預(yù)測(cè)效率較低、數(shù)據(jù)的時(shí)空關(guān)系被忽視等問題,提出一種結(jié)合了K近鄰(KNN)模型且面向高速大數(shù)據(jù)的短時(shí)交通流預(yù)測(cè)方法。首先,對(duì)模型的K值和距離度量進(jìn)行調(diào)優(yōu),利用交叉驗(yàn)證進(jìn)行模型參數(shù)的對(duì)比實(shí)驗(yàn);然后,考慮數(shù)據(jù)內(nèi)在的業(yè)務(wù)時(shí)空關(guān)聯(lián),建?;跁r(shí)空特性的特征向量;最后,在大數(shù)據(jù)環(huán)境下建立回歸預(yù)測(cè)模型,以最優(yōu)參數(shù)的模型實(shí)現(xiàn)預(yù)測(cè)。實(shí)驗(yàn)結(jié)果表明,與傳統(tǒng)時(shí)間序列模型相比,所提方法一次可預(yù)測(cè)出全站點(diǎn)的流量,單次運(yùn)行速度快,效率提高了77%,平均絕對(duì)百分比誤差(MAPE)和絕對(duì)百分比誤差中位數(shù)(MDAPE)均有明顯減低,且具有良好的水平擴(kuò)展性。
關(guān)鍵詞:交通流量;短時(shí)預(yù)測(cè);K近鄰;時(shí)空數(shù)據(jù);大數(shù)據(jù)
中圖分類號(hào): TP319
文獻(xiàn)標(biāo)志碼:A
Abstract: Aiming at the problems that traditional short-time traffic flow prediction method in highway domain is suitable for small scale data, which limits the efficiency on massive data, and the spatio-temporal relationship of data is neglected, a short-term traffic flow prediction method for big data with combining K-Nearest Neighbors (KNN) in highway domain was proposed. Firstly, the K value and distance metric of model were tuned, and the model parameters were compared by using cross validation. Secondly, considering inherent spatio-temporal association of data, feature vectors based on spatio-temporal characteristics were modeled. Finally, a regression prediction model was established under big data environment, and the prediction was realized with the model of optimal parameters. The experimental results show that compared with traditional time series model, the proposed model works on all toll stations at one time, has high speed of single running and improves the efficiency by 77%. The method significantly reduces Mean Absolute Percentage Error (MAPE) and Median Absolute Percentage Error (MDAPE) and it also has good horizontal expansibility.
Key words: traffic flow; short-term forecasting; K Nearest Neighbors (KNN); spatio-temporal data; big data
0 引言
近年來,隨著我國經(jīng)濟(jì)穩(wěn)定的發(fā)展和高速公路路網(wǎng)建設(shè)日漸完善,高速路網(wǎng)的交通流量不斷增長,人們的交通需求也逐漸增加,給路網(wǎng)的通行能力帶來一定挑戰(zhàn)。路網(wǎng)中交通擁堵問題帶來一些社會(huì)問題,不僅增大了處理交通擁堵的花銷,還給我國經(jīng)濟(jì)帶來了一定損失,因此減少交通擁堵是我國需要解決的重大社會(huì)問題之一。對(duì)交通狀況的控制和交通流誘導(dǎo)是智能交通系統(tǒng)(Intelligent Transport System, ITS)的核心研究問題,其關(guān)鍵在于短時(shí)交通流量預(yù)測(cè),使之利用交通管控在路網(wǎng)中分散交通流,緩解交通擁堵狀況,給出行者提供交通誘導(dǎo)信息。交通流根據(jù)預(yù)測(cè)周期可分為兩類:短時(shí)預(yù)測(cè)(short-term Forecasting)和中長時(shí)預(yù)測(cè)(mid-long-term Forecasting)[1]。其中短時(shí)交通流預(yù)測(cè)指以5~30min時(shí)間跨度上的時(shí)間間隔,運(yùn)用當(dāng)前交通流數(shù)據(jù)信息去預(yù)測(cè)下一個(gè)5min至30min內(nèi)的交通流量,本文以時(shí)間間隔取5min為例討論短時(shí)交通流量的預(yù)測(cè)。
高速公路大數(shù)據(jù)由靜態(tài)數(shù)據(jù)和動(dòng)態(tài)數(shù)據(jù)組成,靜態(tài)數(shù)據(jù)包含高速公路收費(fèi)站位置信息、路段信息、分中心信息等;動(dòng)態(tài)數(shù)據(jù)包含收費(fèi)站過車信息、事故信息、道路養(yǎng)護(hù)信息、氣象信息等。這些數(shù)據(jù)具有以下特征:1)復(fù)雜性。高速公路的交通狀況逐漸復(fù)雜,采集到的相關(guān)數(shù)據(jù)多種多樣。2)海量性。路網(wǎng)數(shù)據(jù)動(dòng)態(tài)增加形成海量的高速公路大數(shù)據(jù)。3)實(shí)時(shí)性。高速公路過車數(shù)據(jù)是實(shí)時(shí)接收、秒級(jí)更新的。4)動(dòng)態(tài)性。高速公路數(shù)據(jù)不但包含一些靜態(tài)信息,還包括動(dòng)態(tài)信息。高速公路數(shù)據(jù)不僅具備空間性質(zhì),還具有時(shí)間性質(zhì)[2]?,F(xiàn)有的高速公路交通流量預(yù)測(cè)大多是利用微波車檢器測(cè)出的數(shù)據(jù)進(jìn)行單站點(diǎn)的交通流量預(yù)測(cè),使用高速公路收費(fèi)站出入口測(cè)試到的過車數(shù)據(jù)進(jìn)行交通流量預(yù)測(cè)的研究較少。高速收費(fèi)站過車數(shù)據(jù)目前主要應(yīng)用于收費(fèi)統(tǒng)計(jì)和交通流量統(tǒng)計(jì),其價(jià)值沒有被充分挖掘。
利用高速公路大數(shù)據(jù)對(duì)高速公路收費(fèi)站交通流量進(jìn)行短時(shí)預(yù)測(cè)存在以下3個(gè)難點(diǎn):