Error Analysis on Regularized Learning

2011-12-23 03:08:02ZHANGJixiongWANGJianliSHENGBaohuai

杭州師范大學(xué)學(xué)報(bào)(自然科學(xué)版) 2011年1期

ZHANG Ji-xiong，WANG Jian-li ，SHENG Bao-huai ＊

（1.College of Science，Hangzhou Normal University，Hangzhou 310036，China；2.Department of Mathematics，Shaoxing University，Shaoxing 312000，China）

ZHANG Ji-xiong1，WANG Jian-li2，SHENG Bao-huai2＊

（1.College of Science，Hangzhou Normal University，Hangzhou 310036，China；2.Department of Mathematics，Shaoxing University，Shaoxing 312000，China）

This paper proposed the error analysis on a kind of regularization learning algorithm，offered the sample error by the large number law，and provided the estimation for approximation error with a K-functional.

learning theory；regularization framework；function reconstruction；reproducing kernel Hilbert spaces

1 Introduction

It is known that the Shannon sampling theorem is the core of function reconstruction which is the theory foundation of signal process and many other related fields（see e.g.[1-4]）.To deal with the noise in the sampling data，S.Smale and D.X.Zhou considered，from the view of learning theory and regression analysis，the regularization learning algorithm for random sampling and provided an error analysis in probability（see[4]）.In the present paper，we shall give some further investigations on the estimate of the error analysis for the algorithm.To this end，we need to state the learning framework.

Such maps are called Mercer kernels.

Let Pbe a non-negative Borel measure on X×R，PXbe the marginal measure induced by Pon X，i.e.，the measure on Xdefined by PX（S）＝P（π－1（S）），whereπ：X×R→Xis the projection.Notice that P（x，y），P（y｜x）and PX（x），x∈X，satisfy the following relation（see e.g.[1]，[5]）

This“breaking”of Pinto the measure P（y｜x）and PX（x）corresponds to looking at X×Ras the product of Xand R.

2 The Robustness

Before showing Theorem 1we give the robustness of the solutions.

Proposition 1 Let P，Qbe distributions on Z with｜P｜2＜＋∞，｜Q｜2＜＋∞.K（x，y）are Mercer kernels on X×Xsatisfying （2）.α（P）andα（Q）are solutions of scheme（7）for distributions Pand Q respectively，（x）＝.Then，there holds

Lemma 1 Let Pbe a distribution on Z＝X×Rsuch that｜P｜2＜＋∞.K（x，y）are kernels on X×X satisfying（2）.Then，

i）There exists uniquely a minimizer ofα（P）of problem （7）and

ii）There holds the following relation

3 The Sample Error

The definitions ofαz，γ，α（P）and f＊yields

Proposition 2 Let K（x，y）be a Mercer kernel on X×Xsatisfying（2）.Pis a distribution on Z with｜P｜2＜＋∞，α（P）andαz，γare defined as in（7）and（6）respectively.Then，for any 0＜δ＜1，with confidence 1－δ，there holds

4 Proof of Theorem 1

By（16）and（25）we have

[1]Cucker F，Smale S.On the mathematical foundations of learning theory[J].Bull Amer Math Soc，2001，39（1）：1-49.

[2]Cucker F，Zhou Dingxuan.Learning theory：an approximation theory viewpoint[M].New York：Cambridge University Press，2007.

[3]Han Deguang，Zayed A I.Sampling expansios for functions having values in a Banach space[J].Proc Amer Math Soc，2005，133（12）：3597-3607.

[4]Smale S，Zhou Dingxuan.Shannon sampling and function reconstruction from point values[J].Bull（New Series）Amer Math Soc，2004，41（3）：279-305.

[5]Cucker F，Smale S.Best choices for regularization parameters in learning theory：on the biasvariance problem[J].Found Compt Math，2002，2（3）：413-428.

[6]Carmeli C，De Vito E，Toigo A.Vector valued reproducing kernel Hilbert spaces of integrable functions and Mercer theorem[J].Anal Appl，2006，4（4）：377-408.

[7]Sun Hongwei.Mercer theorem for RKHS on noncompact sets[J].J Complexity，2005，21（2）：337-349.

[8]Devroye L，Gy?rfi L，Lugosi G.A probabilistic theory of pattern recognition[M].New York：Springer-Verlag，1997.

[9]Micchelli C A，Xu Yuesheng，Zhang Haizhang.Universal kernels[J].J Mach Learn Res，2006，7：2651-2667.

[10]Smale S，Zhou Dingxuan.Estimating the approximation error in learning theory[J].Anal and Appl，2003，1（1）：17-41.

[11]Sheng Baihuai，Xiang Daohong.The convergence rate for a K-functional in learning theory[J].J Inequality and Appl，2010，DOI：10.1155／2010／249507.

[12]Sheng Baohuai.Estimates of the norm of the Mercer kernel matrices with discrete orthogonal transforms[J].Acta Math Hungar，2009，122（4）：339-355.

正則化學(xué)習(xí)算法的誤差分析

張際雄1，王建力2，盛寶懷2
（1.杭州師范大學(xué)理學(xué)院，浙江杭州 310036；2.紹興文理學(xué)院數(shù)學(xué)系，浙江紹興 312000）

給出了一類正則化樣本學(xué)習(xí)算法的誤差分析.借助于大數(shù)定律給出了樣本誤差，用一種K-泛函給出了逼近誤差的估計(jì).

學(xué)習(xí)理論；正則化模型；函數(shù)重構(gòu)；再生核Hilbert空間

O174.41 MSC2010：68Q32；68T05Article character：A

1674-232X（2011）01-0027-07

10.3969／j.issn.1674-232X.2011.01.006

date：2010-09-10

Supported by the National NSF（10871226）of PRC.

Biography：ZHANG Ji-xiong（1985—），male，born in Jiujiang，Jiangxi Province，master，engaged in learning theory.

＊Corresponding author：SHENG Bao-huai（1962—），male，born in Baoji，Shaanxi Province，Ph.D.，professor，engaged in learning theory.E-mail：bhsheng＠usx.edu.cn