• <tr id="yyy80"></tr>
  • <sup id="yyy80"></sup>
  • <tfoot id="yyy80"><noscript id="yyy80"></noscript></tfoot>
  • 99热精品在线国产_美女午夜性视频免费_国产精品国产高清国产av_av欧美777_自拍偷自拍亚洲精品老妇_亚洲熟女精品中文字幕_www日本黄色视频网_国产精品野战在线观看 ?

    Research on Micro-blog New Word Recognition Based on SVM

    2017-10-10 11:31:05ChaotingXiaoJianhouGanBinWenWeiZhangandXiaochunCao
    關鍵詞:伐木葛優(yōu)新詞

    Chaoting Xiao, Jianhou Gan, Bin Wen, Wei Zhang, and Xiaochun Cao

    ResearchonMicro-blogNewWordRecognitionBasedonSVM

    Chaoting Xiao, Jianhou Gan*, Bin Wen, Wei Zhang, and Xiaochun Cao

    New word discovery possesses a significant role in the field of Natural Language Processing (NLP). As the effect of mutual information on multi-string is not good, we improve the traditional mutual information and adjacency entropy method respectively and put forward enhancement of mutual information and relative adjacency entropy. As multi-feature massive data brings the problem of slow speed, we use the MapReduce parallel computing model to extract some features, such as, enhancement of mutual information, relative adjacency entropy and background document frequency. With the extracted eight features, the feature vectors of the candidate words are formed, and the SVM model can be trained by the labelled corpus. The experiments show that the proposed method accelerates the computing speed and shortens the time required by the whole recognition process. In addition, comparing with the existing methods, we can see that theFvalue reaches 86.98%.

    new words recognition; Natural Language Processing(NLP); enhanced mutual information; relative adjacency entropy; mapReduce; SVM

    1 Introduction

    In the process of Chinese word segmentation, new word recognition is quite difficult. Sproat and others pointed out that 60% errors of Chinese word segmentation are caused by new words[1]. Now, many new words are spreading via micro-blog. New words such as, ‘伐木累’, ‘葛優(yōu)癱’ and ‘北京癱’, etc, have been created. Micro-blog text contains a considerable proportion of new words, the linguists have concluded according to the statistics that the average annual production of new words is more than 800[2]. In the field of new word recognition, there is no definition for ‘new word’. Based on existing research, people think that new words should have the following properties. From the perspective of word itself, it should be an independent word. From the perspective of appearing frequency, the new word should be widely adopted. Even in corpus, the new word has a high frequency of appearance in many documents and is used by numerous people. From the perspective of time, the word has just appeared within a certain period of time, or it has a new meaning which is ‘the new use of old word’[3].

    At present, new word recognition methods are mainly divided based on the rule, statistics and combination of both rule and statistics. The method based on rule needs to build a rule bank to match via template. The precision is high but manual rule is difficult to write and the cost is high. In addition, the rule is highly related to the field. Meanwhile, the advantages of this kind of method based on statistics include flexiblility, good adaptability and portability. However, it needs a large corpus to calculate the statistics and thus consumes much time. For example, Sui and others[4]extracted the words with close relationship through computing the static union rate among the words after word segmentation of corpus. Then they used the grammar rule and field features to get the field terms with high confidence. The rule only has the features of field, so it is not suitable for other corpora. Sornlertlamvanich and others used decision-tree model to train the new word recognition model, with a precision result of 85%[5]. Unfortunately, it is not suitable for large-scale corpus. Peng[6]and others adopted the statistical method to do unified consideration to do segmentation and new word discovery, by using the CRF model of combining lexical features and field knowledge to extract the new words. At the same time, they added the discovered new words into the dictionary to enhance the recognition effect of the model. The method improves the accuracy of word segmentation but costs a long time. Liu[7]and others applied the left & right information entropy and likelihood rate (LLR) to determine the word boundary to extract the candidate new words. The extracted features of the method are less and the precision rate is not high. Lin and others[8]counted and extracted new words based on word’s internal model and combined with mutual information, IWP and position-word probability. The proposed mutual information is not suitable for multi-strings and there are limitations. Li and other scholars[9]employed word frequency, word probability, etc., to train a SVM model and consider the new recognition from the perspective of classification. The limitation of the method is that it cannot recognize the low-frequency new words, thus it will produce a lot of garbage strings. Zhao and others[10]iteratively used the mutual information, left (right) entropy, left (right) adjacency right (left) average entropy, etc., to obtain the candidate list of new words. Then they used a Chinese collocation library to filter the list to get new words. The limitation of the method is that when obtaining the mutual information, the multi-strings will be divided into two substrings in the calculation. This will affect the results of recognizing the new words. Wang and others[11]explored the new words from the internet based on time series information and used the combination between dynamic feature’s new method and commonly employed statistical method. The method compares the curve’s change trend of each candidate word within a period of time, and states that each part of the new word should have the identical change trend. However, the accuracy of the method is not very high. Shuai and others[12]proposed a filter method to stop word with redefinition and the filter method by using iterative context entropy algorithm to recognize new words and introduce lexical features. The rule dependency of this method is strong. Su and others[13]proposed to improve the adjacency entropy with a weighted adjacency entropy to optimize and improve the performance, and achieved good performance. Li and others[14]exploited the internal word probability, mutual information, word frequency and word probability rule as features to train the SVM model, but the precision rate was only 61.78%.

    Due to the problems of speed caused by the above methods when processing large-scale corpus, as well as the low precision rate caused by the statistical method when recognizing new words, this paper analyzes the micro-blog corpus. We firstly reduces the noise in the micro-blog corpus. Then we use N-Gram statistical method to extract new word candidates based on a word segmentation. We propose a new filtering algorithm and combine it with the stop word list launched by Harbin Institute of Technology to filter the candidates. Then, a SVM classifier will be trained by multiple eigenvalues obtained through improving enhancement of mutual information, relative adjacency entropy and background document frequency method. At last, with the trained SVM model for recognizing the new words of micro-blog of test set. The method improves the speed of new word training recognition model caused by the multi-featured massive data of large-scale corpus statistics, meanwhile, the proposed method can also improve the precision rate of new word recognition. The detailed process is as shown in Fig.1.

    2 New Word Discovery Method Based on Micro-blog Content

    2.1Preprocessingofthecorpus

    Micro-blog has a strong randomness in the word use and grammar, and causes a large number of noisy data. It will affect the feature extraction and increase the model training time. According to the features of the micro-blog corpus, this paper reduces the negative influence of noisy data in the process of training. Through statistics and analysis, we find that many contents are accompanied with topic and expression labels, etc. The specific labels are shown in Table 1.

    Fig.1 Overall flow chart of micro-blog new words extraction.

    LabelDescription#words#Hottopic,‘words’meanthekeywordsofthetopic@nameRemindauser,‘name’meanstheus?ernametobereminded【sentence】Micro?blogtheme,‘sentence’isthesummaryortitleofthemainbodyofmicro?blog[word]Expressionword,‘word’meanthemindandemotion,etc,oftheauthororpaper

    Based on the analysis of micro-blog corpus, we can find that, ‘@’, [expression] and URL links exist in most of the micro-blog content. ‘@’ is usually followed by a user name, where many usernames are random, so there is zero possibility to appear a new word. [Expression] and URL link labels will also have a zero possibility of new words. These noisy data have great influences on the generation of candidate words, so it is necessary to eliminate these noises. This paper eliminates the above three kinds of labels through the method of building regular expression of the micro-blog data.

    2.2 Filtering algorithm

    We introduce N-gram algorithm for preprocessed micro-blog corpus data, the basic idea of the method is to carry out N-size sliding window operation to text content. Based on the word segmentation of the micro-blog data, we use the N-gram to recognize the candidate words. If the corpus is ‘中文新詞識別’, the result after word segmentation will be ‘中文新/新詞/詞識別’. WhenNis 2, the candidate word will be ‘中文新/新詞/詞識別’. WhenNis 3, the candidate word will be ‘中文新詞/新詞識別’. We extract all the candidate new words from micro-blog corpus under the conditions ofNis 2, 3 and 4. These candidate new words contain many garbage strings, so they need to be filtered. After analyzing the news corpus provided by Fudan University, we conclude that the content of corpus are news before 2002, where the language was formal and without oral language and the new words from the internet were rare. Therefore, this paper proposes a filtering algorithm of combining news corpus and stop words. The pseudo algorithm is shown in Table 2, whereNis the news corpus,Wstands for the candidate new word set of micro-blog,Tis the stop word list andNLmeans the candidate new word set after filtering.

    3 Feature Selection of Candidate New Words

    After the filtering algorithm, the candidate words still have some noises, such as: ‘富美喜’, ‘女票不’ and ‘逼格真’. Therefore, we use the statistical method to quantify the features of these candidate new words. The employed statistical method includes mutual information[10]which can measure internal coagulation. Information entropy[10]and background document frequency can measure the external freedom degree.

    Table 2 Filtering algorithm.

    3.1 Internal coagulation

    Mutual information[10]measures the correlation between two events. Traditional mutual information formula[10]only gives the calculation formula of two character strings, which can only be applied to the two character new words. For the multi-character string, if the micro-blog candidate new wordsS={s1s2…sn}, the common method nowadays is to take the two longest substringsSleft={s1s2…sn-1} andSright={s2s3…sn}. A high correlation between the two longest substrings of a candidate new word S shows that the combination of the two will be more closely, and S is more likely to be a word. On the contrary, a lower correlation between them indicates that they are less dependent on each other. However, the mutual information value obtained under this condition is not very accurate. Under current circumstances, this paper aims to improve the traditional mutual information formula[10]and proposes enhancement mutual information which is suitable for the multi-word strings, the definition is as follows.

    (1)

    whereWis the total words of micro-blog corpus andPis the frequency of string in the corpus. The greater the value of the enhanced mutual information, the higher the possibility of the current string is a new word.

    3.2 External freedom degree

    Information entropy can reflect the average information content brought by an event’s results. Intuitively, the meaningful new words are not only repeated in the text, but also appeared in different context, which reflects the string’s independent ability and the freedom degree of usage. However, more appearances of a string in the corpus, it is more probable to have a larger value. Therefore, the adjacency entropy is not conducive to the low-frequency strings. This paper proposes a relative adjacency entropy. In our opinion, the string with a higher word probability than its substring will be regarded as a new word. For stringW= {w1w2…wn} and its longest substringWleft= {w1w2…wn-1} andWright= {w2w3…wn}. We subtract the adjacency entropy after taking the weight and the substring’s adjacency entropy, and take the minimum of relative adjacency entropy.

    Respectively, we defineα={α1,α2,…,αn} andβ={β1,β2,…,βn} to be the context sets of the candidate repeated stringsωin the corpusX. The entropy ofωin the left, right and context of the corpusXis defined as follows.

    (2)

    (3)

    The adjacency entropy after taking the weight and relative adjacency entropy is defined as follows.

    Cr(ω)=λCr(ω)-(1-λ)Cr(ωleft)

    (4)

    CL(ω)=λCL(ω)-(1-λ)CL(ωright)

    (5)

    The minimum formula of relative adjacency entropy is defined as:

    C(ω)=min{CL(ω),Cr(ω)}

    (6)

    From the above definitions we can conclude that for the character strings that are only used in the fixed context, the relative adjacency entropy is small. On the contrary, the relative adjacency entropy is big for character strings that are used in many different contexts.

    3.3 Background document frequency

    Considering from the perspective of human memory, we think that new words have never appeared in previous memories. We use a large-scale background corpus to simulate the human memory to compare the frequency of string in the background corpus and the string in the corpus of the extracted string (foreground corpus). If the frequency of the string in foreground corpus is much larger than that in the large-scale background corpus, the string is likely to be a new word. This method is also useful for high-frequency word adhesion, i.e., the repeated strings and garbage filtering, such as: ‘也不’ and ‘了一’, etc. The frequencies of these strings in the foreground corpus and the background corpus are similar. The formula of the relevant frequency ratio of stringωinXandYof two corpora is defined as follows.

    (7)

    wheref(ω,X) andf(ω,Y) are the corresponding frequencies ofωinXandYof corpora,Xis the foreground corpus andYis the background corpus.

    3.4 Dice

    The Dice of candidateWis estimated by Equation 8. In Equation 8,xidenotes the characters in candidatew. For example,w=x1,x2, …,xn.

    (8)

    3.5 SCP

    The SCP of candidatewis estimated by Equation 9. In Equation 9,xidenotes the characters in candidatew.

    (9)

    4 Parallel Implementation of New Word Feature Quantization Algorithm

    4.1MapReduceparallelcomputingmodel

    MapReduce is a programming model, which is divided into Map and Reduce two stages. The input and output of each stage is based on key-value pairs. In Map stage, Map function changes each line of input to the key-value pairs (K1,V1) form. After the processing of the Map function, it outputs many new key-value pairs List (K2,V2). In Reduce stage, all output in Map stage will be divided according to the key (K2, List(V2)). This process is called shuffle. Each group (K2, List (V2)) is the input of the Reduce function. After the processing of Reduce function, it outputs the final results (K3,V3). Overall new word recognition speed will be affected due to the large scale of the micro-blog corpus and the large amount of time to perform the new word feature quantization algorithm. Thus we parallel the calculations of the background document frequency, the relative adjacency entropy and the enhanced mutual information algorithm with the MapReduce model.

    4.2Parallelimplementationofbackgrounddocumentfrequencyalgorithm

    Background document frequency mainly refers to the ratio between the candidate wordsw’s frequencies inXof the foreground corpus and inYof the background corpus. In order to improve the coupling efficiency of the multi-features, we calculate the word frequencies inXandYcorpus respectively and calculate the frequency ratio in the multi-features coupling Reduce. The calculation process of the background document frequency of the candidate word inXcorpus is as follows. First, we segmentXcorpus via Split and transfer each segmentation to map function. Then we input the candidate word setWto the Map function via configuration method. In Map function, we calculate the frequencykof each candidate word in the corpus fragments, and use output key as the candidate wordw. The specific pseudo code of the algorithm is shown in Table 3, where it outputs Value as frequencyk. In Reduce function, the same key values are accumulated to get the key value, which is the frequency of the candidate wordwinXcorpus. Then it is divided byXcorpus size to obtain the frequency of the candidate word which is the output. The specific algorithm is shown in Table 4. The overall system diagram is shown in Fig.2.

    Table3Mapoperationonparallelizationofbackgrounddocumentfrequency.

    Inwhich,theimplementationprocessofstep(3)is:1.InputW[],〈k1,v1〉2.Forifrom0tolengthofW3.While(index!=?1)4.Index

    Table4Reduceoperationonparallelizationofbackgrounddocumentfrequency.

    Theimplementationprocessofstep(4)is:1.Input〈W[i],List〈k〉〉2.Forxfrom0tolengthofList〈k〉3.sum

    Fig.2 Parallelization of string frequency

    4.3Parallelizationofrelativeadjacencyentropy

    In order to obtain relative adjacency entropy, the context entropy should be obtained first. The context entropy can be classified as the left and right entropy respectively, where their algorithm is similar. Here we introduce the algorithm processed by taking the left entropy as an example. After theXcorpus is segmented via split, each segmentation is inputted to Map. We employ the Configuration method to input the candidate word setWto Map function. Then we find the candidate wordw’s adjacent left character in Map. We setupwfor Key. Value is setup to be the adjacent left characteraas the output. The specific algorithm is shown in Table 5. In Reduce, we count the number of the same left character of candidate word, then obtain the left entropy value of the candidate word and output the left entropy value of each candidate word. The specific algorithm implementation is shown in Table 6. The overall process is shown in Fig.3.

    Table5Mapoperationonparallelizationofrelativeadjacencyentropy

    Inwhich,theprocessofstep(3)is:1.InputW[],1then6.a(chǎn)

    4.4Multi-featuredatacouplingandSVM

    After obtaining the feature data of the candidate new word (such as: the enhanced mutual information, the relative adjacency entropy and the background document frequency), we need to couple the multiple feature data which can form feature vector. First, the data is from different files such as the relative adjacency entropy, the enhanced mutual information and the background document frequency. Each mapper already knows the file name of data stream processed by it. Here it is the key wordwand is marked by the file name. After sealing each input function, map will implement division, shuffle and sort operation indicated by Mapreduce. The Reduce function receives the input data and carries out the complete cross product to the value. The Reduce function generates all consolidated results of these values and limits each value to be marked at most once in each consolidation. The overall process is shown in Fig.4.

    Table6Reduceoperationonparallelizationofrelativeadjacencyentropy.

    Theprocessofstep(4)is:1.Input〈W[i],List〈a〉〉2.Forxfrom0tolengthofList〈a〉3.IfList〈a〉.xnotinA〈k,v〉then//IfthereisnoList〈a〉.xinA〈k,v〉4.A〈k,v〉

    Compared to other classifiers, SVM has better classification results and is a popular statistical machine learning method. We map the sample points from low-dimension to high-dimension feature space and find a super plane, such that the distance between each class of data and hyper plane is the maximum (namely, the optimal hyper plane). The final results will be different if the kernel function is different. For the feature vector composed of the multi-feature data of the candidate new word obtained through the above methods, this paper employs 70% of the multi-feature data as the training set and 30% of the multi-feature data as the testing set. Labeled in a manual manner, we train the SVM micro-blog new word recognition model and carry out 10-fold cross-validation.

    Fig.3 Parallelization of Relative Adjacency Entropy

    Fig.4 Overall parallelization process

    5 Experimental Results and Analysis

    In the experiment, we employ a 5.2 million micro-blog corpus, which includes 591 micro-blog of 2009, 60795 micro-blog of 2010, 763027 micro-blog of 2011, 1699484 micro-blog of 2012, 17882335 micro-blog of 2013, 681449 micro-blog of 2014 and 198925 micro-blog of 2015. We make the micro-blog from 2009 to 2013 as the background corpus to simulate human memory. We let the micro-blog of 2014 and 2015 be the foreground corpus. The news corpus which is used by the filtering algorithm is from Fudan University, which includes 9804 articles. The results of the candidate new word will be obtained by the filtering algorithm of combining the news corpus with stop word. The preliminary results have a total of 13273 words. In relative adjacency entropy algorithm, we set the weight to be 0.62. Through the experiment we find that the proposed method can avoid the new word to be deleted accidentally via the reduction of weight of substring. The hardware of the parallelization improvement is one set of Lenovo computer, with CPU of Celerondual-core T30001.80Ghz and 16G memory. The distributed environment is simulated through running VirtualBox virtual machine on the computer. In total, there are six computers with memory of 512M that are virtual, and we installed the CentOS operation system for each computer. The experimental platform is Eclipse and developed by java language. Kernel function of SVM uses the RBF kernel function.

    5.1 Methods and standards

    This method employs the precision rate (P), recall rate (R) andF-measure to measure the experimental results. The specific definitions of the measurements are as follows.

    (10)

    (11)

    (12)

    In formula (10),Rmeans the recall rate of the recognized new word. In formula (11),Pmeans the precision rate of the recognized new word. In formula (12),β=1 while theFvalue is the harmonic mean of the precision rate and the recall rate. These measurements can comprehensively reflect the overall performance of the new word recognition.

    5.2 Experimental results and analysis

    Comparing the advantages and disadvantages between the proposed method and previous methods, the specific experimental results are shown in Table 7. In Table 7, BF represents the Background document frequency.MI*represents the enhanced mutual information.E*means the relative adjacency entropy. F represents the frequency. We can conclude from the experiment that both the precision rate and recall rate of the enhanced mutual information are improved when comparing (MI+F+Dice+SCP+LCE+RCE) with (MI*+F+Dice+SCP+LCE+RCE). Similarly, both the precision rate and recall rate of (MI*+F+Dice+SCP+LCE+RCE) are higher than that of (MI*+F+Dice+SCP+LCE+RCE+E*). Via the comparison between (MI*+F+Dice+SCP+LCE+RCE+E*) and (MI*+F+Dice+SCP+LCE+RCE+E*+BF), we can observe that the new word recognition accuracy without the BF is lower. According to the results, the enhanced mutual information and the relative adjacency entropy have positive effects. The combination of some features improves the precision rate, recall rate and F value of micro-blog new word recognition when comparing with traditional method (MI+F+Dice+SCP+LCE+RCE). The finalFvalue is 86.98%. Experimental results of new word recognition are shown in Table 8 and experimental results of non-new word recognition are shown in Table 9.

    Table 7 Experimental results of micro-blog new word recognition. %

    Table 8 Experimental results of new word recognition.

    Table 9 Experimental results of non-new word recognition.

    The parallelization algorithm sets the processing speed under the node of one, three and six sets. We obtain the operation situation of micro-blog corpora with different sizes to count the time spent by the system when recognizing the new words. The results are shown in Fig.5. Experimental results indicate that when we increase the number of the node machines, and the recognition speed of the micro-blog new word also increases.

    Fig.5Operationspeedchartonmicro-blognewwordrecognition.

    6 Conclusion

    We improve the traditional mutual information and adjacency entropy method respectively and put forward the enhancement of mutual information and relative adjacency entropy. After experimental verification, the parallelization shortens the overall time of new word recognition. The precision rate and recall rate of micro-blog new word recognition are improved by a trained SVM classification model with the features generated from the above methods. The proposed method can achieve very good classification and recognition performance. In the future, we will try to fully explore the effective new word detection features and adopt them into the model to further improve the performance of the micro-blog new word recognition task.

    Acknowledgment

    The research is supported by National Key Research and Development Plan (No. 2016YFB0800603), National Natural Science Foundation of China (No. 61562093, 61422213, 61650202), Key Project of Applied Basic Research Program of Yunnan Province No.2016FA024, Key Program of the Chinese Academy of Sciences (No. QYZDB-SSW- JSC003).

    [1]R.Sproat and T.Emerson, The first international Chinese word segmentation bake off, inProceedingsofthesecondSIGHANworkshoponChineselanguageprocessing, Sapporo, Japan, vol.17,pp.133-143,2003.

    [2]Zhang Dexin.‘There will be no fish if the water is too clear ’-My Normative View on New Words,JournalofPekingUniversity:PhilosophyandSocialScience, vol.37, no.5, pp.106-119, 2000.

    [3]X.Huang and R.F.Li, Discovery Method of New Words in Blog Contents,ModernElectronicsTechnique, vol.36,no.2, pp.144-146, 2013.

    [4]Z.F.Sui, Y.R.Chen, and Y.R.Wu, etal.The Research on the Automatic Term Extraction in the Domain of Information Science and Technology.http://icl.pku.edu.cn/icl_tr/ papers_2000-2003 /2002/E026-szf-The Research on the Automatic Term Extraction in the Domain of Information Science and Technology.pdf

    [5]V.Sornlertlamvanich, T.Potipiti, and T.Charoenporn, Automatic Corpus-Based Thai Word Extraction with the C4.5 Learning Algorithm,inProceedingsofInternationalConferenceonComputationalLinguistics, Germany,vol.2, pp.802-807, 2000.

    [6]F.Peng, F.Feng, and A.McCallum.Chinese segmentation and new word detection using conditional random fields, inProceedingsofthe20thinternationalconferenceonComputationalLinguistics, Switzerland,pp.562,2004.

    [7]T.Liu, B.Q.Liu, Z.M.Xu, and X.L.Wang.Automatic domain-specific term extraction and its application in text classification,ActaElectronicaSinica, vol.35, no.2, pp.328, 2007.

    [8]Z.F.Lin and X.F.Jiang. New Word Recognition Based on Internal Model of Word,ComputerandModernization, no.11,pp.56-58, 2010.

    [9]H.Li, C.N.Huang, J.Gao, and X Fan, The Use of SVM for Chinese New Word Identification, inProceedingsofFirstInternationalJointConferenceonNaturalLanguageProcessing, Sanya,China, pp.723-732,2004.

    [10] X.B.Zhao and H.P.Zhang, New Word Recognition Based on Iterative Algorithm,ComputerEngineering, vol.40, no.7, pp.154-158, 2014.

    [11] M.Wang, L.Lin, and F.Wang.New word identification in social network text based on time series information, inProceedingsofIEEEInternationalConferenceonComputerSupportedCooperativeWorkingDesign, IEEE Press, pp.552-557,2014.

    [12] C.Xiao, J.Gan,and B.Wen, et al.New Word Recognition Based on Micro-blog Contents,PatternRecognitionandArtificialIntelligence, vol.27,no.2, pp.141-145, 2014.

    [13] Q L Su and B Q Liu.Chinese new word extraction from MicroBlog data, inProceedingsofInternationalConferenceonMachineLearningandCybernetics, Tianjin, China: IEEE Press,pp.1874-1879,2013.

    [14] C Li and Y Xu.Based on Support Vector and Word Features New Word Discovery Research,TrustworthyComputingandServices.Berlin, Germany: Springer Press, pp.698-701, 2012.

    JianhouGanreceived Ph.D. degree in Metallurgical Physical Chemistry from Kunming University of Science and Technology, China, in 2016. In 1998, he was a faculty member at Yunnan Normal University, China. Currently, he is professor in Yunnan Normal University, China. He has published over 40 refereed journal and conference papers.His research interest covers education informalization for nationalities, semantic Web, database, intelligent information processing.

    BinWenreceived Ph.D. degree in computer application technology from China University of Mining & Technology, Beijing, China, in 2013. In 2005,he was a faculty member at Yunnan Normal University, China.Currently, he is associate professor in Yunnan Normal University,China. Now, his research interest covers intelligent information processing and emergency management.WeiZhangis now an Assistant Professor in Institute of Information Engineering, Chinese Academy of Sciences, Beijing 100093, China. He received his Ph.D degree from Department of Computer Science in City University of Hong Kong, Hong Kong, China, in 2015. Before joining Chinese Academy of Sciences, he was a visiting scholar in DVMM group of Columbia University, New York, NY, USA, in 2014. His research interests include large-scale visual instance search and mining, multimedia and digital forensic analysis. He has won the second place in TRECVID Instance Search task in 2012, the Best Demo Award in ACM-HK openday 2013.

    XiaochunCaoreceived the B.E. and M.E. degrees in computer science from Beihang University, Beijing, China, and the Ph.D. degree in computer science from the University of Central Florida, Orlando, FL, USA. He has been a Professor with the Institute of Information Engineering, Chinese Academy of Sciences, Beijing, China, since 2012. He spent about three years with ObjectVideo Inc., as a Research Scientist. From 2008 to 2012, he was a Professor with Tianjin University, Tianjin, China. He has authored and co-authored over 120 journal and conference papers. He is a fellow of the IET. He is on the Editorial Board of the IEEE Transactions of Image Processing. His dissertation was nominated for the University of Central Floridas university-level Outstanding Dissertation Award. In 2004 and 2010, he was a recipient of the Piero Zamperoni Best Student Paper Award at the International Conference on Pattern Recognition.

    2016-12-20; accepted:2017-01-20

    M.S. degree in computer application technology from Yunnan Normal University, China, in 2017. Now, His research interest covers Pattern recognition and natural language processing.

    ?Chaoting Xiao and Bin Wen are with School of Information Science and Technology, Yunnan Normal University, Kunming, Yunnan, China. E-mail: xiaochaoting@gmail.com; wenbin@ynnu.edu.cn.

    ?Jianhou Gan is with Key Laboratory of Educational Informatization for Nationalities(Yunnan Normal University),Ministry of Education, Kunming,Yunnan, China. E-mail: ganjh@ynnu.edu.cn.

    ?Chaoting Xiao, Wei Zhang and Xiaochun Cao are with Institute of Information Engineering,Chinese Academy of Sciences, Beijing, China. E-mail: wzhang.cu @ gmail. com, caoxiaochun@iie.ac.cn.

    *To whom correspondence should be addressed. Manuscript

    猜你喜歡
    伐木葛優(yōu)新詞
    葛優(yōu)一句話懟空談者
    做人與處世(2020年1期)2020-09-22 01:27:55
    《微群新詞》選刊之十四
    WE ARE伐木累 伐木累戰(zhàn)隊
    焚詩記
    詩潮(2018年3期)2018-03-26 12:29:30
    跟蹤導練(三)等
    伐木壘——戶外墻體涂鴉 Family=家庭
    童話世界(2016年26期)2016-08-22 12:17:48
    『葛優(yōu)躺』其實很傷身
    葛優(yōu)兩口子
    海峽姐妹(2015年3期)2015-02-27 15:09:58
    小議網(wǎng)絡新詞“周邊”
    語文知識(2014年12期)2014-02-28 22:01:18
    外教新詞堂
    亚洲欧美精品综合一区二区三区| 色在线成人网| 大型黄色视频在线免费观看| 侵犯人妻中文字幕一二三四区| 满18在线观看网站| 成人免费观看视频高清| 亚洲国产欧美一区二区综合| 一进一出抽搐动态| 亚洲人成电影免费在线| 亚洲少妇的诱惑av| 麻豆av在线久日| 嫩草影视91久久| 亚洲av片天天在线观看| 岛国在线观看网站| 男女午夜视频在线观看| 中文字幕人妻丝袜一区二区| 黄片大片在线免费观看| 国产成人啪精品午夜网站| 亚洲第一欧美日韩一区二区三区| 欧美日韩福利视频一区二区| 老汉色av国产亚洲站长工具| 99国产精品一区二区三区| 中亚洲国语对白在线视频| 欧美一区二区精品小视频在线| 看黄色毛片网站| 久久香蕉精品热| 精品国产一区二区三区四区第35| 国产野战对白在线观看| 亚洲熟女毛片儿| 中文欧美无线码| 日韩欧美一区二区三区在线观看| 精品国产乱码久久久久久男人| www国产在线视频色| tocl精华| 中文字幕人妻熟女乱码| 在线观看免费午夜福利视频| tocl精华| 99国产综合亚洲精品| 成人18禁在线播放| 亚洲精品国产区一区二| 搡老乐熟女国产| 男人操女人黄网站| 波多野结衣一区麻豆| 女人精品久久久久毛片| 麻豆av在线久日| 亚洲 欧美 日韩 在线 免费| 亚洲一区二区三区不卡视频| 在线观看www视频免费| 亚洲成人免费av在线播放| 日本五十路高清| 在线观看www视频免费| 一夜夜www| 精品国产亚洲在线| 正在播放国产对白刺激| 一边摸一边抽搐一进一小说| 99精品久久久久人妻精品| 国产精品永久免费网站| 又黄又粗又硬又大视频| 欧美在线黄色| 亚洲av成人av| 女生性感内裤真人,穿戴方法视频| 少妇被粗大的猛进出69影院| 久久九九热精品免费| av中文乱码字幕在线| 超碰成人久久| 日韩一卡2卡3卡4卡2021年| 麻豆国产av国片精品| 黄频高清免费视频| 国产精品久久久久成人av| 两性夫妻黄色片| 国产又爽黄色视频| 91精品国产国语对白视频| 可以免费在线观看a视频的电影网站| 搡老熟女国产l中国老女人| 婷婷精品国产亚洲av在线| 久久天堂一区二区三区四区| 1024香蕉在线观看| 美国免费a级毛片| 午夜福利在线观看吧| 88av欧美| 精品国产一区二区三区四区第35| 激情视频va一区二区三区| 丝袜在线中文字幕| 90打野战视频偷拍视频| 久久草成人影院| 国产99久久九九免费精品| 亚洲色图综合在线观看| 美女福利国产在线| 免费女性裸体啪啪无遮挡网站| av天堂在线播放| svipshipincom国产片| 国产av又大| 国产精品永久免费网站| 久久久久久免费高清国产稀缺| 涩涩av久久男人的天堂| www.自偷自拍.com| 色婷婷久久久亚洲欧美| e午夜精品久久久久久久| 久久久久久大精品| 自线自在国产av| 亚洲五月色婷婷综合| 亚洲国产精品999在线| 两个人看的免费小视频| 日韩大码丰满熟妇| 国产有黄有色有爽视频| 日本精品一区二区三区蜜桃| 亚洲,欧美精品.| 99国产精品一区二区蜜桃av| 99热国产这里只有精品6| 国产精品亚洲av一区麻豆| 国产国语露脸激情在线看| 免费观看人在逋| 日韩三级视频一区二区三区| 午夜福利一区二区在线看| 国产精品综合久久久久久久免费 | 这个男人来自地球电影免费观看| 狠狠狠狠99中文字幕| 欧美激情 高清一区二区三区| 亚洲一区二区三区色噜噜 | 老司机靠b影院| 欧美乱码精品一区二区三区| 一区二区三区激情视频| 成人国产一区最新在线观看| 曰老女人黄片| 亚洲免费av在线视频| 男女下面进入的视频免费午夜 | 精品国产乱子伦一区二区三区| 久久久久久久精品吃奶| 亚洲成av片中文字幕在线观看| 亚洲精品中文字幕一二三四区| 男女之事视频高清在线观看| 国产一区二区三区在线臀色熟女 | 欧美黄色片欧美黄色片| 纯流量卡能插随身wifi吗| 一个人观看的视频www高清免费观看 | 亚洲精品av麻豆狂野| 欧美色视频一区免费| 国产熟女午夜一区二区三区| 叶爱在线成人免费视频播放| 亚洲熟妇熟女久久| 久久久国产精品麻豆| 欧美日韩福利视频一区二区| 一级毛片女人18水好多| 亚洲国产精品合色在线| 国产精品99久久99久久久不卡| 激情在线观看视频在线高清| 两性夫妻黄色片| 久久久久久久午夜电影 | 国产伦人伦偷精品视频| 黄片小视频在线播放| 欧美丝袜亚洲另类 | 精品少妇一区二区三区视频日本电影| 免费在线观看影片大全网站| 女性被躁到高潮视频| 国内毛片毛片毛片毛片毛片| 欧美日韩精品网址| 国产1区2区3区精品| 国产激情久久老熟女| 中文字幕色久视频| 精品久久久久久成人av| 国产精品 欧美亚洲| 新久久久久国产一级毛片| 欧美性长视频在线观看| 国产熟女xx| 自拍欧美九色日韩亚洲蝌蚪91| 色哟哟哟哟哟哟| 怎么达到女性高潮| 国产aⅴ精品一区二区三区波| 可以免费在线观看a视频的电影网站| 中文字幕高清在线视频| 一边摸一边抽搐一进一出视频| 日韩精品中文字幕看吧| 色综合欧美亚洲国产小说| 久久久精品国产亚洲av高清涩受| 在线观看午夜福利视频| 亚洲人成77777在线视频| 日本a在线网址| 国产单亲对白刺激| 成人三级做爰电影| 丰满迷人的少妇在线观看| 精品久久蜜臀av无| 久久香蕉激情| 午夜精品在线福利| 久久亚洲真实| 成在线人永久免费视频| 日韩大码丰满熟妇| 国产伦一二天堂av在线观看| 纯流量卡能插随身wifi吗| 在线视频色国产色| 水蜜桃什么品种好| 亚洲欧美激情在线| 久久国产精品男人的天堂亚洲| 国产精品香港三级国产av潘金莲| 色在线成人网| 欧美激情 高清一区二区三区| 中文字幕人妻熟女乱码| 一本大道久久a久久精品| 身体一侧抽搐| av视频免费观看在线观看| 国产成人精品久久二区二区91| 精品一品国产午夜福利视频| 午夜免费成人在线视频| 欧美日韩精品网址| 丁香六月欧美| 午夜91福利影院| 热99国产精品久久久久久7| 午夜精品国产一区二区电影| 婷婷精品国产亚洲av在线| 女性被躁到高潮视频| 亚洲成国产人片在线观看| 亚洲熟妇中文字幕五十中出 | av电影中文网址| 亚洲一区二区三区不卡视频| 成人av一区二区三区在线看| 啦啦啦免费观看视频1| 1024香蕉在线观看| 在线观看www视频免费| xxxhd国产人妻xxx| 高清av免费在线| 在线观看免费日韩欧美大片| 免费在线观看黄色视频的| 两个人看的免费小视频| 亚洲欧洲精品一区二区精品久久久| 国产高清国产精品国产三级| 成年人黄色毛片网站| 叶爱在线成人免费视频播放| 久久天堂一区二区三区四区| 日本五十路高清| 村上凉子中文字幕在线| 最近最新中文字幕大全免费视频| 99精品久久久久人妻精品| 午夜老司机福利片| 淫秽高清视频在线观看| 亚洲第一青青草原| 91成人精品电影| 女性被躁到高潮视频| 国产精品永久免费网站| 久久精品91蜜桃| 国产精品二区激情视频| 亚洲人成77777在线视频| 国内毛片毛片毛片毛片毛片| 亚洲欧美激情在线| 自拍欧美九色日韩亚洲蝌蚪91| 久久久久久久久久久久大奶| 久久久国产欧美日韩av| 一进一出抽搐gif免费好疼 | 亚洲精品在线观看二区| 在线观看66精品国产| 一级黄色大片毛片| 亚洲av片天天在线观看| 制服人妻中文乱码| 国产精品亚洲av一区麻豆| 国产精品国产av在线观看| 国产一区二区激情短视频| 中文字幕另类日韩欧美亚洲嫩草| 欧美成人性av电影在线观看| 亚洲人成电影观看| 欧美日韩亚洲综合一区二区三区_| 90打野战视频偷拍视频| 久久99一区二区三区| 久久国产精品影院| 欧美乱色亚洲激情| 激情视频va一区二区三区| 精品人妻1区二区| av视频免费观看在线观看| 夜夜看夜夜爽夜夜摸 | 伦理电影免费视频| 大香蕉久久成人网| 亚洲人成电影观看| 性少妇av在线| 超碰成人久久| 好看av亚洲va欧美ⅴa在| 成年人黄色毛片网站| 91成人精品电影| 精品少妇一区二区三区视频日本电影| 好看av亚洲va欧美ⅴa在| 免费在线观看视频国产中文字幕亚洲| 欧美日本亚洲视频在线播放| 日韩大码丰满熟妇| 精品国产乱子伦一区二区三区| 乱人伦中国视频| av有码第一页| 1024视频免费在线观看| 日韩有码中文字幕| 日韩欧美免费精品| 波多野结衣高清无吗| 欧美成人午夜精品| 老司机深夜福利视频在线观看| 亚洲国产精品合色在线| 日韩中文字幕欧美一区二区| 日本五十路高清| 黄色成人免费大全| 啦啦啦 在线观看视频| 久久久精品欧美日韩精品| 免费观看人在逋| av视频免费观看在线观看| av中文乱码字幕在线| av视频免费观看在线观看| 交换朋友夫妻互换小说| 91老司机精品| 亚洲第一av免费看| 欧美精品一区二区免费开放| 一本综合久久免费| 亚洲伊人色综图| 国产xxxxx性猛交| 女性生殖器流出的白浆| 久久精品亚洲av国产电影网| 成人免费观看视频高清| 91成人精品电影| 丝袜在线中文字幕| 午夜影院日韩av| 在线免费观看的www视频| 精品一区二区三卡| 久久人人97超碰香蕉20202| 日韩中文字幕欧美一区二区| 曰老女人黄片| 国产伦一二天堂av在线观看| √禁漫天堂资源中文www| 黄色a级毛片大全视频| 国产亚洲欧美98| 久久香蕉激情| www国产在线视频色| 成人影院久久| 久久青草综合色| 日本三级黄在线观看| 久久久久久久精品吃奶| 色综合站精品国产| 麻豆av在线久日| 日本撒尿小便嘘嘘汇集6| 国产精品 欧美亚洲| e午夜精品久久久久久久| √禁漫天堂资源中文www| www日本在线高清视频| 一区福利在线观看| 最近最新中文字幕大全免费视频| 国产精品秋霞免费鲁丝片| 18禁美女被吸乳视频| 美女大奶头视频| 成人av一区二区三区在线看| 啦啦啦免费观看视频1| av有码第一页| 久久亚洲真实| svipshipincom国产片| 级片在线观看| 久久热在线av| 亚洲在线自拍视频| 超碰成人久久| 国产av在哪里看| 久热这里只有精品99| 日韩视频一区二区在线观看| 欧美日韩乱码在线| 嫩草影院精品99| 欧洲精品卡2卡3卡4卡5卡区| 亚洲欧美激情在线| 久久久久久久精品吃奶| 91在线观看av| 男女床上黄色一级片免费看| 国产精品偷伦视频观看了| 人人妻人人澡人人看| 脱女人内裤的视频| 99国产精品99久久久久| 女人爽到高潮嗷嗷叫在线视频| 国产精品99久久99久久久不卡| 国产精品亚洲一级av第二区| 热re99久久精品国产66热6| 亚洲第一av免费看| 亚洲五月天丁香| 亚洲av片天天在线观看| 免费在线观看完整版高清| 桃红色精品国产亚洲av| 午夜老司机福利片| 欧美在线黄色| 久久精品国产综合久久久| 国产精品1区2区在线观看.| 在线天堂中文资源库| 两性夫妻黄色片| 亚洲avbb在线观看| 欧美 亚洲 国产 日韩一| 久热爱精品视频在线9| 国产精品亚洲一级av第二区| 欧美精品啪啪一区二区三区| 精品福利永久在线观看| 又大又爽又粗| 老司机午夜福利在线观看视频| 天堂影院成人在线观看| 一区在线观看完整版| 黄色丝袜av网址大全| 变态另类成人亚洲欧美熟女 | 精品高清国产在线一区| 欧美日韩瑟瑟在线播放| 交换朋友夫妻互换小说| 亚洲免费av在线视频| 欧美激情极品国产一区二区三区| 美女高潮喷水抽搐中文字幕| 十分钟在线观看高清视频www| 中国美女看黄片| 亚洲人成电影免费在线| 亚洲av五月六月丁香网| 制服诱惑二区| 亚洲专区国产一区二区| 黑丝袜美女国产一区| 在线观看www视频免费| 欧美激情久久久久久爽电影 | 精品久久久久久成人av| 日本精品一区二区三区蜜桃| 在线观看免费视频日本深夜| ponron亚洲| 女人高潮潮喷娇喘18禁视频| 女人被躁到高潮嗷嗷叫费观| 久久国产精品影院| 日韩免费高清中文字幕av| 高清av免费在线| 日韩av在线大香蕉| 国产欧美日韩一区二区精品| 99久久99久久久精品蜜桃| 亚洲色图av天堂| av在线播放免费不卡| 桃红色精品国产亚洲av| 国产精华一区二区三区| www.999成人在线观看| 黑人巨大精品欧美一区二区蜜桃| 中文字幕人妻丝袜一区二区| 欧美人与性动交α欧美精品济南到| 国产色视频综合| 亚洲精品美女久久av网站| 少妇被粗大的猛进出69影院| 国产成人系列免费观看| a级片在线免费高清观看视频| 国产欧美日韩综合在线一区二区| 欧美精品亚洲一区二区| 男女床上黄色一级片免费看| 国产一区二区激情短视频| 久久久久国内视频| tocl精华| 免费在线观看亚洲国产| 久久久久久久久免费视频了| xxx96com| 欧美日韩av久久| 99re在线观看精品视频| av福利片在线| 麻豆久久精品国产亚洲av | 12—13女人毛片做爰片一| 欧美在线黄色| 成年人免费黄色播放视频| 视频在线观看一区二区三区| 久久久久久亚洲精品国产蜜桃av| 在线观看免费高清a一片| 91字幕亚洲| a在线观看视频网站| 超碰成人久久| 亚洲男人天堂网一区| 后天国语完整版免费观看| 国产在线精品亚洲第一网站| 欧美性长视频在线观看| 亚洲成人免费电影在线观看| 可以在线观看毛片的网站| 在线观看66精品国产| 亚洲国产精品sss在线观看 | 天堂影院成人在线观看| 伦理电影免费视频| 我的亚洲天堂| 男男h啪啪无遮挡| 国产精品国产高清国产av| 免费在线观看视频国产中文字幕亚洲| 又黄又爽又免费观看的视频| 成人国语在线视频| 亚洲黑人精品在线| 黄色片一级片一级黄色片| 国产极品粉嫩免费观看在线| 亚洲第一欧美日韩一区二区三区| 国产99白浆流出| 久久精品国产亚洲av高清一级| 久久国产亚洲av麻豆专区| 新久久久久国产一级毛片| 黄频高清免费视频| 人人妻人人澡人人看| 国产精品 欧美亚洲| 久久影院123| 国产精品电影一区二区三区| 亚洲欧美激情在线| 亚洲成人国产一区在线观看| 交换朋友夫妻互换小说| 国内久久婷婷六月综合欲色啪| 最好的美女福利视频网| 国产精华一区二区三区| 9191精品国产免费久久| 亚洲全国av大片| 18禁国产床啪视频网站| a在线观看视频网站| 午夜久久久在线观看| 日本vs欧美在线观看视频| 操出白浆在线播放| 啦啦啦在线免费观看视频4| 日韩欧美一区二区三区在线观看| 在线观看日韩欧美| 国产精品野战在线观看 | 成人精品一区二区免费| 咕卡用的链子| 国产成人系列免费观看| 午夜影院日韩av| 91av网站免费观看| 好看av亚洲va欧美ⅴa在| 国产黄a三级三级三级人| 熟女少妇亚洲综合色aaa.| 悠悠久久av| 国产三级黄色录像| 一级,二级,三级黄色视频| 校园春色视频在线观看| 午夜日韩欧美国产| 日韩欧美国产一区二区入口| 天天躁狠狠躁夜夜躁狠狠躁| 午夜亚洲福利在线播放| 级片在线观看| 757午夜福利合集在线观看| 丁香欧美五月| 狠狠狠狠99中文字幕| 午夜福利在线观看吧| 久9热在线精品视频| 国产国语露脸激情在线看| 一级毛片女人18水好多| 国产高清激情床上av| 久久精品aⅴ一区二区三区四区| 国产欧美日韩精品亚洲av| 亚洲国产精品一区二区三区在线| 亚洲人成电影免费在线| 日本一区二区免费在线视频| netflix在线观看网站| 天天躁狠狠躁夜夜躁狠狠躁| 亚洲欧美一区二区三区黑人| 国产亚洲精品综合一区在线观看 | 精品国产乱子伦一区二区三区| 女同久久另类99精品国产91| 88av欧美| 成人黄色视频免费在线看| 黑人猛操日本美女一级片| 久久香蕉精品热| 亚洲熟妇熟女久久| 成人黄色视频免费在线看| 黑人欧美特级aaaaaa片| 国产99白浆流出| 岛国视频午夜一区免费看| 久久人妻熟女aⅴ| 日韩一卡2卡3卡4卡2021年| 在线观看免费高清a一片| 午夜免费成人在线视频| 国产av精品麻豆| 久久草成人影院| 一二三四在线观看免费中文在| 精品免费久久久久久久清纯| 婷婷丁香在线五月| 99久久久亚洲精品蜜臀av| 天堂√8在线中文| 免费观看人在逋| 日韩有码中文字幕| 9191精品国产免费久久| 亚洲国产看品久久| 狠狠狠狠99中文字幕| 亚洲午夜理论影院| 每晚都被弄得嗷嗷叫到高潮| 免费观看人在逋| www.精华液| 又紧又爽又黄一区二区| 免费少妇av软件| 免费看十八禁软件| 国产精品久久电影中文字幕| 国产精品一区二区精品视频观看| 涩涩av久久男人的天堂| 香蕉国产在线看| 亚洲自偷自拍图片 自拍| 日本欧美视频一区| 交换朋友夫妻互换小说| 高清在线国产一区| 久久 成人 亚洲| 18禁观看日本| 免费人成视频x8x8入口观看| 精品欧美一区二区三区在线| 亚洲成人免费电影在线观看| 久久久精品欧美日韩精品| 亚洲欧美日韩高清在线视频| 美女大奶头视频| 高清黄色对白视频在线免费看| 久久国产精品人妻蜜桃| 日韩精品青青久久久久久| 精品国产乱码久久久久久男人| 夜夜夜夜夜久久久久| 亚洲一区中文字幕在线| 午夜福利一区二区在线看| 性欧美人与动物交配| 国产高清视频在线播放一区| 久久久久久久精品吃奶| 欧美最黄视频在线播放免费 | 一边摸一边做爽爽视频免费| 俄罗斯特黄特色一大片| 久久伊人香网站| 亚洲熟女毛片儿| av欧美777| 国产成人影院久久av| av电影中文网址| 亚洲精品美女久久av网站| 久久久久国内视频| 国产99久久九九免费精品| 日本 av在线| 久久久精品国产亚洲av高清涩受| 人人妻人人添人人爽欧美一区卜| 亚洲国产精品999在线| av福利片在线| 免费在线观看日本一区| 国产欧美日韩一区二区三| 亚洲欧美激情综合另类| 老汉色∧v一级毛片| 久久久久久久久久久久大奶| 欧美乱码精品一区二区三区| 国产成年人精品一区二区 | 新久久久久国产一级毛片| 国产精品久久视频播放|