基于注意力機制的雙向LSTM模型在中文商品評論情感分類中的研究

2018-01-05 08:06:09成璐

軟件工程 2017年11期

成璐

摘要：國內(nèi)電商網(wǎng)站的快速發(fā)展促使產(chǎn)生大量的中文商品評論信息。對這些評論進行情感分類有利于獲取其中的有用信息，具有重要的應用意義。目前，情感分類的研究主要基于情感詞典或者傳統(tǒng)機器學習。這些方法通常需要人工選取特征，費事費力，分類效果不好。針對這些不足，本文提出一種基于注意力機制的雙向LSTM模型，對中文商品評論進行情感分類。實驗結果表明，該模型在中文商品評論二分類任務和三分類任務中均獲得了較好的準確率、召回率、F1值。

關鍵詞：中文商品評論；情感分類；注意力機制；雙向LSTM

中圖分類號：TP391 文獻標識碼：A

Abstract：With the rapid development of domestic E-commerce websites，there are lots of Chinese product reviews.The sentiment classification of Chinese product reviews is helpful to obtain useful information，with great application significance.Currently，most sentiment classification studies are based on the sentiment dictionary or traditional machine learning methods.These methods usually need artificial selection of features，with low classification efficiency and effectiveness.In view of all these deficiencies，the paper proposes an attention mechanism-based bidirectional LSTM model for the sentiment classification of Chinese product reviews.The experimental results show that the proposed model has better precision rate，recall rate and F1 score in binary classification tasks and three classification tasks in Chinese product reviews.

Keywords：Chinese product reviews；sentiment classification；attention mechanism；bidirectional LSTM

1 引言（Introduction）

隨著國內(nèi)電商網(wǎng)站的迅猛發(fā)展，越來越多的人選擇網(wǎng)上購物，隨之產(chǎn)生大量的中文商品評論信息。對這些信息進行情感分類，不僅可以挖掘用戶對商品的喜好程度，給潛在用戶提供購買建議，同時有利于商家及時改善產(chǎn)品及服務，從而提高商業(yè)價值。因此，對中文商品評論進行情感分類變得非常必要。

傳統(tǒng)的情感分類研究方法主要有兩種：（1）基于情感詞典的方法；（2）基于傳統(tǒng)機器學習的方法[1]。前者需要人工創(chuàng)建情感詞典，費事費力。后者通常采用樸素貝葉斯（NB）、最大熵（ME）、支持向量機（SVM）等進行分類，這些方法容易丟失文本語法語義信息，很難有效捕獲文本中的情感。

隨著深度神經(jīng)網(wǎng)絡在自然語言處理領域的應用，2003年Bengio等人[2]通過神經(jīng)網(wǎng)絡訓練詞向量來表示文本。詞向量不僅可以有效獲取語義信息[3]，同時避免了數(shù)據(jù)稀疏性問題。利用詞向量表示文本，并采用深度學習模型，如遞歸神經(jīng)網(wǎng)絡[4，5]、卷積神經(jīng)網(wǎng)絡（CNN）[6，7]、循環(huán)神經(jīng)網(wǎng)絡（RNN）[8]等，進行情感分類可以獲得比傳統(tǒng)機器學習方法更優(yōu)的效果。

考慮到在對商品評論進行情感分類時，文本對上下文有較強的依賴性，而標準的神經(jīng)網(wǎng)絡模型不能很好地解決該問題，本文采用雙向的長短時記憶神經(jīng)網(wǎng)絡（Bidirectional Long Sort Term Memory，Bi-LSTM）進行情感分類。另外，考慮到不同的詞對文本的貢獻不相同，引入Attention機制?；诖?，本文提出了一種基于Attention機制的Bi-LSTM模型對中文商品評論進行情感分類。為了驗證模型的有效性，本文采用某電商網(wǎng)站的手機評論數(shù)據(jù)集對模型進行實驗。實驗結果表明，該模型取得了較好的效果。

2 基于Attention機制的Bi-LSTM模型（Bi-LSTM

model based on attention mechanism）

基于Attention機制的Bi-LSTM模型如圖1所示。該模型主要由四部分組成：

（1）采用詞向量表示文本；

（2）利用Bi-LSTM模型獲取文本特征；

（3）引入Attention機制表示不同特征的重要性；

（4）最后利用分類器進行情感分類。

3 實驗（Experiment）

3.1 數(shù)據(jù)集

為了驗證模型，采取某電商網(wǎng)站的手機評論作為數(shù)據(jù)集。該數(shù)據(jù)集共15649篇評論，根據(jù)評論星級劃分為：好評（4星、5星）4373篇評論，中評（3星）4629篇評論，差評（1星、2星）6647篇評論。數(shù)據(jù)集樣例見表1。

本文選取好評、差評進行二分類；選擇好評、中評、差評進行三分類。所有的分類任務重，按照80%、10%、10%的比例將數(shù)據(jù)集隨機分為訓練集、驗證集、測試集，見表2。endprint

3.2 數(shù)據(jù)預處理

本文采用jieba分詞工具對評論文本進行分詞并去除分詞列表中的停用詞和標點符號。處理之后文本最大長度為281。為了構建特征向量，詞向量的維度為100，采用兩種方式初始化詞向量：

（1）隨機初始化：所有的詞均隨機初始化，并在訓練過程中詞向量動態(tài)更新。

（2）使用word2vec工具：使用2013年Google提出的開源工具word2vec訓練詞向量，同時對于未出現(xiàn)的詞隨機初始化，訓練過程中詞向量動態(tài)更新。

3.3 實驗參數(shù)設置

為了訓練一個較優(yōu)的模型，模型參數(shù)的設置非常關鍵。模型中主要參數(shù)設置為：學習率為0.01，批處理文件數(shù)為50，Bi-LSTM中隱藏單元數(shù)為200，Dropout值為0.75，L2正則化參數(shù)為0.0001。

3.4 實驗結果及分析

為了驗證模型的有效性，將本文提出的模型rand-Attention-Bi-LSTM、word2vec-Attention-Bi-LSTM與LSTM、Bi-LSTM進行比較。模型評價指標為準確率、召回率、F值。實驗結果見表3。

通過表3可以看出：

（1）Bi-LSTM和LSTM相比，準確率、召回率、F值均有所提升，這是因為Bi-LSTM同時考慮文本的上下文，說明Bi-LSTM的分類效果優(yōu)于LSTM。

（2）由于引入Attention機制，本文的模型準確率、召回率、F值均高于LSTM和Bi-LSTM，說明Attention機制能夠較好地反映文本中詞的重要性。

（3）通過word2vec-Attention-Bi-LSTM和rand-Attention-Bi-LSTM對比發(fā)現(xiàn)，采用word2vec初始化詞向量更有效，有利于提高情感分類精度。

4 結論（Conclusion）

本文提出了一種基于Attention機制的Bi-LSTM模型對中文商品評論進行情感分類。將商品評論用詞向量表示，通過Bi-LSTM獲取文本的上下文關系，同時引入Attention機制表示不同特征的重要性，并進一步優(yōu)化模型。最后，運用該模型在某電商網(wǎng)站的手機評論集上進行情感分類，實驗結果驗證了該模型的可行性和有效性。

由于中文商品評論中包含對商品多個屬性的評價，下一步工作將尋找更優(yōu)的深度學習模型，對商品評論中的不同屬性進行情感傾向性研究。

參考文獻（References）

[1] 杜昌順，黃磊.分段卷積神經(jīng)網(wǎng)絡在文本情感分析中的應用[J].計算機工程與科學，2017，39（01）：173-179.

[2] Yoshua Bengio，Holger Schwenk，Jean-Sébastien Senécal，et al.A Neural Probabilistic Language Model[J].Journal of Machine Learning Research，2003，3：1137-1155.

[3] Mikolov Tomas，Yih Wen-tau，Zweig Geoffrey.Linguistic regularities in continuous space word representations[C].The Annual Conference of the North American Chapter of the Association for Computational Linguistics（NAACL-HLT）， 2013：746-751.

[4] Richard Socher，Brody Huval，Christopher D.Manning，et al.Semantic compositionality through recursive matrix vector spaces[C].Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning（EMNLP-CoNLL），2012：1201-1211.

[5] Richard Socher，Alex Perelygin，Jean Wu，et al.Recursive deep models for semantic compositionality over a sentiment Treebank[C].Proceedings of 2013 Conference on Empirical Methods in Natural Language Processing（EMNLP），2013：1631-1642.

[6] Yoon Kim.Convolutional neural networks for sentence classification[C].Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing（EMNLP），2014：1746-1751.

[7] Nal Kalchbrenner，Edward Grefenstette，Phil Blunsom.A convolutional neural network for modelling sentences[C].Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics（ACL），2014：655-665.endprint

[8] Siwei Lai，Liheng Xu，Kang Liu，et al.Recurrent convolutional neural networks for text classification[C].Proceedings of the Twenty-Ninth AAAI Conference on Artificial Intelligence，2015：2267-2273.

[9] Yequan Wang，Minlie Huang，Xiaoyan Zhu，et al.Attention-based LSTM for Aspect-level Sentiment Classification[J].Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing（EMNL），2016：606-615.

[10] Sepp Hochreiter，Jürgen Schmidhuber.Long short-term memory[J].Neural computation，1997，9（8）：1735-1780.

[11] Kelvin Xu，Jimmy Ba，Ryan Kiros，et al.Show，attend and tell：Neural image caption generation with visual attention[C].Proceedings of the 32nd International Conference on Machine Learning（ICML），2015：2048-2057.

[12] Volodymyr Mnih，Nicolas Heess，Alex Graves，et al.Recurrent models of visual attention[C].Advances in Neural Information Processing Systems 27（NIPS），2014：2204-2212.

[13] Zichao Yang，Diyi Yang，Chris Dyer，et al.Hierarchical Attention Networks for Document Classification[C].Proceedings of Human Language Technologies.The Annual Conference of the North American Chapter of the Association for Computational Linguistics（NAACL-HLT），2016：1480-489.

[14] Dzmitry Bahdanau，Kyunghyun Cho，Yoshua Bengio.Neural machine translation by jointly learning to align and translate[C].International Conference on Learning Representations（ICLR），2015.

作者簡介：

成璐（1988-），女，碩士，助教.研究領域：人工智能，自然語言處理，無線傳感網(wǎng)絡.endprint