Date of Submission
Spring 5-7-2021
Degree Type
Thesis
Degree Name
Master of Science in Computer Science (MSCS)
Department
Computer Science
Committee Chair/First Advisor
Dr. Jing (Selena) He and Dr. Meng Han
Track
Others
Machine Learning
Chair
Dr. Jing (Selena) He
Committee Member
Dr. Meng Han
Committee Member
Dr. Yan Huang
Abstract
Natural Language Processing (NLP) is one of the most attractive technologies in many applications in real-life. Sentiment analysis, which has devoted to know others' think or feel about an experience or an item and hence take an action, is one of the most developed area in both academia and industry. Among sentiment analysis, fine-grained aspect sentiment analysis attempts to analyze emotional attitude categorized into different aspects or features of an(a) experience/service/product. Although aspect level sentiment analysis could provide more useful information, the proposed models' performance were relative poor compared with document-level or sentence-level sentiment analysis due to the lack of well-labeled aspect-level dataset. This thesis propose a semi-supervised approach using pre-trained BERT model to conduct the fine-grained aspect sentiment analysis, and tests it on the benchmark dataset SemEval2014. Our proposed Sentiment Mask Enhanced BERT pre-training and Multi-Token classification (SME-BERT-MT) model consists of two parts, a masked sentiment pre-trained model and a multi-label classification main model. In the first step, We first leverage the sentiment dictionary SentiWordNet to identify the sentiment words in the reviews of the dataset and mask them, then use the masked sentiment pre-trained model to predict the masked words. In the second step, we use the multi-label classification main model to detect aspect topics and aspect-level sentiment. This multi-label classification main model incorporates our pre-trained model and fine-tuned with addition of classification layers. The testing result on SME-BERT-MT shows an accuracy of 96.6 and 87.6 for aspect detection and aspect-level sentiment classification sub-tasks, respectively, on SemEval2014 dataset. Our model outperforms the baseline BERT model by 28.1% and 7.4%, and the accuracy is also more than 2.9% and 1.9% higher than previous works for the aspect detection and sentiment classification, respectively.