Date of Submission

Spring 5-7-2021

Degree Type

Thesis

Degree Name

Master of Science in Computer Science (MSCS)

Department

Computer Science

Committee Chair/First Advisor

Dr. Jing (Selena) He and Dr. Meng Han

Track

Others

Machine Learning

Chair

Dr. Jing (Selena) He

Committee Member

Dr. Meng Han

Committee Member

Dr. Yan Huang

Abstract

Natural Language Processing (NLP) is one of the most attractive technologies in many applications in real-life. Sentiment analysis, which has devoted to know others' think or feel about an experience or an item and hence take an action, is one of the most developed area in both academia and industry. Among sentiment analysis, fine-grained aspect sentiment analysis attempts to analyze emotional attitude categorized into different aspects or features of an(a) experience/service/product. Although aspect level sentiment analysis could provide more useful information, the proposed models' performance were relative poor compared with document-level or sentence-level sentiment analysis due to the lack of well-labeled aspect-level dataset. This thesis propose a semi-supervised approach using pre-trained BERT model to conduct the fine-grained aspect sentiment analysis, and tests it on the benchmark dataset SemEval2014. Our proposed Sentiment Mask Enhanced BERT pre-training and Multi-Token classification (SME-BERT-MT) model consists of two parts, a masked sentiment pre-trained model and a multi-label classification main model. In the first step, We first leverage the sentiment dictionary SentiWordNet to identify the sentiment words in the reviews of the dataset and mask them, then use the masked sentiment pre-trained model to predict the masked words. In the second step, we use the multi-label classification main model to detect aspect topics and aspect-level sentiment. This multi-label classification main model incorporates our pre-trained model and fine-tuned with addition of classification layers. The testing result on SME-BERT-MT shows an accuracy of 96.6 and 87.6 for aspect detection and aspect-level sentiment classification sub-tasks, respectively, on SemEval2014 dataset. Our model outperforms the baseline BERT model by 28.1% and 7.4%, and the accuracy is also more than 2.9% and 1.9% higher than previous works for the aspect detection and sentiment classification, respectively.

Available for download on Wednesday, May 06, 2026

Share

COinS