Date of Submission

Summer 7-20-2021

Degree Type

Thesis

Degree Name

Master of Science in Computer Science (MSCS)

Department

Computer Science

Committee Chair/First Advisor

Dr. Dan Lo

Track

Big Data

Committee Member

Dr. Reza Meimandi Parizi

Committee Member

Dr. Yong Shi

Abstract

Big data analytics is gaining popularity for enterprises in optimizing their business processes ranging from retailers, supply chains, to online shopping stores. Existing practical raw data are far from usable to achieve the goal. Therefore, a good data pre-processing approach is required and is a key step to success. We propose to research on the effectiveness of data pre-processing and the business process based on a real world database. Our methodology involves natural language processing. Our key goal is to study appropriate methods with big data analysis techniques that can handle errors, ambiguity, and repeated descriptions caused by human languages. In this study, we did a simple language similarity checking to understand the database status. We also applied a logical representation system in our database to prove this concept.

Share

COinS