Date of Submission

Summer 7-20-2021

Degree Type


Degree Name

Master of Science in Computer Science (MSCS)


Computer Science

Committee Chair/First Advisor

Dr. Dan Lo


Big Data

Committee Member

Dr. Reza Meimandi Parizi

Committee Member

Dr. Yong Shi


Big data analytics is gaining popularity for enterprises in optimizing their business processes ranging from retailers, supply chains, to online shopping stores. Existing practical raw data are far from usable to achieve the goal. Therefore, a good data pre-processing approach is required and is a key step to success. We propose to research on the effectiveness of data pre-processing and the business process based on a real world database. Our methodology involves natural language processing. Our key goal is to study appropriate methods with big data analysis techniques that can handle errors, ambiguity, and repeated descriptions caused by human languages. In this study, we did a simple language similarity checking to understand the database status. We also applied a logical representation system in our database to prove this concept.