Data Analysis and Visualization in COVID-19 Worldwide Variants Study

Disciplines

Data Science | International Public Health

Abstract (300 words maximum)

Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) is affecting the whole human society in different ways. SARS-CoV-2 infects the human via the interaction of the Spike Protein with the human Angiotensin-Converting Enzyme 2 (ACE2) of the host cells. As the SARS-CoV-2 Virus mutates, new variants emerge, which are more infectious and lethal than the 2019 strain. Using data collected from the GISAID database on COVID-19 variants, we examined the 7 major variants: Alpha (B.1.1.7), Beta (B.1.351), Gamma (P.1), Delta (B.1.617.2), Epsilon (B.1.617.1), Zeta (B.1.525), and Theta (B.1.617.B). Among all countries, we narrowed our data down to six countries: Brazil, India, South Africa, South Korea, the UK, and the USA. China and Russia were not included due to the missing data reports for a period. After cleaning the data, we present the prevalence of each variant through visual representation using bar graphs, pie charts, and line graphs. We also compared the protein structures among different variants and observed the amino acid mutations on the Receptor Binding Domains (RBDs). Our analysis will examine the most prevalent variant and will look to explain any spikes in any country by looking at any policies implemented or lifted during the trend of the variant’s infection rate. Our research will provide an important reference for future COVID-19 research or similar diseases.

Academic department under which the project should be listed

CCSE - Information Technology

Primary Investigator (PI) Name

Chloe Yixin Xie

This document is currently not available here.

Share

COinS
 

Data Analysis and Visualization in COVID-19 Worldwide Variants Study

Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) is affecting the whole human society in different ways. SARS-CoV-2 infects the human via the interaction of the Spike Protein with the human Angiotensin-Converting Enzyme 2 (ACE2) of the host cells. As the SARS-CoV-2 Virus mutates, new variants emerge, which are more infectious and lethal than the 2019 strain. Using data collected from the GISAID database on COVID-19 variants, we examined the 7 major variants: Alpha (B.1.1.7), Beta (B.1.351), Gamma (P.1), Delta (B.1.617.2), Epsilon (B.1.617.1), Zeta (B.1.525), and Theta (B.1.617.B). Among all countries, we narrowed our data down to six countries: Brazil, India, South Africa, South Korea, the UK, and the USA. China and Russia were not included due to the missing data reports for a period. After cleaning the data, we present the prevalence of each variant through visual representation using bar graphs, pie charts, and line graphs. We also compared the protein structures among different variants and observed the amino acid mutations on the Receptor Binding Domains (RBDs). Our analysis will examine the most prevalent variant and will look to explain any spikes in any country by looking at any policies implemented or lifted during the trend of the variant’s infection rate. Our research will provide an important reference for future COVID-19 research or similar diseases.