Disciplines

Health Information Technology

Abstract (300 words maximum)

The coronavirus disease 2019 (COVID-19) caused a pandemic outbreak affecting 213 nations worldwide. Global policymakers are imposing many measures to slow and reduce the rapid growth of infections. On the other hand, the healthcare system is encountering significant challenges for a massive number of COVID-19 confirmed or suspected individuals seeking treatment. Therefore, estimating the number of confirmed cases is necessary to provide valuable insights into the growth of the outbreak and facilitate the policy-making process. In this study, we apply ARIMA models as well as LSTM-based recurrent neural networks to forecast the daily cumulative confirmed cases. The LSTM architecture generates more precise forecasting by leveraging both short- and long-term temporal dependencies from the pandemic time series data. Due to the stochastic nature of optimization and random initialization of weights in the neural networks, the LSTM based model produces a less reproducible outcome. In this paper, we propose a reproducible-LSTM (r-LSTM) framework that produces reproducible and robust results leveraging the z-score outlier detection method. We performed five rounds of nested cross-validation to show consistency in evaluating model performance. The experimental results demonstrate that r-LSTM outperformed the ARIMA model producing minimum MAPE, RMSE, and MAE.

Academic department under which the project should be listed

CCSE - Information Technology

Primary Investigator (PI) Name

Hossain Shahriar

Share

COinS
 

r-LSTM: Time Series Forecasting for COVID-19 Confirmed Cases with LSTM-based Framework

The coronavirus disease 2019 (COVID-19) caused a pandemic outbreak affecting 213 nations worldwide. Global policymakers are imposing many measures to slow and reduce the rapid growth of infections. On the other hand, the healthcare system is encountering significant challenges for a massive number of COVID-19 confirmed or suspected individuals seeking treatment. Therefore, estimating the number of confirmed cases is necessary to provide valuable insights into the growth of the outbreak and facilitate the policy-making process. In this study, we apply ARIMA models as well as LSTM-based recurrent neural networks to forecast the daily cumulative confirmed cases. The LSTM architecture generates more precise forecasting by leveraging both short- and long-term temporal dependencies from the pandemic time series data. Due to the stochastic nature of optimization and random initialization of weights in the neural networks, the LSTM based model produces a less reproducible outcome. In this paper, we propose a reproducible-LSTM (r-LSTM) framework that produces reproducible and robust results leveraging the z-score outlier detection method. We performed five rounds of nested cross-validation to show consistency in evaluating model performance. The experimental results demonstrate that r-LSTM outperformed the ARIMA model producing minimum MAPE, RMSE, and MAE.