COVID-19 Outbreak Prediction and Analysis using Self Reported Symptoms

Rohan Sukumaran$^{*1}$, Parth Patwa$^{*1}$, Sethuraman T V$^{*1}$, Sheshank Shankar$^{1}$, Rishank Kanaparti$^{1}$, Joseph Bae$^{1,2}$, Yash Mathur$^{1}$, Abhishek Singh$^{4}$, Ayush Chopra$^{4}$, Myungsun Kang$^{1}$, Priya Ramaswamy$^{1,3}$, and Ramesh Raskar$^{1,4}$

$^1$PathCheck Foundation
rohan.sukumaran@pathcheck.org, parth.patwa@pathcheck.org, sethu.ramantv@pathcheck.org
$^2$Stony Brook Medicine
$^3$University of California San Francisco
$^4$MIT Media Lab

* Equal contribution

Abstract: It is crucial for policymakers to understand the community prevalence of COVID-19 so combative resources can be effectively allocated and prioritized during the COVID-19 pandemic. Traditionally, community prevalence has been assessed through diagnostic and antibody testing data. However, despite the increasing availability of COVID-19 testing, the required level has not been met in parts of the globe, introducing a need for an alternative method for communities to determine disease prevalence. This is further complicated by the observation that COVID-19 prevalence and spread vary across different spatial, temporal, and demographic verticals. In this study, we study trends in the spread of COVID-19 by utilizing the results of self-reported COVID-19 symptoms surveys as a complement to COVID-19 testing reports. This allows us to assess community disease prevalence, even in areas with low COVID-19 testing ability. Using individually reported symptom data from various populations, our method predicts the likely percentage of the population that tested positive for COVID-19. We achieved a mean absolute error (MAE) of 1.14 and mean relative error (MRE) of 60.40% with 95% confidence interval as [60.12, 60.67]. This implies that our model predicts +/- 1140 cases than the original in a population of 1 million. In addition, we forecast the location-wise percentage of the population testing positive for the next 30 days using self-reported symptoms data from previous days. The MAE for this method is as low as 0.15 (MRE of 11.28% with 95% confidence interval [10.9, 11.6]) for New York. We present an analysis of these results, exposing various clinical attributes of interest across different demographics. Lastly, we qualitatively analyze how various policy enactments (testing, curfew) affect the prevalence of COVID-19 in a community.

Keywords: Machine Learning • COVID-19 • Outbreak Prediction • Time Series

DOI: https://doi.org/10.35566/jbds/v1n1/p8

Fulltext: Read online

PDF: v1n1p8.pdf

Citation: (APA style) Sukumaran, R., Patwa, P., V, S. T., Shankar, S., Kanaparti, R., Bae, J., Mathur, Y., Singh, A., Chopra, A., Kang, M., Ramaswamy, P., & Raskar, R. (2021). COVID-19 Outbreak Prediction and Analysis using Self Reported Symptoms. Journal of Behavioral Data Science, 1(1), 154–169. https://doi.org/10.35566/jbds/v1n1/p8

BibTex format:

@Article{Sukumaran2021,
  author  = {Rohan Sukumaran and Parth Patwa and Sethuraman {T V} and Sheshank Shankar and Rishank Kanaparti and Joseph Bae and Yash Mathur and Abhishek Singh and Ayush Chopra and Myungsun Kang and Priya Ramaswamy and Ramesh Raskar},
  journal = {Journal of Behavioral Data Science},
  title   = {COVID-19 Outbreak Prediction and Analysis using Self Reported Symptoms},
  year    = {2021},
  number  = {1},
  pages   = {154--169},
  volume  = {1},
  doi     = {10.35566/jbds/v1n1/p8},
}