View interactive dashboard

In-depth look at COVID-19 cases/deaths in counties across the US.

Find similar counties

Find and compare similar counties based on different attributes.

The Yu Group at UC Berkeley Statistics / EECS / CCB is working to help forecast the severity of the epidemic for individual counties and hospitals in the US. We develop interpretable models (updated daily) and curate data to predict the trajectory of COVID-19-related deaths. This website provides access to those predictions, in the form of interactive visualizations. We are collaborating with Response4Life to blunt the effect of COVID-19 through the production and appropriate distribution of PPE, medical equipment, and medical personnel to healthcare facilities across the United States.

For hospital level prediction, please go to our hospitalization prediction page where one can upload data for a specific hospital and download prediction results for the given hospital. The uploaded data will only be temporarily used for prediction and will not be collected.

visualizations

Google sheets with our daily updated predictions:

County-level gsheet Hospital-level gsheet
Get predictions for your own data

This link provides access to those predictions where hospitals can upload hospitalization data and get 14 day prediction results. The uploaded data will only be temporarily used for prediction and will not be stored in any form.

View interactive map in fullscreen

COVID pandemic severity index (CPSI): this index is designed to help aid the distribution of medical resources to hospitals. It takes on three values (3: High, 2: Medium, 1: Low), indicating the severity of the covid-19 outbreak for a hospital on a certain day. It is calculated in three steps (more details here):

1. county-level predictions for number of deaths are modeled
2. county-level predictions are allocated to hospitals within counties proportional the their total number of employees
3. final value is decided by thresholding the mean of two numbers: (i) percentile of cumulative deaths so far (ii) percentile of predicted new deaths in the next few days

Data

View data on Github

We have compiled and cleaned a large corpus of county-level and hospital-level data from a variety of public sources to aid data science efforts to combat COVID-19. At the county level, our data include COVID-19 cases/deaths from USA Facts and NYT, automatically updated every day, along with demographic information, health resource availability, COVID-19 health risk factors, and social mobility information. At the hospital level, our data include the location of the hospital, the number of ICU beds, the total number of employees, and the hospital type.

Feature correlations: This heatmap shows correlations between some of the features we have collected at the county-level.

models

View modeling on Github

Combined Linear and Exponential Predictors (CLEP)

Calculate a weighted average of the predictions: higher weight to the models with better historical performance

We develop simple, interpretable models for predicting the trajectory of COVID-19-related deaths at the county-level in the United States (updated daily). Our models show that most counties are experiencing exponential growth that can be accurately modeled several days into the future. However, we also find that some counties are starting to experience sub-exponential growth, possibly due to the “flattening-the-curve” impacts of interventions such as social distancing and shelter in place orders. Details are in our paper.

7-day forecasts for selected counties: Prediction intervals are based on the historical performance of our predictors (narrower for counties where the forecasts were accurate). If we denote err as the largest normalized absolute error for a given county in the past five days, then our prediction interval has the form [prediction * (1 - err), prediction * (1 + err)].

Our team

Thanks to support from AWS and Google.

>