

{"id":125687,"date":"2020-04-23T16:14:21","date_gmt":"2020-04-23T10:44:21","guid":{"rendered":"https:\/\/analyticstraining.com\/?p=16093"},"modified":"2022-11-22T16:47:17","modified_gmt":"2022-11-22T11:17:17","slug":"covid-19-an-attempt-to-predict-confirmed-cases-in-india","status":"publish","type":"post","link":"https:\/\/www.jigsawacademy.com\/covid-19-an-attempt-to-predict-confirmed-cases-in-india\/","title":{"rendered":"COVID-19: An attempt to predict Confirmed Cases in India"},"content":{"rendered":"<p><span style=\"font-weight: 400;\">The COVID-19 pandemic continues to ravage the world. Even as global infections crossed <\/span><span style=\"font-weight: 400;\">2.6 million<\/span><span style=\"font-weight: 400;\">, India\u2019s number at <\/span><span style=\"font-weight: 400;\">around<\/span><span style=\"font-weight: 400;\"> 21,370 seems modest, given we are home to one-sixth of the world\u2019s population. Based on data from Johns Hopkins, in per capita terms, only <\/span><span style=\"font-weight: 400;\">16<\/span><span style=\"font-weight: 400;\"> in a million people in India are infected by COVID-19, vs 338 in a million people globally (as of 22nd April 2020). Things in India are not as bad&#8230; but what does the future look like?<\/span><\/p>\n<p><span style=\"font-weight: 400;\">Given my interest in numbers and trends, I have been trying to figure out if we could forecast the trends for COVID-19. I requested data from the popular ones <\/span><a href=\"https:\/\/twitter.com\/JohnsHopkins\/status\/1243273073756901383\" target=\"_blank\" rel=\"noopener\"><span style=\"font-weight: 400;\">Johns Hopkins-CDDEP<\/span><\/a><span style=\"font-weight: 400;\">,<\/span><a href=\"https:\/\/indianexpress.com\/article\/india\/bcg-refutes-reports-saying-india-may-not-lift-lockdown-restrictions-before-september-coronavirus-6346133\/\" target=\"_blank\" rel=\"noopener\"> <span style=\"font-weight: 400;\">BCG<\/span><\/a><span style=\"font-weight: 400;\"> and other forecasts but these were allegedly not for public dissemination \/ disputed and I did not get a response. In general, I noticed that most of the forecast did not provide day-wise numbers. On a log scale without supporting numbers, it was difficult to decipher what the forecasters wanted to say from the presentations and reports. I would not have been able to read even my own forecast chart without the accompanying numbers.<\/span><\/p>\n<p><span style=\"font-weight: 400;\"><a href=\"https:\/\/analyticstraining.com\/wp-content\/uploads\/2020\/04\/imageLikeEmbed.png\" target=\"_blank\" rel=\"noopener\"><img decoding=\"async\" class=\"aligncenter wp-image-16095 size-full\" src=\"https:\/\/analyticstraining.com\/wp-content\/uploads\/2020\/04\/imageLikeEmbed.png\" alt=\"\" width=\"2048\" height=\"726\" title=\"\"><\/a>Forecasts from the US by experts compiled by<\/span><a href=\"https:\/\/fivethirtyeight.com\/features\/experts-think-were-flattening-the-coronavirus-curve-but-hospitalizations-havent-peaked-yet\/\" target=\"_blank\" rel=\"noopener\"> <span style=\"font-weight: 400;\">fivethirtyeight<\/span><\/a><span style=\"font-weight: 400;\"> showed a huge variation in forecasts. My colleague<\/span><a href=\"https:\/\/www.linkedin.com\/in\/gunnvant-singh-saini-18199a36\/\" target=\"_blank\" rel=\"noopener\"> <span style=\"font-weight: 400;\">Gunnvant<\/span><\/a> <span style=\"font-weight: 400;\">has created a<\/span><a href=\"https:\/\/github.com\/Gunnvant\/covid_scrapper\" target=\"_blank\" rel=\"noopener\"> <span style=\"font-weight: 400;\">data scraper and visualization tool<\/span><\/a><span style=\"font-weight: 400;\"> for COVID-19. However, I did not find any good forecasts for India.&nbsp;<\/span><\/p>\n<p><span style=\"font-weight: 400;\">Some bad ones are out there. \u201cA five-member Central team has projected that the number of COVID-19 cases in Mumbai will touch an estimated 42,604 by April 30 and spiral to 6,56,407 by May 15.<\/span><a href=\"https:\/\/www.thehindu.com\/news\/cities\/mumbai\/coronavirus-huge-spike-in-cases-likely-in-mumbai-says-central-panel\/article31400889.ece\" target=\"_blank\" rel=\"noopener\"> <span style=\"font-weight: 400;\">Based on mathematical modelling for Mumbai by the Union Ministry of Health on April 16<\/span><\/a><span style=\"font-weight: 400;\">\u201d<\/span><\/p>\n<p><span style=\"font-weight: 400;\"><a href=\"https:\/\/analyticstraining.com\/wp-content\/uploads\/2020\/04\/pasted-image-0-3.png\" target=\"_blank\" rel=\"noopener\"><img decoding=\"async\" class=\"aligncenter wp-image-16098 size-full\" src=\"https:\/\/analyticstraining.com\/wp-content\/uploads\/2020\/04\/pasted-image-0-3.png\" alt=\"\" width=\"452\" height=\"611\" title=\"\"><\/a>Source:<\/span><a href=\"https:\/\/www.thehindu.com\/news\/cities\/mumbai\/coronavirus-huge-spike-in-cases-likely-in-mumbai-says-central-panel\/article31400889.ece\" target=\"_blank\" rel=\"noopener\"><span style=\"font-weight: 400;\"> The Hindu<\/span><\/a><\/p>\n<p><span style=\"font-weight: 400;\">The assumptions are too simplistic. 3.8 doubling maintained throughout the forecast period. Such high numbers are great for scaremongering, grabbing eyeballs and making headlines. The state government is disputing these numbers. They should. Such \u201cmathematical modelling\u201d have been made by team members who had no understanding of either mathematics nor modelling. These forecasts add negligible value. May I direct these ill-trained forecasters to<\/span><a href=\"https:\/\/www.jigsawacademy.com\/online-analytics-training\/?query=data%20science\"> <span style=\"font-weight: 400;\">some courses<\/span><\/a><span style=\"font-weight: 400;\"> at<\/span><a href=\"https:\/\/www.linkedin.com\/company\/jigsaw-academy\/\" target=\"_blank\" rel=\"noopener\"><span style=\"font-weight: 400;\"> Jigsaw Academy &#8230;<\/span><\/a><\/p>\n<p><span style=\"font-weight: 400;\">Given my absolute lack of knowledge on diseases, I was initially hesitant to try to forecast it. I take solace from the words of Mark Weir of Ohio State&#8217;s ecology, epidemiology, and population health program:<\/span><\/p>\n<p><span style=\"font-weight: 400;\"><a href=\"https:\/\/analyticstraining.com\/wp-content\/uploads\/2020\/04\/0.jpg\" target=\"_blank\" rel=\"noopener\"><img decoding=\"async\" class=\"aligncenter wp-image-16094 size-full\" src=\"https:\/\/analyticstraining.com\/wp-content\/uploads\/2020\/04\/0.jpg\" alt=\"\" width=\"569\" height=\"295\" title=\"\"><\/a>Source:<\/span><a href=\"https:\/\/fivethirtyeight.com\/features\/a-comic-strip-tour-of-the-wild-world-of-pandemic-modeling\/\" target=\"_blank\" rel=\"noopener\"> <span style=\"font-weight: 400;\">fivethirtyeight.com<\/span><\/a><\/p>\n<p><span style=\"font-weight: 400;\">I looked at this as a data forecasting problem and decided to build a simple time series model. Having spent over a decade forecasting revenues, profits and the unknowable stock prices of my coverage universe, I was used to being wrong and forecasting things I had no idea of! Here is the result, the link to my COVID-19 confirmed infections predictions for India:<\/span><a href=\"https:\/\/docs.google.com\/spreadsheets\/d\/1dc9hwCSz7hoqkgymPghar0AnN80weDgRICQ2qXrmxB0\/edit?usp=sharing\" target=\"_blank\" rel=\"noopener\"> <span style=\"font-weight: 400;\">https:\/\/docs.google.com\/spreadsheets\/d\/1dc9hwCSz7hoqkgymPghar0AnN80weDgRICQ2qXrmxB0\/edit?usp=sharing<\/span><\/a><\/p>\n<p><span style=\"font-weight: 400;\">When I build the models, these are the things I wanted to have:<\/span><\/p>\n<ol>\n<li style=\"font-weight: 400;\"><span style=\"font-weight: 400;\">Less difference between the upper and lower bound of estimates. This is therefore not the 95% likelihood&nbsp;<\/span><\/li>\n<li style=\"font-weight: 400;\"><span style=\"font-weight: 400;\">Mean estimate that hopefully will have less than 5% error from actuals was my aim<\/span><\/li>\n<li style=\"font-weight: 400;\"><span style=\"font-weight: 400;\">steady\/sticky estimates that would update as new information came in but not be too sensitive to minor changes.<\/span><\/li>\n<\/ol>\n<p><span style=\"font-weight: 400;\">You may view the details from in the Google sheet. However, you will not be able to edit or change anything. You may copy it to your own Google drive if you would like to make any changes. All changes in forecast are recorded and ideally these will be updated once a day.&nbsp;<\/span><\/p>\n<p><span style=\"font-weight: 400;\">The data is sourced from Johns Hopkins (details in the Google spreadsheet). As some of the data is country-wise and some data is state-wise (for some countries like the US, China and Australia), we use groupby in Python and download as an excel file. We use a simple time series forecasting model to predict the number of confirmed COVID-19 infections in the next seven days. We also highlight the upper bound and lower bound of the estimates. We check the difference of our mean estimate and the actual numbers. The data for my daily forecasts is available from 11th April and since then the actual number has been within 5% of the predicted forecast. Here are my forecasts for the next seven days..&nbsp;<\/span><\/p>\n<p><span style=\"font-weight: 400;\">Source:<\/span><a href=\"https:\/\/docs.google.com\/spreadsheets\/d\/1dc9hwCSz7hoqkgymPghar0AnN80weDgRICQ2qXrmxB0\/edit?usp=sharing\" target=\"_blank\" rel=\"noopener\"> <span style=\"font-weight: 400;\">https:\/\/docs.google.com\/spreadsheets\/d\/1dc9hwCSz7hoqkgymPghar0AnN80weDgRICQ2qXrmxB0\/edit?usp=sharing<\/span><\/a><\/p>\n<p><span style=\"font-weight: 400;\">The model is work-in-progress and considering some fine tuning. The lower bound is easier to predict as it can\u2019t be less than actuals. The upper bound needs to be tested, especially once we are not in lockdown and may increase the rate of spread. Looking forward to extending the duration of the forecast as well as seeing if we can predict the peak of the infection in India. Hope to share the model soon.&nbsp;<\/span><\/p>\n<p><span style=\"font-weight: 400;\">Given these limitations, honestly, I am surprised the simple model has reasonably good predictive power. And I decided to post it on a public forum to (i) make myself update it daily (ii) see if the model continues to be as good in predicting the numbers, especially in public scrutiny!&nbsp;<\/span><\/p>\n<p><span style=\"font-weight: 400;\">Note that my predictions keep changing each day as fresh data comes in. My prediction for today\u2019s (23rd April) confirmed cases have increased by 4% over the last seven days. I am searching for the peak and to see the numbers fall. Hopefully, my numbers will prove excessive and we will see it reduce&#8230; Unfortunately, the forecasts seem to be edging up. All models are right until they go wrong! Hopefully, this falters in predicting too much, and the numbers end up being lower than forecast\u2026<\/span><\/p>\n<p><span style=\"font-weight: 400;\"><a href=\"https:\/\/analyticstraining.com\/wp-content\/uploads\/2020\/04\/pasted-image-0-1-1.png\" target=\"_blank\" rel=\"noopener\"><img decoding=\"async\" class=\"aligncenter wp-image-16096 size-full\" src=\"https:\/\/analyticstraining.com\/wp-content\/uploads\/2020\/04\/pasted-image-0-1-1.png\" alt=\"\" width=\"1135\" height=\"192\" title=\"\"><\/a>Let&#8217;s have a more sensible discussion on numbers and expectations. I estimated, India would be around 11,000 confirmed infections on 14th April and there would be a push to keep the lockdown intact. With cases around 20,000 currently, going to around 35,000 by 30th April and expected to cross 40,000 by 3rd May, are we looking for at least a partial lockdown continuing? We will know soon enough&#8230;&nbsp;&nbsp;&nbsp;&nbsp;<\/span><\/p>\n<p><span style=\"font-weight: 400;\">Ok, we all agree that Mumbai hitting 6.5 lakh cases by 15th May is baloney. However, while the experts in the <\/span><a href=\"https:\/\/www.thehindu.com\/news\/cities\/mumbai\/coronavirus-huge-spike-in-cases-likely-in-mumbai-says-central-panel\/article31400889.ece\" target=\"_blank\" rel=\"noopener\"><span style=\"font-weight: 400;\">Union Ministry of Health <\/span><\/a><span style=\"font-weight: 400;\">&nbsp;expect over 42,000 confirmed cases by 30th April, I have the audacity to suggest that the whole of India will have less than 42,000 cases by 30th April?<\/span><\/p>\n<p><span style=\"font-weight: 400;\">Yes, I do. Game on! And because I back myself, may the better forecaster win!<\/span><\/p>\n<p><span style=\"font-weight: 400;\"><a href=\"https:\/\/analyticstraining.com\/wp-content\/uploads\/2020\/04\/pasted-image-0-2.png\" target=\"_blank\" rel=\"noopener\"><img decoding=\"async\" class=\"aligncenter wp-image-16097 size-full\" src=\"https:\/\/analyticstraining.com\/wp-content\/uploads\/2020\/04\/pasted-image-0-2.png\" alt=\"\" width=\"377\" height=\"600\" title=\"\"><\/a>Disclaimer:<\/span><\/p>\n<p><span style=\"font-weight: 400;\">I offer my views, with the knowledge that diseases, medicine and healthcare are not my area of expertise. This is an attempt in predictive time series analysis. There are a lot of bad models out there, and I am confident this will be better than most.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">Also, given that many discussions on the topic have been polarized by political leanings and viewpoints, I would like to stress that these are not to promote any ideology or offer judgment on government policy decisions.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">My only wish is that the government both state and central focus on improving healthcare infrastructure and facilities in India, while they leave the forecasting to those who can!<\/span><\/p>\n<p>&nbsp;<\/p>\n","protected":false},"excerpt":{"rendered":"<p>The COVID-19 pandemic continues to ravage the world. Even as global infections crossed 2.6 million, India\u2019s number at around 21,370 seems modest, given we are home to one-sixth of the world\u2019s population. Based on data from Johns Hopkins, in per capita terms, only 16 in a million people in India are infected by COVID-19, vs [&hellip;]<\/p>\n","protected":false},"author":168,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":[],"categories":[1262],"tags":[1355,1381,1382,1369],"form":[1499],"acf":[],"_links":{"self":[{"href":"https:\/\/www.jigsawacademy.com\/wp-json\/wp\/v2\/posts\/125687"}],"collection":[{"href":"https:\/\/www.jigsawacademy.com\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.jigsawacademy.com\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.jigsawacademy.com\/wp-json\/wp\/v2\/users\/168"}],"replies":[{"embeddable":true,"href":"https:\/\/www.jigsawacademy.com\/wp-json\/wp\/v2\/comments?post=125687"}],"version-history":[{"count":1,"href":"https:\/\/www.jigsawacademy.com\/wp-json\/wp\/v2\/posts\/125687\/revisions"}],"predecessor-version":[{"id":260048,"href":"https:\/\/www.jigsawacademy.com\/wp-json\/wp\/v2\/posts\/125687\/revisions\/260048"}],"wp:attachment":[{"href":"https:\/\/www.jigsawacademy.com\/wp-json\/wp\/v2\/media?parent=125687"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.jigsawacademy.com\/wp-json\/wp\/v2\/categories?post=125687"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.jigsawacademy.com\/wp-json\/wp\/v2\/tags?post=125687"},{"taxonomy":"form","embeddable":true,"href":"https:\/\/www.jigsawacademy.com\/wp-json\/wp\/v2\/form?post=125687"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}