

{"id":125574,"date":"2020-02-02T16:56:06","date_gmt":"2020-02-02T11:26:06","guid":{"rendered":"https:\/\/analyticstraining.com\/?p=15700"},"modified":"2022-11-22T16:49:25","modified_gmt":"2022-11-22T11:19:25","slug":"who-is-as-cheery-as-santa-claus-indias-finance-ministers","status":"publish","type":"post","link":"https:\/\/www.jigsawacademy.com\/who-is-as-cheery-as-santa-claus-indias-finance-ministers\/","title":{"rendered":"Who is as cheery as Santa Claus? India\u2019s Finance Minister!"},"content":{"rendered":"<p><span style=\"font-weight: 400;\">Given my background in finance, I celebrate the new year on 1<\/span><span style=\"font-weight: 400;\">st<\/span><span style=\"font-weight: 400;\"> April <\/span><span style=\"font-weight: 400;\">?<\/span><span style=\"font-weight: 400;\"> and the Union Budget is as important an event as Christmas!<\/span><\/p>\n<p><span style=\"font-weight: 400;\">As an NLP exercise, I decided to use the budget speeches from the last decade. What is NLP, you ask? Natural Language Processing (NLP) is an interdisciplinary branch of artificial intelligence, computer science, and linguists that helps program computers to understand, interpret, and generate native human or natural language. Do read our earlier blog post, <\/span><a href=\"https:\/\/analyticstraining.com\/a-quick-introduction-to-natural-language-processing\/\" target=\"_blank\" rel=\"noopener\"><span style=\"font-weight: 400;\">A Quick Introduction To Natural Language Processing.<\/span><\/a><\/p>\n<p><span style=\"font-weight: 400;\">Alexa, Siri, and Google Assistant are all examples of NLP in practice. NLP has numerous applications such as part-of-speech tagging, Named Entity Recognition (NER), question-answering, speech recognition, text-to-speech and speech-to-text, topic modeling, sentiment classification, language modeling, and translation<\/span><\/p>\n<p><span style=\"font-weight: 400;\">In this article, we will focus on the sentiment analysis of the budget speeches by Indian Finance Ministers. We have had 12 budget presentations (including 2 interim ones) in the past ten years. I downloaded the data from the <\/span><a href=\"https:\/\/www.indiabudget.gov.in\/bspeech.php\" target=\"_blank\" rel=\"noopener\"><span style=\"font-weight: 400;\">Government of India\u2019s site<\/span><\/a><span style=\"font-weight: 400;\">. However, the Sarkar does not pay much attention to the details and some years have wrong\/dead links!&nbsp; I sourced the missing speeches from national newspapers.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">I used a loop to read the data and store it as a Python pandas data frame. I used regular expressions to clean the data. Using sklearn\u2019s CountVectorizer, I created a document-term matrix excluding common English stop words. We did our analysis on the cleaned data.<\/span><\/p>\n<p><strong>Here are the top 15 words used in a few of the budget speeches:<\/strong><\/p>\n<p><b>Feb 2010 &#8211; Pranab Mukherjee<\/b><\/p>\n<p><span style=\"font-weight: 400;\">cent, propose, crore, year, duty, government, tax, sector, development, growth, budget, provide, fiscal, central<\/span><\/p>\n<p><b>Feb 2013 &#8211; P Chidambaram<\/b><\/p>\n<p><span style=\"font-weight: 400;\">propose, crore, percent, tax, provide, government, year, sector, investment, development, funds, fund, rate, plan<\/span><\/p>\n<p><b>Feb 2015 &#8211; Arun Jaitley<\/b><\/p>\n<p><span style=\"font-weight: 400;\">tax, crore, proposed, India, act, service, government, excise, duty, year, investment, madam, provide, credit<\/span><\/p>\n<p><b>Jul 2019 &#8211; Nirmala Sitharaman&nbsp;<\/b><\/p>\n<p><span style=\"font-weight: 400;\">tax, government, proposed, India, provide, shall, lakh, section, scheme, crore, income, act, years, year<\/span><\/p>\n<p><span style=\"font-weight: 400;\">Looking at the list, I added some more words which we consider not relevant for the analysis. The list is&nbsp; add_stop_words= [&#8216;crore&#8217;, &#8216;year&#8217;, &#8216;propose&#8217;, &#8216;provide&#8217;, &#8216;sector&#8217;, &#8216;lakh&#8217;, &#8216;years&#8217;, &#8216;proposed&#8217;, &#8216;new&#8217;, \u2018cent\u2019, \u2018percent\u2019,&nbsp; \u2018shall\u2019 ]<\/span><\/p>\n<p><span style=\"font-weight: 400;\">Then we built word clouds for the budget speeches from the last decade.<\/span><\/p>\n<p><span style=\"font-weight: 400;\"><img decoding=\"async\" class=\"aligncenter wp-image-15705 size-full\" src=\"https:\/\/analyticstraining.com\/wp-content\/uploads\/2020\/01\/image5.png\" alt=\"\" width=\"992\" height=\"388\" title=\"\">Do you notice any trends and patterns from the word clouds? Looking at the word clouds, what other words would you remove by adding to the add_stop_words list? Which words do you think would be among the most commonly used words in the 2020 Union Budget?<\/span><\/p>\n<p><span style=\"font-weight: 400;\"><img decoding=\"async\" class=\"aligncenter wp-image-15701 size-full\" src=\"https:\/\/analyticstraining.com\/wp-content\/uploads\/2020\/01\/image1.png\" alt=\"\" width=\"991\" height=\"383\" title=\"\">I did a short analysis on the vocabulary of Finance Ministers. It would have been interesting to see how <\/span><span style=\"font-weight: 400;\">Shashi Tharoor would have measured up if he was the Finance Minister, don\u2019t you think?<\/span><\/p>\n<p><span style=\"font-weight: 400;\">We also did a sentiment analysis using the textblob library.<\/span><\/p>\n<p><span style=\"font-weight: 400;\"><img decoding=\"async\" class=\"aligncenter wp-image-15702 size-full\" src=\"https:\/\/analyticstraining.com\/wp-content\/uploads\/2020\/01\/image2.png\" alt=\"\" width=\"663\" height=\"522\" title=\"\">As we can see, our finance ministers are a positive lot. As cheery as Santa Claus! Ho Ho!!<\/span><\/p>\n<p><span style=\"font-weight: 400;\">As noted in 2013 and 2018, the finance ministers tend to be more opinionated during the final full budget before the national elections.&nbsp;<\/span><\/p>\n<p><span style=\"font-weight: 400;\">Finally, we analyzed the polarity for the budget speeches.<\/span><\/p>\n<p><span style=\"font-weight: 400;\"><img decoding=\"async\" class=\"aligncenter wp-image-15703 size-full\" src=\"https:\/\/analyticstraining.com\/wp-content\/uploads\/2020\/01\/image3.png\" alt=\"\" width=\"708\" height=\"538\" title=\"\">Are you wondering what polarity is? In brief, polarity refers to the emotions expressed in a sentence. The strength of sentiments or opinions is linked to the intensity of emotions, such as happiness and anger. It does appear that the mood dips during the end of the budget speeches.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">Interested to read more about how TextBlob calculates sentiments and polarity? You can read more <\/span><a href=\"https:\/\/github.com\/sloria\/TextBlob\/tree\/dev\/textblob\" target=\"_blank\" rel=\"noopener\"><span style=\"font-weight: 400;\">here<\/span><\/a><span style=\"font-weight: 400;\">.&nbsp;<\/span><\/p>\n<p><span style=\"font-weight: 400;\">What else would you do? Use a bag of words\/n-grams? Use stemming and lemmatization? Or do you side with Peter Skomoroch, the Principal Data Scientist at LinkedIn?<\/span><\/p>\n<p><span style=\"font-weight: 400;\"><img decoding=\"async\" class=\"aligncenter wp-image-15704 size-full\" src=\"https:\/\/analyticstraining.com\/wp-content\/uploads\/2020\/01\/image4.png\" alt=\"\" width=\"775\" height=\"622\" title=\"\">Interested in learning more about NLP? Join the <\/span><a href=\"https:\/\/www.jigsawacademy.com\/pgpdm\/\"><span style=\"font-weight: 400;\">Postgraduate Program in Data Science and Machine Learning (PGPDM)<\/span><\/a><span style=\"font-weight: 400;\"> course, offered by Jigsaw Academy in collaboration with the University of Chicago, which has a new module on AI and DL. NLP is covered in detail.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">We also cover text analysis in IIM Indore\u2019s <\/span><a href=\"https:\/\/www.jigsawacademy.com\/integrated-program-in-business-analytics\/\"><span style=\"font-weight: 400;\">Integrated program in Business Analytics (IPBA)<\/span><\/a><span style=\"font-weight: 400;\"> course.<\/span><\/p>\n","protected":false},"excerpt":{"rendered":"<p>Given my background in finance, I celebrate the new year on 1st April ? and the Union Budget is as important an event as Christmas! As an NLP exercise, I decided to use the budget speeches from the last decade. What is NLP, you ask? Natural Language Processing (NLP) is an interdisciplinary branch of artificial [&hellip;]<\/p>\n","protected":false},"author":168,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":[],"categories":[1262],"tags":[702,86,95,1263,1264,1265,914,915,110,539,1266],"form":[1499],"acf":[],"_links":{"self":[{"href":"https:\/\/www.jigsawacademy.com\/wp-json\/wp\/v2\/posts\/125574"}],"collection":[{"href":"https:\/\/www.jigsawacademy.com\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.jigsawacademy.com\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.jigsawacademy.com\/wp-json\/wp\/v2\/users\/168"}],"replies":[{"embeddable":true,"href":"https:\/\/www.jigsawacademy.com\/wp-json\/wp\/v2\/comments?post=125574"}],"version-history":[{"count":1,"href":"https:\/\/www.jigsawacademy.com\/wp-json\/wp\/v2\/posts\/125574\/revisions"}],"predecessor-version":[{"id":260052,"href":"https:\/\/www.jigsawacademy.com\/wp-json\/wp\/v2\/posts\/125574\/revisions\/260052"}],"wp:attachment":[{"href":"https:\/\/www.jigsawacademy.com\/wp-json\/wp\/v2\/media?parent=125574"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.jigsawacademy.com\/wp-json\/wp\/v2\/categories?post=125574"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.jigsawacademy.com\/wp-json\/wp\/v2\/tags?post=125574"},{"taxonomy":"form","embeddable":true,"href":"https:\/\/www.jigsawacademy.com\/wp-json\/wp\/v2\/form?post=125574"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}