

{"id":122545,"date":"2015-02-18T19:03:29","date_gmt":"2015-02-18T13:33:29","guid":{"rendered":"http:\/\/analyticstraining.com\/?p=5996"},"modified":"2020-06-01T16:37:44","modified_gmt":"2020-06-01T11:07:44","slug":"using-pipes-r","status":"publish","type":"post","link":"https:\/\/www.jigsawacademy.com\/using-pipes-r\/","title":{"rendered":"Using Pipes in R"},"content":{"rendered":"<p>By: <strong>Gunnvant Singh<\/strong>, Faculty at Jigsaw<\/p>\n<p>&nbsp;<\/p>\n<p align=\"justify\"><span style=\"font-family: Calibri, sans-serif;\"><span style=\"font-size: medium;\">Let&#8217;s be honest, once you start doing serious data wrangling in R, the code starts appearing ugly. One of the reason for this is the functional nature of R. In order to accomplish a decent data manipulation task one has to write nested functions. Now, writing nested functions is not that difficult, what&#8217;s difficult is reading them!!! Sample the code below:<\/span><\/span><\/p>\n<p align=\"justify\"><!--more--><\/p>\n<p align=\"justify\"><span style=\"font-family: Calibri, sans-serif;\"><span style=\"font-size: medium;\">library(babynames)<\/span><\/span><\/p>\n<p align=\"justify\"><span style=\"font-family: Calibri, sans-serif;\"><span style=\"font-size: medium;\">libary(dplyr)<\/span><\/span><\/p>\n<p align=\"justify\"><span style=\"font-family: Calibri, sans-serif;\"><span style=\"font-size: medium;\">data(babynames)<\/span><\/span><\/p>\n<p align=\"justify\"><span style=\"font-family: Calibri, sans-serif;\"><span style=\"font-size: medium;\">head(babynames)<\/span><\/span><\/p>\n<p align=\"justify\"><span style=\"font-family: Calibri, sans-serif;\"><span style=\"font-size: medium;\">sum(select(filter(babynames,sex==&#8221;F&#8221;,name==&#8221;Mary&#8221;),n))<\/span><\/span><\/p>\n<p align=\"justify\"><span style=\"font-family: Calibri, sans-serif;\"><span style=\"font-size: medium;\">Now, if you pay attention to the last line it takes a while to figure out what is happening in that piece of code. Let me break it down for you:<\/span><\/span><\/p>\n<p align=\"justify\"><span style=\"font-family: Calibri, sans-serif;\"><span style=\"font-size: medium;\">1. First I subset the data based on sex and name. Essentially the line of code filter(babynames,sex==&#8221;F&#8221;,name==&#8221;Mary&#8221;) is selecting all the observations in the data where the gender is female and name is Mary.<\/span><\/span><\/p>\n<p align=\"justify\"><span style=\"font-family: Calibri, sans-serif;\"><span style=\"font-size: medium;\">2. After this, the command &#8216;select&#8217; is being used to pick the column named &#8216;n&#8217; from the subsetted data.<\/span><\/span><\/p>\n<p align=\"justify\"><span style=\"font-family: Calibri, sans-serif;\"><span style=\"font-size: medium;\">3. The function sum() is being used to add the numbers present in the selected column &#8216;n&#8217;.<\/span><\/span><\/p>\n<p align=\"justify\"><span style=\"font-family: Calibri, sans-serif;\"><span style=\"font-size: medium;\">Now, the key to understand this line of code is to read it from inside out. Now, you all will agree that this is not a very straightforward and one has to think through to understand the piece of code.<\/span><\/span><\/p>\n<p align=\"justify\"><span style=\"font-family: Calibri, sans-serif;\"><span style=\"font-size: medium;\">What we just did above by nesting functions &#8216;filter&#8217;, &#8216;select&#8217; and &#8216;sum&#8217; is called <i>function composition.<\/i> Essentially if I have functions &#8216;f&#8217;, &#8216;g&#8217; and &#8216;z&#8217; , this is what I am trying to achieve through nesting:<\/span><\/span><\/p>\n<p align=\"justify\"><span style=\"font-family: Calibri, sans-serif;\"><span style=\"font-size: medium;\">z(g(f(x)))) ~ sum(select(filter(data))).<\/span><\/span><\/p>\n<div class=\"_form_3\"><\/div>\n<p><script src=\"https:\/\/jigsawacademy67103.activehosted.com\/f\/embed.php?id=3\" type=\"text\/javascript\" charset=\"utf-8\"><\/script><\/p>\n<p align=\"justify\"><span style=\"font-family: Calibri, sans-serif;\"><span style=\"font-size: medium;\">Can function composition be achieved in R without using nested functions? Well till about an year ago, the answer to this question would have been &#8216;no&#8217;. But then two important things happened:<\/span><\/span><\/p>\n<p align=\"justify\"><span style=\"font-family: Calibri, sans-serif;\"><span style=\"font-size: medium;\">(1) Stefan Milton Bache came out with a package called &#8216;magrittr&#8217; in January 2014, implementing the %&gt;% (pipe) operator .<\/span><\/span><\/p>\n<p align=\"justify\"><span style=\"font-family: Calibri, sans-serif;\"><span style=\"font-size: medium;\">(2) Hadley Wickham adopted Stefan&#8217;s pipe operator in his dplyr package<\/span><\/span><\/p>\n<p align=\"justify\"><span style=\"font-family: Calibri, sans-serif;\"><span style=\"font-size: medium;\">Since dplyr is a very powerful data manipulation package in R, its adoption of %&gt;% operator has contributed to the popularity of pipes in R.<\/span><\/span><\/p>\n<p align=\"justify\"><span style=\"font-family: Calibri, sans-serif;\"><span style=\"font-size: medium;\">Before I describe the %&gt;% (pipe) operator in the context of R, let&#8217;s take a look at how piping works.<\/span><\/span><\/p>\n<p align=\"justify\"><span style=\"font-family: Calibri, sans-serif;\"><span style=\"font-size: medium;\">When we are using pipes, instead of providing the arguments directly inside the function, we provide the functional arguments &#8216;near&#8217; the function,<\/span><\/span><\/p>\n<p align=\"justify\"><a href=\"https:\/\/www.jigsawacademy.com\/wp-content\/uploads\/2015\/02\/18-Feb.jpg\"><img decoding=\"async\" class=\"aligncenter size-full wp-image-5997\" src=\"https:\/\/www.jigsawacademy.com\/wp-content\/uploads\/2015\/02\/18-Feb.jpg\" alt=\"18 Feb\" width=\"583\" height=\"82\" title=\"\"><\/a><\/p>\n<p align=\"justify\"><span style=\"font-family: Calibri, sans-serif;\"><span style=\"font-size: medium;\">Let us see how, the above written nested function call can be simplified using %&gt;% operator. Sample the code below:<\/span><\/span><\/p>\n<p align=\"justify\"><span style=\"font-family: Calibri, sans-serif;\"><span style=\"font-size: medium;\">babynames%&gt;%filter(sex==&#8221;F&#8221;,name==&#8221;Mary&#8221;)%&gt;%select(n)%&gt;%sum<\/span><\/span><\/p>\n<p align=\"justify\"><span style=\"font-family: Calibri, sans-serif;\"><span style=\"font-size: medium;\">There are three functions being used in the whole code:<\/span><\/span><\/p>\n<p align=\"justify\"><a href=\"https:\/\/www.jigsawacademy.com\/wp-content\/uploads\/2015\/02\/18-Febe.jpg\"><img decoding=\"async\" class=\"aligncenter size-full wp-image-5998\" src=\"https:\/\/www.jigsawacademy.com\/wp-content\/uploads\/2015\/02\/18-Febe.jpg\" alt=\"18 Febe\" width=\"587\" height=\"109\" title=\"\"><\/a><\/p>\n<p align=\"justify\"><span style=\"font-family: Calibri, sans-serif;\"><span style=\"font-size: medium;\">Let us look at how %&gt;% works, <\/span><\/span><\/p>\n<p align=\"justify\"><span style=\"font-family: Calibri, sans-serif;\"><span style=\"font-size: medium;\"><span style=\"color: #ff0000;\">babynames<\/span>%&gt;%filter(<span style=\"color: #ff0000;\">sex==&#8221;F&#8221;,name==&#8221;Mary&#8221;)<\/span>%&gt;%select(n)%&gt;%sum<\/span><\/span><\/p>\n<p class=\"western\"><span style=\"font-family: Calibri, sans-serif;\"><span style=\"font-size: medium;\">babynames is piped to filter function. Note that babynames is a dataframe.<\/span><\/span><\/p>\n<p class=\"western\"><span style=\"font-family: Calibri, sans-serif;\"><span style=\"font-size: medium;\"><span style=\"color: #0000ff;\">babynames<\/span>%&gt;%<span style=\"color: #0000ff;\">filter(sex==&#8221;F&#8221;,name==&#8221;Mary&#8221;)<\/span>%&gt;%<span style=\"color: #009900;\">select(n)<\/span>%&gt;%sum<\/span><\/span><\/p>\n<p align=\"justify\"><span style=\"font-family: Calibri, sans-serif;\"><span style=\"font-size: medium;\">The result of babynames%&gt;%filter(sex==&#8221;F&#8221;,name==&#8221;Mary&#8221;) is piped into the select function.<\/span><\/span><\/p>\n<p align=\"justify\"><span style=\"font-family: Calibri, sans-serif;\"><span style=\"font-size: medium;\"><span style=\"color: #ff3300;\">babynames<\/span>%&gt;%<span style=\"color: #ff3300;\">filter(sex==&#8221;F&#8221;,name==&#8221;Mary&#8221;)<\/span>%&gt;%<span style=\"color: #ff3300;\">select(n)<\/span>%&gt;%<span style=\"color: #0000ff;\">sum<\/span><\/span><\/span><\/p>\n<p align=\"justify\"><span style=\"font-family: Calibri, sans-serif;\"><span style=\"font-size: medium;\">At last the result of select(n) is piped to sum() function<\/span><\/span><\/p>\n<p style=\"text-align: left;\" align=\"justify\"><span style=\"font-family: Calibri, sans-serif;\"><span style=\"font-size: medium;\">Now if we see the whole code in totality, babynames%&gt;%filter(sex==&#8221;F&#8221;,name==&#8221;Mary&#8221;)%&gt;%select(n)%&gt;%sum, we can read it as, take the data babynames, then subset it according to sex and name, from this subsetted data select column &#8216;n&#8217; and then find the sum of all the values in this column. Compare this with our orginal code: sum(select(filter(babynames,sex==\u201dF&#8217;\u201d name==\u201dMary\u201d),n)).<\/span><\/span><\/p>\n<p class=\"western\"><span style=\"font-family: Calibri, sans-serif;\"><span style=\"font-size: medium;\">Clearly, piping enhances the readability of R code and makes complex data manipulation very easy.<\/span><\/span><\/p>\n<p>Interested in a career in Big Data? Check out\u00a0<a href=\"http:\/\/www.jigsawacademy.com\/big-data-specialist-wiley\">Jigsaw Academy&#8217;s Big Data courses<\/a>\u00a0and see how you can get trained to become a Big Data specialist.<\/p>\n<p class=\"western\">\u00a0Related Artilces:<\/p>\n<p class=\"western\"><a href=\"http:\/\/analyticstraining.com\/2014\/the-power-of-r-and-why-its-an-essential-skill-for-data-analysts\/\" target=\"_blank\" rel=\"noopener noreferrer\">The Power of R \u2013 And Why it\u2019s an Essential Skill for Data Analysts<\/a><br \/>\n<a href=\"http:\/\/analyticstraining.com\/2014\/stringi-package-r\/\" target=\"_blank\" rel=\"noopener noreferrer\">Stringi Package in R<\/a><br \/>\n<a href=\"http:\/\/analyticstraining.com\/2014\/how-to-create-a-word-cloud-in-r\/\" target=\"_blank\" rel=\"noopener noreferrer\">How to Create a Word Cloud in R<\/a><\/p>\n<p class=\"western\">\n","protected":false},"excerpt":{"rendered":"<p>By: Gunnvant Singh, Faculty at Jigsaw &nbsp; Let&#8217;s be honest, once you start doing serious data wrangling in R, the code starts appearing ugly. One of the reason for this is the functional nature of R. In order to accomplish a decent data manipulation task one has to write nested functions. Now, writing nested functions [&hellip;]<\/p>\n","protected":false},"author":105,"featured_media":122318,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":[],"categories":[1],"tags":[83,571,545,553,640,48],"form":[1499],"acf":[],"_links":{"self":[{"href":"https:\/\/www.jigsawacademy.com\/wp-json\/wp\/v2\/posts\/122545"}],"collection":[{"href":"https:\/\/www.jigsawacademy.com\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.jigsawacademy.com\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.jigsawacademy.com\/wp-json\/wp\/v2\/users\/105"}],"replies":[{"embeddable":true,"href":"https:\/\/www.jigsawacademy.com\/wp-json\/wp\/v2\/comments?post=122545"}],"version-history":[{"count":0,"href":"https:\/\/www.jigsawacademy.com\/wp-json\/wp\/v2\/posts\/122545\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.jigsawacademy.com\/wp-json\/wp\/v2\/media\/122318"}],"wp:attachment":[{"href":"https:\/\/www.jigsawacademy.com\/wp-json\/wp\/v2\/media?parent=122545"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.jigsawacademy.com\/wp-json\/wp\/v2\/categories?post=122545"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.jigsawacademy.com\/wp-json\/wp\/v2\/tags?post=122545"},{"taxonomy":"form","embeddable":true,"href":"https:\/\/www.jigsawacademy.com\/wp-json\/wp\/v2\/form?post=122545"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}