<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom">
  <channel>
    <title>Survey on R Views</title>
    <link>https://rviews.rstudio.com/tags/survey/</link>
    <description>Recent content in Survey on R Views</description>
    <generator>Hugo -- gohugo.io</generator>
    <language>en-us</language>
    <lastBuildDate>Tue, 07 Nov 2017 00:00:00 +0000</lastBuildDate>
    <atom:link href="https://rviews.rstudio.com/tags/survey/" rel="self" type="application/rss+xml" />
    
    
    
    
    <item>
      <title>Automating Summary of Surveys with RMarkdown</title>
      <link>https://rviews.rstudio.com/2017/11/07/automating-summary-of-surveys-with-rmarkdown/</link>
      <pubDate>Tue, 07 Nov 2017 00:00:00 +0000</pubDate>
      
      <guid>https://rviews.rstudio.com/2017/11/07/automating-summary-of-surveys-with-rmarkdown/</guid>
      <description>
        


&lt;p&gt;This guide shows how to automate the summary of surveys with R and R Markdown using RStudio. This is great for portions of the document that don’t change (e.g., “the survey shows substantial partisan polarization”). The motivation is really twofold: &lt;em&gt;efficiency&lt;/em&gt; (maximize the reusabililty of code, minimize copying and pasting errors) and &lt;em&gt;reproducibility&lt;/em&gt; (maximize the number of people and computers that can reproduce findings).&lt;/p&gt;
&lt;p&gt;The basic setup is to write an &lt;code&gt;Rmd&lt;/code&gt; file that will serve as a template, and then a short R script that loops over each data file (using &lt;code&gt;library(knitr)&lt;/code&gt;). The &lt;code&gt;render&lt;/code&gt; function then turns the &lt;code&gt;Rmd&lt;/code&gt; into documents or slides (typically in &lt;code&gt;PDF&lt;/code&gt;, &lt;code&gt;HTML&lt;/code&gt;, or &lt;code&gt;docx&lt;/code&gt;) by taking file metadata as a &lt;a href=&#34;http://rmarkdown.rstudio.com/developer_parameterized_reports.html&#34;&gt;parameter&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;There are countless ways to summarize a survey in R. This guide shows a few basics with &lt;code&gt;ggplot&lt;/code&gt; and &lt;code&gt;questionr&lt;/code&gt;, but focuses on the overall workflow (file management, etc.). Following the instructions here, you should be able to reproduce all four reports (and in principle, many more) despite only writing code to clean one survey. Most of the code is displayed in this document, but all code is found in either &lt;code&gt;pewpoliticaltemplate.Rmd&lt;/code&gt; or &lt;code&gt;pew_report_generator.R&lt;/code&gt;. All code, as well as the outputted documents, can be found &lt;a href=&#34;https://github.com/rdrr1990/datascience101/tree/master/automating&#34;&gt;here&lt;/a&gt;, and details on obtaining the data are found below.&lt;/p&gt;
&lt;div id=&#34;software&#34; class=&#34;section level1&#34;&gt;
&lt;h1&gt;Software&lt;/h1&gt;
&lt;p&gt;RStudio’s interface with &lt;code&gt;library(rmarkdown)&lt;/code&gt; is evolving rapidly. Installing the current version of RStudio is highly recommended, particularly for the previews of the R Markdown code (this doc was created with RStudio 1.1.83). (Here is my &lt;a href=&#34;https://web.stanford.edu/class/stats101/&#34;&gt;install guide&lt;/a&gt;, which includes links to tutorials and cheat sheets. For somewhat more advanced survey data cleaning, click &lt;a href=&#34;https://web.stanford.edu/class/stats101/R_skill_dRill.html&#34;&gt;here&lt;/a&gt;.)&lt;/p&gt;
&lt;p&gt;Even if you’ve knit Rmd before, your libraries may not be new enough to create parameterized reports. I recommend installing &lt;code&gt;pacman&lt;/code&gt;, which has a convenience function &lt;code&gt;p_load&lt;/code&gt; that smooths package installation, loading, and maintenance. I’d recommend &lt;code&gt;p_load&lt;/code&gt; particularly if you are collaborating, say, on Dropbox.&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;install.packages(&amp;quot;pacman&amp;quot;)
p_load(rmarkdown, knitr, foreign, scales, questionr, tidyverse, update = TRUE)&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Remember &lt;code&gt;PDF&lt;/code&gt; requires &lt;code&gt;LaTeX&lt;/code&gt; &lt;a href=&#34;https://web.stanford.edu/class/stats101/&#34;&gt;(install links)&lt;/a&gt;. By contrast, knitting to &lt;code&gt;docx&lt;/code&gt; or &lt;code&gt;HTML&lt;/code&gt; does not require &lt;code&gt;LaTeX&lt;/code&gt;. Creating &lt;code&gt;pptx&lt;/code&gt; is possible in R with &lt;code&gt;library(ReporteRs)&lt;/code&gt;, but is not discussed here.&lt;/p&gt;
&lt;/div&gt;
&lt;div id=&#34;the-data&#34; class=&#34;section level1&#34;&gt;
&lt;h1&gt;The Data&lt;/h1&gt;
&lt;p&gt;Download the four “political surveys” from Pew Research available &lt;a href=&#34;http://www.people-press.org/datasets/2016/&#34;&gt;here&lt;/a&gt; (i.e., January, March, August, and October 2016). You may recall, some politics happened in 2016. (The data is free, provided you take a moment to make an account.)&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;If need be, decompress each &lt;code&gt;zip&lt;/code&gt; folder.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Three of my folders have intuitive names (&lt;code&gt;Jan16&lt;/code&gt;, &lt;code&gt;Mar16&lt;/code&gt;, and &lt;code&gt;Oct16&lt;/code&gt;), but one of my folders picked up a lengthy name, &lt;code&gt;http___www.people-press.org_files_datasets_Aug16&lt;/code&gt;. Don’t worry about that.&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;p&gt;Create a new folder, call it, say, &lt;code&gt;automating&lt;/code&gt;.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Move all four data folders into &lt;code&gt;automating&lt;/code&gt;.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Please note that I have no affiliation (past or present) with Pew Research. I simply think that they do great work and they make it relatively hassle-free to get started with meaningful data sets.&lt;/p&gt;
&lt;/div&gt;
&lt;div id=&#34;the-r-notebook-r-markdown-template&#34; class=&#34;section level1&#34;&gt;
&lt;h1&gt;The R Notebook (R Markdown) Template&lt;/h1&gt;
&lt;p&gt;(R Markdown ninjas can skip this section.)&lt;/p&gt;
&lt;p&gt;In RStudio, create a new R Notebook and save it as &lt;code&gt;pewpoliticaltemplate.Rmd&lt;/code&gt; in the &lt;code&gt;automating&lt;/code&gt; folder you just created. This document will likely knit to &lt;code&gt;HTML&lt;/code&gt; by default; hold down the Knit button to change it to &lt;code&gt;PDF&lt;/code&gt;. Add fields to the header as desired. The sample header below automatically puts today’s date on the document by parsing the expression next to &lt;code&gt;Date:&lt;/code&gt; as R code. &lt;code&gt;classoption: landscape&lt;/code&gt; may help with wide tables. You can also specify the file that contains your bibliography in several formats, such as &lt;code&gt;BibTex&lt;/code&gt; and &lt;code&gt;EndNote&lt;/code&gt; &lt;a href=&#34;http://rmarkdown.rstudio.com/authoring_bibliographies_and_citations.html&#34;&gt;(citation details)&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;Next add an R code chunk to &lt;code&gt;pewpoliticaltemplate.Rmd&lt;/code&gt; to take care of background stuff like formatting. Though setting a working directory would not be needed just to knit the &lt;code&gt;Rmd&lt;/code&gt;, the directory must be set by &lt;code&gt;knitr::opts_knit$set(root.dir = &#39;...&#39;)&lt;/code&gt; to automate document prep. (&lt;code&gt;setwd&lt;/code&gt; isn’t needed in the &lt;code&gt;Rmd&lt;/code&gt;, but setting the working directory separately in &lt;code&gt;Console&lt;/code&gt; is recommended if you’re still editing.)&lt;/p&gt;
&lt;div class=&#34;figure&#34;&gt;
&lt;img src=&#34;/post/2017-11-01-Mohanty-Surveys_files/images/config.png&#34; /&gt;

&lt;/div&gt;
&lt;p&gt;The Play button at the top right gives a preview of the code’s output, which is handy. If some part of the analysis is very lengthy, it only needs to be run once, freeing you to tinker with graphics and the like.&lt;/p&gt;
&lt;p&gt;– Now the default settings have been set and you don’t need to worry about suppressing warnings and so on with each code chunk. You can, of course, change them case-by-case as you like.&lt;/p&gt;
&lt;p&gt;– Unlike in R, when setting the format options for individual code chunks (as shown above to suppress warnings before the defaults kick in), you do need to type out the words &lt;code&gt;TRUE&lt;/code&gt; and &lt;code&gt;FALSE&lt;/code&gt; in full.&lt;/p&gt;
&lt;p&gt;– Unlike the template, in this doc, I’ve set the defaults to &lt;code&gt;echo = TRUE&lt;/code&gt; and &lt;code&gt;tidy = TRUE&lt;/code&gt; to display the R code more pleasingly.&lt;/p&gt;
&lt;p&gt;– The setting &lt;code&gt;asis = TRUE&lt;/code&gt; is very useful for professionally formatted tables (shown below), but is not recommended for raw R output of matrix and tables. To make raw data frames display with &lt;code&gt;kable&lt;/code&gt; by default, see &lt;a href=&#34;http://rmarkdown.rstudio.com/html_document_format.html&#34;&gt;here&lt;/a&gt;.&lt;/p&gt;
&lt;div id=&#34;the-template&#34; class=&#34;section level3&#34;&gt;
&lt;h3&gt;The Template&lt;/h3&gt;
&lt;p&gt;I find it easiest to write a fully working example and then make little changes as needed so that &lt;code&gt;knitr::render()&lt;/code&gt; can loop over the data sets. First things first.&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;survey &amp;lt;- read.spss(&amp;quot;Jan16/Jan16 public.sav&amp;quot;, to.data.frame = TRUE)&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Summary stats can easily be inserted into the text like so:&lt;/p&gt;
&lt;div class=&#34;figure&#34;&gt;
&lt;img src=&#34;/post/2017-11-01-Mohanty-Surveys_files/images/intext.png&#34; /&gt;

&lt;/div&gt;
&lt;p&gt;The template contains additional examples with survey weights (lengthier calculations should be done in blocks of code and then their result referenced with that inline style).&lt;/p&gt;
&lt;p&gt;Here is a basic plot we might want, which reflects the survey weights. &lt;code&gt;facet_grid()&lt;/code&gt; is used to create analogous plots for each party identification. The plot uses the slightly wonky syntax &lt;code&gt;y = (..count..)/sum(..count..)&lt;/code&gt; to display the results as percentages rather than counts. Note that some code that cleans the data (mostly shortening labels) is omitted for brevity, but can be found &lt;a href=&#34;https://github.com/rdrr1990/datascience101/blob/master/automating/pewpoliticaltemplate.Rmd&#34;&gt;here&lt;/a&gt;.&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;PA &amp;lt;- ggplot(survey) + theme_minimal()
PA &amp;lt;- PA + geom_bar(aes(q1, y = (..count..)/sum(..count..), weight = weight, 
    fill = q1))
PA &amp;lt;- PA + facet_grid(party.clean ~ .) + theme(strip.text.y = element_text(angle = 45))
PA &amp;lt;- PA + xlab(&amp;quot;&amp;quot;) + ylab(&amp;quot;Percent of Country&amp;quot;)
PA &amp;lt;- PA + ggtitle(&amp;quot;Presidential Approval: January 2016&amp;quot;)
PA &amp;lt;- PA + scale_y_continuous(labels = scales::percent)
PA&lt;/code&gt;&lt;/pre&gt;
&lt;div class=&#34;figure&#34;&gt;
&lt;img src=&#34;/post/2017-11-01-Mohanty-Surveys_files/images/ggPresApproval-1.png&#34; /&gt;

&lt;/div&gt;
&lt;p&gt;Here is an example of a weighted crosstab. &lt;code&gt;knitr::kable&lt;/code&gt; will create a table that’s professional in appearance (when knit as &lt;code&gt;PDF&lt;/code&gt;; &lt;code&gt;kable&lt;/code&gt; takes the style of an academic journal).&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;kable(wtd.table(survey$ideo, survey$sex, survey$weight)/nrow(survey), digits = 2)&lt;/code&gt;&lt;/pre&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr class=&#34;header&#34;&gt;
&lt;th&gt;&lt;/th&gt;
&lt;th align=&#34;right&#34;&gt;Male&lt;/th&gt;
&lt;th align=&#34;right&#34;&gt;Female&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr class=&#34;odd&#34;&gt;
&lt;td&gt;Very conservative&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;0.04&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;0.03&lt;/td&gt;
&lt;/tr&gt;
&lt;tr class=&#34;even&#34;&gt;
&lt;td&gt;Conservative&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;0.14&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;0.13&lt;/td&gt;
&lt;/tr&gt;
&lt;tr class=&#34;odd&#34;&gt;
&lt;td&gt;Moderate&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;0.20&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;0.20&lt;/td&gt;
&lt;/tr&gt;
&lt;tr class=&#34;even&#34;&gt;
&lt;td&gt;Liberal&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;0.08&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;0.09&lt;/td&gt;
&lt;/tr&gt;
&lt;tr class=&#34;odd&#34;&gt;
&lt;td&gt;Very liberal&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;0.03&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;0.03&lt;/td&gt;
&lt;/tr&gt;
&lt;tr class=&#34;even&#34;&gt;
&lt;td&gt;DK*&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;0.02&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;0.03&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;p&gt;Suppose we want to display Presidential Approval, where the first column provides overall approval and subsequent columns are crosstabs for various factors of interest (using the cell weights). I’ve written a convenience function called &lt;a href=&#34;https://github.com/rdrr1990/datascience101/blob/master/automating/Xtabs.R&#34;&gt;Xtabs&lt;/a&gt; that creates this format, which is common in the survey world.&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;source(&amp;quot;https://raw.githubusercontent.com/rdrr1990/datascience101/master/automating/Xtabs.R&amp;quot;)
kable(Xtabs(survey, &amp;quot;q1&amp;quot;, c(&amp;quot;sex&amp;quot;, &amp;quot;race&amp;quot;), weight = &amp;quot;cellweight&amp;quot;))&lt;/code&gt;&lt;/pre&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr class=&#34;header&#34;&gt;
&lt;th&gt;&lt;/th&gt;
&lt;th align=&#34;left&#34;&gt;Overall&lt;/th&gt;
&lt;th align=&#34;left&#34;&gt;Male&lt;/th&gt;
&lt;th align=&#34;left&#34;&gt;Female&lt;/th&gt;
&lt;th align=&#34;left&#34;&gt;White (nH)&lt;/th&gt;
&lt;th align=&#34;left&#34;&gt;Black (nH)&lt;/th&gt;
&lt;th align=&#34;left&#34;&gt;Hispanic&lt;/th&gt;
&lt;th align=&#34;left&#34;&gt;Other&lt;/th&gt;
&lt;th align=&#34;left&#34;&gt;DK*&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr class=&#34;odd&#34;&gt;
&lt;td&gt;Approve&lt;/td&gt;
&lt;td align=&#34;left&#34;&gt;45.6%&lt;/td&gt;
&lt;td align=&#34;left&#34;&gt;21.3%&lt;/td&gt;
&lt;td align=&#34;left&#34;&gt;24.2%&lt;/td&gt;
&lt;td align=&#34;left&#34;&gt;20.4%&lt;/td&gt;
&lt;td align=&#34;left&#34;&gt;9.48%&lt;/td&gt;
&lt;td align=&#34;left&#34;&gt;10.2%&lt;/td&gt;
&lt;td align=&#34;left&#34;&gt;4.97%&lt;/td&gt;
&lt;td align=&#34;left&#34;&gt;0.443%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr class=&#34;even&#34;&gt;
&lt;td&gt;Disapprove&lt;/td&gt;
&lt;td align=&#34;left&#34;&gt;48.5%&lt;/td&gt;
&lt;td align=&#34;left&#34;&gt;25.5%&lt;/td&gt;
&lt;td align=&#34;left&#34;&gt;23%&lt;/td&gt;
&lt;td align=&#34;left&#34;&gt;39.6%&lt;/td&gt;
&lt;td align=&#34;left&#34;&gt;1.33%&lt;/td&gt;
&lt;td align=&#34;left&#34;&gt;3.95%&lt;/td&gt;
&lt;td align=&#34;left&#34;&gt;2.53%&lt;/td&gt;
&lt;td align=&#34;left&#34;&gt;1.12%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr class=&#34;odd&#34;&gt;
&lt;td&gt;Don’t Know (VOL)&lt;/td&gt;
&lt;td align=&#34;left&#34;&gt;5.94%&lt;/td&gt;
&lt;td align=&#34;left&#34;&gt;2.67%&lt;/td&gt;
&lt;td align=&#34;left&#34;&gt;3.27%&lt;/td&gt;
&lt;td align=&#34;left&#34;&gt;3.39%&lt;/td&gt;
&lt;td align=&#34;left&#34;&gt;0.646%&lt;/td&gt;
&lt;td align=&#34;left&#34;&gt;1.14%&lt;/td&gt;
&lt;td align=&#34;left&#34;&gt;0.489%&lt;/td&gt;
&lt;td align=&#34;left&#34;&gt;0.269%&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;p&gt;Suppose we want to do many crosstabs. The syntax &lt;code&gt;survey$ideo&lt;/code&gt; is widely used for convenience, but &lt;code&gt;survey[[&amp;quot;ideo&amp;quot;]]&lt;/code&gt; will serve us better since it allows us to work with vectors of variable names ( &lt;a href=&#34;http://www.win-vector.com/blog/2017/06/non-standard-evaluation-and-function-composition-in-r/&#34;&gt;details from win-vector&lt;/a&gt; ). Below, the first two calls to comparisons are identical, but the final one is not because there is no variable “x” in the data frame &lt;code&gt;survey&lt;/code&gt;.&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;identical(survey$ideo, survey[[&amp;quot;ideo&amp;quot;]])&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;[1] TRUE&lt;/code&gt;&lt;/pre&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;x &amp;lt;- &amp;quot;ideo&amp;quot;
identical(survey[[x]], survey[[&amp;quot;ideo&amp;quot;]])&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;[1] TRUE&lt;/code&gt;&lt;/pre&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;identical(survey[[x]], survey$x)&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;[1] FALSE&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;So say we want weighted crosstabs for ideology and party ID crossed by all question 20, 21, 22.. 29. Here is some code that will do that.&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;x &amp;lt;- names(survey)[grep(&amp;quot;q2[[:digit:]]&amp;quot;, names(survey))]
x&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt; [1] &amp;quot;q20&amp;quot;  &amp;quot;q21&amp;quot;  &amp;quot;q22a&amp;quot; &amp;quot;q22b&amp;quot; &amp;quot;q22c&amp;quot; &amp;quot;q22d&amp;quot; &amp;quot;q22e&amp;quot; &amp;quot;q22f&amp;quot; &amp;quot;q22g&amp;quot; &amp;quot;q22h&amp;quot;
[11] &amp;quot;q22i&amp;quot; &amp;quot;q25&amp;quot;  &amp;quot;q26&amp;quot;  &amp;quot;q27&amp;quot;  &amp;quot;q28&amp;quot; &lt;/code&gt;&lt;/pre&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;y &amp;lt;- c(&amp;quot;ideo&amp;quot;, &amp;quot;party&amp;quot;)&lt;/code&gt;&lt;/pre&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;for (i in x) {
    for (j in y) {
        cat(&amp;quot;\nWeighted proportions for&amp;quot;, i, &amp;quot;broken down by&amp;quot;, j, &amp;quot;\n&amp;quot;)
        print(kable(wtd.table(survey[[i]], survey[[j]], survey$weight)/nrow(survey), 
            digits = 2))
        cat(&amp;quot;\n&amp;quot;)  # break out of table formatting
    }
    cat(&amp;quot;\\newpage&amp;quot;)
}&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;A few notes:&lt;/p&gt;
&lt;p&gt;– This code will only work with the &lt;code&gt;asis&lt;/code&gt; setting (shown above) that lets &lt;code&gt;knitr&lt;/code&gt; interpret the output of &lt;code&gt;print(kable())&lt;/code&gt; as something to render (rather than just Markdown code to display for use elsewhere).&lt;/p&gt;
&lt;p&gt;– Ideally one would have a &lt;code&gt;csv&lt;/code&gt; or &lt;code&gt;data.frame&lt;/code&gt; of the questions, and display them as loop-switched questions. In this case, the questionnaire is in a &lt;code&gt;docx&lt;/code&gt; and so &lt;code&gt;library(docxtrackr)&lt;/code&gt; may help.&lt;/p&gt;
&lt;p&gt;– Rather than a nested loop, one would likely prefer to pick a question, loop over the demographic and ideological categories for the crosstabs, and then insert commentary and overview.&lt;/p&gt;
&lt;p&gt;– The outer loops makes a new page each time it is run, with the inner loop with &lt;code&gt;cat(&amp;quot;\\newpage&amp;quot;))&lt;/code&gt;, which is specific to rendering as &lt;code&gt;PDF&lt;/code&gt;. Extra line breaks &lt;code&gt;\n&lt;/code&gt; are needed to break out of the table formatting and keep code and text separate. A different approach to page breaks is needed &lt;a href=&#34;https://stackoverflow.com/questions/24672111/how-to-add-a-page-break-in-word-document-generated-by-rstudio-markdown&#34;&gt;for docx&lt;/a&gt;.&lt;/p&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;div id=&#34;adapting-the-template-with-parameters&#34; class=&#34;section level1&#34;&gt;
&lt;h1&gt;Adapting the Template with Parameters&lt;/h1&gt;
&lt;p&gt;The next step is to add a &lt;a href=&#34;http://rmarkdown.rstudio.com/developer_parameterized_reports.html&#34;&gt;parameter&lt;/a&gt; with any variables you need. The parameters will be controlled by the R script discussed below. There is, of course, quite a bit of choice as to what is controlled by which file, but often only a handful of parameters are necessary. Add the following to the end of the header of &lt;code&gt;pewpoliticaltemplate.Rmd&lt;/code&gt;:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;params:
  spssfile: !r  1
  surveywave: !r 2016&lt;/code&gt;&lt;/pre&gt;
&lt;div class=&#34;figure&#34;&gt;
&lt;img src=&#34;/post/2017-11-01-Mohanty-Surveys_files/images/newheader.png&#34; /&gt;

&lt;/div&gt;
&lt;p&gt;That creates variables &lt;code&gt;params$spssfile&lt;/code&gt; and &lt;code&gt;params$surveywave&lt;/code&gt; that can be controlled externally from other R sessions, and gives them default values of &lt;code&gt;1&lt;/code&gt; and &lt;code&gt;2016&lt;/code&gt; respectively. Setting default values smooths debugging by allowing you to continue knitting the &lt;code&gt;Rmd&lt;/code&gt; on its own (rather than from the R script we will create in a moment… You can also click on Knit and choose &lt;code&gt;Knit with Parameters&lt;/code&gt; to specify particular values).&lt;/p&gt;
&lt;p&gt;Now make any changes to &lt;code&gt;Rmd&lt;/code&gt; template. For example, in the &lt;code&gt;ggplot&lt;/code&gt; code…&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;PA &amp;lt;- PA + ggtitle(paste(&amp;quot;Presidential Approval:&amp;quot;, params$surveywave))&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Notice we can get a list of all the &lt;code&gt;spss&lt;/code&gt; files like so:&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;dir(pattern = &amp;quot;sav&amp;quot;, recursive = TRUE)&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;[1] &amp;quot;http___www.people-press.org_files_datasets_Aug16/Aug16 public.sav&amp;quot;
[2] &amp;quot;Jan16/Jan16 public.sav&amp;quot;                                           
[3] &amp;quot;March16/March16 public.sav&amp;quot;                                       
[4] &amp;quot;Oct16/Oct16 public.sav&amp;quot;                                           &lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;or in this case&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;dir(pattern = &amp;quot;16 public.sav&amp;quot;, recursive = TRUE)&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;[1] &amp;quot;http___www.people-press.org_files_datasets_Aug16/Aug16 public.sav&amp;quot;
[2] &amp;quot;Jan16/Jan16 public.sav&amp;quot;                                           
[3] &amp;quot;March16/March16 public.sav&amp;quot;                                       
[4] &amp;quot;Oct16/Oct16 public.sav&amp;quot;                                           &lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;I recommend making the pattern as specific as possible in case you or your collaborators add other &lt;code&gt;spss&lt;/code&gt; files with similar names. To use regular expressions to specify more complicated patterns, see &lt;a href=&#34;https://rstudio-pubs-static.s3.amazonaws.com/74603_76cd14d5983f47408fdf0b323550b846.html&#34;&gt;here&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;Now back to editing &lt;code&gt;pewpoliticaltemplate.Rmd&lt;/code&gt;…&lt;/p&gt;
&lt;div class=&#34;figure&#34;&gt;
&lt;img src=&#34;/post/2017-11-01-Mohanty-Surveys_files/images/newreadingdata.png&#34; /&gt;

&lt;/div&gt;
&lt;p&gt;Knit the file to see how it looks with these default settings; that’s it for this portion.&lt;/p&gt;
&lt;/div&gt;
&lt;div id=&#34;automating-with-knitr&#34; class=&#34;section level1&#34;&gt;
&lt;h1&gt;Automating with knitr&lt;/h1&gt;
&lt;p&gt;Now create a new R script; mine’s called &lt;code&gt;pew_report_generator.R&lt;/code&gt;. It’s just a simple loop that tells which data set to grab, as well as the label to pass to the &lt;code&gt;Rmd&lt;/code&gt;. Note that the labels appear in alphabetical rather than chronological order as a function of the way that the &lt;code&gt;Rmd&lt;/code&gt; happens to find the files.&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;library(pacman)
p_load(knitr, rmarkdown, sessioninfo)

setwd(&amp;quot;/users/mohanty/Desktop/pewpolitical/&amp;quot;)

waves &amp;lt;- c(&amp;quot;August 2016&amp;quot;, &amp;quot;January 2016&amp;quot;, &amp;quot;March 2016&amp;quot;, &amp;quot;October 2016&amp;quot;)

for (i in 1:length(waves)) {
    render(&amp;quot;pewpoliticaltemplate.Rmd&amp;quot;, params = list(spssfile = i, surveywave = waves[i]), 
        output_file = paste0(&amp;quot;Survey Analysis &amp;quot;, waves[i], &amp;quot;.pdf&amp;quot;))
}

session &amp;lt;- session_info()
save(session, file = paste0(&amp;quot;session&amp;quot;, format(Sys.time(), &amp;quot;%m%d%Y&amp;quot;), &amp;quot;.Rdata&amp;quot;))&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;That’s it. Of course, in practice you might write some code on the first survey that doesn’t work for all of them. Pew, for example, seems to have formatted the survey date differently in the last two surveys, which led me to make a few changes. But if the data are formatted fairly consistently, a one-time investment can save massive amounts of time lost to error-prone copying and pasting.&lt;/p&gt;
&lt;div id=&#34;a-little-version-control&#34; class=&#34;section level3&#34;&gt;
&lt;h3&gt;A Little Version Control&lt;/h3&gt;
&lt;p&gt;The last bit of code is not necessary, but is a convenient way to store which versions of which libraries were actually used on which version of R. If something works now but not in the future, &lt;code&gt;install_version&lt;/code&gt; (found in &lt;code&gt;library(devtools)&lt;/code&gt;) can be used to install the older version of particular packages.&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;s &amp;lt;- session_info()
s$platform&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt; setting  value                       
 version  R version 3.4.2 (2017-09-28)
 os       macOS Sierra 10.12.6        
 system   x86_64, darwin15.6.0        
 ui       X11                         
 language (EN)                        
 collate  en_US.UTF-8                 
 tz       America/Los_Angeles         
 date     2017-11-06                  &lt;/code&gt;&lt;/pre&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;s$packages&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt; package     * version date       source                          
 assertthat    0.2.0   2017-04-11 CRAN (R 3.4.0)                  
 backports     1.1.1   2017-09-25 CRAN (R 3.4.2)                  
 bindr         0.1     2016-11-13 CRAN (R 3.4.0)                  
 bindrcpp      0.2     2017-06-17 CRAN (R 3.4.0)                  
 broom         0.4.2   2017-02-13 CRAN (R 3.4.0)                  
 cellranger    1.1.0   2016-07-27 CRAN (R 3.4.0)                  
 clisymbols    1.2.0   2017-05-21 cran (@1.2.0)                   
 colorspace    1.3-2   2016-12-14 CRAN (R 3.4.0)                  
 digest        0.6.12  2017-01-27 CRAN (R 3.4.0)                  
 dplyr       * 0.7.4   2017-09-28 cran (@0.7.4)                   
 evaluate      0.10.1  2017-06-24 CRAN (R 3.4.1)                  
 forcats       0.2.0   2017-01-23 CRAN (R 3.4.0)                  
 foreign     * 0.8-69  2017-06-22 CRAN (R 3.4.2)                  
 formatR       1.5     2017-04-25 CRAN (R 3.4.0)                  
 ggplot2     * 2.2.1   2016-12-30 CRAN (R 3.4.0)                  
 glue          1.2.0   2017-10-29 CRAN (R 3.4.2)                  
 gtable        0.2.0   2016-02-26 CRAN (R 3.4.0)                  
 haven         1.1.0   2017-07-09 CRAN (R 3.4.1)                  
 highr         0.6     2016-05-09 CRAN (R 3.4.0)                  
 hms           0.3     2016-11-22 CRAN (R 3.4.0)                  
 htmltools     0.3.6   2017-04-28 CRAN (R 3.4.0)                  
 httpuv        1.3.5   2017-07-04 CRAN (R 3.4.1)                  
 httr          1.3.1   2017-08-20 cran (@1.3.1)                   
 jsonlite      1.5     2017-06-01 CRAN (R 3.4.0)                  
 knitr       * 1.17    2017-08-10 CRAN (R 3.4.1)                  
 labeling      0.3     2014-08-23 CRAN (R 3.4.0)                  
 lattice       0.20-35 2017-03-25 CRAN (R 3.4.2)                  
 lazyeval      0.2.1   2017-10-29 CRAN (R 3.4.2)                  
 lubridate     1.7.0   2017-10-29 CRAN (R 3.4.2)                  
 magrittr      1.5     2014-11-22 CRAN (R 3.4.0)                  
 mime          0.5     2016-07-07 CRAN (R 3.4.0)                  
 miniUI        0.1.1   2016-01-15 CRAN (R 3.4.0)                  
 mnormt        1.5-5   2016-10-15 CRAN (R 3.4.0)                  
 modelr        0.1.1   2017-07-24 CRAN (R 3.4.1)                  
 munsell       0.4.3   2016-02-13 CRAN (R 3.4.0)                  
 nlme          3.1-131 2017-02-06 CRAN (R 3.4.2)                  
 pacman      * 0.4.6   2017-05-14 CRAN (R 3.4.0)                  
 pkgconfig     2.0.1   2017-03-21 CRAN (R 3.4.0)                  
 plyr          1.8.4   2016-06-08 CRAN (R 3.4.0)                  
 psych         1.7.8   2017-09-09 CRAN (R 3.4.1)                  
 purrr       * 0.2.4   2017-10-18 CRAN (R 3.4.2)                  
 questionr   * 0.6.3   2017-11-06 local                           
 R6            2.2.2   2017-06-17 CRAN (R 3.4.0)                  
 Rcpp          0.12.13 2017-09-28 cran (@0.12.13)                 
 readr       * 1.1.1   2017-05-16 CRAN (R 3.4.0)                  
 readxl        1.0.0   2017-04-18 CRAN (R 3.4.0)                  
 reshape2      1.4.2   2016-10-22 CRAN (R 3.4.0)                  
 rlang         0.1.2   2017-08-09 CRAN (R 3.4.1)                  
 rmarkdown     1.6     2017-06-15 CRAN (R 3.4.0)                  
 rprojroot     1.2     2017-01-16 CRAN (R 3.4.0)                  
 rstudioapi    0.7     2017-09-07 cran (@0.7)                     
 rvest         0.3.2   2016-06-17 CRAN (R 3.4.0)                  
 scales        0.5.0   2017-08-24 cran (@0.5.0)                   
 sessioninfo * 1.0.0   2017-06-21 CRAN (R 3.4.1)                  
 shiny         1.0.5   2017-08-23 cran (@1.0.5)                   
 stringi       1.1.5   2017-04-07 CRAN (R 3.4.0)                  
 stringr       1.2.0   2017-02-18 CRAN (R 3.4.0)                  
 tibble      * 1.3.4   2017-08-22 cran (@1.3.4)                   
 tidyr       * 0.7.2   2017-10-16 CRAN (R 3.4.2)                  
 tidyverse   * 1.1.1   2017-01-27 CRAN (R 3.4.0)                  
 withr         2.0.0   2017-10-25 Github (jimhester/withr@a43df66)
 xml2          1.1.1   2017-01-24 CRAN (R 3.4.0)                  
 xtable        1.8-2   2016-02-05 CRAN (R 3.4.0)                  
 yaml          2.1.14  2016-11-12 CRAN (R 3.4.0)&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;
&lt;/div&gt;

        &lt;script&gt;window.location.href=&#39;https://rviews.rstudio.com/2017/11/07/automating-summary-of-surveys-with-rmarkdown/&#39;;&lt;/script&gt;
      </description>
    </item>
    
    <item>
      <title>The R Survey</title>
      <link>https://rviews.rstudio.com/2017/07/13/the-r-survey/</link>
      <pubDate>Thu, 13 Jul 2017 00:00:00 +0000</pubDate>
      
      <guid>https://rviews.rstudio.com/2017/07/13/the-r-survey/</guid>
      <description>
        

&lt;p&gt;The &lt;a href=&#34;https://www.r-consortium.org/&#34;&gt;R Consortium&lt;/a&gt; is undertaking a multi-year effort to survey the whole R world. In a rather low-key &lt;a href=&#34;https://www.r-consortium.org/blog/2017/06/29/take-the-r-consortiums-survey-on-r&#34;&gt;blog post&lt;/a&gt; at the end of last month, the R Consortium’s technical committee, the Infrastructure Steering Committee &lt;a href=&#34;https://www.r-consortium.org/wp-content/uploads/sites/13/2017/06/R-Consortium-ISC-Charter.pdf&#34;&gt;(ISC)&lt;/a&gt;, launched its prototype &lt;a href=&#34;http://bit.ly/2tuO4NF&#34;&gt;survey&lt;/a&gt; of R users. &lt;img src=&#34;/post/2017-07-12-The-R-Survey_files/RC_Survey.png&#34; /&gt;&lt;/p&gt;
&lt;p&gt;The idea is to use the information gleaned from the exploratory questions in this first survey to craft a more refined version that can be sent out annually. So far, we have received approximately twelve hundred responses. Our goal for this first attempt, is to collect ten thousand completed surveys.&lt;/p&gt;
&lt;p&gt;The map below indicates that our coverage is pretty good so far, but we would like to see more responses from Asia, Africa and South America. Additional responses from other areas of the world are most welcome, as well.&lt;/p&gt;
&lt;div class=&#34;figure&#34;&gt;
&lt;img src=&#34;/post/2017-07-12-The-R-Survey_files/Survey_map.png&#34; /&gt;

&lt;/div&gt;
&lt;p&gt;Please take the &lt;a href=&#34;http://bit.ly/2tuO4NF&#34;&gt;survey&lt;/a&gt; yourself and help us spread the word on social media, by word of mouth, and any other way you can think of. The &lt;a href=&#34;http://bit.ly/2tuO4NF&#34;&gt;survey&lt;/a&gt; will be live until September 15th.&lt;/p&gt;

        &lt;script&gt;window.location.href=&#39;https://rviews.rstudio.com/2017/07/13/the-r-survey/&#39;;&lt;/script&gt;
      </description>
    </item>
    
  </channel>
</rss>
