Greg Wilson is a data scientist and professional educator at RStudio.
As a newcomer to R who prefers to read paper rather than pixels, I’ve been working my way through a more-or-less random selection of relevant books over the past few months. Some have discussed topics that I’m already familiar with in the context of R, while others have introduced me to entirely new subjects. This post describes four of them in brief; I hope to follow up with a second post in a few months as I work through the backlog on my desk.
First up is Sharon Machlis’ Practical R for Mass Communcation and Journalism, which is based on the author’s workshops for journalists. This book dives straight into doing the kinds of things a busy reporter or news analyst needs to do to meet a 5:00 pm deadline: data cleaning, presentation-quality graphics, and maps take precedence over control flow or the niceties of variable scope. I particularly enjoyed the way each chapter starts with a realistic project and works through what’s needed to build it. People who’ve never programmed before will be a little intimidated by how many packages they need to download if they try to work through the material on their own, but the instructions are clear, and the author’s enthusiasm for her material shines through in every example. (If anyone is working on a similar tutorial for sports data, please let me know - I have more than a few friends it would make very happy.)
req(), I’m much less confused than I was.
Functional programming has been the next big thing in computing since I was a graduate student in the 1980s. It does finally seem to be getting some traction outside the craft-beer-and-Emacs community, and Functional Programming in R by Thomas Mailund looks at how these ideas can be used in R. Mailund writes clearly, and readers who don’t have a background in computer science may find this a gentle way into a complex subject. However, despite the subtitle “Advanced Statistical Programming for Data Science, Analysis and Finance”, there’s nothing particularly statistical or financial about the book’s content. Some parts felt rushed, such as the lightning coverage of point-free programming (which should have had either a detailed exposition or no mention at all), but my biggest complaint about the book is its price: I think $34 for 100 pages is more than most people will want to pay.
Finally, we have Stefano Allesina and Madlen Wilmes’ Computing Skills for Biologists. As the subtitle says, this book presents a toolbox that includes Python, Git, LaTeX, and SQL as well as R, and is aimed at graduate students in biology who have just realized that a few hundred megabytes of messy data are standing between them and their thesis. The authors present the basics of each subject clearly and concisely using real-world data analysis examples at every turn. They freely admit in the introduction that coverage will be broad and shallow, but that’s exactly what books like this should aim for, and they hit a bulls eye. The book’s only weakness - unfortunately, a significant one - is an almost complete lack of diagrams. There are only six figures in its 400 pages, and none in the material on visualization. I realize that readers who are coding along with the examples will be able to view some plots and charts as they go, but I would urge the authors to include these in a second edition.
R is growing by leaps and bounds, and so is the literature about it. If you have written or read a book on R recently that you think others would be interested in, please let us know - we’d enjoy checking it out.
Stefano Allesina and Madlen Wilmes: Computing Skills for Biologists: A Toolbox. Princeton University Press, 978-0691182759.
Chris Beeley and Shitalkumar Sukhdeve: Web Application Development with R Using Shiny (3rd ed.). Packt, 2018, 978-1788993128.
Sharon Machlis: Practical R for Mass Communcation and Journalism. Chapman & Hall/CRC, 2018, 978-1138726918.
Thomas Mailund: Functional Programming in R: Advanced Statistical Programming for Data Science, Analysis and Finance. Apress, 2017, 978-1484227459.
You may leave a comment below or discuss the post in the forum community.rstudio.com.