Analytics Administration for R

Analytic administrator is a role that data scientists assume when they onboard new tools, deploy solutions, support existing standards, or train other data scientists. It is a role that works closely with IT to maintain, upgrade, and scale analytic environments. Analytic admins have a multiplier effect - as they go about their work, they influence others in the organization to be more effective. If you are a data scientist using R, you might consider filling the role of analytic admin for your organization.

Read more

Share Comments · · · ·

Some R User Group News

This week, members of the Bay Area useR Group (BARUG) celebrated the group’s one hundred and first meetup with beer, pizza and three outstanding presentations at the cancer diagnostics company GRAIL. Pete Mohanty began the evening with the talk Did “Communities in Crisis” Elect Trump?: An Analysis with Kernel Regularized Least Squares. Not only is the Political Science compelling, but the underlying Data Science is top shelf. The bigKRLS package that Pete and his collaborator Robert Shaffer wrote to support their research uses parallel processing and external memory techniques to make the computationally intensive Regularized Least Squares algorithm suitable for fairly large data sets.

Read more

Share Comments · · · · ·

Mapping Quandl Data with Shiny

Today, we are going to wrap our previously built Quandl/world map Notebook into an interactive Shiny app that lets users choose both a country and a data set for display. As usual, we did a lot of the heavy lifting in the Notebook to make our work more reproducible and our app more performant. The final app is available here. Devotees of this Reproducible Finance blog series will note similarities to this Shiny app, but today’s app will have different and richer functionality.

Read more

Share Comments · · · · · · ·

What is the tidyverse?

Last week, I had the opportunity to talk to a group of Master’s level Statistics and Business Analytics students at Cal State East Bay about R and Data Science. Many in my audience were adult students coming back to school with job experience writing code in Java, Python and SAS. It was a pretty sophisticated crowd, but not surprisingly, their R skills were stitched together in a way that left some big gaps.

Read more

Share Comments · · · · · · ·

A Shiny App for Exploring Commodities Prices and Economic Indicators, via Quandl

In a previous post, we created an R Notebook to explore the relationship between the copper/gold price ratio and 10-year Treasury yields (if you’re curious why we might care about this relationship, have a quick look at that previous post), relying on data from Quandl. Today, we’ll create a Shiny app that lets users choose which different commodities ratios and different economic indicators to investigate. Perhaps users don’t care about Dr.

Read more

Share Comments · · · · ·

April New Package Picks

Here are my picks for the “Top 40” new packages submitted to CRAN in April 2017. These selections, which were culled from 208 submissions, are organized into four categories: Data, Finance, Statistics and Utilities. The number of entries in the Data and Utilities categories reflect the initiatives of R developers to connect to external resources. Data comtradr v0.0.1: Provides functions to extract country-level shipping data for a variety of commodities data from the United Nations Comtrade API.

Read more

Share Comments · · · ·

Civic Data Wrangling: in R and on data.world

One of the most valuable things I have learned working on Data for Democracy’s Medicare drug spending project has been the value of collaborative tools. It has been my first in-depth experience using Github collaboratively, for one, but it has also introduced me to data.world. data.world is an intuitive way to store, organize, explore, and visualize individual data files, making them more visible to and usable for anyone who’s interested. Using the data.

Read more

Share Comments · · · ·

Growth of DataFest over the years

In a previous post, I introduced DataFest and how one can streamline the organization of this event using Google Forms and tools from the tidyverse. In this post, I’ll walk through building a Shiny app that demonstrates the growth of DataFest over the years, both in terms of host locations and participating institutions, as well as in terms of the number of students who participated in each event.

Read more

Share Comments · · · · ·

Review of Efficient R Programming

In the crowded market space of data science and R language books, Lovelace and Gillespie’s Efficient R Programming (2016) stands out from the crowd. Over the course of ten comprehensive chapters, the authors address the primary tenets of developing efficient R programs. Unless you happen to be a member of the R core development team, you will find this book useful whether you are a novice R programmer or an established data scientist and engineer.

Read more

Share Comments · · ·

Databases using R

As R developers, our first instinct may be to approach databases the same way we do regular files. We start by reading all of the data into memory and then proceed to data exploration. But what if there is a better way…

Read more

Share Comments · · ·