Chapman University DataFest Highlights

Editor’s Note: The 2017 Chapman University DataFest was held during the weekend of April 21-23. The 2018 DataFest will be held during the weekend of April 27-29. DataFest was founded by Rob Gould in 2011 at UCLA with 40 students. In just seven years, it has grown to 31 sites in three countries. Have a look at Mine Çetinkaya-Rundel’s post Growth of DataFest over the years for the details. In recent years, it has been difficult for UCLA to keep up with the growing interest and demand from southern California universities.

Read more

Share Comments · · · · ·

Visualizations with R and Databases

The Challenge Visualizations are one of R’s strengths. There are many functions and packages that create complex plots, often with one simple command. These plotting functions do two things: first, they take the raw data and run the calculations needed for a given visualization, and second, they draw the plot. If the source of the data resides within a database, the usual approach is to import all of the data and then create the plot.

Read more

Share Comments · · · · · ·

End-to-end visualization using ggplot2

ggplot2 is kind of a household word for R users. I’ve ended up using it for complex data munging and wrangling work, where I needed to get clarity on different aspects of the data, especially being able to get different views, slices and dices of it, but in a nice visualization. At some point along the line, I slowly stopped using more traditional plotting functions like plot(), matplot(), barplot(), etc.

Read more

Share Comments · · ·

Portfolio Volatility Shiny App

In our 3 previous posts, we walked through how to calculate portfolio volatility, then how to calculate rolling volatility, and then how to visualize rolling volatility. Today, we will wrap all of that work into a Shiny app that allows a user to construct his or her own five-asset portfolio, choose a benchmark and a time period, and visualize the rolling volatilities over time. Here is the final app: There will be a slight departure in form today because we will use a helpers.

Read more

Share Comments · · · ·

R and Interactive Graphics

Judging from the number of JSM talks that incorporated interactive visualizations of some sort or another, it appears that interactive graphics have captured the attention of a good many statisticians. I found this a little surprising. Statisticians, on the whole, are not easily impressed by “eye candy”, and I believe that there are many, like me, who think that base R graphics remain a powerful tool for data exploration. The ability of R’s plot() function to quickly produce plots for all sorts of objects helps an R user attain that state of flow that makes R such a productive environment for data analysis.

Read more

Share Comments · ·

A Postcard from JSM

Baltimore has the reputation of being a tough town: hot in the summer and gritty, but the convention center hosting the Joint Statistical Meetings is a pretty cool place to be. There are thousands of people here and so many sessions (over 600) that it’s just impossible to get an overview of all that’s going on. So, here are couple of snapshots from an R-focused, statistical tourist. First Snapshot: What’s in a Vector?

Read more

Share Comments · · ·

Looking for R at JSM

I am very much looking forward to attending JSM which begins this Sunday. And once again, I will be spending a good bit of my time hunting for new and interesting applications of R. In years gone by, this was a difficult game at JSM because R, R Package, Shiny, tidyverse and the like did not often turn up in a keyword search. This year, however, there is quite a bit of low hanging fruit.

Read more

Share Comments · ·

June 2017 New Package Picks

Two hundred and thirty-eight new packages were added to CRAN in June. Below are my picks for the “Top 40”, organized into six categories: Biostatistics, Data, Machine Learning, Miscellaneous, Statistics and Utilities. Some packages, including geofacet and secret, already seem to be gaining traction. Biostatistics BIGL v1.0.1: Implements response surface methods for drug synergy analysis, including generalized and classical Loewe formulations and the Highest Single Agent methodology. There are vignettes on Methodology and Synergy Analysis.

Read more

Share Comments · · ·

Visualizing Portfolio Volatility

This is the third post in our series on portfolio volatility, variance and standard deviation. If you want to start at the beginning with calculating portfolio volatility, have a look at the first post here - Intro to Volatility. The second post on calculating rolling standard deviations is here: Intro to Rolling Volatility.

Read more

Share Comments · · · ·

Some Ideas for your Internal R Package

At RStudio, I have the pleasure of interacting with data science teams around the world. Many of these teams are led by R users stepping into the role of analytic admins. These users are responsible for supporting and growing the R user base in their organization and often lead internal R user groups. One of the most successful strategies to support a corporate R user group is the creation of an internal R package.

Read more

Share Comments · · · ·