The reticulate package solves the hardest problem in data science: people

Andrew Mangano is the Director of eCommerce Analytics at Albertsons Companies. Part I - Modelling The reticulate package integrates Python within R and, when used with RStudio 1.2, brings the two languages together like never before. Much more important than the technical details of how it all works is the impact that it has on on both individuals and teams by enabling data scientists who speak different languages to collaborate seamlessly on a project.

Read more

Share Comments · · ·

Parsnipping Fama French

Today, we will continue our exploration of developments in the world of tidy models, and we will stick with our usual Fama French modeling flow to do so. For new readers who want get familiar with Fama French before diving into this post, see here where we covered importing and wrangling the data, here where we covered rolling models and visualization, and here where we covered managing many models.

Read more

Share Comments · · · · ·

Paid in Books: An Interview with Christian Westergaard

R is greatly benefiting from new users coming from disciplines that traditionally did not provoke much serious computation. Journalists1 and humanist scholars2, for example, are embracing R. But, does the avenue from the Humanities go both ways? In a recent conversation with Christian Westergaard, proprietor of Sophia Rare Books in Copenhagen, I was delighted to learn that it does. JBR: I was very pleased to learn when I spoke with you recently at the California Antiquarian Book Fair that you were an S and S+ user in graduate school.

Read more

Share Comments · ·

Graph analysis using the tidyverse

Walk-through of how to use tidyverse, along with tidygraph and ggraph to easily analyze graph data.

Read more

Share Comments · · · · ·

Some R Packages for ROC Curves

In a recent post, I presented some of the theory underlying ROC curves, and outlined the history leading up to their present popularity for characterizing the performance of machine learning models. In this post, I describe how to search CRAN for packages to plot ROC curves, and highlight six useful packages. Although I began with a few ideas about packages that I wanted to talk about, like ROCR and pROC, which I have found useful in the past, I decided to use Gábor Csárdi’s relatively new package pkgsearch to search through CRAN and see what’s out there.

Read more

Share Comments · · ·

January 2019: “Top 40” New CRAN Packages

One hundred and fifty-three new packages made it to CRAN in January. Here are my “Top 40” picks in eight categories: Computational Methods, Data, Machine Learning, Medicine, Science, Statistics, Utilities, and Visualization. Computational Methods cPCG v1.0: Provides a function to solve systems of linear equations using a (preconditioned) conjugate gradient algorithm. The vignette shows how to use the package. RcppDynProg v0.1.1: Implements dynamic programming using Rcpp. Look here for examples.

Read more

Share Comments · · ·

A Few New R Books

Greg Wilson is a data scientist and professional educator at RStudio. As a newcomer to R who prefers to read paper rather than pixels, I’ve been working my way through a more-or-less random selection of relevant books over the past few months. Some have discussed topics that I’m already familiar with in the context of R, while others have introduced me to entirely new subjects. This post describes four of them in brief; I hope to follow up with a second post in a few months as I work through the backlog on my desk.

Read more

Share Comments · · ·

A Look Back on 2018: Part 2

Welcome to the second installment of Reproducible Finance 2019! In the previous post, we looked back on the daily returns for several market sectors in 2018. Today, we’ll continue that theme and look at some summary statistics for 2018, and then extend out to previous years and different ways of visualizing our data.

Read more

Share Comments · · ·

R for Quantitative Health Sciences: An Interview with Jarrod Dalton

This interview came about through researching R-based medical applications in preparation for the upcoming R/Medicine conference. When we discovered the impressive number of Shiny-based Risk Calculators developed by the Cleveland Clinic and implemented in public-facing sites, we wanted to learn more about the influence of R Language in the development of statistical science at this prominent institution. We were fortunate to have Jarrod Dalton of the Quantitative Health Sciences Department grant this interview.

Read more

Share Comments · · ·

December 2018: “Top 40” New CRAN Packages

By my count, 157 new packages stuck to CRAN in December. Below are my “Top 40” picks in ten categories: Computational Methods, Data, Finance, Machine Learning, Medicine, Science, Statistics, Time Series, Utilities and Visualization. This is the first time I have used the Medicine category. I am pleased that a few packages that appear to have clinical use made the cut. Also noteworthy in this month’s selection are the inclusion of four packages from the Microsoft Azure team (stuffing 41 packages into the “Top 40”), and some eclectic, but fascinating packages in the Science section.

Read more

Share Comments · · ·