Graph analysis using the tidyverse

Walk-through of how to use tidyverse, along with tidygraph and ggraph to easily analyze graph data.

Read more

Share Comments · · · · ·

Some R Packages for ROC Curves

In a recent post, I presented some of the theory underlying ROC curves, and outlined the history leading up to their present popularity for characterizing the performance of machine learning models. In this post, I describe how to search CRAN for packages to plot ROC curves, and highlight six useful packages. Although I began with a few ideas about packages that I wanted to talk about, like ROCR and pROC, which I have found useful in the past, I decided to use Gábor Csárdi’s relatively new package pkgsearch to search through CRAN and see what’s out there.

Read more

Share Comments · · ·

January 2019: “Top 40” New CRAN Packages

One hundred and fifty-three new packages made it to CRAN in January. Here are my “Top 40” picks in eight categories: Computational Methods, Data, Machine Learning, Medicine, Science, Statistics, Utilities, and Visualization. Computational Methods cPCG v1.0: Provides a function to solve systems of linear equations using a (preconditioned) conjugate gradient algorithm. The vignette shows how to use the package. RcppDynProg v0.1.1: Implements dynamic programming using Rcpp. Look here for examples.

Read more

Share Comments · · ·

A Few New R Books

Greg Wilson is a data scientist and professional educator at RStudio. As a newcomer to R who prefers to read paper rather than pixels, I’ve been working my way through a more-or-less random selection of relevant books over the past few months. Some have discussed topics that I’m already familiar with in the context of R, while others have introduced me to entirely new subjects. This post describes four of them in brief; I hope to follow up with a second post in a few months as I work through the backlog on my desk.

Read more

Share Comments · · · ·

A Look Back on 2018: Part 2

Welcome to the second installment of Reproducible Finance 2019! In the previous post, we looked back on the daily returns for several market sectors in 2018. Today, we’ll continue that theme and look at some summary statistics for 2018, and then extend out to previous years and different ways of visualizing our data.

Read more

Share Comments · · ·

R for Quantitative Health Sciences: An Interview with Jarrod Dalton

This interview came about through researching R-based medical applications in preparation for the upcoming R/Medicine conference. When we discovered the impressive number of Shiny-based Risk Calculators developed by the Cleveland Clinic and implemented in public-facing sites, we wanted to learn more about the influence of R Language in the development of statistical science at this prominent institution. We were fortunate to have Jarrod Dalton of the Quantitative Health Sciences Department grant this interview.

Read more

Share Comments · · ·

December 2018: “Top 40” New CRAN Packages

By my count, 157 new packages stuck to CRAN in December. Below are my “Top 40” picks in ten categories: Computational Methods, Data, Finance, Machine Learning, Medicine, Science, Statistics, Time Series, Utilities and Visualization. This is the first time I have used the Medicine category. I am pleased that a few packages that appear to have clinical use made the cut. Also noteworthy in this month’s selection are the inclusion of four packages from the Microsoft Azure team (stuffing 41 packages into the “Top 40”), and some eclectic, but fascinating packages in the Science section.

Read more

Share Comments · · ·

Onboard and Offboard Data Manipulation in Flexdashboard

Harrison Schramm is a Professional Statistician and Non-Resident Senior Fellow at the Center for Strategic and Budgetary Assessments. The Shiny set of tools, and, by extension, Flexdashboard, give professional analysts tools to rapidly put interactive versions of their work in the hands of clients. Frequently, an end user will interact with data by either uploading or downloading a new set in its entirety (typically from a .csv or other similarly structured source), or do so ‘on the fly’ interactively, using tools like RHandsonTable.

Read more

Share Comments · · · ·

ROC Curves

I have been thinking about writing a short post on R resources for working with (ROC) curves, but first I thought it would be nice to review the basics. In contrast to the usual (usual for data scientists anyway) machine learning point of view, I’ll frame the topic closer to its historical origins as a portrait of practical decision theory. ROC curves were invented during WWII to help radar operators decide whether the signal they were getting indicated the presence of an enemy aircraft or was just noise.

Read more

Share Comments · · · · ·

A Look Back on 2018: Part 1

Welcome to Reproducible Finance 2019! It’s a new year, a new beginning, the Earth has completed one more trip around the sun, and that means it’s time to look back on the previous January to December cycle.

Read more

Share Comments · ·