Accelerate your plots with ggforce

In this post, I will walk you through some examples that show off the major features of the ggforce package. The main goal is to share a few ideas about customizing visualizations that you may find useful in your everyday work. The ggforce package is an extension to ggplot2 developed by Thomas Pedersen. Thanks to ggforce, you can enhance almost any ggplot by highlighting data groupings, and focusing attention on interesting features of the plot.

Read more

Share Comments · · · · ·

R/Medicine 2019 Workshops

R/Medicine 2019 kicked off on Thursday with two outstanding workshops. It was difficult to choose between the two, but fortunately both presenters developed rich sets of materials that are available online. Alison Hill delivered R Markdown for Medicine with an elegant HTML exposition masterfully created to cultivate beginners while still engaging experienced R Markdown users. Photo by Samuel Zeller on Unsplash In four sections: (1) R Markdown Anatomy, (2) Outputs and Tables, (3) Graphics for Communication and (4) Data and Workflows she developed aspects of R Markdown aimed at statisticians and clinicians writing medical document which should also delight a wide audience of R Markdown users.

Read more

Share Comments · · · · · ·

How to Send Custom E-mails with R

A common business oriented data science task is to programatically craft and send custom emails. In this post, I will show how to accomplish this with R on the RStudio Connect platform (a paid product built for the enterprise) using the blastula package.blastula provides a set of functions for composing high-quality HTML e-mails that render across various e-mail clients, such as gmail and outlook, and also includes tooling for sending out those e-mails via SMTP, the standard protocol for electronic mail transmission between different e-mail providers.

Read more

Share Comments · · · · ·

July 2019 "Top 40" R Packages

One hundred seventy-six new packages made it to CRAN in July. Here are my “Top 40” picks organized into twelve categories: Data, Data Science, Finance, Genomics, Machine Learning, Mathematics, Medicine, Statistics, Time Series, Topological Data Analysis, Utilities and Visualization. Data eia v0.3.2: Provides API access to data from the US Energy Information Administration (EIA). Use of the API requires a free API key. See the Package Overview. litteR v0.4.1: Implements a user interface to analyze litter data: beach litter, riverain litter, floating litter, seafloor litter, etc.

Read more

Share Comments · · · ·

Calculating Always-Valid p-values in R

In this post, we will develop a framework for always-valid inference based on the paper Always Valid Inference: Continuous Monitoring of A/B Tests (2019 Johari, Pekelis, Walsh). Using an always-valid p-value allows us to continuously monitor A/B tests, and potentially stop the test early in a valid way1. In section 5 of the paper, the authors propose their method for calculating always-valid p-values: the mixture sequential ratio probability test (mSPRT), first introduced by Robbins (1970).

Read more

Share Comments · ·

Tech Dividends, Part 2

In a previous post, we explored the dividend history of stocks included in the SP500, and we followed that with exploring the dividend history of some NASDAQ tickers. Today’s post is a short continuation of that tech dividend theme, with the aim of demonstrating how we can take our previous work and use it to quickly visualize research from the real world. In this case, the inspiration is the July 27th edition of Barron’s, which has an article called 8 Tech Stocks That Yield Steady Payouts.

Read more

Share Comments · · · · · ·

Plumber Logging

The plumber R package is used to expose R functions as API endpoints. Due to plumber’s incredible flexibility, most major API design decisions are left up to the developer. One important consideration to be made when developing APIs is how to log information about API requests and responses. This information can be used to determine how plumber APIs are performing and how they are being utilized. An example of logging API requests in plumber is included in the package documentation.

Read more

Share Comments · · · ·

Tech Dividends, Part 1

In a previous post, we explored the dividend history of stocks included in the SP500. Today, we’ll extend that analysis to cover the Nasdaq because, well, because in the previous post I said I would do that. We’ll also explore a different source for dividend data, do some string cleaning and check out ways to customize a tooltip in plotly. Bonus feature: we’ll get into some animation too.

Read more

Share Comments · · · · ·

Validating Type I and II Errors in A/B Tests in R

In this post, we seek to develop an intuitive sense of what type I (false-positive) and type II (false-negative) errors represent when comparing metrics in A/B tests, in order to gain an appreciation for “peeking”, one of the major problems plaguing the analysis of A/B test today. To better understand what “peeking” is, it helps to first understand how to properly run a test. We will focus on the case of testing whether there is a difference between the conversion rates cr_a and cr_b for groups A and B.

Read more

Share Comments · · ·

June 2019 "Top 40" R Packages

Approximately 136 new packages stuck to CRAN in June. (This number is difficult to nail down with certainty because packages may be removed from CRAN after sitting there for a few days.) Here are my picks for the June “Top 40” in ten categories: Computational Methods, Data, Finance, Genomics, Machine Learning, Science and Medicine, Statistics, Time Series, Utilities, and Visualization. Computational Methods cppRouting v1.1: Provides functions to calculate distances, shortest paths and isochrones on weighted graphs using several variants of Dijkstra algorithm.

Read more

Share Comments · · ·