May 2020: "Top 40" New CRAN Packages

by Joseph Rickert

One hundred eighty-four new packages stuck to CRAN in May. The following are my “Top 40” picks in eleven categories: Data, Finance, Genomics, Marketing, Machine Learning, Medicine, Science, Statistics, Time Series, Utilities, and Visualization.


covid19nytimes v0.1.3: Provides accesses the NY Times Covid-19 county-level data for the US that is also available here. There is a vignette.

geodata v0.1.0: Contains small spatial datasets used to teach basic spatial analysis concepts. Datasets are based on of the GeoDa software workbook and data site.

GermaParl v1.4.2: Provides access to the GermaParl corpus of parliamentary debates of the German Bundestag maintained by the PolMine Project. The vignette introduces the corpus and package.

nhlapi v0.1.2: Retrieves and processes the data exposed by the open NHL API, including information on players, teams, games, tournaments, drafts, standings, schedules and other endpoints. There are vignettes on a Low-level API, Retrieving Player Data, and Retireving Team Data.

polAr v0.1.3: Implements a toolbox for the analysis of political and electoral data from Argentina. There are vignettes on Computing, Data Access, and Displaying Results.

rKolada v0.1.3: Provides methods for downloading and processing data and metadata from Kolada, the official Swedish regions and municipalities database. There is an Introduction and a Quick Start Guide.


strand v0.1.3: Provides a framework for performing discrete (share-level) simulations of investment strategies. Simulated portfolios optimize exposure to an input signal subject to constraints such as position size and factor exposure. The vignette on Backtesting with strand is nicely done.

TwitterAutomatedTrading v0.1.0: Provides access to the MetaTrader 5 platform that enables users to carry out automated trading using sentiment indexes computed from twitter and/or stockwits. See Godeiro (2018) for background, and the vignette for how to use the package.


immunarch v0.6.5: Provides a framework for bioinformatics exploratory analysis of bulk and single-cell T-cell receptor and antibody repertoires that includes data loading, analysis and visualization for bulk and single-cell AIRR (Adaptive Immune Receptor Repertoire) data. There is an Introduction and a vignette on Working with Data.

SubtypeDrug v0.1.0: Implements a tool to prioritize cancer subtype-specific drugs by integrating genetic perturbation, drug action, biological pathway, and cancer subtype. See Han et al. (2019) for background and the vignette for details on the package.

TransPhylo v1.4.4: Provides functions to reconstruct infectious disease transmission using genomic data. See Didelot et. al (2014) and Didelot et. al (2017) for background. See the Introduction and the vignettes: Inference of transmission tree from a dated phylogeny, Simultaneous Inference of Multiple Transmission Trees and Simulation of outbreak data.


CLVTools v0.5.0: Implements various probabilistic latent customer attrition models for non-contractual settings (e.g., retail business) with and without time-invariant and time-varying covariates. See Schmittlein et al. (1987) and Fader et al. (2005) for background and the vignette to get started.

grizbayr v1.2.2: Provides functions to implement Bayesian A / B and Bandit marketing tests. See Stucchio (2015) for background and the vignette to get started.

Machine Learning

applicable v0.0.1.1: Provides functions that measure the amount of extrapolation new samples can have from the training set which are based on the concept of applicability domains. See Netzeva et al (2005). There are vignettes for binary and continuous data.

piRF v0.1.0: Implements multiple state-of-the-art prediction interval methodologies for random forests including quantile regression intervals, out-of-bag intervals, bag-of-observations intervals, one-step boosted random forest intervals, bias-corrected intervals, high-density intervals, and split-conformal intervals. Look here for an example.

rules v0.0.2: Provides bindings that allow prediction rule ensembles, C5.0 rules, and Cubist to be used with the parsnip package.


AdhereRViz v0.1.0: Implements a Shiny based GUI to the AdhereR package to allow users to access different data sources, explore patterns of medication, and compute various measures of adherence. See the vignette for details.

MrSGUIDE v0.1.1: provides functions to facilitate subgroup analysis for single and multiple responses in both randomized trials and observational studies based on the GUIDE algorithm. See the Vignette.


ldsr v0.0.2: Provides functions to reconstruct streamflow and climate information using linear dynamical systems. See Nguyen and Galelli (2018) for background and the vignette for examples.

rties v5.0.0: Provides tools for investigating temporal processes in bivariate (e.g., dyadic) systems. The theoretical background can be found in Butler (2011) and Butler & Barnard (2019). There is an Overview, and vignettes on Intertia Coordination, Data Preparation, and System Varibles.


Compack v0.1.0: Implements regression methodologies with compositional covariates, including sparse log-contrast regression with compositional covariates proposed by Lin et al. (2014), and sparse log-contrast regression with functional compositional predictors proposed by Sun et al. (2020). There is a vignette.

ghypernet v1.0.0: Provides functions for model fitting and selection of generalized hypergeometric ensembles of random graphs (gHypEG). The package is based on the research by Casiraghi and collaborators. For example, see Casiraghi et al. (2016), Casiraghi (2017) and Casiraghi and Nanumyan (2018). There is an Introduction, a short Tutorial, and a vignette on Finding Significant Links.

motifcluster v0.1.0: Provides tools for spectral clustering of weighted directed networks using motif adjacency matrices. These methods, which perform well on large and sparse networks, are based on the methodology described in Underwood et al. (2020). See the vignette.

regmedint v0.1.0: Implements the regression-based causal mediation analysis with a treatment-mediator interaction term, as originally implemented in the SAS macro described in Valeri and VanderWeele (2013) and Valeri and VanderWeele (2015). There is an Introduction, and vignettes on Implementing Formulas, Bootstrapping, and Multiple Imputation.

Time Series

DeCAFS v3.1.5: Provides functions to detect abrupt changes in time series with local fluctuations as a random walk process and autocorrelated noise as an AR(1) process. See Romano et al. (2020) for the theory.

Rdrw v1.0.1: Provides functions to fit and simulate a univariate or multivariate damped random walk process (also known as an Ornstein-Uhlenbeck process or a continuous-time autoregressive model of the first order) which is suitable for analyzing time series data with irregularly-spaced observation times and heteroscedastic measurement errors. See Hu and Tak (2020) for background.

statespacer v0.1.0: Provides functions for estimating time series using the state space method. For background see JSS Vol 41. The package has an Introduction, a Dictionary for the model object and vignettes on Fitting and ARIMA Model, an Example and on Specifying a new model component.


almanac v0.1.1: Provides tools for implementing recurrence rules, i.e. functions for defining recurring events. There is an Introduction and vignettes on Adjusting and Shifting Dates, iCalendar Specification, and Quarterly Rules.

gdiff v0.2-1: Provides functions for performing graphical difference testing. Look here for more information.

i2dash v0.2.1: Provides functions for creating web-based dashboards. See the vignette.

pkgndep v1.0.0: Provides functions to check and visualize the “heaviness” of R packages. See the vignette.

presser v1.0.0: Implements the web service and functions to test web clients without using the internet.

stringfish v0.12.1: Implements a framework for performing string and sequence operations using the alt-rep system to speed up the computation of common string operations. See the vignette.

worcs Implements the Workflow for Open Reproducible Code in Science, WORCS. There is an Introduction, and vignettes on citing, git_cloud, and setup.


ggpacman v0.1.0: Reproduces the game Pac-Man using ggplot2 and gganimate. Look here for more information.

iNZightTS v1.5.2: Provides tools for working with time series data, including functions for drawing, decomposing, and forecasting, comparing multiple series, and fitting both additive and multiplicative models. Look here for more information.

prismadiagramR v1.0.0: Provides functions to create PRISMA diagrams used to track the identification, screening, eligibility, and inclusion of studies in a systematic review. See the vignette.

sketcher v0.1.3: Implements image processing effects that convert a photo into a line drawing image. See Tsuda (2020) for background and look here for examples.

upsetjs v1.3.1: Provides an htmlwidget wrapper for the JavaScript UpSet.js library. There is an Introduction, and vignettes on Coloring, Combination Modes, and Venn and Euler Diagrams.

xaringanthemer v0.3.0: Provides functions to create custom CSS themes. There is and Overview, and vignettes on ggplot2 Themes, and Template Variables.

