Sept 2020: "Top 40" New CRAN Packages

by Joseph Rickert

Two hundred thirty-six new packages made it to CRAN in September. Here are my “Top 40” picks in eleven categories: Computational Methods, Data, Finance, Genomics, Machine Learning, Mathematics, Medicine, Statistics, Time Series, Utilities and Visualization. The large number of packages and, in my opinion, the high percentage of high quality work made choosing only forty more difficult than for most months.

Computational Methods

pmwg v0.1.9: Provides an R implementation of the Particle Metropolis algorithm within a Gibbs sampler for model parameter. Covariance matrix and random effect estimation are described in Gunawan et al. (2020). There is a Tutorial.

sanic v0.0.1: Provides access to Eigen C++ library routines for solving large systems of linear equations. Direct and iterative solvers available include Cholesky, LU, QR, and Krylov subspace methods.

Data

cmsafops v1.0.0: Provides functions for the analysis and manipulation of CM SAF climate monitoring data. Detailed information and test data are available here.

friends v0.1.0: PRovides complete scripts from the American sitcom Friends in tibble format. Use this package to practice data wrangling, text analysis and network analysis. See README for examples.

nflfastR v3.0.0: Provides functions to access National Football League play-by-play data. Look here for examples.

od v0.0.1: Provide tools and example datasets for working with origin-destination (‘OD’) datasets of the type used to describe aggregate urban mobility patterns Carey et al. (1981) and supports the sf class system of Pebesma (2018). See the vignette to a brief introduction to OD data.

Finance

GARCHIto v0.1.0: Provides functions to estimate model parameters and forecast future volatilities using the Unified GARCH-Ito Kim and Wang (2016) and Realized GARCH-Ito Song et. al. (2020) models. See the vignette for an introduction.

LifeInsuranceContracts v0.0.2: Provides a framework for modeling traditional life insurance contracts such as annuities, whole life insurances or endowments and includes modeling profit participation schemes, dynamic increases or more general contract layers, as well as contract changes. See the vignette for details.

Genomics

dPCP v1.0.3: Implements the automated clustering and quantification of the digital PCR data is based on the combination of DBSCAN (Hahsler et al. (2019) and c-means (Bezdek et l. (1981) algorithms. See the vignette for examples.

MAPITR v1.1.2: Implements the algorithms described in Turchin et al. (2020) for identifying marginal epistasis between pathways and the rest of the genome. See the vignette for an example with simulated data.

Machine Learning

FuncNN v1.0: This Allows the user to build models of the form: f(z, g(x) | θ) where f() is a neural network, z is a vector of scalar covariates, and g(x) is a vector of functional covariates. The package is built on top of the Keras/Tensorflow architecture. See Thind et al. (2020) for information on the methodology, and README for an example.

shapr 0.1.3: Implements the method for computing Shapley Values which accounts for feature independence as described in Aas et al. (2019) to help interpret machine learning models. See the vignette for details.

rMIDAS v0.1.0: Implements the method for multiple imputation using denoising autoencoders described in Lall & Robinson (2020) that has advantages for large data sets.

Mathematics

Riemann v0.1.0: Provides algorithms for manifold-valued data, including Fréchet summaries, hypothesis testing, clustering, visualization, and other learning tasks. Look here for the math.

simplextree v1.0.1: Provides an interface to a Simplex Tree data structure which enables efficient manipulation of simplicial complexes of any dimension. See Boissonnat & Maria (2014) for background and look here for a quickstart.

topsa v0.1.0: Provides functions to estimate geometric sensitivity indices reconstructing the embedding manifold of a data set. Detailed information of can be found in Hernandes et al.

Medicine

card v0.1.0: Provides tools to help assess the autonomic regulation of cardiovascular physiology with respect to electrocardiography, circadian rhythms, and the clinical risk of autonomic dysfunction on cardiovascular health through the perspective of epidemiology and causality. For background on the analysis of circadian rhythms through cosinor analysis see Cornelissen (2014) and Refinetti et al. (2014). There are two vignettes: circadian and cosinor.

EpiNow2 v1.2.1: Provides functions to estimate the time-varying reproduction number, rate of spread, and doubling time using a range of open-source tools Abbott et al. (2020) for background, Gostic et al. (2020) for current best practices, and README for examples.

psrwe v1.2: Provides tools to incorporate real-world evidence (RWE) into regulatory and health care decision making and includes functions which implement the PS-integrated RWE analysis methods proposed in Wang et al. (2019), Wang et al. (2020), and Chen et al. (2020). There is a vignette on propensity score integration.

Tplyr v0.1.3: Implement a tool to simplify table creation and the data manipulation necessary to create clinical reports. There is a Getting Started Guide, and vignettes on Layers, Options, and Tables.

Statistics

bkmrhat v1.0.0: Extends the Bayesian kernel machine regression package bkmrto allow multiple-chain inference and diagnostics by leveraging functions from the future, rstan, and coda package. See Bobb et al. (2018) for background and the vignette for examples.

densEstBayes v1.0-1: Provides functions for density estimation via Bayesian inference engines including Hamiltonian Monte Carlo, the no U-turn sampler, semiparametric mean field variational Bayes and slice sampling. The methodology is described in Wand and Yu (2020). The vignette has several examples.

EquiSurv v0.1.0: Provides both a non-parametric and a parametric approach to investigating the equivalence (or non-inferiority) of two survival curves obtained from two given datasets. Tests are based on the creation of confidence intervals at pre-specified time points. see Möllenhoff &Tresch (2020) for all of the details.

gmGeostats v0.10-7: Provides functions to support the geostatistical analysis of multivariate data, in particular data with restrictions. See Tolosana-Delgado et al. (2018) for background and the vignette for the basics.

hermiter v1.0.0: Provides functions for estimating the full probability density function, cumulative distribution function and quantile function using Hermite series based estimators which are particularly useful in the sequential setting (both stationary and non-stationary) and one-pass batch estimation setting for large data sets. See Stephanou et al. (2017) and Stephanou et al. (2020) for background and the vignette for examples.

ivreg v0.5-0: Implements instrumental variable estimation for linear models by two-stage least-squares (2SLS) regression. Several methods are provided for fitted ivreg model objects, including extensive functionality for computing and graphing regression diagnostics in addition to other standard model tools. There is an overview and a vignette on diagnostics.

mcmcsae v0.5.0: Provides functions to fit multi-level models with possibly correlated random effects using Markov Chain Monte Carlo simulation. There are vignettes on Area-level models, Linear Regression, and Unit-level models.

rater v1.0.0: Provides functions to fit models based on Dawid & Skene (1979) to repeated categorical data. The vignette describes the modeling workflow.

testtwice v1.0.3: Implements the method of Rosenbaum (2012) to test one hypothesis with several test statistics while correcting for multiple testing.

txshift v0.3.4: Provides functions to estimate the population-level causal effects of stochastic interventions on a continuous-valued exposure. The causal parameter and estimation methodology are described in Díaz & van der Laan (2013). There is an Introduction to Targeted Learning and an additional vignette with a more advanced example.

vacuum v0.1.0: Implements Tukey’s FUNOP (FUll NOrmal Plot), FUNOR-FUNOM (FUll NOrmal Rejection-FUll NOrmal Modification), and vacuum cleaner procedures to identify, treat, and analyze outliers in contingency tables. See Tukey (1962). There is a vignette on the vacuum.

Time Series

localFDA v1.0.0: Implements a theoretically supported alternative to k-nearest neighbors for functional data to solve problems of estimating unobserved segments of a partially observed functional data sample, functional classification and outlier detection. The methodology and details are in Elías et al. (2020). Look here for some examples.

onlineforecast v0.9.3: Implements a framework for fitting adaptive forecasting models that provides a way to use forecasts as input to models, e.g. weather forecasts for energy related forecasting. There are vignettes on Forecast Evaluation, Model Setup, and Data Setup.

Utilities

cmdfun v1.0.2: Provides a framework for building function calls to interface with shell commands by allowing lazy evaluation of command line arguments. It is intended to enable package builders to wrap command line software, and to help analysts stay inside the R environment. Full documentation is on the package website.

ducdb v0.2.1-2: The DuckDB project is an embedded analytical data management system with support for the Structured Query Language (SQL). This package includes all of DuckDB and a R Database Interface (DBI) connector.

path.chain v0.2.0: Provides path_chain class and functions which facilitates loading and saving directory structure in YAML configuration files via config package. There is a vignette on Path Validation and another on Config Files.

procmaps v0.0.3: Provides functions to determine which library or other region is mapped to a specific address of a process. It is the equivalent of /proc/self/maps as a data frame, and is designed to work on all major platforms.

robservable v0.2.0: Enables loading and displaying online JavaScript Observable notebook. Have a look at the Gallery, the Introduction, and the vignette on Shiny Applications.

Visualization

catmaply v0.9.0: Implements methods and plotting functions for displaying categorical data on an interactive heatmap using plotly. In addition to the viewer pane, resulting plots can be saved as a standalone HTML file, embedded in R Markdown documents or in a Shiny app. The vignette offers examples.

diffviewer v0.1.0: Implements an HTML widget that shows differences between files (text, images, and data frames).

ggip v0.2.0: Extends ggplot2 to enable the visualization of IP (Internet Protocol) addresses and networks using space filling curves that map the address space onto Cartesian coordinates. It offers full support for both IPv4 and IPv6 address spaces. There is an Introduction and a vignette on Visualizing IP Data.

Share Comments · · ·

You may leave a comment below or discuss the post in the forum community.rstudio.com.