February 2020: "Top 40" New R Packages

2020-03-26

by Joseph Rickert

One hundred sixty-four new packages made it to CRAN in February. Here are my “Top 40” picks in eleven categories: Computational Methods, Data, Genomics, Machine Learning, Mathematics, Medicine, Science, Statistics, Time Series, Utilities, and Visualizations.

Computational Methods

delayed v0.3.0: Implements mechanisms to parallelize dependent tasks in a manner that optimizes the computational resources. Functions produce “delayed computations” which may be parallelized using futures. See the vignette for details.

tergmLite v2.1.7: Provides functions to efficiently simulate dynamic networks estimated with the framework for temporal exponential random graph models implemented in the tergm package.

Data

crsmeta v0.2.0: Provides functions to obtain coordinate system metadata from various data formats including: CRS (Coordinate Reference System), EPSG (European Petroleum Survey Group), PROJ4 and WKT (Well-Known Text 2).

danstat v0.1.0: Implements an interface into the Statistics Denmark Databank API. The vignette provides an Introduction.

osfr v0.2.8: Implements an interface for interacting with OSF which enables users to access open research materials and data, or to create and manage private or public projects. There is a Getting Started Guide and a vignette on Authentication.

Genomics

selectSNPs v1.0.1: Provides a method using unified local functions to select low-density SNPs. See the Vignette for a tutorial.

varitas v0.0.1: Implements a multi-caller variant analysis pipeline for targeted analysis sequencing data. There is an Introduction and a vignette on Errors.

Machine Learning

autokeras v1.0.1: Implements an interface to AutoKeras, an open source software library for automated machine learning. See README for an example.

MTPS v0.1.9: Implements functions to predict simultaneous multiple outcomes based on revised stacking algorithms as described in Xing et al. (2019). See the vignette to get started.

quanteda.textmodels v0.9.1: Implements methods for scaling models and classifiers based on sparse matrix objects representing textual data. It includes implementations of the Laver et al. (2003) wordscores model, the Perry & Benoit’s (2017) class affinity scaling model, and the Slapin & Proksch (2008) wordfish model. See the vignette to get started.

SeqDetect v1.0.7: Implements the automaton model found in Krleža, Vrdoljak & Brčić (2019) to detect and process sequences. See the vignette for examples and theory.

studyStrap v1.0.0: Implements multi-Study Learning algorithms such as Merging, Study-Specific Ensembling (Trained-on-Observed-Studies Ensemble), the Study Strap, and the Covariate-Matched Study Strap. and offers over 20 similarity measures. See Kishida, et al. (2019) for background and the vignette for how to use the package.

Mathematics

PlaneGeometry v1.1.0: Provides R6 classes representing triangles, circles, circular arcs, ellipses, elliptical arcs and lines, plot methods, transformations and more. The vignette offers multiple examples.

Medicine

beats v0.1.1: Provides functions to import data from UFI devices and process electrocardiogram (ECG) data. It also includes a Shiny app for finding and exporting heart beats. See README to get started.

NMADiagT v0.1.2: Implements the hierarchical summary receiver operating characteristic model developed by Ma et al. (2018) and the hierarchical model developed by Lian et al. (2019) for performing meta-analysis. It is able to simultaneously compare one to five diagnostic tests within a missing data framework.

SAMBA v0.9.0: Implements several methods, as proposed in Beesley & Mukherjee (2020) for obtaining bias-corrected point estimates along with valid standard errors using electronic health records data with misclassifird EHR-derived disease status. See the vignette for details.

Science

baRUlho v1.0.1: Provides functions to facilitate acoustic analysis of (animal) sound transmission experiments including functions for data preparation, analysis and visualization. See Dabelsteen et al. (1993) for background and the vignette for an introduction.

CBSr v1.0.3: Uses monotonically constrained Cubic Bezier Splines to approximate latent utility functions in intertemporal choice and risky choice data. See the Lee et al. (2019) for the details.

Statistics

blockCV v2.1.1: Provides functions for creating spatially or environmentally separated folds for cross-validation in spatially structured environments and methods for visualizing the effective range of spatial autocorrelation to separate training and testing datasets as described in Valavi, R. et al. (2019). See the vignette for examples.

BGGM v1.0.0: Implements the methods for fitting Bayesian Gaussian graphical models recently introduced in Williams (2019), Williams & Mulder (2019) and Williams et al. (2019). There are vignettes on Credible Intervals, Plotting Network Structure, Comparing GGMs with the Posterior Predicive Distributions, and Predictability.

metagam v:0.1.0: Provides a method to perform the meta-analysis of generalized additive models and generalized additive mixed models, including functionality for removing individual participant data from models computed using the mgcv and gamm4 packages. A typical use case is a situation where data cannot be shared across locations, and an overall meta-analytic fit is sought. For the details see Sorensen et al. (2020), Zanobetti (2000), and Crippa et al. (2018). There is an Introduction and vignettes on Dominance, Heterogenity Plots, and Multivariate Smooth Terms.

MKpower v0.4: Provides functions for power analysis and sample size calculations for Welch and Hsu t-tests, Wilcoxon rank sum tests and diagnostic tests. See Flahault et al. (2005) and Dobbin & Simon (2007) for background, and the vignette for examples.

mvrsquared v0.0.3: Implements a method to compute the coefficient of determination for outcomes in n-dimensions. See Jones (2019) for the theory and the vignette to get started.

pdynmc v0.8.0: Provides functions to model linear dynamic panel data based on linear and nonlinear moment conditions as proposed by Holtz-Eakin et al.(1988), Ahn & Schmidt (1995), and Arellano & Bover (1995). See the vignette for the underlying theory and a sample session.

Superpower v0.0.3: Provides functions to simulate ANOVA designs of up to three factors, calculate the observed power and average observed effect size for all main effects and interactions. See Lakens & Caldwell (2019) for background, and the vignette for an introduction.

tune v0.0.1: Provides functions and classes for use in conjunction with other tidymodels packages for finding reasonable values of hyper-parameters in models, pre-processing methods, and post-processing steps. Look here for and example.

xrnet v0.1.7: Provides functions to fit hierarchical regularized regression models incorporating potentially informative external data as in Weaver & Lewinger (2019). See README for examples.

Time Series

seer v1.4.1: Implements a framework for selecting time series forecast models based on features calculated from the time series. For details see Talagala et al. (20180).

testcorr v0.1.2: Provides functions for computing test statistics for the significance of autocorrelation in univariate time series, cross-correlation in bivariate time series, Pearson correlations in multivariate series and test statistics for i.i.d. property of univariate series as described in Dalla et al. (2019). See the vignette for the math and examples.

Utility

bioC.logs v1.1: Fetches download statistics BioConductor.org. See the vignette.

matricks v0.8.2: Provides function to help with creation of complex matrices along with a plotting function. See the vignette for examples.

rco v1.0.1: Provides functions to automatically apply different strategies to optimize R code. These functions take R code as input, and returns R code as output. There are vignettes on: Contributing an optimizer, Docker files, Common Subexpression Elimination, Constant Folding, Constant Propagation, Dead Code Elimination, Dead Expression Elimination, Dead Store Elimination, and Loop-invariant Code Motion.

slider v0.1.2: Provides type-stable rolling window functions over any R data type and supports both cumulative and expanding windows. See the vignette for examples.

taxadb v0.1.0: Provides fast, consistent access to taxonomic data, and supports common tasks such as resolving taxonomic names to identifiers and looking up higher classification ranks of given species. There is an Introduction and a Schema.

tidyfst v0.8.8: Provides a toolkit of tidy data manipulation verbs with data.table as the backend, combining the merits of syntax elegance from dplyr and computing performance from data.table. There is a vignete written in Chinese, an English Language Introduction and vignettes on join, reshape, nest, fst and dt.

tidytable v0.3.2: Provides an rlang compatible interface to data.table. See README for examples.

Visualization

iNzightTools v1.8.3: Provides wrapper functions for common variable and dataset manipulation workflows primarily used by iNZight, a graphical user interface providing easy exploration and visualization of data for students. Many functions return the tidyverse code used to obtain the result in an effort to bridge the gap between GUI and coding.

IPV v0.1.1: Provides functions to generate item pool visualizations which are used to display the conceptual structure of a set of items. See Dantlgraber et al. (2019) for background and the vignette for examples.

spacey v0.1.1: Provides utilities to download USGS and ESRI geospatial data and produce high quality rayshader maps for locations in the United States. There is an Introduction

Tendril v2.0.4: Provides functions to compute and display tendril plots. See the vignnette for and introduction..

tidyHeatmap v0.99.9: Provides an implementation of the Bioconductor ComplexHeatmap package based on tidy data frames. See the vignette.