November 2021: "Top 40" New CRAN Packages

by Joseph Rickert

Two hundred eleven new packages made it to CRAN in November. Here are my “Top 40” picks in thirteen categories: Computational Methods, Data, Ecology, Finance, Genomics, Humanities, Machine Learning, Medicine, Networks, Statistics, Time Series, Utilities, and Visualization. It was gratifying to see multiple packages developed for applications in the Computational Humanities. R is helping to extend the reach of data literacy.

Computational Methods

bigQF v1.6: Implements a computationally efficient leading-eigenvalue approximation to tail probabilities and quantiles of large quadratic forms (e.g. in the sequence kernel association test) and also provides stochastic singular value decomposition for dense or sparse matrices. See Lumley et al. (2018) for background and the vignettes: Checking pQF vs SKAT, Matrix-free computations, SKAT, weights, and projections.

LMMsolver v1.0.0: Implements an efficient system to solve sparse, mixed model equations. See the vignette.

Heat map of precipitation anomalies

nimbleAPT v1.0.4: Provides functions for adaptive parallel tempering with nimble models. See Lacki & Miasojedow (2016) and the vignette to get started.

Plot of jumps and samples

Data

binancer v1.2.0: Implements an R client to the Binance Public Rest API for data collection on cryptocurrencies, portfolio management and trading. See README for an example.

japanstat v0.1.0: Provides tools for using the API of e-Stat, a portal site for Japanese government statistics which include functions for automatic query generation, data collection and formatting. See README for examples.

openeo v1.1.0: Provides to access data and processing for openEO compliant back-ends. See the vignette to get started.

Ecology

mFD v1.0.1: Provides functions to compute functional traits-based distances between pairs of species for species gathered in assemblages. Chao et al. (2018) and Mouillot et al. (2013) along with the other references listed for background. There are five vignettes including General Workflow and Compute and Interpret Quality of Functional Spaces

Plots of trait differences

popbayes v1.0: Implements a Bayesian framework to infer the trends of animal populations over time from series of counts by accounting for count precision, smoothing the population rate of increase over time, and accounting for the maximum demographic potential of species. There is a Get started vignette and another on Getting species information.

Plot of population size over time

Finance

MultiATSM v0.0.1: Provides functions to model the affine term structure of interest rates based on the single-country unspanned macroeconomic risk framework of Joslin et al. (2014) and the multi country extensions of Jotikasthira et al.(2015), and Candelon and Moura (2021). See the vignette for an example.

Plot of level, slope and curvature by maturity years

Genomics

gwasrapidd v0.99.12: Provides easy access to the NHGRI-EBI GWAS Catalog via the REST API. There is a Getting Started Guide, a vignette on Variants associated with BMI, and an FAQ.

Examples of search criteria

phylosamp v0.1.6: Implements tools to estimate the probability of true transmission between two cases given phylogenetic linkage and the expected number of true transmission links in a sample. The methods are described in Wohl et al. (2021). There are vignettes on estimating FDR from sample size, Sample size from FDR, Sensitivity and specificity, and Examples.

Histogram of Genetic Distance

Rtropical v1.2.1: Provides functions to process phylogenetic trees with tropical support vector machines and principal component analysis defined with tropical geometry. See Tang et al. (2020) for details about tropical SVMs, Page et al. (2020) and Yoshida et al. (2019) for tropical PCA, and Ardila & Develin (2007) for some background on tropical mathematics. The vignette will get you started.

Humanities

litRiddle v0.4.1: Provides a dataset, functions to explore the quality of literary novels, the data of a reader survey about fiction in Dutch, a description of the novels the readers rated, and the results of stylistic measurements of the novels. There is a vignette.

Histograms of Literariness by Gender

kairos v1.0.0: Provides a toolkit for absolute dating and analysis of chronological patterns, including functions for chronological modeling and dating of archaeological assemblages from count data. There is a Manual, and a Bibliography.

Machine Learning

cuda.ml v0.3.1: Implements an R interface for RAPIDS cuML, a suite of GPU-accelerated machine learning libraries powered by CUDA. Look here for an example.

Projection map of manifold embedding

innsight v0.1.1: Implements methods to analyze the behavior and individual predictions of modern neural networks including Connection Weights as described by Olden et al. (2004), Layer-wise Relevance Propagation as described by Bach et al. (2015), Deep Learning Important Features as described by Shrikumar et al. (2017), and gradient-based methods like SmoothGrad described by Smilkov et al. (2017), and Gradient x Input described by Baehrens et al. (2009). There is an Introduction and a vignette on Custom Model Definition.

Heatmap showing feature relevance

topicmodels.etm v0.1.0: Provides functions to find topics in texts which are semantically embedded using techniques like word2vec or GloVe. See Dieng et al. (2019) for the details, and look here for an example.

Plot of ETM clusters

Medicine

clustra v0.1.5: Provides functions to cluster medical trajectories of unequally spaced and unequal length time series aligned by an intervention time. See the vignette for examples.

Plot of trajectories clustered by group

eSIR v0.4.2: Implements the extended state-space SIR models developed by Song Lab at UM school of Public Health which include capabilities to model time-varying transmission, time-dependent quarantine, and time-dependent antibody-immunization. See Wang et al. (2020) for background, and README for examples.

Plot of Probability of infection over time

QHScrnomo v2.2.0: Provides functions for fitting and predicting competing risk models, creating nomograms, k-fold cross validation, calculating the discrimination metric, and drawing calibration curves. See the vignette for a short tutorial.

Example of a nomogram

rnmamod v0.1.0: Implements a comprehensive suite of functions to perform and visualize pairwise and network meta-analyses. The package covers core Bayesian one-stage models implemented in a systematic review with multiple interventions, including fixed-effect and random-effects network meta-analysis. There are vignettes on Performing a network meta-analysis and Describing the network.

Map of network of interventions evaluated

whomds v1.0.1: Provides functions for calculating and presenting the results from the WHO Model Disability Survey (MDA). See Andrich (2014) for background on Rasch measurement. There are English and Spanish versions of six vignettes including Background on disability measurement and Best practices with Rasch Analysis.

Illustration of disability scores

Networks

diffudist v1.0.0: Enables the evaluation of diffusion distances for complex single-layer networks by providing functions to evaluate the Laplacians, stochastic matrices, and the corresponding diffusion distance matrices. See De Domenico (2017) and Bertagnolli & De Domenico (2021) for the details and the vignette for some theory and examples.

Hierarchical cluster plots of distances

NetFACS V0.3.0: Provides functions to analyze and visualize facial communication data, based on network theory and resampling methods and primarily targeted at datasets of facial expressions coded with the Facial Action Coding System. See Farine (2017) and Carsey & Harden (2014) for background, and the vignette for an introduction.

Network of facial expressions

Statistics

adass v1.0.0: Implements the adaptive smoothing spline estimator for the function-on-function linear regression model described in Centofanti et al. (2020). Look here for an example.

Heatmap of adaptive spline estimator

OneSampleMR v0.1.0: Provides functions for one sample Mendelian randomization and instrumental variable analyses including implementations of the Sanderson & Windmeijer (2016) conditional F-statistic, the multiplicative structural mean model HernĂ¡n and Robins (2006), and two-stage predictor substitution and two-stage residual inclusion estimators explained by Terza et al. (2008). There are short vignettes on the Multiplicative structural mean model and the Coditional F-Statistic.

PhaseTypeR v1.0.1: Implements functions to model continuous and discrete phase-type distributions, both univariate and multivariate. See Navarro (2019) for background. There is an Introduction and vignettes on Population Genetics and The Site Frequency Spectrum.

Histogram of Phase Type Distribution

stan4bart v0.0-2: Fits semiparametric linear and multilevel models with non-parametric additive Bayesian additive regression trees BART and Stan. Multilevel models can be expressed using lme4 syntax. Look here to get started.

zoid v1.0.0: Fits Dirichlet regression and zero-and-one inflated Dirichlet regression (also called trinomial mixture models) with Bayesian methods implemented in Stan. There are vignettes on Fitting Models, Simulating data, Priors, and Prior sensitivity for overdispersion.

Time Series

surveil v0.1.0: Fits time series models for routine disease surveillance tasks and returns probability distributions for a variety of quantities of interest, including measures of health inequality, period and cumulative percent change, and age-standardized rates. See the vignette for an introduction.

Plot time series by race

VisitorCounts v1.0.0: Provides functions for modeling and forecasting of park visitor counts using social media data and (partial) on-site visitor counts. See Wood et al. (2013) for background and the vignette for examples.

Plot of model fits and differences

Utilities

dateutils v0.1.5: Provides utilities for mixed frequency data, in particular, to aggregate and normalize tabular mixed frequency data, index dates to end of period, and seasonally adjust tabular data. There is an Introduction.

datefixR v0.1.2: Provides functions to fix messy dates such as those entered via text boxes. Standardizes “/” or “-”, whitespace separation, month abbreviations, and more. See the vignette.

lambdar v1.1.0: Provides functions for serving containers that can execute R code on the AWS Lambda serverless compute service. There are five vignettes including api-gateway-invocations and lambda-runtime-in-container.

Diagram of package structure

mtb v0.1.1: Provides functions to summarize time related data, generate axis transformation from data, and assist Markdown and Shiny file editing. There are vignettes on working with axes, color, Markdown, and time related data.

Diagram showing arrows associated with times

rocker v0.2.1: Provides an R6 class interface for handling database connections using the DBI package as backend which allows handling of connections to PostgreSQL, MariaDB, SQLite and other databases. There are seven short vignettes including DBI package objects and functions in R6 rocker class and Database transactions with R6 rocker class.

shinySelect v1.0.0: Provides a customizable, select control widget for Shiny to enable using HTML in the items and KaTeX to type mathematics. Look here for examples.

Mathematics in Shiny select bar

Visualization

gggrid v0.1-1: Extendsggplot2 to make it easy to add raw grid output, such as customised annotations, to a plot. See the vignette for examples.

latitude vs. longitude plot

NHSRplotthedots v0.1.0: Provides tools for drawing statistical process control charts with functions to draw XmR charts, use change points and apply rules with summary indicators for when rules are breached. There is an Introduction and vignettes Deviations from Excel defaults and Number of points required.

SPC Charts

shinymodels v0.1.0: Allows users to launch a shiny application for tidymodels results. For classification or regression models, the app can be used to determine if there is lack of fit or poorly predicted points. Look here to get started.

Shiny generated scatter plots

Share Comments · · · ·

You may leave a comment below or discuss the post in the forum community.rstudio.com.