February 2022: "Top 40" New CRAN Packages

by Joseph Rickert

February was a good month for new R packages on CRAN. Here are my “Top 40” selections from the two hundred packages that arrived in thirteen categories: Computational Methods, “Data”, Genomics, Linguistics, Machine Learning, Mathematics, Medicine, Networks, Science, Sports, Statistics, Utilities, Visualization.

Computational Methods

fastadi v0.1.0: Implements the adaptive-impute matrix completion algorithm described in Cho, Kim & Rohe (2016). See README for details.

invertiforms v0.1.0: Provides composable invertible transforms for sparse matrices. See README for examples.

Visualization of regularized degree normalized graph Laplacian

mirai v0.1.1: Provides a simple and lightweight method for concurrent and parallel code execution, built on NNG, Nanomsg Next Gen technology. See README for examples.

Data

amerifluxr v1.0.0: Provides a programmatic interface to the AmeriFlux database and includes query, download, and data summary tools. There are vignettes on Data Import and Site Selection.

finnishgrid v0.1.0: Implements an API client for Fingrid Open Data on the electricity market and the power system. See the vignette for an introduction.

Time series plot of electricity production

fixtuRes v0.1.3: Provides functions to generate mock data in R using YAML configurations. See the vignette.

jpstat v0.2.0: Provides tools for using the e-Stat API, a portal site for Japanese government statistics, and includes functions for automatic query generation, data collection and formatting. See README for examples.

Gemomics

fcfdr v1.0.0: Provides functions to implement the Flexible cFDR (Hutchinson et al. (2021)) and Binary cFDR (Hutchinson et al. (2021)) methodologies to leverage auxiliary data from arbitrary distributions, for example functional genomic data, with GWAS p-values to generate re-weighted p-values. There is an Introducion and there are vignettes on LDAK and TID

simphony v1.0.0: Provides a tool for simulating rhythmic data: transcriptome data using Gaussian or negative binomial distributions, and behavioral activity data using Bernoulli or Poisson distributions. See Singer et al. (2019) for details, and the vignettes on simphony’s options and evaluating rhythm detection for examples.

Plots illustrating rythmic and non-rythmic features

Linguistics

glottospace v0.0.111: Provides streamlined workflows for geolinguistic analysis, including: accessing global linguistic and cultural databases, data import, data entry, data cleaning, data exploration, mapping, visualization and export. See README to get started.

Language families on world map

Machine Learning

aum v2022.2.7: Uses a standard template library sort to implement an efficient algorithm for computing AUM, Area Under Min(FP, FN), and directional derivatives. See Hillman & Hocking (2021) for details and the vignettes Accuracy comparison and Speed comparison.

auc curves vs. iteration for several models

familiar v1.0.0: Provides an unified interface for end-to-end automated machine learning and model evaluation. There is an Introduction, and five additional vignettes including Evaluation and explanation and Feature selection methods.

Model calibration plot

OptHoldoutSize v0.1.0.0: Provides tools to estimate the size of a holdout set and the associated errors when updating predictive scores. See Haidar-Wehbe et al. (2022) for details and the vignettes Comparison of algorithms, ASPRE Example, and Simulated Example.

Plots showing loss vs. holdout set size for various secnarios

soundClass v0.0.9.1: Implements a sound classification workflow with functions to automatically classify sound events using convolutional neural networks. See Gibb et al. (2019), Mac Aodha et al. (2018), and Stowell et al. (2019) for background and the vignette for an example.

Mathematics

gmpoly v1.1.0: Provides symbolic calculation, addition, multiplication, and evaluation of multivariate polynomials with rational coefficients. See README for examples.

gyro v0.2.0: Implements functions for three dimensional hyperbolic geometry based on the theory found in Ungar (2005). The short vignette points to resources to get you started.

Hyperbolic Icosahedron

sumR v0.4.6: Implements functions based on theoretical results which ensure that the summation of an infinite discrete series is within an arbitrary margin of error of its true value. See Braden (1992) for background.

Medicine

admiral v0.6.3: Implements a toolbox for programming CDISC compliant Analysis Data Model ADaM datasets in R in accordance with the Analysis Data Model Implementation Guide. There are seven vignettes including Creating ADSL and Creating a BDS Exposure ADaM.

baker v1.0.0: Provides functions to specify, fit and visualize nested partially-latent class models for inference of population disease etiology and individual diagnosis. See Wu et al. (2015), Wu et al. (201). and Wu & Chen (2020) for background and the vignette for examples.

Matrix of odds ratios

MicroMB v0.0.12: Implements a framework based on S3 dispatch for constructing models of mosquito-borne pathogen transmission which are constructed from submodels of various components. Consistent mathematical expressions for the distribution of bites on hosts enables stochastic and deterministic models to be coherently incorporated and updated over a discrete time step. There are nine vignettes including the Ross-Macdonale mosquito model, transmission model, and Blood feeding.

Wiring diagram for blood feeding computation

musclesyneRgies v1.1.3: Provides a framework to factorize electromyography data including tools for raw data pre-processing, non negative matrix factorization, classification of factorised data and plotting of obtained outcomes. There are vignettes on analysis, plots, pro tips, and workflow.

Motor control time series and synergy plots for four trials

ravetools v0.0.3: Implements signal processing tools for analyzing electrophysiology data including a fast, memory-efficient Notch-filter, Welch-periodogram, and a discrete wavelet transform algorithm for hours of high-resolution signals. See the RAVE Project and Magnotti et al. (2020) for background and README for examples.

Welch periodogram for notch filters

wildmeta v0.1.0: Implements single coefficient tests and multiple-contrast hypothesis tests of meta-regression models using cluster wild bootstrapping, based on the methods examined in Joshi et al. (2021). The vignette provides examples.

Density plot of bootstrapped Naive F Statistic

Networks

leidenbase v0.1.9: Implements an R to C/C++ interface that runs the Leiden community detection algorithm to find a basic partition. It includes the required source code files from the official leidenalg distribution and functions from the R igraph package. See the vignette for an example.

netropy v0.1.0: Provides functions to conduct a statistical entropy analysis of network data as introduced by Frank & Shafie (2016). There are vignettes on Joint Entropies, Prediction Power, Uni, Bi & Trivariate Entropies, and Data Editing.

Science

IceSat2R v1.0.1: Implements an interface to to the OpenAltimetry ICESat data through the API allowing users to download and process Global Geolocated Photon Data, Land Ice Height, Sea Ice Height, Land and Vegetation Height and more. There are vignettes on IceSat-2 Atlases, Mission Orbits, and Virtual File System Orbits.

Maps showing an area of interest in Himalayas

panstarrs v0.1.0: Implements an interface to the API for Pan-STARRS1, a data archive of the PS1 wide-field astronomical survey which allows access to the PS1 catalog and to the PS1 images. See the vignettes Cat and Images.

Image of the Antennae Galaxy

ThermalSampleR v0.1.0: Implements a range of simulations to aid researchers in determining appropriate sample sizes when performing critical thermal limits studies including a number of wrapper functions are provided for plotting and summarizing outputs from these simulations. There is both a vignette and a Shiny App.

Plots showing width of CI vs. sample size

Sports

footBayes v0.1.0: Provides functions to estimate, visualize, and predict the most well-known football models: double Poisson, bivariate Poisson, Skellam, student_t. The package allows Hamiltonian Monte Carlo (HMC) estimation through the underlying Stan environment and Maximum Likelihood estimation (MLE, for ‘static’ models only). See Dixon & Coles (1997) Karlis & Ntzoufras (2003), and Pauli & Torelli (2018) for background and the vignette for and introduction to the package.

Statistics

fido v1.0.0: Provides methods for fitting and inspecting Bayesian Multinomial Logistic Normal Models using MAP estimation and Laplace approximation as developed in Silverman et. Al. (2022). There is an Introduction and vignettes on PCR Bias, Non-linear models, Joint Modeling, and Picking priors.

plot showing multiple densities

geostan v0.2.1: Provides Stan-based tools for Bayesian inference with spatial data, including exploratory analysis tools, multiple spatial model specifications, spatial model diagnostics, and special methods for inference with small area survey data. See Donegan et al. (2020) for background and the vignettes on Spatial autocorrolation and Survey data for examples.

Plots of observed data and residuals

Landmarking v1.0.0: Provides functions to perform Landmark survival analyses which allow survival predictions to be updated dynamically as new measurements from an individual are recorded. There is an Introduction and a vignette on how to use the package.

Plot showing systolic blood pressure vs. age at a landmark age with repeated measures

PUMP v1.0.0: Provides functions to estimate power, minimum detectable effect size and sample size requirements in the context of multilevel randomized experiments with multiple outcomes. See Hunter et al. (2021) for the details. There is a Package Demo vignette and vignettes on the Sampling method, and Simulating multi-level data.

Plots showing sample size against power.

safestats v0.8.6: Provides functions to design and apply tests which are anytime valid, can be used to design hypothesis tests in the prospective/randomized control trial setting or in the observational/retrospective setting, and remain valid under both optional stopping and optional continuation. For details on the theory of safe tests, see Grunwald et al. (2019). There is a vignette on Safestats and another on Contingency tables.

Plot of stopping times vs. divergence

Utilities

audubon v0.1.1: Provides a collection of Japanese text processing tools for filling Japanese iteration marks, Japanese character type conversions, segmentation by phrase, and text normalization which is based on rules for the Sudachi morphological analyzer and the NEologd (Neologism dictionary for MeCab). See README to get started.

rconfig v0.1.1: Allows users to manage R configuration files and override configuration statements from the command line. Look here for details.

Visualization

ggchangepoing v0.1.0: R provides tools for changepoint analysis and uses ggplot2 to visualize changepoints. See the vignette.

Time Series with Change Points

ggdensity v0.0.1: Provides functions for visualizing contours of 2-d kernel density estimates and implements several additional density estimators as well as more interpretable visualizations based on highest density regions instead of the traditional height of the estimated density surface. Look here for examples.

Contour density plot

gghdr v0.1.0: Provides a framework for visualizing Highest Density Regions in ggplot2. See the vignette.

Scatter plot showing high density region and data

ggpattern v0.4.2: Provides geoms filled with various patterns including patterned versions of every ggplot2 geom that has a region that can be filled along with a suite of aesthetics and scales for controlling pattern appearances. There four vignettes: Developing Patterns and three more that cover gradients, polygons, and crosshatching.

Bar charts filled with various patterns

Share Comments · · · ·

You may leave a comment below or discuss the post in the forum community.rstudio.com.