February 2023: "Top 40" New CRAN Packages

by Joseph Rickert

One hundred seventy-three new packages made it to CRAN in February. Here are my “Top 40” selections in thirteen categories: Computational Methods, Data, Ecology, Economics, Machine Learning, Mathematics, Medicine, Pharma, Science, Statistics, Time Series, Utilities, and Visualization.

Computational Methods

dcTensor v1.0.1: Implements semi-binary and semi-ternary matrix methods based on non-negative matrix factorization (NMF) and singular value decomposition (SVD). For the details see the reference section of GitHub README.md. There are seven vignettes including Discretized Singular Value Decomposition, Discretized Partial Least Squares, and Discretized Non-negative Tucker Decomposition.

Plot showing whether the original data is recovered by the dSVD


dhis2r v0.1.1: Implements a connection to DHIS2, a global open-source project coordinated by the HISP Centre at the University of Oslo. See the vignette to get started.

ispdata v1.1: Provides access to data from the Rio de Janeiro Public Security Institute including criminal statistics, data on gun seizures and femicide. See README to get started.

ohsome v0.2.1: Implements a client for Heidelberg Institute for Geo information Technology’s OpenStreatMap API and provides functions to analyze the rich data source of OpenStreetMap (OSM) history. See the vignette for examples.

map showing Breweries per sq km in Bavaria

OlympicRshiny v1.0.0: Implements a Shiny App to visualize Olympic Data from 1896 to 2016 residing in a Kaggle Dataset. Look here to get started.

whitewater v0.1.2: Provides methods for retrieving United States Geological Survey (USGS) water data using sequential and parallel processing. See Bengtsson (2022) for background on the parallel methods and README for examples.

Map and histogram of peak flow


birdscanR v0.1.2: Provides functions to extract bird and insect data from Birdscan MR1 SQL vertical-looking radar databases, filter, and process the data into Migration Traffic Rates, e.g. #objects per hour and per km, etc. See Haest et al. (2021) and Schmid et al. (2019) for background and the vignette for examples.


echoice2 v0.2.3: Implements choice models based on economic theory, including MCMC based estimation prediction, Hierarchical Multinomial Logit and Multiple Discrete-Continuous (Volumetric) models. See Allenby, Hardt and Rossi (2019), Kim, Hardt, Kim and Allenby (2022), and Hardt and Kurz (2020) for the underlying theory, and the vignettes on Importing lists of lists and Volumetric Demand and Conjunctive Screening.

Density of pizza purchases

Machine Learning

ODRF v0.0.3: Implements oblique decision random random forests as an ensemble of oblique decision trees which use linear combinations of predictors for partitioning trees. See the vignette.

Example of an oblique classification tree

spinner v1.1.0: Provides a torch implementation of Graph Net architecture allowing different options for message passing and feature embedding. Look here for examples.

Plot showing performance of model on training and test sets

sMTL v0.1.0: Implements L0-constrained Multi-Task Learning and domain generalization algorithms which are coded in Julia allowing for fast coordinate descent and local combinatorial search algorithms. See Loewinger et al. (2022) for details and look here for an introduction.

tfevents v0.0.2: Provides a convenient way to log scalars, images, audio, and histograms in the tfevent record file format. Logged data can be visualized on the fly using TensorBoard, a web based tool that focuses on visualizing the training progress of machine learning models. Look here for an example.

tfevents dashboard

tidyAML v0.0.1: Implements a simple interface for automatic machine learning that uses the fits tidymodels framework to fit models. Look here for examples.


deFit v0.1.2: Provides functions that use numerical optimization to fit first and second order ordinary differential equations to time series data in order to examine the dynamic relationships between variables or the characteristics of a dynamical system. Look here for examples.

Plot of second order differential equation fit

fitlandr v0.1.0: Provides a toolbox for estimating vector fields from intensive longitudinal data and construct potential landscapes. The vector fields can be estimated with two nonparametric methods: the Multivariate Vector Field Kernel Estimator by Bandi & Moloche (2018) and the Sparse Vector Field Consensus algorithm by Ma et al. (2013). The potential landscapes are simulated with the simlandr package of Cui et al. (2021) or with the method of Bhattacharya et al. (2011). See README for examples.

Plot of two dimensional vector field


CodelistGenerator v1.0.0: Provides functions to generate a candidate code list for the Observational Medical Outcomes Partnership common data model based on string matching. For a given search strategy, a candidate code list will be returned. See the Introduction and the vignettes on Candidate codes, Options, and Codelists for medicaitons.

simaerep v0.4.3: Implements bootstrap based simulation methods to detect clinical trials that may be under reporting adverse events. See Koneswarakantha (2021) for background and README for an example.

Plots showing adverse event reporting


metalite v0.1.1: Provides A metadata structure for clinical data analysis and reporting based on Analysis Data Model (ADaM) datasets which simplifies clinical analysis and reporting tool development by defining standardized inputs, outputs, and workflow. See Zhang et al. (2022), the package Introduction and the vignettes AE Listing, AE Specification, AE Summary, and Miettinen and Nurminen Test. Diagram of metalite framework


gcplyr v1.1.0: Implements tools to import, manipulate and analyze bacterial growth curve data as commonly output by plate readers including for reshaping common plate reader outputs into tidy formats. See README for documentation. There are several vignettes including an Introduction and Analyzing data and Dealing with noise.

Growth curves

locaR v0.1.2: Provides functions to conduct acoustic source localization, as well as organize and check localization data and results. Cobos et al. (2010) gives details of the algorithms. Vignettes include an Introduction, Detecting sound sources, and localize Multiple.

Animation of localization method

PooldiloutionR v1.0.0: Pool dilution is a isotope tracer technique wherein a biogeochemical pool is artificially enriched with its heavy isotopologue and the gross productive and consumptive fluxes of that pool are quantified. This package calculates gross production and consumption rates from closed-system isotopic pool dilution time series data. See Fischer and Hedin (2002) for background and the vignette for examples.

Predictions and observations over time


counterfactuals v0.1.1: Implements a modular R6 interface for counterfacual explanation methods including Burghmans et al. (2022), Dandl et al. (2020), and Wexler et al. (2019). See the the Introduction and the vignettes How to extend the package and Other types of models.

Plot of conterfactual surface

semfindr v0.1.4: Provides functions for sensitivity analysis in structural equation modeling using influence measures and diagnostic plots. It supports the leave-one-out case wise sensitivity analysis of Pek and MacCallum (2011). There is an introduction and three additional vignettes: Approximate Case Influence, Selecting Cases in Rerun, and Use Case IDs.

Plot of fit measure against generalized Cook's distance

stxplore v0.1.0: Implements statistical tools for spatio-temporal data exploration, including simple plotting functions, covariance calculations and computations similar to principal component analysis for spatio-temporal data. Look here for examples and see the vignettes Exploration using dataframes and Using stars objects.

Plot of group means

SurrogateRsq v0.2.0: Implements the surrogate R-squared measure for categorical data analysis proposed in Liu et al. (2022). See the vignette.

Diagram of workflow for modeling categorical data

Time Series

setartree v0.1.0: Implements the forecasting-specific tree-based model proposed by Godahewa et al. (2022) that is particularly suitable for global time series forecasting. Look here to get started.

StructuralDecompose v0.1.1: Provides functions, which perform very well in the presence of significant level shifts, to explain the behavior of a time series by decomposing it into trend, seasonality and residuals. See the short vignettes Decomposition and Example Walkthrough.

sufficientForecasting v0.1.0: Implements a sufficient forecasting method for a single time series using many predictors and a possibly nonlinear forecasting function. Assuming that the predictors are driven by some latent factors, the SF first conducts factor analysis and then performs sufficient dimension reduction. See Fan et al. (2017), Luo et al. (2022), and Yu et al. (2022) for background and the vignette to get started.

tseriesTARMA v0.3-2: Implements routines for nonlinear time series analysis based on Threshold Autoregressive Moving Average models and provides methods for model fitting and forecasting, tests for threshold effects and unit-root tests. See README for examples.


currr v0..1.2: Implements the family of map() functions with frequent saving of the intermediate results. This enables stopping the evaluation and then restarting it from where you left off by reading the already evaluated work from cache. See README to get started.

Animation showing workflow

dataMojo v1.0.0: Implements a grammar of data manipulation with data.table by providing a consistent a series of utility functions that help to solve the most common data manipulation challenges. See the vignette.

hexfont v0.3.1: Contains all the hex font files from the GNU Unifont Project compressed by ‘xz’. GNU Unifont is a duo spaced bitmap font that attempts to cover all the official Unicode glyphs plus several of the artificial scripts in the Under-ConScript Unicode Registry. See the vignette.

Example of hexfont

parabar v0.10.1: Provides a simple interface in the form of R6 classes for executing tasks in parallel, tracking their progress, and displaying accurate progress bars. See README for examples.

rang v0.2.0: Provides tools to resolve the dependency graph of R packages at a specific time point based on the information from various R-hub web services. The dependency graph can then be used to reconstruct the R computational environment with Rocker. See README and the FAQ.

tablexlsx v0.1.0: Provides functions to export data frames to excel workbooks. See the vignette.

xmpdf v0.1.3: Provides functions to edit XMP metadata in a variety of media file formats as well as edit bookmarks and documentation in pdf files. Functions can detect and use a variety of command-line tools to perform these operations including exiftool, ghostscript, and pdftk. See the Introduction and the FAQ vignette.


animate v0.3.9.4: Implements a web-based graphics device to extend base R graphics functions to support frame-by-frame animation and keyframes animation. Target use cases include real-time animated visualizations, agent-based models, dynamical systems, and animated diagrams. See the Introduction and the Q&A vignette.

Durga v1.0.0: Implements a system for plotting grouped data effect sizes. compatible with base R methods for combining plots. See Khan & McLean (2023) and the vignette for examples.

Plot showing effects of insulin on blood sugar

ggrain v0.0.3: Extends ggplot2 to create raincloud plots. See the vignette.

Example of raincloud plot

PieGlyph v0.1.0: Extends ggplot2 to replace points in a scatter plot with pie-chart glyphs showing the relative proportions of different categories. There are several vignettes including PieGlyph, Multinomial Classificatio, and Time series example.

Time series with covariate information

Share Comments · ·

You may leave a comment below or discuss the post in the forum community.rstudio.com.