March: "Top 40" New CRAN Packages

by Joseph Rickert

Two hundred and six new packages stuck to CRAN in March. Here are my “Top 40” selections in thirteen categories: Computational Methods, Data, Finance, Game Theory, Genomics, Machine Learning, Medicine, Networks, Science, Statistics, Time Series, Utilities, and Visualization.

Computational Methods

RCDT v1.1.0: Provides functions to perform 2D Delaunay triangulation, constrained or unconstrained, with the help of the CDT C++ library. See the vignette.

Plot of a sun curve

rlemon v0.2.0: Provides access to the LEMON C++ graph library. Look here for a list of algorithms.

Data

ag5Tools v0.0.1: Offers tools for downloading and extracting data from the Copernicus Agrometeorological indicators from 1979 to present derived from reanalysis (AgERAS) dataset. See ag5Tools to get started and the vignette on extracting data.

AirMonitor v0.2.2: Provides utilities for working with hourly air quality monitoring data with a focus on small particulates (PM2.5) along with algorithms to calculate NowCast and the associated Air Quality Index (AQI) as defined by the US Environmental Projection Agency AirNow program. There is an Introduction, a Developers Style Guide, and a Data Model.

Map of Western US with bubble chart of air quality

BISdata v0.1-1: Provides access to data from the Bank for International Settlements in Basel. Look here for an example.

Finance

AFR v0.1.0: Provides tools for regression, prediction and forecast analysis of macroeconomic and credit data adapted for banking sector of Kazakhstan for bank analysts and non-statisticians. There are vignettes on Data transformatiom, Diagnostic Tests, and Regression.

fixedincome v0.0.1: Implements objects that abstract interest rates, compounding factors, day count rules, forward rates and term structure of interest rates to assist with calculations of interest rates and fixed income. Look here for examples.

Plot of spot rate curve

Game Theory

socialranking v0.1.1: Offers a set of solutions to rank players based on a transitive ranking between coalitions, including CP-Majority, ordinal Banzhaf or lexicographic excellence solution summarized Allouche et al. (2020). See the vignette.

Genomics

AquaBPsim v0.0.1: Provides tools to simulate breeding programs including functions to simulate production and reproduction systems encountered in aquaculture. See the vignette for details.

coda4microbiome v0.1.1: Provides tools for microbiome data analysis that take into account its compositional nature including functions for variable selection for both, cross-sectional and longitudinal studies, and for binary and continuous outcomes. See the vignette

Illustration of package function

MitoHEAR v0.1.0: Provides functions for the estimation and downstream statistical analysis of the mitochondrial DNA Heteroplasmy calculated from single-cell datasets. See the vignette.

Heat map the ground truth and the new partition obtained with unsupervised cluster analysis

ZetaSuite v1.0.0: Provides functions to score hits from two-dimensional RNAi screens analyze single cell transcriptomics to differentiate rare cells from damaged ones. See Vento-Tormo et al. (2018) for background and the vignette for examples.

Plot showing evaluation of quality for individual readouts

Machine Learning

RGAN v0.1.1: Implements the Generative Adversarial Nets algorithm described in Goodfellow et al. (2014). See README for examples.

Plots comparing real and synthetic data

sentiment.ai v0.1.1: Implements Sentiment Analysis via tensorflow deep learning and gradient boosting models and also allows users to create embedding vectors for text which can be used in other analyses. See the vignette for an example.

sentopics v0.6.2: Offers a framework that joins topic modeling and sentiment analysis of textual data, and implements a fast Gibbs sampling estimation of Latent Dirichlet Allocation. See Griffiths & Steyvers (2004) and the Joint Sentiment/Topic Model of Lin, Everson & Ruger (2012). There is a vignette on Bascic Usage and another on Topical time series.

Time series of sentiment breakdown

transforEmotion v0.1.0: Provides access to sentiment analysis using the Python based huggingface transformer zero-shot classification model pipelines. The default pipeline is Cross-Encoder’s DistilRoBERTa trained on the Stanford Natural Language Inference and Multi-Genre Natural Language Inference datasets. There is a vignette on setting up Python.

Medicine

adaptr v1.0.0: Simulates adaptive clinical trials using adaptive stopping, adaptive arm dropping, and adaptive randomization. There is an Overview and vignettes on Basic and Advanced examples.

Plot of summary metrics by trial arm

EVI v0.1.1-4: Implements the epidemic volatility index, a novel early warning tool for identifying new waves in an epidemic as described in Kostoulas et al. (2021). See the vignette.

Time series of positive and negative predictive values for New York

rts2 v0.3: Provides functions to support modelling case data for real-time surveillance of infectious diseases including functions to generate a computational grid over an area of interest and approximate log-Gaussian Cox Process model. See Diggle et al. (2013) and Solin and Särkkä (2020) for background and look here for examples.

Color coded grid showing proportion of population over 65 imposed on map

Networks

BCDAG v1.0.0: Provides functions for structure learning of causal networks and estimation of joint causal effects from observational Gaussian data. See Castelletti & Mascaro (2021) and Castelletti & Mascaro (2022) for background, and the vignettes Random Data Generation, Output of learn_DAG(), and MCMC scheme for posterior inference for details.

SEset v1.0.1: Implements tools to compute and analyze the set of statistically-equivalent Gaussian, linear path models which generate the input precision or (partial) correlation matrix. See README for examples.

Directed and undirected plots of a network

Science

bdc v1.1.0: Provides functions for biodiversity data cleaning organized into five themes: Merging datasets, Pre-filtering, Taxonomy, Space (Flagging low precision coordinates), and Time (flagging inconsistent data collection dates). There are vignettes on Standardization, Pre-filter, Space, Taxonomy, and Time.

CSHShydRology v1.2.1: Offers a collection of user submitted functions to aid in the analysis of hydrological data. See the vignette.

Plot showing precipitation and flow over time

Statistics

cheem v0.2.0: Provides functions to explore local explanations of non-linear models by first calculating the tree SHapley Additive exPlanation for every observation and for calculating a projection basis, and then changing the basis with a radial tour. See Lundberg et al. (2019), Spyrison & Cook (2020) and Cook and Buja (2012) for background and the vignette to get started.

gif illustrating radial tour

CondCopulas v0.1.2: Provides functions for the estimation of conditional copulas models, various estimators of conditional Kendall’s tau statistic as proposed in Derumigny and Fermanian 2019a, 2019b and 2020. See the vignette for examples.

Plot of conditional quantiles

multilevelmod v0.1.0: Implements bindings for hierarchical regression models for use with the parsnip package. Models include longitudinal generalized linear models as described in Liang and Zeger (1986) and mixed-effect models as described in Pinheiro and Bates (2000). See the vignette for examples.

rbmi v1.1.3: Implements reference based multiple imputation allowing for the imputation of longitudinal datasets using predefined strategies. These include conventional MI methods, conditional mean imputation methods, and bootstrapped MI methods. See the Scope section of the Statistical Specifications vignette for more information on MI methods. There is a Quick Start Guide and an additional vignette on Advanced Functionality.

remiod v1.0.0: Implements reference-based multiple imputation of ordinal and binary responses under Bayesian framework, as described in Wang and Liu (2022). See the vignette.

Missing values plot

rlcv v1.0.0: Provides functions to estimate likelihood cross-validation bandwidth for uni- and multi-variate kernel densities which are robust with respect to fat-tailed distributions and outliers. See Wu (2019) for the theory and the vignette for an example.

Contour density plot

sftime v0.2-0: Provides classes and methods for spatial objects that have a registered time column, in particular for irregular spatiotemporal data. See the vignette.

Time series plots by ID and value

workboots v0.1.1: Provides functions for generating bootstrap prediction intervals from a tidymodels workflow. There is a Getting Started Guide and a vignette on estimating linear prediction intervals.

Plot showing predictions with prediction intervals

Time Series

dtts v0.1.0: Provides high-frequency time-series support via nanotime and data.table. See README for examples.

ngboostForecast v0.0.2: Implements probabilistic time series forecasting via natural gradient boosting for probabilistic prediction. See README for an example.

Plot of time series with forecast

svines v0.1.4: Provides functions to fit and simulate from stationary vine copula models for time series. See Nagler et al. (2022) for the theory and look here for examples.

Plots of copula contours

Utilities

dyn.log v0.4.0: Implements dynamic, configuration driven logging. There are vignettes on Configuration, Formats, Layouts, and Levels.

formatters v0.2.0: Provides a framework for rendering complex tables to ASCII, and a set of formatters for transforming values or sets of values into ASCII-ready display strings. See the vignette for examples.

git4r v0.1.2: Implements an interactive git user interface from the R command line that includes tools to make commits, branches, remotes, and diffs an integrated part of R coding. See the vignette.

Visualization

ggmice v0.0.1: Provides functions to enhance a mice imputation workflow with visualizations for incomplete and imputed data including functions to inspect missing data, develop imputation models, evaluate algorithmic convergence, and compare observed versus imputed data. See the Getting Started Guide and the vignette Old Firends.

Missing data pattern plot

langevitour v0.2: Implements an HTML widget that uses Langevin dynamics to show random walks through 2D projections of numerical data. It can be used from within R, or included in a self-contained Rmarkdown document. See the vignette for an examples.

2D projection of RNA sequence

picker v0.2.6: Provides functions to zoom, pan, and pick points from a deck.gl scatterplot and includes tooltips, labels, a grid overlay, legends, and coupled interactions across multiple plots. See README for examples.

gif illustrating point selection

Share Comments · · ·

You may leave a comment below or discuss the post in the forum community.rstudio.com.