January 2022: "Top 40" New CRAN Packages

by Joseph Rickert

Two Hundred and two new packages made it to CRAN in January. Several were in applications areas that don’t show up very often, and a few of these made it into my January “Top 40” picks. My selections are listed below in thirteen categories: Agriculture, Computational Methods, Data, Engineering, Finance, Genomics, Machine Learning, Mathematics, Medicine, Science, Statistics, Utilities, and Visualization. Slightly expanding the number of categories helps to emphasize the various application areas, but makes it more difficult to classify packages which could fit in multiple categories. I hope this does not inconvenience readers.

Agriculture

ALUES v0.2.0: Provides functions for fuzzy modeling to evaluate land suitability for different crops production according to the methodology established by the Food and Agriculture Organization and the International Rice Research Institut. There is a Getting Started Guide along with seven vignettes including: Methodology and Visualizing with maps.

Maps showing suitability of Marinduque for banana cultivation

Computational Methods

CGNM v0.1.1: Implements the cluster Gauss-Newton method to find multiple solutions to nonlinear least squares problems. See Aoki et al. (2020) for background and the vignette to get started.

Plots showing initial and final parameter distributions

simpr v0.2.2: Implements a tidyverse friendly framework for simulation studies, design analysis, and power analysis. It enables users to generate data, fit models, and tidy up model results in a single pipeline. There is an Introduction and there are short vignettes on Optimization, Reproducing simulations, and Managing simulation errors.

Data

chessR v1.5.0: Provides functions to enable users to extract chess game data from popular chess sites, including Lichess and Chess.com and then perform analyses on game data. See the vignette.

Plots showing distributions of chess results

dictionaRy v0.1.1: Implements an interface to the Free Dictionary API which allows users to retrieve dictionary definitions for English words, as well as additional information including phonetics, part of speech, origins, audio pronunciation, example usage, synonyms and antonyms which are returned in tidy format for ease of use.

flightsbr: v0.1.0: Provides functions to download flight and airport data from Brazil’s Civil Aviation Agency (ANAC) that includes detailed information on all aircraft, aerodromes, airports, and airports movements registered in ANAC, and on every international flight to and from Brazil, as well as domestic flights within the country. There is an Introduction and there are vignettes on Airport data and Flights data.

Plot showing daily number of flights in Brazil for 2019 and 2020

rGhanaCensus v0.1.0: Contains literacy and education data sets scrapped from the 2021 Ghana Population and Housing Census See the vignette.

Map of Ghana showing percentage of school dropouts by region and gender

Engineering

TesiproV v0.9.1: Provides functions to calculate the failure probability of civil engineering problems with Level I up to Level III Methods. See AU & BECK (2001) for background and the vignette for examples.

Finance

PDtoolkit v0.1.0: Provides functions for developing probability of default rating models including functions for imputations, binning of numeric and categorical risk factors, weights of evidence, information value calculations, and risk factor clustering as well as validation functions for testing homogeneity, heterogeneity, discriminatory and predictive power of the model. See README for examples.

ufRisk v1.0.2: Provides functions to calculate Value at Risk (VaR) and Expected Shortfall (ES) by means of various parametric and semiparametric GARCH-type models. The approaches implemented in this package are described in Feng et al. (2020) and Letmathe et al. (2021). See README to get started.

Plot showing Out of sample losses and VaR over time

Genomics

aphylo v0.2-1: Implements a parsimonious evolutionary model to analyze and predict gene-functional annotations in phylogenetic trees as described in Vega Yon et al. (2021). See the vignette for an example.

Circular plot showing prediction accuracy for annotated phylogenetic tree

edlibR v1.0.0: Implements an interface to the Edlib C/C++library for exact sequence alignment using Levenshtein distance. See the vignette to get started.

freqpcr v0.4.0: Implements functions for the interval estimation of the population allele frequency from qPCR analysis based on the restriction enzyme digestion (RED)-DeltaDeltaCq method as described in Osakabe et al. (2017). The vignette is in English and Japanese.

Probability density plots for allele content

Machine Learning

longmixr v1.0.0: Adapts the consensus clustering approach from ConsensusClusterPlus for longitudinal data using flexmixflexible mixture models. See the vignette.

Plot showing distributions across clusters

reclin2 v0.1.1: Provides functions to assist performing probabilistic record linkage and deduplication: generating pairs, comparing records, em-algorithm for estimating m- and u-probabilities. See Fellegi & Sunter (1969) for background. There is an Introduction and there are vignettes on Deduplication, Record lingage, and Using a cluster for parallel computing.

rego v1.3.4: Implements a machine learning algorithm for predicting and imputing time series along with a Bayesian stochastic search methodology for model selection. Written in C++, the authors claim that the package is suitable for problems with hundreds or thousands of dependent variables and problems in which the number of dependent variables is greater than the number of observations. Look here for documentation.

Mathematics

tesselation v1.0.0: Computes Delaunay and Voronoï tessellations and provides functions to plot the 2 and 3 dimensional tessellations. Delaunay tessellations are computed in C with the help of the Qhull. There is a vignette on GitHub.

Rotating, multicolored dodecahedron

weyl v0.0-1: Provides functions for working with Weyl Algebras. See the vignette.

Medicine

biodosetools v3.6.0: Implements a Shiny application to perform the various statistical tests and calculations needed by Biological Dosimetry Laboratories. There ar vignettes on Dicentrics dose estimation and dose-effect fitting and on Translocations dose estimation and dese-effect fitting.

Screen capture of shiny app for showing coefficients plot for model fit

rccola v1.0.2: Provides secure convenience functions for entering and handling API keys and pulling data directly into memory. By default, it will load from REDCap instances, but other sources are injectable via inversion of control. See README for documentation.

Science

datelife v0.6.1: Implements the functions underlying the DateLife web service to allow researchers and the general audience to obtain open scientific data on ages for their organisms of interest. Age data are extracted from dated phylogenetic trees (chronograms) that have been published and peer-reviewed in association with a scientific article in an indexed journal. There is a Getting Started Guide and a Case Study.

dynamAdes v2.0.2: Implements a model to study the population dynamics of invasive Aedes mosquitoes. See Da Re et al. (2021) for the model rationale, and Da Re et al. (2021) for the model framework. The vignette provides an example.

Plots showing interquatile range of albopictus abundance over time

Statistics

autoReg v0.1.0: Provides functions to create summary tables for descriptive statistics and select explanatory variables automatically for various regression models including linear models, generalized linear models and cox-proportional hazard models. There is a Getting Started Guide and there are vignettes on Automatic Regression Modeling, Bootstrap Simulation, Statistical tests, and Survival Analysis.

Hazard Ratio table and plot

conformalInference.multi v1.0.0: Provides functions to compute full conformal, split conformal and multi split conformal prediction regions when the response variable is multivariate (i.e. dimension is greater than one). For background see Lei et al. (2016), Diquigiovanni et al. (2021), and Solari & Djordjilovic (2021). Look here for examples.

Plot of Conformal prediction region

gamselBayes v1.0-2: Provides functions to fit and select generalized additive models via approximate Bayesian inference according to the methodology described in He & Wand (2021). There is a vignette.

interpretCI v0.1.1: Provides functions for estimating confidence intervals for various statistics and plotting the results. There is an Introduction along with ten additional vignettes for the various statistics including CI for difference between proportions and Hypotheses test for difference between paired means.

Plots illustrating difference between paired means test

lmls v0.1.0: Implements functions for Gaussian location-scale regression model (a multi-predictor model with explanatory variables for the mean (location) and the standard deviation (scale) of a response variable) using the algorithms described in Girolami & Calderhead (2011) and Nesterov (2009). See the vignette.

sandwichr v1.0.1: Implements the Spatial Stratified Heterogeneity (SSH) spatial interpolation algorithms described in Wang et al. (2013). See the vignette.

Diagram showing information flow in SSH spatial interpolation model

Utilities

gittargets v0.0.3: Provides functions to preserve historical output in targets workflows by capturing version-controlled snapshots of the data store. Each snapshot links to the underlying commit of the source code enabling users to recover contemporaneous data when rolling back to a previous commit. See the Tutorial.

Diagram of gittargets workflow model

httptest2 v0.1.0: For packages using httr2 this package enables testing all of the logic on the R side of the API without requiring access to the remote service. It also allows recording real API responses to use as test fixtures. There is an Introduction and there are vignettes on Modifying Recorded Requests, Writing Vignettes with API’s, and FAQs.

maybe v0.2.0: Implements a maybe type which represents the possibility of some value or nothing. This may be used instead of throwing an error or returning NULL. maybe and has the advantages of being composable and requiring the developer to explicitly acknowledge the potential absence of a value. Look here to get started.

nanonext v0.2.0: Implements an R binding for NNG (Nanomsg Next Gen), a socket library for implementing a high-performance cross-platform protocol standard for messaging and communications. It serves as a concurrency framework that can be used for building distributed applications. See README to get started.

powerjoin v0.0.1: Provides extensions of dplyr and fuzzyjoin join functions to preprocess data, apply various data checks, and deal with conflicting columns. See README for examples.

quickcheck v0.1.0: Builds on the framework provided by hedgehogto implement property based testing in R. It was inspired by QuickCheck and has been designed to seamlessly integrate with testthat. See README for an introduction.

Visualization

fisheye v0.1.0: Provides functions to transform base maps to focus on a specific location using an azimuthal logarithmic distance transformation. See README to get started.

Gif showing transformation of map of Paris

forestplotter v0.1.1: Provides functions to create multi-column forest plots with confidence intervals that may be grouped the data. See the vignette.

Forest plots for two groups with table of covariates

geomtextpath v0.1.0: Implements a ggplot2 extension which allows text to follow curved paths. Curved text makes it easier to directly label paths or neatly annotate in polar co-ordinates. There is an Introduction and there are vignettes on Curved Text in Polar Plots and Aesthetics.

Plot with spiraling text

ggESDA v0.1.0: Implements an extension of ggplot2 to visualize symbolic data and also provides a function to transform classical data to symbolic data by both clustering algorithm and customized methods.

Scatterplot where rectangles represent intervals between two variables

toastui v0.2.1: Implements an interface to the TOAST UI libraries for creating interactive tables and plots that can be integrated into Shiny applications. There is a vignette and a webpage with interactive examples.

Figure showing calendar page

tornado v0.1.1: Implements linear models, generalized linear models, survival regression models, and machine learning models using the caret package framework and draws tornado plots to visualize the range of outputs expected from a variety of inputs, or alternatively, the sensitivity of the output to the range of inputs. See the vignette.

Example of a tornado plot

Share Comments · · · ·

You may leave a comment below or discuss the post in the forum community.rstudio.com.