April 2021: "Top 40" New CRAN Packages

by Joseph Rickert

One hundred seventy-nine new packages made it to CRAN in April. Here are my “Top 40” picks in twelve categories: Computational Methods, Data, Genomics, Machine Learning, Mathematics, Medicine, Networks, Operations Research, Statistics, Time Series, Utilities, and Visualization.

Computational Methods

abess v0.1.0: Provides a toolkit for solving the best subset selection problem in linear regression, logistic regression, Poisson regression, Cox proportional hazard model, multiple-response Gaussian, and multinomial regression. It implements and generalizes algorithms described in Zhu et al. (2020) that exploit a novel sequencing-and-splicing technique to guarantee exact support recovery and globally optimal solution in polynomial times. There is an Introduction.

eat v0.1.0: Provides functions to determine production frontiers and technical efficiency measures through non-parametric techniques based upon regression trees. See Esteve et al. (2020) for details. There is an Introduction.

Data

childdevdata v1.1.0: Bundles publicly available data sets with individual milestone data for children aged 0-5 years, with the aim of supporting the construction, evaluation, validation and interpretation of methodologies that aggregate milestone data into informative measures of child development. See README.

datagovindia v0.0.3: Allows users to search the open data platform of the government of India to communicate with the more than 80,000 available APIs. See the vignette.

lehdr v0.2.4: Provides functions to query the LODES FTP server to obtain longitudinal Employer-Household Dynamics data and optionally aggregate Census block-level data. See the vignette.

rbioapi v0.7.0: Provides a consistent R interface to the Biologic Web Services API and fully supports miEAA, PANTHER, Reactome, String, and UniProt. See this vignette to get started.

tidywikidatar v0.2.0: Provides functions to query Wilidata, get tidy data frames in response, and cache data in a local SQLite database. See README.

Genomics

protti v0.1.1: Provides functions and workflows for proteomics quality control and data analysis of both limited proteolysis-coupled mass spectrometry and regular bottom-up proteomics experiments. See Feng et. al. (2014) for background. There are vignettes for various workflows: Dose Response, Single Treatment Dose Response, Input Preparation, and Quality Control.

Rediscover v0.1.0: Implements an optimized method for identifying mutually exclusive genomic events based on the Poisson-Binomial distribution that takes into account that some samples are more mutated than others. See Canisius et al. (2016). The vignette provides an introduction.

Machine Learning

geocmeans v0.1.1: Provides functions to apply spatial fuzzy unsupervised classification, visualize and interpret results, as well as indices for estimating the spatial consistency and classification quality. See Cai et al. (2007), Zaho et al. (2013), and Gelb & Appaericio (2021) for background. There is an Introduction and an additional vignette.

Rforestry v0.9.0.4: Provides fast implementations of Honest Random Forests, Gradient Boosting, and Linear Random Forests, with an emphasis on inference and interpretability. See Kunzel et al. (2019). See README to get started.

Mathematics

elasdics v0.1.2: Provides functions to align curves and to compute mean curves based on the elastic distance defined in the square-root-velocity framework. For information on the framework see Srivastava and Klassen (2016), For more theoretical details see Steyer et al. (2021)

jordan v1.0-1: Provides functions to manipulate Jordan Algebras, commutative but non-associative algebraic structures that satisfy the Jordan Identify: (xy)x2 = x(yx2). See McCrimmon (204).

Medicine

ccoptimalmatch v0.1.0: Uses sub-sampling to create pseudo-observations of controls to optimally match cases with controls. See Mamoiris (2021) for the theory and the vignette for examples.

nCov2019 v0.4.4: Implements an interface to disease.sh - Open Disease Data API to access real time and historical data of COVID-19 cases, vaccine and therapeutics data. There is a vignette.

hlaR v0.1.0: Implements a tool for the eplet analysis of donor and recipient HLA (human leukocyte antigen) mismatches. There are vignettes on Imputation and Eplet Mismatch and a Shiny App as well.

RevieweR v2.3.6: Implements a portable Shiny tool to explore patient-level electronic health record data and perform chart review in a single integrated framework. This tool supports the OMOP common data model as well as the MIMIC-III data model, and chart review through a REDCap API. See the RevieweR Website for more information. There are several vignettes including Local, Docker, BigQuery and Shiny Server deployment and performing a Chart Review.

Networks

greed v0.5.1: Provides an ensemble of algorithms to enable clustering of networks and data matrices with different type of generative models. Model selection and clustering is performed in combination by optimizing the Integrated Classification Likelihood. The optimization is performed with a combination of greedy local search and a genetic algorithm. See Côme et al. (2021) for background and the vignettes on Gaussian Mixture Models and Clustering.

Operations Research

critpath v0.1.2: Provides functions to compute critical paths, schedules, PERT charts and Gantt charts. There is a vignette on CPM and PERT and another on the LESS Method.

himach v0.1.2: Provides functions to compute the best routes between airports for supersonic aircraft flying subsonic over land. There is an Introduction to Supersonic Routing and a vignette on Advanced Supersonic Routing.

Statistics

convdistr v1.5.3: Provides functions to compute convolutions of probability distributions via a method that creates a new random number function for individual random samples from the random generator function of each distribution. There is an Introduction and a vignette on Sample Size.

gamlss.lasso v1.0-0: Provides an interface for extra high-dimensional smooth functions for Generalized Additive Models for Location Scale and Shape (GAMLSS) including lasso, ridge, elastic net and least angle regression. The gamlss website provides considerable information.

GGMnonreg v1.0.0: Provides functions to estimate non-regularized Gaussian graphical models, Ising models, and mixed graphical models. See Williams et al. (2019), Williams & Rast (2019), and Williams (2020) for details. README contains examples.

relevance v1.1: Implements the concepts of relevance and significance measures introduced in Stahel (2021) to augment inference with p-values. See the vignette for examples.

sasfunclust v1.0.0: Implements the sparse and smooth functional clustering method described in Centofanti et al. (2021) that aims to classify a sample of curves into homogeneous groups while jointly detecting the most informative portions of domain. See README to get started.

survMS v0.0.1: Provides functions to simulate data from the Accelerated Hazard, Accelerated Failure Time, and Cox survival models. See Bender et al. (2004) for the methods used to implement the Cox model, and the vignette and GitHub for an introduction and examples.

TestGardener v0.1.4: Provides functions to develop, evaluate, and score multiple choice examinations, psychological scales, questionnaires, and similar types of data involving sequences of choices among one or more sets of answers. See Ramsay et al. (2020) and Ramsay et al. (2019) for the methodology and the vignettes Symptom Distress Analysis and SweSAT Quantitative Analysis.

wpa v1.5.0: Provides opinionated functions to enable easier and faster analysis of Workplace Analytics data. See the vignette for an introduction.

Time Series

garchmodels v0.1.1: Implements a framework for using GARCH models with the tidymodels ecosystem. It includes both univariate and multivariate methods from the rugarch and rmgarch packages. There is a Getting Started Guide and a vignette on tuning univariate GARCH models.

tensorTS v0.1.1: Provides functions for estimating, simulating and predicting factor and autoregressive models for matrix and tensor valued time series. See Chen et al. (2020), Chen et al. (2020), and Han et al. (2020) for the math.

Utilities

diffmatchpatch v0.1.0: Implements a wrapper for Google’s diff-match-patch library. It provides basic tools for computing diffs, finding fuzzy matches, and constructing / applying patches to strings. See README for examples.

erify v0.2.0: Provides several validator functions to check if arguments passed by users have valid types, lengths, etc., and if not, to generate informative and good-formatted error messages in a consistent style. See the vignette to get started.

juicr v0.1: Provides a GUI interface for automating data extraction from multiple images containing scatter and bar plots, semi-automated tools to tinker with extraction attempts, and a fully-loaded point-and-click manual extractor with image zoom, calibrator, and classifier. See the vignette for examples, and the Youtube channel for a course on meta analysis.

mailmerge v 0.2.1: Allows users to mail merge using markdown documents and gmail, parse markdown documents as the body of email, use the yaml header to specify the subject line of the email, preview the email in the RStudio viewer pane, and send (draft) email using gmailr. See the vignette for examples.

m61r v0.0.2: Provides dplyr and tidyr like data manipulation functions using only base R and no dependencies. See the vignette for examples.

Visualization

flametree v0.1.2: Implements a generative art system for producing tree-like images using an L-system to create the structures. See README to get started.

leafdown v1.0.0: Provides drill down functionality for leaflet choropleths in shiny apps. There is an Introduction and a Showcase example.

mapping v1.2: Provides coordinates, linking and mapping functions for mapping workflows of different geographical statistical units. Geographical coordinates automatically link with the input data to generate maps. See the vignette to get started.

materialmodifier v1.0.0: Provides functions to apply image processing effects to modify the perceived material properties such as gloss, smoothness, and blemishes. Look here for documentation and practical tips of the package is available at

svplots v0.1.0: Implements two versions of sample variance plots illustrating the squared deviations from sample variance as described in Wijesuriya (2020). See the vignette.

vivid v0.1.0: Provides a suite of plots for displaying variable importance and two-way variable interaction. Plots include partial dependence plots laid out in “pairs plot”” or zenplots style. There is an Introduction and a Quick Start Guide.

Share Comments · · ·

You may leave a comment below or discuss the post in the forum community.rstudio.com.