March 2023: "Top 40" New CRAN Packages

by Joseph Rickert

Accounting

debkeepr v0.1.1: Provides tools to analyze historical, non-decimal currencies and value systems that use tripartite or tetrapartite systems such as pounds and shillings in the context of double-entry bookkeeping. See the Getting Started guide and the vignettes Analysis of Richard Dafforne’s Journal and Ledger and Transactions in Richard Dafforne’s Journal.

Plot showing whether the original data is recovered by the dSVD

Computational Methods

ABM v0.3: Implements a high-performance, flexible and extensible framework to develop continuous-time agent based models capable of simulating millions of agents in which state transitions may be either spontaneous or caused by agent interactions. See README for multiple examples including Simulate an agent based SEIR model and Simulate contact tracing on an SIR model.

rvMF v0.0.7: Provides functions to generate pseudo-random vectors that follow an arbitrary von Mises-Fisher distribution on a sphere including functions to generate random variates, compute the density for the distribution of an inner product between von Mises-Fisher random vector and its mean direction. Look here for an example.

Data

FertNet v0.1.1: Provides tools to processes data from The Social Networks and Fertility Survey including functions for correcting respondent errors and for transforming network data into network objects to facilitate analyses and visualization. See README to get started.

Visualization of a network for one of the respondents

oldbailey v1.0.0: Provides functions to fetch trial data from the Old Bailey Online API. Data includes the names of the first person speakers, defendants, victims, their recorded genders, verdicts, punishments, crime locations, and dates. Look here for an example.

webtrackR v0.0.1: Implements data structures and methods to work with web tracking data, including data preprocessing steps, methods to construct audience networks as described in Mangold & Scharkow (2020) and metrics of news audience polarization described in Mangold & Scharkow (2022). Look here to get started.

Ecology

GIFT v1.0.0: Provides functions to retrieve regional plant checklists, species traits and distributions, and environmental data from the Global Inventory of Floras and Traits database and to visualize the map of available flora. There is an introductory Tutorial, an Advanced Tutorial, and a vignette on Queries.

Projection map of angiosperms

rTLsDeep v0.0.5: Uses terrestrial laser scanning and deep learning to classify post-hurricane damage severity at the individual tree level. See Klauberg et al. (2023) for details, and look here for an example.

3D Tree Scan

Finance

HSRFA v0.1.1: Implements two algorithms to do robust factor analysis by considering the Huber loss: one is based on minimizing the Huber loss of the idiosyncratic error’s L2 norm, the other is based on minimizing the element-wise Huber loss. See He et al. (2023) for background, Bai (2003) for PCA code, and He et al. (2022), and Chen et al. (2021) for the Quantile Factor Analysis method.

PCRA v1.0: Provides a collection of functions and several real-world data sets that support teaching a quantitative finance MS level course on Portfolio Construction and Risk Analysis. See the vignette: Introduction to CRSP Stocks and SPGMI Factors in PCRA.

Genomics

GESciLiVis v1.1.0: Provides tools to visualize publication activity per gene based on a gene list and a user-defined set of keywords to perform an NCBI database search as in PubMed. See the vignette.

Bar plot of results from search of human gene set

ggpicrust2 v1.6.0: Provides tools to analyze and visualize PICRUSt2 output with pre-defined plots and functions, including a one-click option for creating publication-level plots. For more details, see Yang et al. (2023). Look here for examples.

gsdensity v0.1.2: Implements a computational tool for pathway centric analysis of single-cell data including scRNA-seq data and spatial genomics data. Given a gene set and a cell-by-gene matrix, ask the question: is this gene set somehow enriched by a subpopulation of the cells? See README for examples.

Seurat annotations on UMAP vs UMAP plot

metaGE v1.0.0: Provides tools for conducting genome-wide association studies for studying Genotype x Environment interactions, including functions to collect the results of GWAS data from different files, infer the inter-environment correlation matrix, perform global test procedure for quantitative trait loci detection, and perform tests of contrast or meta-regression. See De Walsche et al. (2023) for the details.

Machine Learning

FACT v0.1.0: Implements an algorithm agnostic framework for feature attribution while preserving the integrity of the data and facilitating the understand of the mapping procedure of an algorithm that assigns instances to clusters. See README to get started.

Density plots for three clusters

lpda v1.0.1: Implements the linear programming classification method described by Nueda, et al. (2022) which is advantageous when variable distributions are unknown or when the number of variables is much greater than the number of observations. See the vignette.

Plot showing the separating hyperplane UBayFS v1.0: Implements the user-guided Bayesian framework proposed by Jenul et al. (2022) for ensemble feature selection. See the Introduction and the vignette on Block feature selection.

Plot showing features and constraints

Mathematics

qfratio v1.0.1: Provides functions to evaluate moments of ratios and products of quadratic forms in normal variables using recursive algorithms developed by Bao and Kan (2013) and and Hillier et al. (2014). See README for examples.

Medicine

gsDesign2 v1.0.7: Provides tools to enable fixed or group sequential design under non-proportional hazards assumptions that support flexible enrollment, time-to-event and time-to-dropout assumptions. Design methods include average hazard ratio, the weighted logrank tests in Yung and Liu (2019), and MaxCombo tests. See the vignette to get started.

NCC v1.0: Provides functions to simulate and analyze platform trials with non-concurrent controls. See Bofill Roig et al. (2022), Saville et al. (2022), and Schmidli et al. (2014) for background. There is a brief Introduction and there are vignettes on simulating binary data, continuous data, and How to run a simulation study.

For treatments that enter the trial later, the control group is divided into concurrent (CC) and non-concurrent controls (NCC)

Pharma

DrugExposureDiagnostics v0.4.1: Provides ingredient specific diagnostics for drug exposure records in the Observational Medical Outcomes Partnership (OMOP) common data model. See the Introduction and the Summary of checks vignette.

rlistings v0.1.1: Provides functions to create and display listings for clinical trials. See the Getting Started Guide.

Science

LCMSQA v1.0.0: Provides functions to check the quality of liquid chromatograph/mass spectrometry (LC/MS) experiments using an interactive shiny application. Tests include total ion current chromatogram, base peak chromatogram, mass spectrum, and extracted ion chromatogram. See the Introduction.

Feature detection screen

Statistics

lmw v0.0.1: Provides functions to compute the implied weights of linear regression models for estimating average causal effects and provides diagnostics based on these weights. See Chattopadhyay and Zubizarreta (2022) where several regression estimators are represented as weighting estimators, in connection with inverse probability weighting. Look here for examples.

Plot of sample influence curve

ptable v1.0.0: Implements the cell-key statistical disclosure control perturbation technique to protect confidential information. See Giessing (2016) for the technical details and the vignette for examples.

Plot of Distribution of the Perturbation Values vs Noise

snha v0.1.3: Implements the St. Nicolas House Analysis to explore interacting variables and create correlation networks. See the vignette.

Plots contrasting PCA and SNHA approaches to variable interactions

satdad v1.1: Implements theoretical and non-parametric tools to analyze tail dependence in sample based or theoretical models. A goal is to generate multivariate extreme value models in any dimension. See the extensive vignette.

Plots

sr v0.1.0: Implements the Gamma test based smooth regression method for measuring smoothness in multivariate relationships, finding causal connections in precision data, finding lags and embeddings in time series, and training neural networks. See Evans & Jones (2002) and Jones (2004) for details and the vignette for examples.

Plot of Henon Model using Gamma

wqspt v1.0.1: Implements a permutation test method for the weighted quantile sum (WQS) regression used to evaluate the effect of complex exposure mixtures on an outcome. See Carrico et al. (2015) and Day et al. (2022) for the theory and the vignette for examples.

Table of weights from permutation test

Time Series

coconots v1.1.1: Provides tools for fitting, validating, and forecasting practical convolution-closed time series models for low counts. The models are described in Jung and Tremayne (2011), and the model assessment tools are presented in Czado et al. (2009), Gneiting and Raftery (2007), and, Tsay (1992). See README for examples.

Diagram showing functionality

sparseDFM v1.0: Implements various estimation methods for dynamic factor models (DFMs) including PCA, see Stock and Watson (2002), and EM, see Banbura and Modugno (2014) and DFMs Mosley et al. (2023). There are vignettes on Nowcasting UK Trade in Goods and Inflation.

Plot of factor loadings

Utilities

askgpt v0.0.2: Implements a connection to the OpenAI API to answer questions about R. See the vignette.

Example of chatGPT answer

cellKey v1.0.1: Implements a method to protect statistical data by computing cell keys for individual cells in statistical tables. The theory behind the method is described in Thompson, Broadfoot and Elazar (2013) and Giessing and Tent (2019).

occupationMeasurement v0.2.0: Implements an interface for performing interactive occupation coding during interviews as described in Peycheva et al. (2021) and Schierholz et al. (2018). There are several vignettes including a Getting Started guide, Using the API, and Custom Questionnaires.

pracpac v0.1.0: Provides functions to streamline the creation of Docker images with R packages and dependencies embedded. See Nagraj and Turner (2023) for details and the vignettes Basic usage and Use cases.

RmdConcord v0.1.6: Supports concordances in R Markdown documents to easily find the source in the .Rmd file of errors detected by HTML tidy. See README for details and note that the vignette serves as a practice file.

symbol.equation.gpt v1.1.1: Provides an interface for adding symbols, smileys, arrows, and building mathematical equations using LaTeX or r2symbols for Markdown and Shiny development. See the vignette.

Shiny interface

tinysnapshot v0.0.3: Provides snapshots for unit tests using the tinytest framework and includes expectations to test base R and ggplot2 plots as well as console output from print(). See README for usage.

Test snapshots

Visualization

PlotBivInvGaus v0.1.0: Provides functions to create bivariate inverse Gaussian distribution contour plots for non-negative random variables. See the vignette.

Density contour plot

textBoxplacement v1.0: Provides functions to compute a non-overlapping layout of text boxes to label multiple overlaying curves. See the vignette.

Multiple curves with text boxes

Share Comments · · · ·

You may leave a comment below or discuss the post in the forum community.rstudio.com.