By my count, just over 200 new packages made it to CRAN and stuck during March. The trend for specialized, and sometimes downright esoteric science packages continues. I counted 40 new packages in this class. Most, but not all of these, are focused on bio-science applications. For example, the foreSIGHT package profiled below focuses on climate science. I was also pleased to see two new packages (not from RStudio) in the Data Science category, h2o4gpu and onnx, built on the reticulate package for interfacing with Python
. I hope this also becomes a trend.
The following are my “Top 40” picks for March in nine categories: Computational Methods, Data, Data Science, Political Science, Science, Statistics, Time Series, Utilities and Visualizations.
Computational Methods
dynprog v0.1.0: Implements a domain-specific language for specifying translating recursions into dynamic-programming algorithms.
fmlogcondens v1.0.2: Implements a fast solver for the maximum likelihood estimator of the family of multivariate log-concave probability function. Includes well-known parametric densities including the normal, uniform, and exponential distributions and many more. For details, see Rathke et al. (2015). The vignette shows how to use the package.
knor v0.0-5: Provides access to knor
, a NUMA-optimized, in-memory, distributed library for computing k-means.
Data
daymetr v1.3.1: Provides programmatic interface to the Daymet climate data. The vignette shows how to use it.
NOAAWeather v0.1.0: Provides functions to retrieve real-time weather data from all NOAA stations, and plot time series, boxplot, calendar heatmap, and geospatial maps to analyze trends. The vignette shows how to use the package.
ppitables v0.1.2: Contains country-specific lookup data tables used as reference to determine the poverty likelihood of a household based on their PPI score (Poverty Probability Index), with documentation from Innovations for Poverty Action.
usfertilizer v0.1.5: Provides county-level estimates of fertilizer, nitrogen and phosphorus, from 1945 to 2012 in the United States of America. There is an Introduction and a vignette on Data Scources and Processes.
Data Science
greybox v0.2.0: Implements tools for model selection and combinations via information criteria based on the values of partial correlations. The vignette provides details.
h2o4gpu v0.2.0: Implements an interface to H2O4GPU, a collection of GPU
solvers for machine learning algorithms. There is a vignette.
iml v0.3.0: Provides interpretability methods to analyze the behavior and predictions of any machine learning model, including feature importance, partial dependence plots, [individual conditional expectation (ice plots), local models, the Shapley Value, and tree surrogate models.
iTOP v1.0.1: Provides functions to infer a topology of relationships between different datasets, such as multi-omics and phenotypic data recorded on the same samples. The methodology is based on the extension of the RV coefficient, a measure of matrix correlation to partial matrix correlations and binary data. See Aben et al. (2018) for details and the vignette introduction to the package.
onnx v0.0.1: Implements an interface to ONNX
, the Open Neural Network Exchange, which provides an open-source format for machine-learning models.
rcqp v0.5: Implements Corpus Query Protocol functions based on the CWB software, a collection of open-source tools for managing and querying large text corpora. The vignette provides a roadmap.
Political Science
coalitions v0.6.2: Implements an MCMC method to calculate probabilities for a coalition majority based on survey results. See Bender and Bauer (2018). There are vignettes on Workflows, Pooling, and Diagnostics.
Science
diagmeta v0.2-0: Implements methods by Steinhauser et al. (2016) for meta-analysis of diagnostic accuracy studies with several cutpoints.
NetworkExtinction v0.1.0: Provides functions to simulate the extinction of species in the food web, and analyze the cascading effects as described in Dunne et al. (2002). There is a vignette.
foreSIGHT v0.9.2: Provides a tool to create hydroclimate scenarios, stress test systems, and visualize system performance in scenario-neutral climate-change impact assessments. Functions generate perturbed time series using a range of approaches, including simple scaling of observed time series (Culley et al. (2016)) and stochastic simulation of perturbed time series. (Guo et al. (2018)). The vignette offers a tutorial.
PINSPlus v1.0.0: Implements PINS
: Perturbation clustering for data INtegration and disease Subtyping Nguyen et al. (2017), a novel approach for integration of data and classification of diseases into various subtypes There is a vignette.
Statistics
chandwich v1.0.0: Provides functions to adjustment user-supplied independence loglikelihood functions using a robust sandwich estimator of the parameter covariance matrix, based on the methodology in Chandler and Bate (2007). The vignette shows how it works.
ciuupi v1.0.0: Provides functions to compute a confidence interval for a specified linear combination of regression parameters in a linear regression model with iid normal errors and known variance, when there is uncertain prior information that a distinct specified linear combination of the regression parameters takes a given value. See Kabaila and Mainzer (2017) and the vignette for details.
CoxPhLb v1.0.0: Provides functions to analyze right-censored, length-biased data using Cox model, including model fitting and checking, and the stationarity assumption test. The model fitting and checking methods are described in Qin and Shen (2010) and Lee, Ning, and Shen (2018).
cutpointr v0.7.3: Provides functions to estimate cutpoints that optimize a specified metric in binary classification tasks and validate performance using bootstrapping. The vignette shows how to use the functions.
fcr v1.0:
Provides a function for dynamic prediction in functional concurrent regression that extends the pffr()
function from the refund
package to handle the scenario where the functional response and concurrently measured functional predictor are irregularly measured. See Leroux et al. (2017) and the vignette.
ggdag v0.1.0: Builds on the DAGitty web tool to provide functions to tidy, analyze, and plot directed acyclic graphs (DAGs). There is an Introduction to DAGS, an Introduction to ggdag, and a vignette on Common Structures of Bias.
hdme v0.1.1: Provides a function for penalized regression for generalized linear models for measurement error problems including the lasso (L1-penalization), which corrects for measurement error (Sorensen et al. (2015), and an implementation of the Generalized Matrix Uncertainty Selector (Sorensen et al. (2018). The vignette gives the details.
joineRmeta v0.1.1: Extends the joint models proposed by Henderson et. al. (2000) to include multi-study, meta-analytic cases. See the vignette for details.
rare v0.1.0: Implements the alternating direction method of multipliers algorithm of Yan and Bien (2018) for fitting linear models with tree-based lasso regularization. The vignette shows how to use the package.
Time Series
rMEA v1.0.0: Provides tools to read, visualize, and export bivariate motion energy time-series. Lagged synchrony between subjects can be analyzed through windowed cross-correlation. See Ramseyer & Tschacher (2011) for an application, and the README for how to use the package.
tsfknn v0.1.0: Provides a function to forecast time series using nearest neighbors regression. See Martinez et al. (2017) and the vignette for details.
spGARCH v0.1.4: Provides functions to analyze spatial and spatiotemporal autoregressive conditional heteroscedasticity Otto, Schmid, Garthoff (2017), simulation of spatial ARCH-type processes, quasi-maximum-likelihood estimation of the parameters of spARCH models, spatial autoregressive models with spARCH disturbances, diagnostic checks, and visualizations.
Utilities
base2grob v0.0.2: Provides a function to convert a base plot function call (using expression or formula) to grob
objects that are compatible to the grid
ecosystem so that cowplot
can be used to align base plots with ggplot
objects. The vignette shows how things work.
cranly v0.1: Provides functions to clean, organize, summarize, and visualize CRAN package database information, and also for building package directives networks (depends, imports, suggests, enhances) and collaboration networks. The vignette shows how to use the package.
osrmr v0.1.28: Implements a wrapper around the Open Source Routing Machine (OSRM) API. See the vignette for details.
fasterize v1.0.0: Provides a fast, drop-in replacement for rasterize()
from the raster
package that takes sf
-type objects and uses the scan line algorithm attributed to [Wylie et al. (1967)](doi:10.1145⁄1465611.1465619 There is a vignette.
jsr223 v0.3.1: Provides a high-level integration that makes Java
objects easy to use from within R
, and an unified interface for integrating R
with several programming languages, including Groovy
, JavaScript
, JRuby
, (Ruby
), Jython
(Python
), and Kotlin
. See the manual for details.
Visualization
clustree v0.1.2: Provides functions to produce clustering tree visualizations for interrogating clusterings as resolution increases. See the vignette for details.
datamaps v0.0.2: Enables users to create interactive choropleth maps with bubbles and arcs by coordinates or region name that can be used directly from the console, from RStudio
, in Shiny
apps, and in R Markdown
documents. The vignette will help you get started.
funnelR v0.1.0: Provides functions for creating funnel plots for proportion data, and supports user-defined benchmarks, confidence limits, and estimation methods (e.g., exact or approximate) based on Spiegelhalter (2005). See the Introduction to get started.
nVennR v0.2.0: Provides an interface for the nVenn algorithm of Perez-Silva et al. (2018). See the vignette for an introduction to the package, and the R package UpSetR
for help interpreting the results.
smovie v1.0.1: Uses the rpanel package to create interactive movies to help students understand statistical concepts. There are movies to: visualize probability distributions (including user-supplied ones); illustrate sampling distributions of the sample mean (central limit theorem); the sample maximum (extremal types theorem); and more. See the vignette for an overview.
You may leave a comment below or discuss the post in the forum community.rstudio.com.