June 2017 New Package Picks

by Joseph Rickert

Two hundred and thirty-eight new packages were added to CRAN in June. Below are my picks for the “Top 40”, organized into six categories: Biostatistics, Data, Machine Learning, Miscellaneous, Statistics and Utilities. Some packages, including geofacet and secret, already seem to be gaining traction.

Biostatistics

BIGL v1.0.1: Implements response surface methods for drug synergy analysis, including generalized and classical Loewe formulations and the Highest Single Agent methodology. There are vignettes on Methodology and Synergy Analysis.

colorpatch v0.1.2: Provides functions to show color patches for encoding fold changes (e.g., log ratios) and confidence values within a diagram; especially useful for rendering gene expression data and other types of differential experiments. See the vignette.

eesim v0.1.0: Provides functions to create simulated time series of environmental exposures (e.g., temperature, air pollution) and health outcomes for use in power analysis and simulation studies in environmental epidemiology. The vignette gives an overview of the package.

personalized v0.0.2: Provides functions for fitting and validating subgroup identification and personalized medicine models under the general subgroup identification framework of Chen et al. The vignette provides a brief tutorial.

tidygenomics v0.1.0: Provides method to deal with genomic intervals the “tidy way”. The vignette explains how they work.

Data

alfred v0.1.1: Provides direct access to the ALFRED and FRED databases. The vignette gives a brief example.

CityWaterBalance v0.1.0: Provides functions to retrieve data and estimate unmeasured flows of water through an urban network. Data for US cities can be gathered via web services using this package and dependencies. See the vignette for an introduction to the package.

censusapi v0.2.0: Provides a wrapper for the U.S. Census Bureau APIs that returns data frames of census data and metadata. Available data sets include the Decennial Census, American Community Survey, Small Area Health Insurance Estimates, Small Area Income and Poverty Estimates, and Population Estimates and Projections. There is a brief vignette.

dataverse v0.2.0: Provides access to Dataverse version 4 APIs, enabling data search, retrieval, and deposit. There are four vignettes: Introduction, Search and Discovery, Retrieval and Data Archiving.

data.world v1.1.1: Provides high-level tools for working with data.world data sets. There is a Quickstart Guide and a vignette for writing Queries.

SimMultiCorrData v0.1.0: Provides functions to generate continuous, binary, ordinal, and count variables with a specified correlation matrix that can be used to simulate data sets that mimic real-world situations (e.g., clinical data sets, plasmodes). There are several vignettes including an Overall Workflow for Data Simulation and a Comparison to Other Packages.

tidycensus v0.1.2: Provides an integrated R interface to the decennial US Census and American Community Survey APIs, and the US Census Bureau’s geographic boundary files.

ukbtools v0.9.0: Provides tools to work with UK Biobank datasets. The vignette shows how to get started.

wpp2017 v1.0-1: Provides and interface to data sets from the United Nation’s World Population Prospects 2017.

Machine Learning

cld3 v1.0: Provides an interface to Google’s experimental Compact Language Detector 3 algorithm, a neural network model for language identification that is the successor of cld2.

datafsm v0.2.0: Implements a method that automatically generates models of dynamic decision-making that both have strong predictive power and are interpretable in human terms. The vignette provides an example.

diceR v0.1.0: Provides functions for cluster analysis using an ensemble clustering framework. The vignette shows some examples.

glmertree v0.1-1: Implements recursive partitioning based on (generalized) linear mixed models (GLMMs) combining lmer() and glmer() from lme4 and lmtree() and glmtree() from partykit. The vignette shows an example.

greta v0.2.0: Lets users write statistical models in R and fit them by MCMC on CPUs and GPUs, using Google TensorFlow. There is a website, a Getting Started Guide, and vignettes providing Examples andTechnical Details.

penaltyLearning v2017.07.11: Implements algorithms from Learning Sparse Penalties for Change-point Detection using Max Margin Interval Regression. There is a vignette.

SentimentAnalysis v1.2-0: Implements functions to perform sentiment analysis of textual data using various existing dictionaries, such as Harvard IV, or finance-specific dictionaries, and create customized dictionaries. The vignette provides an introduction.

Miscellaneous

convexjlr v0.5.1: Provides a high-level wrapper for Julia package Convex.jl, which makes it easy to describe and solve convex optimization problems. There is a very nice vignette that shows how to optimize the parameters for several machine learning models.

interp v1.0-29: Implements bivariate data interpolation on both regular and irregular grids using either linear methods or splines.

pkggraph v0.2.0: Allows users to interactively explore and plot package dependencies for CRAN.

parallelDist v0.1.1: Provides a parallelized alternative to R’s native dist function to calculate distance matrices for continuous, binary, and multi-dimensional input matrices with support for a broad variety of distance functions from the stats, prox and dtw R packages. The vignette offers some results on performance.

Stats

anchoredDistR v1.0.3: Supplements the MAD# software that implements the Method of Anchored Distributions for inferring geostatistical parameters. There is a vignette.

bssm v01.1-1: Efficient methods for Bayesian inference of state space models via particle Markov chain Monte Carlo and importance sampling type corrected Markov chain Monte Carlo. There is a vignette on Bayesian Inference of State Space Models and an example of a Logistic Growth Model.

factorMerger v0.3.1 Provides a set of tools to support results of post-hoc testing and enable to extract hierarchical structure of factors. There is an Introduction and vignettes on Cox Regression Factor Merging and Multidimensional Gaussian Merging.

MittagLeffleR v0.1.0: Provides density, distribution, and quantile functions as well as random variate generation for the Mittag-Leffler distribution based on the algorithm by Garrappa. There are short vignettes for the math, distribution functions and random variate generation.

walker v0.2.0: Provides functions for building dynamic Bayesian regression models where the regression coefficients can vary over time as random walks. The vignette shows some examples.

Utilities

charlatan v0.1.0: Provides functions to make fake data, including addresses, person names, dates, times, colors, coordinates, currencies, DOIs, jobs, phone numbers, ‘DNA’ sequences, doubles and integers from distributions and within a range. The Introduction will get you started.

colordistances v0.8.0: Provides functions to load and display images, selectively mask specified background colors, bin pixels by color, quantitatively measure color similarity among images,and cluster images by object color similarity. There is an Introduction and vignettes on Pixel Binning Methods and Color Distance Metrics.

dbplyr v1.1.0: Implements a dplyr back end for databases that allows working with remote database tables as if they are in-memory data frames. There is an Introduction, a vignette for Adding a new DBI backend and one for SQL translation.

geofacet v0.1.5: Provides geofaciting functionality (the ability to arrange a sequence of plots for different geographical entities into a grid that preserves some geographical orientation) for ggplot2. There is a Package Reference vignette and an Introduction. The package is already getting some traction. This is a user submission:

ggformula v0.4.0: Provides a formula interface to ggplot2. There is a vignette explaining how it works.

gqlr v0.0.1: Provides an implementation of the GraphQL query language created by Facebook for describing data requirements on complex application data models. gqlr should be useful for integrating R computations into production applications that use GraphQL.

later v0.3: Allows users to execute arbitrary R or C functions some time after the current time, after the R execution stack has emptied. The vignette shows how to use later from C++.

secret v1.0.0: Allows sharing sensitive information like passwords, API keys, etc., in R packages, using public key cryptography. There is a vignette.

sessioninfo v1.0.0: Provides functions to query and print information about the current R session. It is similar to utils::sessionInfo(), but includes more information.

webglobe v1.0.2: Provides functions to display geospatial data on an interactive 3D globe. There is a vignette

Share Comments · · ·