R Packages on R Views
https://rviews.rstudio.com/categories/r-packages/index.xml
Recent content in R Packages on R ViewsHugo -- gohugo.ioen-usRStudio, Inc. All Rights Reserved.December 2108: “Top 40” New CRAN Packages
https://rviews.rstudio.com/2019/01/30/december-2108-top-40-new-cran-packages/
Wed, 30 Jan 2019 00:00:00 +0000https://rviews.rstudio.com/2019/01/30/december-2108-top-40-new-cran-packages/
<p>By my count, 157 new packages stuck to CRAN in December. Below are my “Top 40” picks in ten categories: Computational Methods, Data, Finance, Machine Learning, Medicine, Science, Statistics, Time Series, Utilities and Visualization. This is the first time I have used the Medicine category. I am pleased that a few packages that appear to have clinical use made the cut. Also noteworthy in this month’s selection are the inclusion of four packages from the Microsoft Azure team (stuffing 41 packages into the “Top 40”), and some eclectic, but fascinating packages in the Science section.</p>
<h3 id="computational-methods">Computational Methods</h3>
<p><a href="https://cran.r-project.org/package=ar.matrix">ar.matrix</a> v0.1.0: Provides functions that use precision matrices and Choleski factorization to simulates auto-regressive data. The <a href="https://cran.r-project.org/web/packages/ar.matrix/readme/README.html">README</a> offers examples.</p>
<p><img src="/post/2019-01-24-Dec2018-NewPkgs_files/ar.png" height = "400" width="600"></p>
<p><a href="https://CRAN.R-project.org/package=mvp">mvp</a> v1.0-2: Provides functions for the fast symbolic manipulation polynomials. See the <a href="https://cran.r-project.org/web/packages/mvp/vignettes/mvp.html">vignette</a> and this R Journal <a href="https://journal.r-project.org/archive/2013-1/kahle.pdf">paper</a> for details on how to create this image of the Rosenbrock function.</p>
<p><img src="/post/2019-01-24-Dec2018-NewPkgs_files/mvp.png" height = "400" width="600"></p>
<p><a href="https://cran.r-project.org/package=pomdp">pomdp</a> v0.9.1: Provides an interface to <a href="http://www.pomdp.org/code/index.html"><code>pomdp-solve</code></a>, a solver for Partially Observable Markov Decision Processes (POMDP). See the <a href="https://cran.r-project.org/web/packages/pomdp/vignettes/POMDP.pdf">vignette</a> for examples.</p>
<p><img src="/post/2019-01-24-Dec2018-NewPkgs_files/pomdp.png" height = "400" width="600"></p>
<h3 id="data">Data</h3>
<p><a href="https://cran.r-project.org/package=dbparser">dbparser</a> v1.0.0: Provides a tool for parsing the <a href="http://drugbank.ca">DrugBank</a> XML database. The <a href="https://cran.r-project.org/web/packages/dbparser/vignettes/dbparser.html">vignette</a> shows how to get started.</p>
<p><a href="https://cran.r-project.org/package=rdhs">rdhs</a> v0.6.1: Implements a client querying the <a href="https://api.dhsprogram.com/#/index.html">DHS API</a> to download and manipulate survey datasets and metadata. There are introductions to using <a href="https://cran.r-project.org/web/packages/rdhs/vignettes/introduction.html">rdhs</a> and the <a href="https://cran.r-project.org/web/packages/rdhs/vignettes/client.html">rdhs client</a>, an extended example about <a href="https://cran.r-project.org/web/packages/rdhs/vignettes/anemia.html">Anemia prevalence</a>, and vignettes on <a href="https://cran.r-project.org/web/packages/rdhs/vignettes/country_codes.html">Country Codes</a>, <a href="https://cran.r-project.org/web/packages/rdhs/vignettes/geojson.html">Interacting with the geojson API results</a>, and <a href="https://cran.r-project.org/web/packages/rdhs/vignettes/testing.html">Testing</a>.</p>
<h3 id="finance">Finance</h3>
<p><a href="https://cran.r-project.org/package=optionstrat">optionstrat</a> v1.0.0: Implements the Black-Scholes-Merton option pricing model to calculate key option analytics and graphical analysis of various option strategies. See the <a href="https://cran.r-project.org/web/packages/optionstrat/vignettes/optionstrat_vignette.html">vignette</a>.</p>
<p><a href="https://cran.r-project.org/package=riskParityPortfolio">riskParityPortfolio</a> v0.1.1: Provides functions to design risk parity portfolios for financial investment. In addition to the vanilla formulation, where the risk contributions are perfectly equalized, many other formulations are considered that allow for box constraints and short selling. The package is based on the papers: <a href="doi:10.1109/TSP.2015.2452219">Feng and Palomar (2015)</a>, <a href="doi:10.2139/ssrn.2297383">Spinu (2013)</a>, and <a href="arXiv:1311.4057">Griveau-Billion et al.(2013)</a>. See the <a href="https://cran.r-project.org/web/packages/riskParityPortfolio/vignettes/RiskParityPortfolio.html">vignette</a> for an example.</p>
<p><img src="/post/2019-01-24-Dec2018-NewPkgs_files/riskParityPortfolio.png" height = "400" width="600"></p>
<h3 id="machine-learning">Machine Learning</h3>
<p><a href="https://cran.r-project.org/package=BTM">BTM</a> v0.2: Provides functions to find <a href="https://github.com/xiaohuiyan/xiaohuiyan.github.io/blob/master/paper/BTM-WWW13.pdf"><code>Biterm</code></a> topics in collections of short texts. In contrast to topic models, which analyze word-document co-occurrence, biterms consist of two words co-occurring in the same short text window.</p>
<p><a href="https://cran.r-project.org/package=ParBayesianOptimization">ParBayesianOptimization</a> v0.0.1: Provides a framework for optimizing Bayesian hyperparameters according to the methods described in <a href="https://arxiv.org/abs/1206.2944">Snoek et al. (2012)</a>. There are vignettes on <a href="https://cran.r-project.org/web/packages/ParBayesianOptimization/vignettes/standardFeatures.html">standard</a> and <a href="https://cran.r-project.org/web/packages/ParBayesianOptimization/vignettes/advancedFeatures.html">advanced</a> features.</p>
<p><img src="/post/2019-01-24-Dec2018-NewPkgs_files/ParB.png" height = "400" width="600"></p>
<h3 id="medicine">Medicine</h3>
<p><a href="https://cran.r-project.org/package=LUCIDus">LUCIDus</a> v0.9.0: Implements the <code>LUCID</code> method to jointly estimate latent unknown clusters/subgroups with integrated data. See the <a href="https://cran.r-project.org/web/packages/LUCIDus/vignettes/LUCIDus-vignette.html">vignette</a> for details.</p>
<p><img src="/post/2019-01-24-Dec2018-NewPkgs_files/LUCID.png" height = "400" width="600"></p>
<p><a href="https://cran.r-project.org/package=metaRMST">metaRMST</a> v1.0.0: Provides functions that use individual patient-level data to produce a multivariate meta-analysis of randomized controlled trials with the difference in restricted mean survival times ( <a href="https://bmcmedresmethodol.biomedcentral.com/articles/10.1186/1471-2288-13-152">RMSTD</a> ).</p>
<p><a href="https://cran.r-project.org/package=webddx">webddx</a> v0.1.0: Implements a differential-diagnosis generating tool. Given a list of symptoms, the function <code>query_fz</code> queries the <a href="http://www.findzebra.com/">FindZebra</a> website and returns a differential-diagnosis list.</p>
<h3 id="science">Science</h3>
<p><a href="https://cran.r-project.org/package=bioRad">bioRad</a> v0.4.0: Provides functions to extract, visualize, and summarize aerial movements of birds and insects from weather radar data. There is an <a href="https://cran.r-project.org/web/packages/bioRad/vignettes/bioRad.html">Introduction</a> and a vignette on <a href="https://cran.r-project.org/web/packages/bioRad/vignettes/rad_aero_18.html">Exercises</a>.</p>
<p><img src="/post/2019-01-24-Dec2018-NewPkgs_files/bioRad.png" height = "400" width="600"></p>
<p><a href="https://cran.r-project.org/package=pmd">pmd</a> v0.1.1: Implements the paired mass distance analysis proposed in <a href="doi:10.1016/j.aca.2018.10.062">Yu, Olkowicz and Pawliszyn (2018)</a> for gas/liquid chromatography–mass spectrometry. See the <a href="https://cran.r-project.org/web/packages/pmd/vignettes/globalstd.html">vignette</a> for an introduction.</p>
<p><a href="https://cran.r-project.org/package=tabula">tabula</a> v1.0.0: Provides functions to examine archaeological count data and includes several measures of diversity. There are vignettes on <a href="https://cran.r-project.org/web/packages/tabula/vignettes/diversity.html">Diversity Measures</a>, <a href="https://cran.r-project.org/web/packages/tabula/vignettes/matrix.html">Matrix Classes</a>, and <a href="https://cran.r-project.org/web/packages/tabula/vignettes/seriation.html">Matrix Seriation</a>. This last vignette includes an example reproducing the results of <a href="https://doi.org/10.1016/j.jas.2012.04.040">Peeples and Schachner (2012)</a>.</p>
<p><img src="/post/2019-01-24-Dec2018-NewPkgs_files/tabula.png" height = "400" width="600"></p>
<p><a href="https://cran.r-project.org/package=traitdataform">traitdataform</a> v0.5.2: Provides functions to assist with handling ecological trait data and applying the Ecological Trait-Data Standard terminology described in <a href="doi:10.1101/328302">Schneider et al. (2018)</a>.</p>
<p><a href="https://cran.r-project.org/package=waterquality">waterquality</a> v0.2.2: Implements over 45 algorithms to develop water quality indices from satellite reflectance imagery. The <a href="https://cran.r-project.org/web/packages/waterquality/vignettes/waterquality_vignette.html">vignette</a> introduces the package.</p>
<p><img src="/post/2019-01-24-Dec2018-NewPkgs_files/waterquality.png" height = "400" width="600"></p>
<h3 id="statistics">Statistics</h3>
<p><a href="https://cran.r-project.org/package=areal">areal</a> v0.1.2: Implements areal weighted interpolation with support for multiple variables in a workflow that is compatible with the <code>tidyverse</code> and <code>sf</code> frameworks. There are vignettes on <a href="https://cran.r-project.org/web/packages/areal/vignettes/areal.html">Areal Interpolation</a>, <a href="https://cran.r-project.org/web/packages/areal/vignettes/areal-weighted-interpolation.html">Wieghted Areal Interpoaltion</a>, and <a href="https://cran.r-project.org/web/packages/areal/vignettes/data-preparation.html">Preparing Data for Interpolation</a>.</p>
<p><img src="/post/2019-01-24-Dec2018-NewPkgs_files/areal.png" height = "400" width="600"></p>
<p><a href="https://cran.r-project.org/package=FLAME">FLAME</a> v1.0.0: Implements the Fast Large-scale Almost Matching Exactly algorithm of <a href="arXiv:1707.06315">Roy et al. (2017)</a> for causal inference. Look at the <a href="https://cran.r-project.org/web/packages/FLAME/readme/README.html">README</a> to get started.</p>
<p><a href="https://cran.r-project.org/package=mistr">mistr</a> v0.0.1: Offers a computational framework for mixture distributions with a focus on composite models. There is an <a href="https://cran.r-project.org/web/packages/mistr/vignettes/mistr-introduction.pdf">Introduction</a> and a vignette on <a href="https://cran.r-project.org/web/packages/mistr/vignettes/mistr-extensions.pdf">Extensions</a>.</p>
<p><img src="/post/2019-01-24-Dec2018-NewPkgs_files/mistr.png" height = "400" width="600"></p>
<p><a href="https://cran.r-project.org/package=mlergm">mlergm</a> v0.1: Provides functions to estimate exponential-family random graph models for multilevel network data, assuming the multilevel structure is observed. There is a <a href="https://cran.r-project.org/web/packages/mlergm/vignettes/mlergm_tutorial.html">Tutorial</a>.</p>
<p><img src="/post/2019-01-24-Dec2018-NewPkgs_files/mlergm.png" height = "400" width="600"></p>
<p><a href="https://cran.r-project.org/package=MTLR">MTLR</a> v0.1.0: Implements the Multi-Task Logistic Regression (MTLR) proposed by <a href="https://papers.nips.cc/paper/4210-learning-patient-specific-cancer-survival-distributions-as-a-sequence-of-dependent-regressors">Yu et al. (2011)</a>. See the <a href="https://cran.r-project.org/web/packages/MTLR/vignettes/workflow.html">vignette</a>.</p>
<p><img src="/post/2019-01-24-Dec2018-NewPkgs_files/MTLR.png" height = "400" width="600"></p>
<p><a href="https://cran.r-project.org/package=multiRDPG">mulitRDPG</a> v1.0.1: Provides functions to fit the Multiple Random Dot Product Graph Model and performs a test for whether two networks come from the same distribution. See <a href="arXiv:1811.12172">Nielsen and Witten (2018)</a> for details.</p>
<p><img src="/post/2019-01-24-Dec2018-NewPkgs_files/multiRDPG.png" height = "400" width="600"></p>
<p><a href="https://cran.r-project.org/package=ocp">ocp</a> v0.1.0: Implements the Bayesian online changepoint detection method of <a href="arXiv:0710.3742">Adams and MacKay (2007)</a> for univariate or multivariate data. Gaussian and Poisson probability models are implemented. The <a href="https://cran.r-project.org/web/packages/ocp/vignettes/introduction.html">vignette</a> provides an introduction.</p>
<p><img src="/post/2019-01-24-Dec2018-NewPkgs_files/ocp.png" height = "400" width="600"></p>
<p><a href="https://cran.r-project.org/package=probably">probably</a> v0.0.1: Provides tools for post-processing class probability estimates. See the vignettes <a href="https://cran.r-project.org/web/packages/probably/vignettes/where-to-use.html">Where does probability fit in?</a> and <a href="https://cran.r-project.org/web/packages/probably/vignettes/equivocal-zones.html">Equivocal Zones</a>.</p>
<p><img src="/post/2019-01-24-Dec2018-NewPkgs_files/probably.png" height = "400" width="600"></p>
<p><a href="https://cran.r-project.org/package=smurf">smurf</a> v1.0.0: Implements the SMuRF algorithm of <a href="arXiv:1810.03136">Devriendt et al. (2018)</a> to fit generalized linear models (GLMs) with multiple types of predictors via regularized maximum likelihood. See the package <a href="https://cran.r-project.org/web/packages/smurf/vignettes/smurf.html">Introduction</a>.</p>
<p><img src="/post/2019-01-24-Dec2018-NewPkgs_files/smurf.png" height = "400" width="600"></p>
<p><a href="https://cran.r-project.org/package=subtee">subtee</a> v0.3-4: Provides functions for naive and adjusted treatment effect estimation for subgroups. Proposes model averaging <a href="doi:10.1002/pst.1796">Bornkamp et al. (2016)</a> and bagging <a href="doi:10.1002/bimj.201500147">Rosenkranz (2016)</a> to address the problem of selection bias in treatment effect estimation for subgroups. There is a <a href="https://cran.r-project.org/web/packages/subtee/vignettes/subtee_package.html">Introduction</a> and vignettes for the <a href="https://cran.r-project.org/web/packages/subtee/vignettes/plotting_functions.html">plot</a> and <a href="https://cran.r-project.org/web/packages/subtee/vignettes/subbuild_function.html">subbuild</a> functions.</p>
<p><img src="/post/2019-01-24-Dec2018-NewPkgs_files/subtee.png" height = "400" width="600"></p>
<p><a href="https://cran.r-project.org/package=xspliner">xspliner</a> v0.0.2: Provides functions to assist model building using surrogate black-box models to train interpretable spline based, additive models. There are vignettes on <a href="https://cran.r-project.org/web/packages/xspliner/vignettes/xspliner.html">Basic Theory and Usage</a>, <a href="https://cran.r-project.org/web/packages/xspliner/vignettes/automation.html">Automation</a>, <a href="https://cran.r-project.org/web/packages/xspliner/vignettes/discrete.html">Classification</a>, <a href="https://cran.r-project.org/web/packages/xspliner/vignettes/cases.html">Use Cases</a>, <a href="https://cran.r-project.org/web/packages/xspliner/vignettes/graphics.html">Graphics</a>, <a href="https://cran.r-project.org/web/packages/xspliner/vignettes/extras.html">Extra Information</a>, and the <a href="https://cran.r-project.org/web/packages/xspliner/vignettes/methods.html">xspliner Environment</a>.</p>
<p><img src="/post/2019-01-24-Dec2018-NewPkgs_files/xspliner.png" height = "400" width="600"></p>
<h3 id="time-series">Time Series</h3>
<p><a href="https://cran.r-project.org/package=mfbvar">mfbvar</a> v0.4.0: Provides functions for estimating mixed-frequency Bayesian vector autoregressive (VAR) models with Minnesota or steady-state priors as those used by <a href="doi:10.1080/07350015.2014.954707">Schorfheide and Song (2015)</a>, or by <a href="http://uu.diva-portal.org/smash/get/diva2:1260262/FULLTEXT01.pdf">Ankargren, Unosson and Yang (2018)</a>. Look at the <a href="https://github.com/ankargren/mfbvar">GitHub page</a> for an example.</p>
<p><img src="/post/2019-01-24-Dec2018-NewPkgs_files/mfbvar.png" height = "400" width="600"></p>
<p><a href="https://cran.r-project.org/package=NTS">NTS</a> v1.0.0: Provides functions to simulate, estimate, predict, and identify models for nonlinear time series.</p>
<h3 id="utilities">Utilities</h3>
<p><a href="https://cran.r-project.org/package=AzureContainers">AzureContainers</a> v1.0.0: Implements an interface to container functionality in Microsoft’s <a href="https://azure.microsoft.com/en-us/overview/containers/"><code>Azure</code></a> cloud that enables users to manage the the <code>Azure Container Instance</code>, <code>Azure Container Registry</code>, and <code>Azure Kubernetes Service</code>. There are vignettes on <a href="https://cran.r-project.org/web/packages/AzureContainers/vignettes/vig01_plumber_deploy.html">Plumber model deployment</a> and <a href="https://cran.r-project.org/web/packages/AzureContainers/vignettes/vig02_mmls_deploy.html">Machine Learning server model deployment</a>.</p>
<p><a href="https://cran.r-project.org/package=AzureRMR">AzureRMR</a> v1.0.0: Implements lightweight interface to the <a href="https://docs.microsoft.com/en-us/rest/api/resources/">Azure Resource Manager</a> REST API. The package exposes classes and methods for <a href="https://searchmicroservices.techtarget.com/definition/OAuth"><code>OAuth</code> authentication</a> and working with subscriptions and resource group. There is an <a href="https://cran.r-project.org/web/packages/AzureRMR/vignettes/intro.html">Introduction</a> and a vignette on <a href="https://cran.r-project.org/web/packages/AzureRMR/vignettes/extend.html">Extending AzureRMR</a>.</p>
<p><a href="https://cran.r-project.org/package=AzureStor">AzureStor</a> v1.0.0: Provides tools to manage storage in Microsoft’s <a href="https://azure.microsoft.com/services/storage"><code>Azure</code></a> cloud. See the <a href="https://cran.r-project.org/web/packages/AzureStor/vignettes/intro.html">Introduction</a>.</p>
<p><a href="https://cran.r-project.org/package=AzureVM">AzureVM</a> v1.0.0: Implements tools for working with virtual machines and clusters of virtual machines in Microsoft’s <a href="https://azure.microsoft.com/en-us/services/virtual-machines/"><code>Azure</code></a> cloud. See the <a href="https://cran.r-project.org/web/packages/AzureVM/vignettes/intro.html">Introduction</a>.</p>
<p><a href="https://cran.r-project.org/package=cliapp">cliapp</a> v0.1.0: Provides functions that facilitate creating rich command line applications with colors, headings, lists, alerts, progress bars, and custom CSS-based themes. See the <a href="https://cran.r-project.org/web/packages/cliapp/readme/README.html">README</a> for examples.</p>
<p><a href="https://cran.r-project.org/package=projects">projects</a> v0.1.0: Provides a project infrastructure with a focus on manuscript creation. See the <a href="https://cran.r-project.org/web/packages/projects/readme/README.html">README</a> for the conceptual framework and an introduction to the package.</p>
<p><img src="/post/2019-01-24-Dec2018-NewPkgs_files/projects.png" height = "400" width="600"></p>
<p><a href="https://cran.r-project.org/package=remedy">remedy</a> v0.1.0: Implements an RStudio Addin offering shortcuts for writing in <code>Markdown</code>.</p>
<p><a href="https://cran.r-project.org/package=solartime">solartime</a> v0.0.1: Provides functions for computing sun position and times of sunrise and sunset. The <a href="https://cran.r-project.org/web/packages/solartime/vignettes/overview.html">vignette</a> offers an overview.</p>
<h3 id="visualization">Visualization</h3>
<p><a href="https://CRAN.R-project.org/package=easyalluvial">easyalluvial</a> v0.1.8: Provides functions to simplify Alluvial plots for visualizing categorical data over multiple dimensions as flows. See <a href="doi:10.1371/journal.pone.0008694">Rosvall and Bergstrom (2010)</a>. See the <a href="https://cran.r-project.org/web/packages/easyalluvial/readme/README.html">README</a> for details.</p>
<p><img src="/post/2019-01-24-Dec2018-NewPkgs_files/easyalluvial.png" height = "400" width="600"></p>
<p><a href="https://cran.r-project.org/package=spatialwidget">spatialwidget</a> v0.2: Provides functions for converting R objects, such as simple features, into structures suitable for use in <a href="https://cran.r-project.org/package=htmlwidgets"><code>htmlwidgets</code></a> mapping libraries. See the <a href="https://cran.r-project.org/web/packages/spatialwidget/vignettes/spatialwidget.html">vignette</a> for details.</p>
<p><a href="https://cran.r-project.org/package=transformr">transformr</a> v0.1.1: Provides an extensive framework for manipulating the shapes of polygons and paths and can be seen as the spatial brother to the <a href="https://CRAN.R-project.org/package=tweenr">tweenr</a> package. See the <a href="https://cran.r-project.org/web/packages/transformr/readme/README.html">README</a> for details.</p>
<p><img src="https://cran.r-project.org/web/packages/transformr/readme/man/figures/README-unnamed-chunk-5.gif" height = "400" width="600"></p>
<script>window.location.href='https://rviews.rstudio.com/2019/01/30/december-2108-top-40-new-cran-packages/';</script>
November 2018: “Top 40” New Packages
https://rviews.rstudio.com/2018/12/21/november-2018-top-40-new-packages/
Fri, 21 Dec 2018 00:00:00 +0000https://rviews.rstudio.com/2018/12/21/november-2018-top-40-new-packages/
<p>Having absorbed an average of 181 new packages each month over the last 28 months, CRAN is still growing at a pretty amazing rate. The following plot shows the number of new packages since I started keeping track in August 2016.</p>
<p><img src="/post/2018-12-14-NovTop40_files/new_pkgs.png" height = "400" width="600"></p>
<p>This November, 171 new packages stuck to CRAN. Here is my selection for the “Top 40” organized into the categories: Computational Methods, Data, Finance, Machine Learning, Marketing Analytics, Science, Statistics, Utilities and Visualization.</p>
<h3 id="computational-methods">Computational Methods</h3>
<p><a href="https://cran.r-project.org/package=mixsqp">mixsqp</a> v0.1-79: Provides optimization algorithms (<a href="arXiv:1806.01412">Kim et al. (2012)</a> based on sequential quadratic programming (SQP) for maximum likelihood estimation of the mixture proportions in a finite mixture model where the component densities are known. The <a href="https://cran.r-project.org/web/packages/mixsqp/vignettes/mixsqp-intro.html">vignette</a> shows how to use the package.</p>
<p><a href="https://cran.r-project.org/package=polylabelr">polylabelr</a> v0.1.0: Implements a wrapper around the C++ library <a href="https://github.com/mapbox/polylabel">polylabel</a> from <code>Mapbox</code>, providing an efficient routine for finding the approximate pole of inaccessibility of a polygon. See <a href="https://cran.r-project.org/web/packages/polylabelr/readme/README.html">README</a>.</p>
<p><img src="/post/2018-12-14-NovTop40_files/polylabel.png" height = "400" width="600"></p>
<p><a href="https://cran.r-project.org/package=RiemBase">Riembase</a> v0.2.1: Implements a number of algorithms to estimate fundamental statistics including Fréchet mean and geometric median for manifold-valued data. See <a href="doi:10.1017/CBO9781139094764">Bhattacharya and Bhattacharya (2012)</a> if you are interested in statistics on manifolds, and <a href="https://www.abebooks.com/servlet/BookDetailsPL?bi=30175283776&searchurl=isbn%3D978-0-691-13298-3%26sortby%3D17&cm_sp=snippet-_-srp1-_-title1">Absil et al (2007)</a> for information on the computational aspects of optimization on matrix manifolds.</p>
<p><a href="https://cran.r-project.org/package=SolveRationalMatrixEquation">SolveRationalMatrixEquation</a> v0.1.0: Provides functions to find the symmetric positive definite solution X such that X = Q + L (X inv) L^T given a symmetric positive definite matrix Q and a non-singular matrix L. See <a href="doi:10.1155/2007/21850">Benner et al. (2007)</a> for the details and the <a href="https://cran.r-project.org/web/packages/SolveRationalMatrixEquation/vignettes/SolveRationalMatrixEquation.html">vignette</a> for an example.</p>
<h3 id="data">Data</h3>
<p><a href="https://cran.r-project.org/package=metsyn">metsyn</a> v0.1.2: Provides an interface to the <a href="https://donneespubliques.meteofrance.fr/?fond=produit&id_produit=90&id_rubrique=32">Meteo France Synop data</a> <a href="https://donneespubliques.meteofrance.fr/?fond=produit&id_produit=90&id_rubrique=32">API</a>. This is meteorological data recorded every 3 on 62 French meteorological stations.</p>
<p><img src="/post/2018-12-14-NovTop40_files/metsyn.png" height = "400" width="600"></p>
<p><a href="https://cran.r-project.org/package=neonUtilities">neonUtilities</a> v1.0.1: Provides an interface to the <a href="http://data.neonscience.org">National Ecological Observatory</a> <a href="http://data.neonscience.org/data-api">NEON API</a>. For more information, see the <a href="https://github.com/NEONScience/NEON-utilities">README file</a>.</p>
<p><a href="https://cran.r-project.org/package=phenocamapi">phenocamapi</a> v0.1.2: Allows users to obtain phenological time-series and site metadata from the <a href="https://phenocam.sr.unh.edu/webcam/">PhenoCam network</a>. There is a <a href="https://cran.r-project.org/web/packages/phenocamapi/vignettes/getting_started_phenocam_api.html">Getting Started Guide</a> and a <a href="https://cran.r-project.org/web/packages/phenocamapi/vignettes/phenocam_data_fusion.html">vignette</a> with examples.</p>
<p><img src="/post/2018-12-14-NovTop40_files/phenocamapi.png" height = "400" width="600"></p>
<p><a href="https://cran.r-project.org/package=rdbnomics">rdbnomics</a> v0.4.3: Provides access to hundreds of millions data series from <a href="https://db.nomics.world/">DBnomics API</a>. See the <a href="https://cran.r-project.org/web/packages/rdbnomics/vignettes/rdbnomics-tutorial.html">vignette</a> for examples.</p>
<p><img src="/post/2018-12-14-NovTop40_files/rdbnomics.png" height = "400" width="600"></p>
<p><a href="https://cran.r-project.org/package=restez">restez</a> v1.0.0: Allows users to download large sections of <a href="https://www.ncbi.nlm.nih.gov/genbank/">GenBank</a> and generate a local SQL-based database. A user can then query this database using <code>restez</code> functions.</p>
<h3 id="finance">Finance</h3>
<p><a href="https://cran.r-project.org/package=">crseEventStudy</a> v1.0: Implements the <a href="doi:10.1016/j.jempfin.2018.02.004">Dutta et al. (2018)</a> standardized test for abnormal returns in long-horizon event studies to improve the power and robustness of the tests described in <a href="doi:10.1016/B978-0-444-53265-7.50015-9">Kothari/Warner (2007)</a>.</p>
<p><a href="https://cran.r-project.org/package=psymonitor">psymonitor</a> v0.0.1: Provides functions to apply the real-time monitoring strategy proposed by <a href="doi:10.1111/iere.12132">Phillips, Shi and Yu (2015)</a> (and <a href="doi:10.1111/iere.12131">here</a>) to test for “bubbles”. There is a vignette on <a href="https://cran.r-project.org/web/packages/psymonitor/vignettes/illustrationBONDS.html">detecting crises</a> and another on <a href="https://cran.r-project.org/web/packages/psymonitor/vignettes/illustrationSNP.html">monitoring bubbles</a>.</p>
<p><img src="/post/2018-12-14-NovTop40_files/psymonitor.png" height = "400" width="600"></p>
<h3 id="machine-learning">Machine Learning</h3>
<p><a href="https://cran.r-project.org/package=pivmet">pivmet</a> v0.1.0: Provides a collection of pivotal algorithms for relabeling the MCMC chains in order to cope with the label switching problem in Bayesian mixture models. Functions also initialize the centers of the classical k-means algorithm in order to obtain a better clustering solution. There is a vignette on <a href="https://cran.r-project.org/web/packages/pivmet/vignettes/K-means_clustering_using_the_MUS_algorithm.html">K-means clustering</a> and another on <a href="https://cran.r-project.org/web/packages/pivmet/vignettes/Relabelling_in_Bayesian_mixtures_by_pivotal_units.html">Label Swithching</a>.</p>
<p><img src="/post/2018-12-14-NovTop40_files/pivmet.png" height = "400" width="600"></p>
<p><a href="https://cran.r-project.org/package=RDFTensor">RDFTensor</a> v1.0: Implements tensor factorization techniques suitable for sparse, binary and three-mode <code>RDF</code> tensors. See <a href="doi:10.1145/2187836.2187874">Nickel et al. (2012)</a>, <a href="doi:10.1038/44565">Lee and Seung</a>, <a href="doi:10.1007/978-3-642-33460-3_39">Papalexakis et al.</a> and <a href="doi:10.1137/110859063">Chi and T. G. Kolda (2012)</a> for details.</p>
<p><a href="https://cran.r-project.org/package=rfviz">rfviz</a> v1.0.0: Provides an interactive data visualization and exploration toolkit that implements Breiman and Cutler’s original Java based, random forest visualization tools. It includes both supervised and unsupervised classification and regression algorithms. The <a href="https://www.stat.berkeley.edu/~breiman/RandomForests/cc_graphics.htm">Berkekey website</a> describes the original implementation.</p>
<p><a href="https://cran.r-project.org/package=rJST">rJST</a> v1.0: Provides functions to stimulate the Joint Sentiment Topic model as described by <a href="doi:10.1145/1645953.1646003">Lin and He (2009)</a> and <a href="doi:10.1109/TKDE.2011.48">Lin et al. (2012)</a>. See the <a href="https://cran.r-project.org/web/packages/rJST/vignettes/rJST.html">Introduction</a> for details.</p>
<h3 id="marketing-analytics">Marketing Analytics</h3>
<p><a href="https://cran.r-project.org/package=MarketMatching">Marketmatching</a> v1.1.1: Enables users to find the best control markets using time series matching and analyze the impact of an intervention. Uses the <code>dtw</code> package to do the matching and the <code>CausalImpact</code> package to analyze the causal impact. See the <a href="https://cran.r-project.org/web/packages/MarketMatching/vignettes/MarketMatching-Vignette.html">vignette</a> for an example.</p>
<h3 id="science">Science</h3>
<p><a href="https://cran.r-project.org/package=EpiSignalDetection">EpiSignalDetection</a> v0.1.1: Provides functions to detect possible outbreaks using infectious disease surveillance data at the European Union / European Economic Area or country level. See <a href="doi:10.18637/jss.v070.i10">Salmon et al. (2016)</a> for a description of the automatic detection tools and the <a href="https://cran.r-project.org/web/packages/EpiSignalDetection/vignettes/EpiSignalDetection_Vignette.html">vignette</a> for an overview of the package.</p>
<p><a href="https://cran.r-project.org/package=memnet">memnet</a> v0.1.0: Implements network science tools to facilitate research into human (semantic) memory including several methods to infer networks from verbal fluency data, various network growth models, diverse random walk processes, and tools to analyze and visualize networks. See <a href="doi:10.31234/osf.io/s73dp">Wulff et al. (2018)</a> and the <a href="https://cran.r-project.org/web/packages/memnet/vignettes/memnet.html">vignette</a> for an introduction.</p>
<p><img src="/post/2018-12-14-NovTop40_files/memnet.png" height = "400" width="600"></p>
<p><a href="https://cran.r-project.org/package=phylocomr">phylocomr</a> v0.1.2: Implements an interface to <a href="http://phylodiversity.net/phylocom/">Phylocom</a>, a library for analysis of <code>phylogenetic</code> community structure and character evolution. See the <a href="https://cran.r-project.org/web/packages/phylocomr/vignettes/phylocomr_vignette.html">vignette</a> for and introduction to the package.</p>
<p><a href="https://cran.r-project.org/package=plinkQC">plinkQC</a> v0.2.0: Facilitates genotype quality control for genetic association studies as described by <a href="doi:10.1038/nprot.2010.116">Anderson et al. (2010)</a>. There are vignettes on <a href="https://cran.r-project.org/web/packages/plinkQC/vignettes/AncestryCheck.pdf">Ancestry Estimation</a>, <a href="https://cran.r-project.org/web/packages/plinkQC/vignettes/Genomes1000.pdf">Processing 1000 Genomes</a>, <a href="https://cran.r-project.org/web/packages/plinkQC/vignettes/HapMap.pdf">HapMap III Data</a> and <a href="https://cran.r-project.org/web/packages/plinkQC/vignettes/plinkQC.pdf">Genotype Quality Control</a>.</p>
<p><img src="/post/2018-12-14-NovTop40_files/plinkQC.png" height = "400" width="600"></p>
<h3 id="statistics">Statistics</h3>
<p><a href="https://cran.r-project.org/package=BivRec">BivRec</a> v1.0.0: Implements a collection of non-parametric and semiparametric methods to analyze alternating recurrent event data. See <a href="https://cran.r-project.org/web/packages/BivRec/readme/README.html">README</a> for examples.</p>
<p><img src="/post/2018-12-14-NovTop40_files/BivRec.png" height = "400" width="600"></p>
<p><a href="https://cran.r-project.org/package=cusum">cusum</a> v0.1.0: Provides functions for constructing and evaluating <a href="https://en.wikipedia.org/wiki/CUSUM">CUSUM charts</a> and RA-CUSUM charts with focus on false signal probability. The <a href="https://cran.r-project.org/web/packages/cusum/vignettes/cusum.html">vignette</a> offers an example.</p>
<p><a href="https://cran.r-project.org/package=dabestr">dabestr</a> v0.1.0: Offers an alternative to significance testing using bootstrap methods and estimation plots. See <a href="doi:10.1101/377978">Ho et al (2018)</a>. There is a vignette on <a href="https://cran.r-project.org/web/packages/dabestr/vignettes/bootstrap-confidence-intervals.html">Bootstrap Confidence Intervals</a>, another on <a href="https://cran.r-project.org/web/packages/dabestr/vignettes/robust-statistical-visualization.html">Statistical Visualizations</a>, and a third on creating <a href="https://cran.r-project.org/web/packages/dabestr/vignettes/using-dabestr.html">Estimation Plots</a>.</p>
<p><img src="/post/2018-12-14-NovTop40_files/dbestr.png" height = "400" width="600"></p>
<p><a href="https://cran.r-project.org/package=deckgl">deckgl</a> v0.1.8: Implements an interface to <a href="https://deck.gl/">deck.gl</a>, a WebGL-powered open-source JavaScript framework for visual exploratory data analysis of large data sets and supports basemaps from <a href="https://www.mapbox.com/">mapbox</a>. There are fourteen brief vignettes, each devoted to a different plot layer, but look <a href="https://crazycapivara.github.io/deckgl/">here</a> for a brief overview.</p>
<p><img src="https://user-images.githubusercontent.com/18344164/48512983-cca32e80-e8ae-11e8-9107-c380925cf861.gif
" height = "400" width="600"></p>
<p><a href="https://cran.r-project.org/package=LindleyPowerSeries">LindleyPowerSeries</a> v0.1.0: Provides functions to compute the probability density function, the cumulative distribution function, the hazard rate function, the quantile function and random generation for Lindley Power Series distributions. See <a href="doi:10.1007/s13171-018-0150-x">Nadarajah and Si (2018)</a> for details.</p>
<p><a href="https://cran.r-project.org/package=modi">modi</a> v0.1.0: Implements algorithms that take sample designs into account to detect multivariate outliers. See <a href="doi:10.17713/ajs.v45i1.86">Bill and Hulliger (2016)</a> for details and the <a href="https://cran.r-project.org/web/packages/modi/vignettes/modi_vignette.html">vignette</a> for an introduction.</p>
<p><a href="https://cran.r-project.org/package=MPTmultiverse">MPTmultiverse</a> v0.1: Provides a function to examine the multiverse of possible modeling choices. See the paper by <a href="doi:10.1177/1745691616658637">Steegen et al. (2016)</a> and the <a href="https://cran.r-project.org/web/packages/MPTmultiverse/vignettes/introduction-bayen_kuhlmann_2011.html">vignette</a> for an overview of the package.</p>
<p><img src="/post/2018-12-14-NovTop40_files/MPTmultiverse.png" height = "400" width="600"></p>
<p><a href="https://cran.r-project.org/package=pterrace">pterrace</a> v1.0: Provides functions to plot the persistence terrace, a summary graphic for topological data analysis that helps to determine the number of significant topological features. See <a href="doi:10.1080/10618600.2017.1422432">Moon et al. (2018)</a>.</p>
<p><img src="/post/2018-12-14-NovTop40_files/pterrace.png" height = "400" width="600"></p>
<p><a href="https://cran.r-project.org/package=randcorr">randcorr</a> v1.0: Implements the algorithm by <a href="doi:10.1016/j.spl.2015.06.015">Pourahmadi and Wang (2015)</a> for generating a random p x p correlation matrix by representing the correlation matrix using Cholesky factorization and hyperspherical coordinates. See <a href="arXiv:1809.05212">Makalic and Schmidt (2018)</a> for the sampling process used.</p>
<p><a href="https://cran.r-project.org/package=SMFilter">SMFilter</a> v1.0.3: Provides filtering algorithms for the state space models on the <a href="https://en.wikipedia.org/wiki/Stiefel_manifold">Stiefel manifold</a> as well as the corresponding sampling algorithms for uniform, vector Langevin-Bingham and <a href="https://www.sciencedirect.com/science/article/pii/S0047259X02000659">matrix Langevin-Bingham distributions</a> on the Stiefel manifold. See the <a href="https://cran.r-project.org/web/packages/SMFilter/vignettes/readme.html">vignette</a>.</p>
<h3 id="utilities">Utilities</h3>
<p><a href="https://cran.r-project.org/package=IRkernel">IRkernel</a> v0.8.14: Implements a native R kernel for <a href="https://jupyter.org/">Jupyter Notebook</a>. See <a href="https://cran.r-project.org/web/packages/IRkernel/readme/README.html">README</a> for information on how to use the package.</p>
<p><a href="https://cran.r-project.org/package=lobstr">lobstr</a> v1.0.0: Provides set of tools for inspecting and understanding R data structures inspired by <code>str()</code>. See <a href="https://cran.r-project.org/web/packages/lobstr/readme/README.html">README</a> for information on the included functions.</p>
<p><a href="https://cran.r-project.org/package=parsnip">parsnip</a> v0.0.1: Implements a common interface allowing users to specify a model without having to remember the different argument names across different functions or computational engines. The <a href="https://cran.r-project.org/web/packages/parsnip/vignettes/parsnip_Intro.html">vignette</a> goes over the basics.</p>
<p><a href="https://cran.r-project.org/package=pkgsearch">pkgsearch</a> v2.0.1: Allows users to search CRAN R packages using the <a href="https://www.r-pkg.org/">METACRAN</a> search server.</p>
<p><a href="https://cran.r-project.org/package=stevedore">stevedore</a> v0.9.0: Implements an interface to the <a href="https://docs.docker.com/develop/sdk/">Docker API</a>. There is an <a href="https://cran.r-project.org/web/packages/stevedore/vignettes/stevedore.html">Introduction</a> and a vignette with <a href="https://cran.r-project.org/web/packages/stevedore/vignettes/examples.html">Examples</a>.</p>
<p><a href="https://cran.r-project.org/package=vctrs">vctrs</a> v0.1.0: Defines new notions of prototype and size that are used to provide tools for consistent and well-founded type-coercion and size-recycling. There are vignettes on <a href="https://cran.r-project.org/web/packages/vctrs/vignettes/s3-vector.html">S3 vectors</a>, <a href="https://cran.r-project.org/web/packages/vctrs/vignettes/stability.html">Type and Size Stability</a> and <a href="https://cran.r-project.org/web/packages/vctrs/vignettes/type-size.html">Prototypes</a>.</p>
<p><a href="https://cran.r-project.org/package=vtree">vtree</a> v0.1.4: Provides a function for drawing drawing <code>variable trees</code> plots that display information about hierarchical subsets of a data frame defined by values of categorical variables. The <a href="https://cran.r-project.org/web/packages/vtree/vignettes/vtree.html">vignette</a> offers an introduction.</p>
<p><img src="/post/2018-12-14-NovTop40_files/vtree.png" height = "400" width="600"></p>
<p><a href="https://cran.r-project.org/package=zipR">zipR</a> v0.1.0: Implements the Python <code>zip()</code> function in R. See the <a href="https://cran.r-project.org/web/packages/zipR/vignettes/my-vignette.html">vignette</a>.</p>
<h3 id="visualization">Visualization</h3>
<p><a href="https://cran.r-project.org/package=countcolors">countcolors</a> v0.9.0: Contains functions to count colors within color range(s) in images, and provides a masked version of the image with targeted pixels changed to a different color. Output includes the locations of the pixels in the images, and the proportion of the image within the target color range with optional background masking. There is an <a href="https://cran.r-project.org/web/packages/countcolors/vignettes/Introduction.html">Introduction</a> and an <a href="https://cran.r-project.org/web/packages/countcolors/vignettes/bat_WNS.html">Example</a>.</p>
<p><img src="/post/2018-12-14-NovTop40_files/countcolors.png" height = "400" width="600"></p>
<p><a href="https://cran.r-project.org/package=coveffectsplot">coveffectsplot</a> v0.0.1: Provides forest plots to visualize covariate effects from either the command line or an interactive <code>Shiny</code> application. There is an <a href="https://cran.r-project.org/web/packages/coveffectsplot/vignettes/introduction_to_coveffectsplot.html">Introduction</a>.</p>
<p><img src="/post/2018-12-14-NovTop40_files/coveffectsplot.png" height = "400" width="600"></p>
<script>window.location.href='https://rviews.rstudio.com/2018/12/21/november-2018-top-40-new-packages/';</script>
Statistics in Glaucoma: Part III
https://rviews.rstudio.com/2018/12/18/statistics-in-glaucoma-part-iii/
Tue, 18 Dec 2018 00:00:00 +0000https://rviews.rstudio.com/2018/12/18/statistics-in-glaucoma-part-iii/
<p><em>Samuel Berchuck is a Postdoctoral Associate in Duke University’s Department of Statistical Science and Forge-Duke’s Center for Actionable Health Data Science.</em></p>
<p><em>Joshua L. Warren is an Assistant Professor of Biostatistics at Yale University.</em></p>
<div id="looking-forward-in-glaucoma-progression-research" class="section level2">
<h2>Looking Forward in Glaucoma Progression Research</h2>
<p>The contribution of the <code>womblR</code> package and corresponding statistical methodology is a technique for correctly accounting for the complex spatial structure of the visual field. The purpose of this method is to properly model visual field data, so that an effective diagnostic is derived that discriminates progression status. This is one of many important clinical questions that needs to be addressed in glaucoma research. Others include: quantifying visual field variability to create simulations of healthy and progression patients, combining multiple data modalities to obtain a composite diagnostic, and predicting the timing and spatial location of future vision loss. There is opportunity within the glaucoma literature for the development of quantitative methods that answer important clinical questions, are easy to understand, and are simple to use. To this end, in closing this three-part series, we present a final example of a new method that uses change points to assess future vision loss.</p>
</div>
<div id="modeling-changes-on-the-visual-field-using-spatial-change-points" class="section level2">
<h2>Modeling Changes on the Visual Field Using Spatial Change Points</h2>
<p>To motivate the use of change points, we note that patients diagnosed with glaucoma are often monitored for years with slow changes in visual functionality. It is not until disease progression that notable vision loss occurs, and the deterioration is often swift. This disease course inspires a modeling framework that can identify a point of functional change in the course of follow-up, thus change points are employed. This is an appealing modeling framework, because the concept of disease progression becomes intrinsically parameterized into the model, with the change point representing the point of functional change. In this model, the time of the change point triggers a simultaneous change in both the mean and variance process. Furthermore, to account for the typical trend of a long period of monitoring with little change followed by abrupt vision loss, we force the mean and variance to be constant before the change point. For the mean process, and assuming data from a patient with nine visual field tests, this results in <span class="math display">\[\mu_t\left(\mathbf{s}_i\right)=\left\{ \begin{array}{ll}
{\beta}_0\left(\mathbf{s}_i\right) & x_t \leq \theta\left(\mathbf{s}_i\right),\\
{\beta}_0\left(\mathbf{s}_i\right) + {\beta}_1\left(\mathbf{s}_i\right)\left\{x_t-\theta\left(\mathbf{s}_i\right)\right\} & \theta\left(\mathbf{s}_i\right) \leq x_t.\end{array} \right. \quad t = 1,\ldots,9 \quad i = 1,\ldots,52 \]</span> Here, the change point at location <span class="math inline">\(\mathbf{s}_i\)</span> is given by <span class="math inline">\(\theta(\mathbf{s}_i)\)</span>, and <span class="math inline">\(x_t\)</span> is the days from baseline visit for follow-up visit <span class="math inline">\(t\)</span>. A final important detail is that the change points <span class="math inline">\(\theta(\mathbf{s}_i)\)</span> are truncated in the observed follow-up range, <span class="math inline">\((x_1, x_9)\)</span>. In practice, the true change point can occur outside of this region, so we define a latent process, <span class="math inline">\(\eta(\mathbf{s}_i)\)</span>, that defines the true change point, <span class="math inline">\(\theta(\mathbf{s}_i) = \max\{\min\{\eta(\mathbf{s}_i), x_9\}, x_1\}\)</span>. Finally, all of the location-specific effects are modeled using a novel multivariate conditional autoregressive (MCAR) prior that incorporates the anatomy detailed in the <code>womblR</code> method. More details can be found in Berchuck et al. 2018.</p>
<p>We once again rely on MCMC methods for inference, and a package similar to <code>womblR</code> was developed that implements the spatially varying change point model, <code>spCP</code>. This package has much of the same functionality as <code>womblR</code>, and we demonstrate its functionality below.</p>
<p>We begin by loading <code>spCP</code>. All of the visual field data (<code>VFSeries</code>), adjacencies (<code>HFAII_Queen</code>), and anatomical angles (<code>GarwayHeath</code>) are included in the <code>spCP</code> package.</p>
<pre class="r"><code>###Load package
library(spCP)
###Format data
blind_spot <- c(26, 35) # define blind spot
VFSeries <- VFSeries[order(VFSeries$Location), ] # sort by location
VFSeries <- VFSeries[order(VFSeries$Visit), ] # sort by visit
VFSeries <- VFSeries[!VFSeries$Location %in% blind_spot, ] # remove blind spot locations
Y <- VFSeries$DLS # define observed outcome data
Time <- unique(VFSeries$Time) / 365 # years since baseline visit
MaxTime <- max(Time)
###Neighborhood objects
W <- HFAII_Queen[-blind_spot, -blind_spot] # visual field adjacency matrix
M <- dim(W)[1] # number of locations
DM <- GarwayHeath[-blind_spot] # Garway-Heath angles
Nu <- length(Time) # number of visits
###Obtain bounds for spatial parameter (details are in Berchuck et al. 2018)
pdist <- function(x, y) pmin(abs(x - y), (360 - pmax(x, y) + pmin(x, y))) #Dissimilarity metric distance function (i.e., circumference)
DM_Matrix <- matrix(nrow = M, ncol = M)
for (i in 1:M) {
for (j in 1:M) {
DM_Matrix[i, j] <- pdist(DM[i], DM[j])
}
}
BAlpha <- -log(0.5) / min(DM_Matrix[DM_Matrix > 0])
AAlpha <- 0
###Hyperparameters
Hypers <- list(Alpha = list(AAlpha = AAlpha, BAlpha = BAlpha),
Sigma = list(Xi = 6, Psi = diag(5)),
Delta = list(Kappa2 = 1000))
###Starting values
Starting <- list(Sigma = 0.01 * diag(5),
Alpha = mean(c(AAlpha, BAlpha)),
Delta = c(0, 0, 0, 0, 0))
###Metropolis tuning variances
Tuning <- list(Lambda0Vec = rep(1, M),
Lambda1Vec = rep(1, M),
EtaVec = rep(1, M),
Alpha = 1)
###MCMC inputs
MCMC <- list(NBurn = 10000, NSims = 250000, NThin = 25, NPilot = 20)</code></pre>
<p>Once the inputs have been properly formatted, the program can be run.</p>
<pre class="r"><code>###Run MCMC sampler
reg.spCP <- spCP(Y = Y, DM = DM, W = W, Time = Time,
Starting = Starting, Hypers = Hypers, Tuning = Tuning, MCMC = MCMC,
Family = "tobit",
Weights = "continuous",
Distance = "circumference",
Rho = 0.99,
ScaleY = 10,
ScaleDM = 100,
Seed = 54)
## Burn-in progress: |*************************************************|
## Sampler progress: 0%.. 10%.. 20%.. 30%.. 40%.. 50%.. 60%.. 70%.. 80%.. 90%.. 100%.. </code></pre>
<p>To visualize the estimated change points, we can use the <code>PlotCP</code> function from <code>spcP</code>. The function requires the model fit object and the original data set, plus the variable names of the raw DLS, time (in years), and spatial locations.</p>
<pre class="r"><code>VFSeries$TimeYears <- VFSeries$Time / 365
PlotCP(reg.spCP,
VFSeries,
dls = "DLS",
time = "TimeYears",
location = "Location",
cp.line = TRUE,
cp.ci = TRUE)</code></pre>
<p><img src="/post/2018-12-12-statistics-in-glaucoma-part-iii_files/figure-html/unnamed-chunk-4-1.png" width="689.28" style="display: block; margin: auto;" /></p>
<p>Using the <code>PlotCP</code> function, we present the posterior means of the change points using a blue vertical line, with dashed 95% credible intervals. Furthermore, the mean process and credible interval are plotted using red lines, and the raw DLS values are given by black points. For this example patient, the majority of the change points are at the edges of follow-up. When the DLS is constant over time, the estimated change points are at the end of follow-up, while any trends that are present before follow-up correspond to the change point occurring at the beginning. This information provides clinicians with visual and quantitative confirmation of functional changes across the visual field.</p>
</div>
<div id="change-points-as-a-proxy-for-progression" class="section level2">
<h2>Change Points as a Proxy for Progression</h2>
<p>To formalize the importance of the change points, we look to convert their presence or absence into a clinical decision. We decide to calculate the probability that a change point has been observed at each location across the visual field. To provide a tool that is useful for clinicians, we create a gif that presents the probability of a change point throughout a patient’s follow-up, and are able to predict one and a half years into the future. In Berchuck et al. 2018, it is shown that these change points are highly predictive of progression.</p>
<p>We begin by extracting and calculating the change point probabilities.</p>
<pre class="r"><code>###Extract change point posterior samples
eta <- reg.spCP$eta
###Convert change points to probabilties of occuring before time t
NFrames <- 50 # number of frames in GIF
GIF_Times <- seq(0, MaxTime + 1.5, length.out = NFrames) # obtain GIF 1.5 years after the end of follow-up
GIF_Days <- round(GIF_Times * 365) # convert to days for use later
CP_Probs <- matrix(nrow = M, ncol = NFrames)
###Obtain probabilties at each time point
for (t in 1:NFrames) {
CP_Probs[, t] <- apply(eta, 2, function(x) mean(x < GIF_Times[t]))
}
colnames(CP_Probs) <- GIF_Times</code></pre>
<p>Now, to create a gif of the probabilities, we use the <code>magick</code> package, and in particular, the functions <code>image_graph</code> and <code>image_animate</code>. Furthermore, we use the <code>PlotSensitivity</code> function from <code>womblR</code> to plot the predicted probabilities on the visual field.</p>
<pre class="r"><code>###Load packages
library(magick) # package for creating GIFs
library(womblR) # loaded for PlotSensitivity
###Create GIF
Img <- image_graph(600, 600, res = 96)
for (f in 1:NFrames) {
p <- womblR::PlotSensitivity(CP_Probs[, f],
legend.lab = expression(paste("Pr[", eta, "(s)] < ", t)),
zlim = c(0, 1),
bins = 250,
legend.round = 2,
border = FALSE,
main = bquote("Days from baseline: t = " ~ .(GIF_Days[f]) ~ " (" ~ t[max] ~ " = " ~ .(Time[Nu] * 365) ~ ")"))
}
dev.off()</code></pre>
<pre><code>## quartz_off_screen
## 2</code></pre>
<p>Now, we animate and print the created gif using the <code>image_animate</code> function, specifying 10 frames per second using the <code>fps</code> option.</p>
<pre class="r"><code>Animation <- image_animate(Img, fps = 10)
Animation</code></pre>
<p><img src="/post/2018-12-12-statistics-in-glaucoma-part-iii_files/figure-html/unnamed-chunk-7-1.gif" style="display: block; margin: auto;" /></p>
<p>This gif has many properties that make it clinically useful. The space-time nature of the image allows for clinicians to understand not only the current state of the disease, but also the progression pattern throughout all of follow-up. Furthermore, the gif shows the pattern and future risk of progression over the next one and a half years, presenting clinicians a tool for planning for future risk.</p>
</div>
<div id="conclusions-and-future-directions" class="section level2">
<h2>Conclusions and Future Directions</h2>
<p>The hope in developing these <code>R</code> packages is for them to be used clinically, and to inspire other quantitative scientists to do the same. When statistical methods are typically developed for medical research, it is more common for the methodologies to be published without any corresponding software package. This means that no matter how impactful the method may be, it is unlikely to make a clinical impact for many years, due to the complexity in implementing the inferential methods. Clinicians are dependent on quantitative methods for analyzing the massive amounts of data that exist in today’s world, and they are typically reliant on the proprietary software that is built into the imaging machines themselves. This software is useful, but because the methods are often not published, it can be difficult to interpret the results. More open-source software being developed for medical research will lead to greater collaboration and visibility of the important problems being addressed by health researchers. The <code>R</code> environment, including CRAN and RStudio, make it particularly easy to create and share <code>R</code> packages, and the development of <code>Rcpp</code> and its relatives allow for the packages to be computationally fast. Our hope is that the <code>womblR</code> and <code>spCP</code> packages illustrate this concept and excite people to get involved in glaucoma research, or one of many other important health areas.</p>
</div>
<div id="reference" class="section level2">
<h2>Reference</h2>
<ol style="list-style-type: decimal">
<li>Berchuck, S.I., Mwanza, J.C., & Warren, J.L. (2018). <a href="https://arxiv.org/pdf/1811.11038.pdf">“A Spatially Varying Change Points Model for Monitoring Glaucoma Progression Using Visual Field Data”</a>.</li>
</ol>
</div>
<script>window.location.href='https://rviews.rstudio.com/2018/12/18/statistics-in-glaucoma-part-iii/';</script>
Rsampling Fama French
https://rviews.rstudio.com/2018/12/13/rsampling-fama-french/
Thu, 13 Dec 2018 00:00:00 +0000https://rviews.rstudio.com/2018/12/13/rsampling-fama-french/
<p>Today we will continue our work on Fama French factor models, but more as a vehicle to explore some of the awesome stuff happening in the world of <a href="https://www.tidyverse.org/articles/2018/11/tidymodels-update-nov-18/">tidy models</a>. For new readers who want get familiar with Fama French before diving into this post, see <a href="https://rviews.rstudio.com/2018/04/11/introduction-to-fama-french/">here</a> where we covered importing and wrangling the data, <a href="https://rviews.rstudio.com/2018/05/10/rolling-fama-french/">here</a> where we covered rolling models and visualization, my most recent previous post <a href="https://rviews.rstudio.com/2018/11/19/many-factor-models/">here</a> where we covered managing many models, and if you’re into Shiny, <a href="http://www.reproduciblefinance.com/shiny/fama-french-three-factor/">this flexdashboard</a>.</p>
<p>Our goal today is to explore k-fold cross-validation via the <code>rsample</code> package, and a bit of model evaluation via the <code>yardstick</code> package. We started the model evaluation theme last time when we used <code>tidy()</code>, <code>glance()</code> and <code>augment()</code> from the <code>broom</code> package. In this post, we will use the <code>rmse()</code> function from <code>yardstick</code>, but our main focus will be on the <code>vfold_cv()</code> function from <code>rsample</code>. We are going to explore these tools in the context of linear regression and Fama French, which might seem weird since these tools would typically be employed in the realms of machine learning, classification, and the like. We’ll stay in the world of explanatory models via linear regression world for a few reasons.</p>
<p>First, and this is a personal preference, when getting to know a new package or methodology, I prefer to do so in a context that’s already familiar. I don’t want to learn about <code>rsample</code> whilst also getting to know a new data set and learning the complexities of some crazy machine learning model. Since Fama French is familiar from our previous work, we can focus on the new tools in <code>rsample</code> and <code>yardstick</code>. Second, factor models are important in finance, despite relying on good old linear regression. We won’t regret time spent on factor models, and we might even find creative new ways to deploy or visualize them.</p>
<p>The plan for today is take the same models that we ran in the last post, only this use k-fold cross validation and bootstrapping to try to assess the quality of those models.</p>
<p>For that reason, we’ll be working with the same data as we did previously. I won’t go through the logic again, but in short, we’ll import data for daily prices of five ETFs, convert them to returns (have a look <a href="http://www.reproduciblefinance.com/2017/09/25/asset-prices-to-log-returns/">here</a> for a refresher on that code flow), then import the five Fama French factor data and join it to our five ETF returns data. Here’s the code to make that happen:</p>
<pre class="r"><code>library(tidyverse)
library(broom)
library(tidyquant)
library(timetk)
symbols <- c("SPY", "EFA", "IJS", "EEM", "AGG")
# The prices object will hold our daily price data.
prices <-
getSymbols(symbols, src = 'yahoo',
from = "2012-12-31",
to = "2017-12-31",
auto.assign = TRUE,
warnings = FALSE) %>%
map(~Ad(get(.))) %>%
reduce(merge) %>%
`colnames<-`(symbols)
asset_returns_long <-
prices %>%
tk_tbl(preserve_index = TRUE, rename_index = "date") %>%
gather(asset, returns, -date) %>%
group_by(asset) %>%
mutate(returns = (log(returns) - log(lag(returns)))) %>%
na.omit()
factors_data_address <-
"http://mba.tuck.dartmouth.edu/pages/faculty/ken.french/ftp/Global_5_Factors_Daily_CSV.zip"
factors_csv_name <- "Global_5_Factors_Daily.csv"
temp <- tempfile()
download.file(
# location of file to be downloaded
factors_data_address,
# where we want R to store that file
temp,
quiet = TRUE)
Global_5_Factors <-
read_csv(unz(temp, factors_csv_name), skip = 6 ) %>%
rename(date = X1, MKT = `Mkt-RF`) %>%
mutate(date = ymd(parse_date_time(date, "%Y%m%d")))%>%
mutate_if(is.numeric, funs(. / 100)) %>%
select(-RF)
data_joined_tidy <-
asset_returns_long %>%
left_join(Global_5_Factors, by = "date") %>%
na.omit()</code></pre>
<p>After running that code, we have an object called <code>data_joined_tidy</code>. It holds daily returns for 5 ETFs and the Fama French factors. Here’s a look at the first row for each ETF rows.</p>
<pre class="r"><code>data_joined_tidy %>%
slice(1)</code></pre>
<pre><code># A tibble: 5 x 8
# Groups: asset [5]
date asset returns MKT SMB HML RMW CMA
<date> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 2013-01-02 AGG -0.00117 0.0199 -0.0043 0.0028 -0.0028 -0.0023
2 2013-01-02 EEM 0.0194 0.0199 -0.0043 0.0028 -0.0028 -0.0023
3 2013-01-02 EFA 0.0154 0.0199 -0.0043 0.0028 -0.0028 -0.0023
4 2013-01-02 IJS 0.0271 0.0199 -0.0043 0.0028 -0.0028 -0.0023
5 2013-01-02 SPY 0.0253 0.0199 -0.0043 0.0028 -0.0028 -0.0023</code></pre>
<p>Let’s work with just one ETF for today and use <code>filter(asset == "AGG")</code> to shrink our data down to just that ETF.</p>
<pre class="r"><code>agg_ff_data <-
data_joined_tidy %>%
filter(asset == "AGG")</code></pre>
<p>Okay, we’re going to regress the daily returns of AGG on one factor, then three factors, then five factors, and we want to evaluate how well each model explains AGG’s returns. That means we need a way to test the model. Last time, we looked at the adjusted r-squared values when the model was run on the entirety of AGG returns. Today, we’ll evaluate the model using k-fold cross validation. That’s a pretty jargon-heavy phrase that isn’t part of the typical finance lexicon. Let’s start with the second part, <code>cross-validation</code>. Instead of running our model on the entire data set - all the daily returns of AGG - we’ll run it on just part of the data set, then test the results on the part that we did not use. Those different subsets of our original data are often called the training and the testing sets, though <code>rsample</code> calls them the <code>analysis</code> and <code>assessment</code> sets. We validate the model results by applying them to the <code>assessment</code> data and seeing how the model performed.</p>
<p>The <code>k-fold</code> bit refers to the fact that we’re not just dividing our data into training and testing subsets, we’re actually going to divide it into a bunch of groups, a <code>k</code> number of groups, or a <code>k</code> number of <code>folds</code>. One of those folds will be used as the validation set; the model will be fit on the other <code>k - 1</code> sets, and then tested on the validation set. We’re doing this with a linear model to see how well it explains the data; it’s typically used in machine learning to see how well a model predicts data (we’ll get there in 2019).<a href="#fn1" class="footnoteRef" id="fnref1"><sup>1</sup></a></p>
<p>If you’re like me, it will take a bit of tinkering to really grasp k-fold cross validation, but <code>rsample</code> as a great function for dividing our data into k-folds. If we wish to use five folds (the state of the art seems to be either five or ten folds), we call the <code>vfold_cv()</code> function, pass it our data object <code>agg_ff_data</code>, and set <code>v = 5</code>.</p>
<pre class="r"><code>library(rsample)
library(yardstick)
set.seed(752)
cved_ff<-
vfold_cv(agg_ff_data, v = 5)
cved_ff</code></pre>
<pre><code># 5-fold cross-validation
# A tibble: 5 x 2
splits id
<list> <chr>
1 <split [1K/252]> Fold1
2 <split [1K/252]> Fold2
3 <split [1K/252]> Fold3
4 <split [1K/252]> Fold4
5 <split [1K/251]> Fold5</code></pre>
<p>We have an object called <code>cved_ff</code>, with a column called <code>splits</code> and a column called <code>id</code>. Let’s peek at the first split.</p>
<pre class="r"><code>cved_ff$splits[[1]]</code></pre>
<pre><code><1007/252/1259></code></pre>
<p>Three numbers. The first, 1007, is telling us how many observations are in the <code>analysis</code>. Since we have five folds, we should have 80% (or 4/5) of our data in the <code>analysis</code> set. The second number, 252, is telling us how many observations are in the <code>assessment</code>, which is 20% of our original data. The third number, 1259, is the total number of observations in our original data.</p>
<p>Next, we want to apply a model to the <code>analysis</code> set of this k-folded data and test the results on the <code>assessment</code> set. Let’s start with one factor and run a simple linear model, <code>lm(returns ~ MKT)</code>.</p>
<p>We want to run it on <code>analysis(cved_ff$splits[[1]])</code> - the analysis set of out first split.</p>
<pre class="r"><code>ff_model_test <- lm(returns ~ MKT, data = analysis(cved_ff$splits[[1]]))
ff_model_test</code></pre>
<pre><code>
Call:
lm(formula = returns ~ MKT, data = analysis(cved_ff$splits[[1]]))
Coefficients:
(Intercept) MKT
0.0001025 -0.0265516 </code></pre>
<p>Nothing too crazy so far. Now we want to test on our assessment data. The first step is to add that data to the original set. We’ll use <code>augment()</code> for that task, and pass it <code>assessment(cved_ff$splits[[1]])</code></p>
<pre class="r"><code>ff_model_test %>%
augment(newdata = assessment(cved_ff$splits[[1]])) %>%
head() %>%
select(returns, .fitted)</code></pre>
<pre><code> returns .fitted
1 0.0009021065 1.183819e-04
2 0.0011726989 4.934779e-05
3 0.0010815505 1.157267e-04
4 -0.0024385815 -7.544460e-05
5 -0.0021715702 -8.341007e-05
6 0.0028159467 3.865527e-04</code></pre>
<p>We just added our fitted values to the <code>assessment</code> data, the subset of the data on which the model was not fit. How well did our model do when compare the fitted values to the data in the held out set?</p>
<p>We can use the <code>rmse()</code> function from <code>yardstick</code> to measure our model. RMSE stands for root mean-squared error. It’s the sum of the squared differences between our fitted values and the actual values in the <code>assessment</code> data. A lower RMSE is better!</p>
<pre class="r"><code>ff_model_test %>%
augment(newdata = assessment(cved_ff$splits[[1]])) %>%
rmse(returns, .fitted)</code></pre>
<pre><code># A tibble: 1 x 3
.metric .estimator .estimate
<chr> <chr> <dbl>
1 rmse standard 0.00208</code></pre>
<p>Now that we’ve done that piece by piece, let’s wrap the whole operation into one function. This function takes one argument, a <code>split</code>, and we’re going to use <code>pull()</code> so we can extract the raw number, instead of the entire <code>tibble</code> result.</p>
<pre class="r"><code>model_one <- function(split) {
split_for_model <- analysis(split)
ff_model <- lm(returns ~ MKT, data = split_for_model)
holdout <- assessment(split)
rmse <- ff_model %>%
augment(newdata = holdout) %>%
rmse(returns, .fitted) %>%
pull(.estimate)
}</code></pre>
<p>Now we pass it our first split.</p>
<pre class="r"><code>model_one(cved_ff$splits[[1]]) %>%
head()</code></pre>
<pre><code>[1] 0.002080324</code></pre>
<p>Now we want to apply that function to each of our five folds that are stored in <code>agg_cved_ff</code>. We do that with a combination of <code>mutate()</code> and <code>map_dbl()</code>. We use <code>map_dbl()</code> instead of <code>map</code> because we are returning a number here and there’s not a good reason to store that number in a list column.</p>
<pre class="r"><code> cved_ff %>%
mutate(rmse = map_dbl(cved_ff$splits, model_one))</code></pre>
<pre><code># 5-fold cross-validation
# A tibble: 5 x 3
splits id rmse
* <list> <chr> <dbl>
1 <split [1K/252]> Fold1 0.00208
2 <split [1K/252]> Fold2 0.00189
3 <split [1K/252]> Fold3 0.00201
4 <split [1K/252]> Fold4 0.00224
5 <split [1K/251]> Fold5 0.00190</code></pre>
<p>OK, we have five RMSE’s since we ran the model on five separate <code>analysis</code> fold sets and tested on five separate <code>assessment</code> fold sets. Let’s find the average RMSE by taking the <code>mean()</code> of the <code>rmse</code> column. That can help reduce noisiness that resulted from our random creation of those five folds.</p>
<pre class="r"><code>cved_ff %>%
mutate(rmse = map_dbl(cved_ff$splits, model_one)) %>%
summarise(mean_rse = mean(rmse)) </code></pre>
<pre><code># 5-fold cross-validation
# A tibble: 1 x 1
mean_rse
<dbl>
1 0.00202</code></pre>
<p>We now have the mean RMSE after running on our model, <code>lm(returns ~ MKT)</code>, on all five of our folds.</p>
<p>That process for finding the mean RMSE can be applied other models, as well. Let’s suppose we wish to find the mean RMSE for two other models: <code>lm(returns ~ MKT + SMB + HML)</code>, the Fama French three-factor model, and <code>lm(returns ~ MKT + SMB + HML + RMW + CMA</code>, the Fama French five-factor model. By comparing the mean RMSE’s, we can evaluate which model explained the returns of AGG better. Since we’re just adding more and more factors, the models can be expected to get more and more accurate but again, we are exploring the <code>rsample</code> machinery and creating a template where we can pop in whatever models we wish to compare.</p>
<p>First, let’s create two new functions, that follow the exact same code pattern as above but house the three-factor and five-factor models.</p>
<pre class="r"><code>model_two <- function(split) {
split_for_model <- analysis(split)
ff_model <- lm(returns ~ MKT + SMB + HML, data = split_for_model)
holdout <- assessment(split)
rmse <-
ff_model %>%
augment(newdata = holdout) %>%
rmse(returns, .fitted) %>%
pull(.estimate)
}
model_three <- function(split) {
split_for_model <- analysis(split)
ff_model <- lm(returns ~ MKT + SMB + HML + RMW + CMA, data = split_for_model)
holdout <- assessment(split)
rmse <-
ff_model %>%
augment(newdata = holdout) %>%
rmse(returns, .fitted) %>%
pull(.estimate)
}</code></pre>
<p>Now we pass those three models to the same <code>mutate()</code> with <code>map_dbl()</code> flow that we used with just one model. The result will be three new columns of RMSE’s, one for each of our three models applied to our five folds.</p>
<pre class="r"><code>cved_ff %>%
mutate(
rmse_model_1 = map_dbl(
splits,
model_one),
rmse_model_2 = map_dbl(
splits,
model_two),
rmse_model_3 = map_dbl(
splits,
model_three))</code></pre>
<pre><code># 5-fold cross-validation
# A tibble: 5 x 5
splits id rmse_model_1 rmse_model_2 rmse_model_3
* <list> <chr> <dbl> <dbl> <dbl>
1 <split [1K/252]> Fold1 0.00208 0.00211 0.00201
2 <split [1K/252]> Fold2 0.00189 0.00184 0.00178
3 <split [1K/252]> Fold3 0.00201 0.00195 0.00194
4 <split [1K/252]> Fold4 0.00224 0.00221 0.00213
5 <split [1K/251]> Fold5 0.00190 0.00183 0.00177</code></pre>
<p>We can also find the mean RMSE for each model.</p>
<pre class="r"><code>cved_ff %>%
mutate(
rmse_model_1 = map_dbl(
splits,
model_one),
rmse_model_2 = map_dbl(
splits,
model_two),
rmse_model_3 = map_dbl(
splits,
model_three)) %>%
summarise(mean_rmse_model_1 = mean(rmse_model_1),
mean_rmse_model_2 = mean(rmse_model_2),
mean_rmse_model_3 = mean(rmse_model_3))</code></pre>
<pre><code># 5-fold cross-validation
# A tibble: 1 x 3
mean_rmse_model_1 mean_rmse_model_2 mean_rmse_model_3
<dbl> <dbl> <dbl>
1 0.00202 0.00199 0.00192</code></pre>
<p>That code flow worked just fine, but we had to repeat ourselves when creating the functions for each model. Let’s toggle to a flow where we define three models - the ones that we wish to test with via cross-validation and RMSE - then pass those models to one function.</p>
<p>First, we use <code>as.formula()</code> to define our three models.</p>
<pre class="r"><code>mod_form_1 <- as.formula(returns ~ MKT)
mod_form_2 <- as.formula(returns ~ MKT + SMB + HML)
mod_form_3 <- as.formula(returns ~ MKT + SMB + HML + RMW + CMA)</code></pre>
<p>Now we write one function that takes <code>split</code> as an argument, same as above, but also takes <code>formula</code> as an argument, so we can pass it different models. This gives us the flexibility to more easily define new models and pass them to <code>map</code>, so I’ll append <code>_flex</code> to the name of this function.</p>
<pre class="r"><code>ff_rmse_models_flex <- function(split, formula) {
split_for_data <- analysis(split)
ff_model <- lm(formula, data = split_for_data)
holdout <- assessment(split)
rmse <-
ff_model %>%
augment(newdata = holdout) %>%
rmse(returns, .fitted) %>%
pull(.estimate)
}</code></pre>
<p>Now we use the same code flow as before, except we call <code>map_dbl()</code>, pass it our <code>cved_ff$splits</code> object, our new <code>flex</code> function called <code>ff_rmse_models_flex()</code>, and the model we wish to pass as the <code>formula</code> argument. First we pass it <code>mod_form_1</code>.</p>
<pre class="r"><code>cved_ff %>%
mutate(rmse_model_1 = map_dbl(cved_ff$splits,
ff_rmse_models_flex,
formula = mod_form_1))</code></pre>
<pre><code># 5-fold cross-validation
# A tibble: 5 x 3
splits id rmse_model_1
* <list> <chr> <dbl>
1 <split [1K/252]> Fold1 0.00208
2 <split [1K/252]> Fold2 0.00189
3 <split [1K/252]> Fold3 0.00201
4 <split [1K/252]> Fold4 0.00224
5 <split [1K/251]> Fold5 0.00190</code></pre>
<p>Now let’s pass it all three models and find the mean RMSE.</p>
<pre class="r"><code>cved_ff %>%
mutate(rmse_model_1 = map_dbl(cved_ff$splits,
ff_rmse_models_flex,
formula = mod_form_1),
rmse_model_2 = map_dbl(cved_ff$splits,
ff_rmse_models_flex,
formula = mod_form_2),
rmse_model_3 = map_dbl(cved_ff$splits,
ff_rmse_models_flex,
formula = mod_form_3)) %>%
summarise(mean_rmse_model_1 = mean(rmse_model_1),
mean_rmse_model_2 = mean(rmse_model_2),
mean_rmse_model_3 = mean(rmse_model_3))</code></pre>
<pre><code># 5-fold cross-validation
# A tibble: 1 x 3
mean_rmse_model_1 mean_rmse_model_2 mean_rmse_model_3
<dbl> <dbl> <dbl>
1 0.00202 0.00199 0.00192</code></pre>
<p>Alright, that code flow seems a bit more flexible than our original method of writing a function to assess each model. We didn’t do much hard thinking about functional form here, but hopefully this provides a template where you could assess more nuanced models. We’ll get into bootstrapping and time series work next week, then head to Shiny to ring in the New Year!</p>
<p>And, finally, a couple of public service announcements.</p>
<p>First, thanks to everyone who has checked out my new book! The price just got lowered for the holidays. See on <a href="https://www.amazon.com/Reproducible-Finance-Portfolio-Analysis-Chapman/dp/1138484032">Amazon</a> or on the <a href="https://www.crcpress.com/Reproducible-Finance-with-R-Code-Flows-and-Shiny-Apps-for-Portfolio-Analysis/Jr/p/book/9781138484030">CRC homepage</a> (okay, that was more of an announcement about my book).</p>
<p>Second, applications are open for the <a href="https://www.battlefin.com/">Battlefin</a> alternative data contest, and RStudio is one of the tools you can use to analyze the data. Check it out <a href="https://www.battlefin.com/adc">here</a>. In January, they’ll announce 25 finalists who will get to compete for a cash prize and connect with some quant hedge funds. Go get ‘em!</p>
<p>Thanks for reading and see you next time.</p>
<div class="footnotes">
<hr />
<ol>
<li id="fn1"><p>For more on cross-validation, see “An Introduction to Statistical Learning”, chapter 5. Available online here: <a href="http://www-bcf.usc.edu/~gareth/ISL/" class="uri">http://www-bcf.usc.edu/~gareth/ISL/</a>.<a href="#fnref1">↩</a></p></li>
</ol>
</div>
<script>window.location.href='https://rviews.rstudio.com/2018/12/13/rsampling-fama-french/';</script>
Statistics in Glaucoma: Part II
https://rviews.rstudio.com/2018/12/07/statistics-in-glaucoma-part-ii/
Fri, 07 Dec 2018 00:00:00 +0000https://rviews.rstudio.com/2018/12/07/statistics-in-glaucoma-part-ii/
<p><em>Samuel Berchuck is a Postdoctoral Associate in Duke University’s Department of Statistical Science and Forge-Duke’s Center for Actionable Health Data Science.</em></p>
<p><em>Joshua L. Warren is an Assistant Professor of Biostatistics at Yale University.</em></p>
<div id="analyzing-visual-field-data" class="section level2">
<h2>Analyzing Visual Field Data</h2>
<p>In Part I of this series on statistic in glaucoma, we detailed the use of visual fields for understanding functional vision loss in glaucoma patients. Before discussing a new method for modeling visual field data that accounts for the anatomy of the eye, we discussed how visual field data is typically analyzed by introducing a common diagnostic metric, point-wise linear regression (PLR). PLR is a trend-based diagnostic that uses slope p-values from the location specific linear regressions to discriminate progression status. The motivation for PLR is straightforward, assuming that large negative slopes at numerous visual field locations is indicative of progression. This is characteristic of a large class of methods for analyzing visual field data that attempt to discriminate progression based on changes in the DLS across time. This technique is simple, intuitive, and effective; however, it is often limited due to the naivete of modeling assumptions, including the independence of visual field locations.</p>
</div>
<div id="ocular-anatomy-in-the-neighborhood-structure-of-the-visual-field" class="section level2">
<h2>Ocular Anatomy in the Neighborhood Structure of the Visual Field</h2>
<p>To properly account for the spatial dependencies on the visual field, Berchuck et al. 2018 introduce a neighborhood model that incorporates anatomical information through a dissimilarity metric. Details of the method can be found in Berchuck et al. 2018, but we provide a quick introduction. The key development is the specification of the neighborhood structure through a new definition of adjacency weights. Typically in areal data, the adjacency for two locations <span class="math inline">\(i\)</span> and <span class="math inline">\(j\)</span> is defined as <span class="math inline">\(w_{ij} = 1(i \sim j)\)</span>, where <span class="math inline">\(i \sim j\)</span> is the event that locations <span class="math inline">\(i\)</span> and <span class="math inline">\(j\)</span> are neighbors. As discussed in Part I, this assumption is not sufficient due to the complex anatomy of the eye. To account for this additional structure, a more general adjacency is introduced that is a function of a dissimilarity metric, <span class="math inline">\(w_{ij}(\alpha_t) = 1(i \sim j)\exp\{-z_{ij}\alpha_t\}\)</span>. Here, <span class="math inline">\(z_{ij}\)</span> is a dissimilarity metric that represents the absolute difference between the Garway-Heath angles of locations <span class="math inline">\(i\)</span> and <span class="math inline">\(j\)</span>.</p>
<p>The parameter <span class="math inline">\(\alpha_t\)</span> dictates the importance of the dissimilarity metric at each visual field exam <span class="math inline">\(t\)</span>. When <span class="math inline">\(\alpha_t\)</span> becomes large, the model reduces to an independent process, and as <span class="math inline">\(\alpha_t\)</span> goes to zero, the process becomes the standard spatial model for areal data. Based on the specification of the adjacency weights, <span class="math inline">\(\alpha_t\)</span> has a useful interpretation with respect to deterioration of visual ability. In particular, <span class="math inline">\(\alpha_t\)</span> changing over exams indicates that the neighborhood structure on the visual field is changing, which in turn implies damage to the underlying retinal ganglion cell structure. This observation motivates a diagnostic of progression that quantifies variability in <span class="math inline">\(\alpha_t\)</span> across time. We choose the coefficient of variation (CV) and demonstrate that is a highly significant predictor of progression, and furthermore, independent of trend-based methods such as PLR.</p>
</div>
<div id="navigating-the-womblr-package" class="section level2">
<h2>Navigating the <code>womblR</code> Package</h2>
<p>To make the method available to clinicians, the R package <code>womblR</code> was developed. The package provides a suite of functions that walk a user through the full process of analyzing a series of visual fields from beginning to end. The user interface was modeled after other impactful R packages for Bayesian spatial analysis, including <code>spBayes</code> and <code>CARBayes</code>. The package name combines Hadley’s naming convention for R packages (i.e., ending a package with the letter R) with the name of the author of the seminal paper on boundary detection, originally referred to areal wombling (Womble 1951).</p>
<p>We will now walk through the process of analyzing visual field data, estimating the <span class="math inline">\(\alpha_t\)</span> parameters, and assessing progression status. The main function in <code>womblR</code> is the Spatiotemporal Boundary Detection with Dissimilarity Metric model function (<code>STBDwDM</code>). Inference for the method is obtained through Markov chain Monte Carlo (MCMC), which is a computationally intensive method that iterates between updating individual model parameters until enough posterior samples have been collected post-convergence for making accurate posterior inference. Because of the iterative nature of MCMC, the majority of computation is performed within a <code>for</code> loop, so the package is built on C++ through the packages <code>Rcpp</code> and <code>RcppArmadillo</code>. Because of the increased complexity of writing in C++, the pre- and post-processing of the model are done in <code>R</code> with the <code>for</code> loop implemented in C++. The MCMC method employed in <code>womblR</code> is a Metropolis-Hastings within Gibbs algorithm.</p>
<p>Just as a quick aside, with the more recent advent of probabilistic programming, this model could have been implemented using the Hamiltonian Monte Carlo methods used in software like Stan or PyMC3. These programs do not require the derivation of full conditionals, and push the MCMC algorithm to the background. There is undoubtedly a huge market for this type of software, and it is clearly playing a significant role in the popularization of Bayesian modeling. At the same time, implementing MCMC samplers using <code>Rcpp</code> with traditional MCMC algorithms can be instructive, and for those with experience, nearly as quick of a coding experience.</p>
<p>We now begin by formatting the visual field data for analysis. According to the manual, the observed data <code>Y</code> must first be ordered spatially and then temporally. Furthermore, we will remove all locations that correspond to the natural blind spot (which, in the Humphrey Field Analyzer-II, correspond to locations 26 and 35).</p>
<pre class="r"><code>###Load package
library(womblR)
###Format data
blind_spot <- c(26, 35) # define blind spot
VFSeries <- VFSeries[order(VFSeries$Location), ] # sort by location
VFSeries <- VFSeries[order(VFSeries$Visit), ] # sort by visit
VFSeries <- VFSeries[!VFSeries$Location %in% blind_spot, ] # remove blind spot locations
Y <- VFSeries$DLS # define observed outcome data</code></pre>
<p>Now that we have assigned the observed outcomes to <code>Y</code>, we move onto the temporal variable <code>Time</code>. For visual field data, we define this to be the time from the baseline visit. We obtain the unique days from the baseline visit and scale them to be on the year scale.</p>
<pre class="r"><code>Time <- unique(VFSeries$Time) / 365 # years since baseline visit
print(Time)</code></pre>
<pre><code>## [1] 0.0000000 0.3452055 0.6520548 1.1123288 1.3808219 1.6109589 2.0712329
## [8] 2.3780822 2.5698630</code></pre>
<p>Next, we assign the adjacency matrix and dissimilarity metric (both discussed in Part I).</p>
<pre class="r"><code>W <- HFAII_Queen[-blind_spot, -blind_spot] # visual field adjacency matrix
DM <- GarwayHeath[-blind_spot] # Garway-Heath angles</code></pre>
<p>Now that we have specified the data objects <code>Y</code>, <code>DM</code>, <code>W</code>, and <code>Time</code>, we will customize the objects that characterize Bayesian MCMC methods, in particular, hyperparameters, starting values, Metropolis tuning values, and MCMC inputs. These objects have been detailed previously in the <code>womblR</code> package <a href="https://cran.r-project.org/web/packages/womblR/vignettes/womblR-example.html">vignette</a>, so we will not spend time going over their definitions. We will only note that they are each <code>list</code> objects similar to the <code>spBayes</code> package. We begin by specifying the hyperparameters.</p>
<pre class="r"><code>###Bounds for temporal tuning parameter phi
TimeDist <- abs(outer(Time, Time, "-"))
TimeDistVec <- TimeDist[lower.tri(TimeDist)]
minDiff <- min(TimeDistVec)
maxDiff <- max(TimeDistVec)
PhiUpper <- -log(0.01) / minDiff # shortest diff goes down to 1%
PhiLower <- -log(0.95) / maxDiff # longest diff goes up to 95%
###Hyperparameter object
Hypers <- list(Delta = list(MuDelta = c(3, 0, 0), OmegaDelta = diag(c(1000, 1000, 1))),
T = list(Xi = 4, Psi = diag(3)),
Phi = list(APhi = PhiLower, BPhi = PhiUpper))</code></pre>
<p>Then we specify the starting values for the parameters, Metropolis tuning variances, and MCMC details.</p>
<pre class="r"><code>###Starting values
Starting <- list(Delta = c(3, 0, 0), T = diag(3), Phi = 0.5)
###Metropolis tuning variances
Nu <- length(Time) # calculate number of visits
Tuning <- list(Theta2 = rep(1, Nu), Theta3 = rep(1, Nu), Phi = 1)
###MCMC inputs
MCMC <- list(NBurn = 10000, NSims = 250000, NThin = 25, NPilot = 20)</code></pre>
<p>We specify that our model will run for a burn-in period of 10,000 scans, followed by 250,000 scans post burn-in. In the burn-in period there will be 20 iterations of pilot adaptation evenly spaced out over the period. The final number of samples to be used for inference will be thinned down to 10,000 based on the thinning number of 25. We can now run the MCMC sampler. Details of the various options available in the sampler can be found in the documentation, <code>help(STBDwDM)</code>.</p>
<pre class="r"><code>reg.STBDwDM <- STBDwDM(Y = Y, DM = DM, W = W, Time = Time,
Starting = Starting, Hypers = Hypers, Tuning = Tuning, MCMC = MCMC,
Family = "tobit",
TemporalStructure = "exponential",
Distance = "circumference",
Weights = "continuous",
Rho = 0.99,
ScaleY = 10,
ScaleDM = 100,
Seed = 54)
## Burn-in progress: |*************************************************|
## Sampler progress: 0%.. 10%.. 20%.. 30%.. 40%.. 50%.. 60%.. 70%.. 80%.. 90%.. 100%.. </code></pre>
<p>We quickly assess convergence by checking the traceplots of <span class="math inline">\(\alpha_t\)</span> (note that further MCMC convergence diagnostics should be used in practice).</p>
<pre class="r"><code>###Load coda package
library(coda)
###Convert alpha to an MCMC object
Alpha <- as.mcmc(reg.STBDwDM$alpha)
###Create traceplot
par(mfrow = c(3, 3))
for (t in 1:Nu) traceplot(Alpha[, t], ylab = bquote(alpha[.(t)]), main = bquote(paste("Posterior of " ~ alpha[.(t)])))</code></pre>
<p><img src="/post/2018-12-03-statistics-in-glaucoma-part-ii_files/figure-html/unnamed-chunk-8-1.png" width="689.28" /></p>
</div>
<div id="converting-mcmc-samples-into-clinical-statements" class="section level2">
<h2>Converting MCMC Samples into Clinical Statements</h2>
<p>Now we calculate the posterior distribution of the CV of <span class="math inline">\(\alpha_t\)</span> and print its moments.</p>
<pre class="r"><code>CVAlpha <- apply(Alpha, 1, function(x) sd(x) / mean(x))
plot(density(CVAlpha, adjust = 2), main = expression("Posterior of CV"~(alpha[t])), xlab = expression("CV"~(alpha[t])))</code></pre>
<p><img src="/post/2018-12-03-statistics-in-glaucoma-part-ii_files/figure-html/unnamed-chunk-9-1.png" width="50%" style="display: block; margin: auto;" /></p>
<pre class="r"><code>STCV <- c(mean(CVAlpha), sd(CVAlpha), quantile(CVAlpha, probs = c(0.025, 0.975)))
names(STCV)[1:2] <- c("Mean", "SD")
print(STCV)</code></pre>
<pre><code>## Mean SD 2.5% 97.5%
## 0.19121622 0.10205826 0.04636219 0.42744656</code></pre>
<p>For this information to be useful clinically, we convert it into a probability of progression based on a model trained on a large cohort of glaucoma patients (Berchuck et al. 2019). Because the information from <span class="math inline">\(\alpha_t\)</span> is independent of trend-based methods, we show that the optimal use of <span class="math inline">\(\alpha_t\)</span> is combining it with a basic global metric that includes the slope and p-value (and their interaction) of the overall mean at each visual field exam. The trained model coefficients are publicly available and are used below. Furthermore, both the mean, standard deviation, and their interaction of the CV of <span class="math inline">\(\alpha_t\)</span> are included. The probability of progression can be calculated as follows.</p>
<pre class="r"><code>###Calculate the global metric slope and p-value
MeanSens <- apply(t(matrix(VFSeries$DLS, ncol = Nu)) / 10, 1, mean) # scaled mean DLS
reg.global <- lm(MeanSens ~ Time) # global regression
GlobalS <- summary(reg.global)$coef[2, 1] # global slope
GlobalP <- summary(reg.global)$coef[2, 4] # global p-value
###Obtain probabiltiy of progression using estimated parameters from Berchuck et al. 2019
input <- c(1, GlobalP, GlobalS, STCV[1], STCV[2], GlobalS * GlobalP, STCV[1] * STCV[2])
coef <- c(-1.7471655, -0.2502131, -13.7317622, 7.4746348, -8.9152523, 18.6964153, -13.3706058)
fit <- input %*% coef
exp(fit) / (1 + exp(fit))</code></pre>
<pre><code>## [,1]
## [1,] 0.4355997</code></pre>
<p>The probability of progression is calculated to be 0.44, which can be compared to the threshold cutoff for the trained model of 0.325. This cutoff for the probability of progression was determined using operating characteristics, so that the specificity was forced to be in the clinically meaningful range of 85%. Based on this derived threshold, the probability of progression is high enough to indicate that this patient’s disease shows evidence of visual field progression (which is reassuring, because we know this patient has progression as determined by clinicians).</p>
<p><code>Looking ahead:</code> The third installment will wrap up the discussion on the <code>womblR</code> package and ponder future directions for the role of statistics in glaucoma research. Furthermore, the role of open-source software in medicine will be discussed.</p>
</div>
<div id="references" class="section level2">
<h2>References</h2>
<ol style="list-style-type: decimal">
<li>Berchuck, S.I., Mwanza, J.C., & Warren, J.L. (2018). <a href="https://arxiv.org/abs/1805.11636"><em>Diagnosing Glaucoma Progression with Visual Field Data Using a Spatiotemporal Boundary Detection Method</em></a>, In press at <em>Journal of the American Statistical Association</em>.</li>
<li>Womble, W. H. (1951). <a href="http://science.sciencemag.org/content/114/2961/315"><em>Differential Systematics</em></a>. <em>Science</em>, 114(2961), 315-322.</li>
<li>Berchuck, S.I., Mwanza, J.C., Tanna, A.P., Budenz, D.L., Warren, J.L. (2019). <em>Improved Detection of Visual Field Progression Using a Spatiotemporal Boundary Detection Method</em>. In press at <em>Scientific Reports</em> (Available upon request).</li>
</ol>
</div>
<script>window.location.href='https://rviews.rstudio.com/2018/12/07/statistics-in-glaucoma-part-ii/';</script>
Statistics in Glaucoma: Part I
https://rviews.rstudio.com/2018/12/03/statistics-in-glaucoma-part-i/
Mon, 03 Dec 2018 00:00:00 +0000https://rviews.rstudio.com/2018/12/03/statistics-in-glaucoma-part-i/
<p><em>Samuel Berchuck is a Postdoctoral Associate in Duke University’s Department of Statistical Science and Forge-Duke’s Center for Actionable Health Data Science.</em></p>
<p><em>Joshua L. Warren is an Assistant Professor of Biostatistics at Yale University.</em></p>
<div id="introduction" class="section level2">
<h2>Introduction</h2>
<p>Glaucoma is a leading cause of blindness worldwide, with a prevalence of 4% in the population aged 40-80. The disease is characterized by retinal ganglion cell death and corresponding damage to the optic nerve head. Since visual impairment caused by glaucoma is irreversible and efficient treatments exist, early detection of the disease is essential. Determining if the disease is progressing remains one of the most challenging aspects of glaucoma management, since it is difficult to distinguish true progression from variability due to natural degradation or noise. In practice, clinicians monitor progression using a multifactorial approach that relies on various measurements of the disease. In this series of blog posts, we focus on the use of visual fields. Visual field examinations obtain levels of a patient’s actual vision, and the practice is thus referred to as a functional measurement. As such, visual fields are a proxy for a patient’s quality of life, and therefore are typically prioritized in practice.</p>
</div>
<div id="visual-field-data" class="section level2">
<h2>Visual Field Data</h2>
<p>Visual fields are complex spatiotemporal data generated from an intricate anatomical system, which is important to understand for modeling purposes. To illustrate visual field data, we load an example data set from the <code>womblR</code> package on CRAN. The package <code>womblR</code> was developed specifically for analyzing visual field data, and uses a Bayesian hierarchical model that accounts for the complex nature of the data (more details will be provided in Part II). The specific data set comes from the Vein Pulsation Study Trial in Glaucoma and the Lions Eye Institute trial registry, Perth, Western Australia. We begin by loading the package.</p>
<pre class="r"><code>library(womblR)</code></pre>
<p>The data set of interest is loaded lazily and can be accessed as follows; we also view the first six rows for illustration.</p>
<pre class="r"><code>data(VFSeries)
head(VFSeries)</code></pre>
<pre><code>## Visit DLS Time Location
## 1 1 25 0 1
## 2 2 23 126 1
## 3 3 23 238 1
## 4 4 23 406 1
## 5 5 24 504 1
## 6 6 21 588 1</code></pre>
<p>The data object <code>VFSeries</code> contains a longitudinal series of visual fields for a glaucoma patient that we will use throughout the three blog posts to exemplify the study of visual fields. This patient has been determined to be progressing, based on the expertise of two clinicians. <code>VFSeries</code> has four variables: <code>Visit</code>, <code>DLS</code>, <code>Time</code>, and <code>Location</code>. The variable <code>Visit</code> represents the visual field test visit number, <code>DLS</code> the observed measure, <code>Time</code> the time of the visual field test (in days from baseline visit), and <code>Location</code> the spatial location on the visual field where the observation occurred. There are 9 visual field exams contained in this data set, and on average 117.25 days between visits.</p>
<p>To help visualize the dataframe, we can use the <code>PlotVFTimeSeries</code> function. <code>PlotVFTimeSeries</code> is a function that plots a patient’s observed visual acuity over time at each location on the visual field.</p>
<pre class="r"><code>PlotVfTimeSeries(Y = VFSeries$DLS,
Location = VFSeries$Location,
Time = VFSeries$Time,
main = "Visual field sensitivity time series \n at each location",
xlab = "Days from baseline visit",
ylab = "Differential light sensitivity (dB)",
line.reg = FALSE)</code></pre>
<p><img src="/post/2018-11-19-statistics-in-glaucoma-part-i_files/figure-html/unnamed-chunk-3-1.png" width="528" style="display: block; margin: auto;" /></p>
<p>The above figure demonstrates the visual field from a Humphrey Field Analyzer-II (HFA-II) testing machine, which generates 54 spatial locations (only 52 informative locations; note the 2 blanks spots corresponding to the blind spot). The visual field map is constructed by assessing a patient’s response to varying levels of light. Patients are instructed to focus on a central fixation point as light is introduced randomly in a preceding manner over a grid on the visual field. As light is observed, the patient presses a button and the current light intensity is recorded. The process is repeated until the entire visual field is tested. The light intensity is measured in differential light sensitivity (DLS), which quantifies the difference in the HFA-II background and observed light intensity. Smaller values indicate worsening vision.</p>
</div>
<div id="spatial-anatomy-on-the-visual-field" class="section level2">
<h2>Spatial Anatomy on the Visual Field</h2>
<p>The spatial surface of the visual field is observed on a lattice (i.e., uniform areal data); however, it is a complex projection of the underlying optic nerve head and exhibits anatomically induced spatial dependencies. In particular, localized damage to the optic disc can result in clinically deterministic deterioration across the visual field. Incorporating this non-standard spatial dependence structure into our methodology is a priority for properly analyzing these data, although it is commonly ignored. Translating this into math lingo, this means that a naive modeling of the spatial surface of the visual field would be inappropriate (i.e., neighbors defined through adjacent locations). Instead, the definition of a neighbor when considering vision loss on the visual field must depend on the underlying anatomical proximities.</p>
<p>To illustrate this concept, we begin by displaying the visual field neighborhood structure. The adjacency matrix for the HFA-II is available in the <code>womblR</code> package. In this analysis, we use a queen specification, meaning that an adjacency is defined as any location that shares an edge or corner on the lattice. We now load this adjacency matrix and remove the two locations that correspond to the blind spot.</p>
<pre class="r"><code>blind_spot <- c(26, 35) # define blind spot
W <- HFAII_Queen[-blind_spot, -blind_spot] # HFA-II visual field adjacency matrix</code></pre>
<p>This adjacency structure can be displayed using the <code>graph.adjacency</code> function in the <code>igraph</code> package.</p>
<pre class="r"><code>library(igraph)
adj.graph <- graph.adjacency(W, mode = "undirected")
plot(adj.graph)</code></pre>
<p><img src="/post/2018-11-19-statistics-in-glaucoma-part-i_files/figure-html/unnamed-chunk-5-1.png" width="528" style="display: block; margin: auto;" /></p>
<p>As mentioned above, naively assuming that all of these adjacencies are equal ignores the important underlying anatomy that enforces these dependencies. This anatomical relationship of the visual field test points and the underlying optic nerve head was studied by Garway-Heath et al. (2000), in which they estimated the angle that each test location’s underlying retinal ganglion cells enters the optic disc, measured in degrees. These angles are the missing link that will allow the visual field adjacency structure to be dictated by the underlying anatomy. These angles can be visualized using the function <code>PlotAdjacency</code> from <code>womblR</code>, which displays neighborhood structures across the visual field. Before using this function, we need to load the angles measured in Garway-Heath et al. (2000). These are available from <code>womblR</code>; again, we remove the blind spot before using.</p>
<pre class="r"><code>Angles <- GarwayHeath[-blind_spot] # Garway-Heath angles
summary(Angles)</code></pre>
<pre><code>## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 11.00 80.75 192.50 177.35 275.75 329.00</code></pre>
<p>We are now ready to visualize the neighborhood structure of the visual field using the <code>PlotAdjacency</code> function.</p>
<pre class="r"><code>###Plot the angles on the visual field
PlotAdjacency(W = W,
DM = Angles,
zlim = c(0, 180),
Visit = NA,
edgewidth = 3.75,
cornerwidth = 0.33,
lwd.border = 3.75,
main = "Garway-Heath angles\n across the visual field")</code></pre>
<p><img src="/post/2018-11-19-statistics-in-glaucoma-part-i_files/figure-html/unnamed-chunk-7-1.png" width="528" style="display: block; margin: auto;" /></p>
<p>The angles measured by Garway-Heath et al. are presented at each location on the visual field. More interestingly, the distances between these angles are presented for each of the neighbor pairs. This figure is equivalent to the adjacency plot displayed above, but allows the adjacencies to vary as a function of the anatomy. In particular, if two visual field locations are anatomically similar, the dependency is strengthened (i.e., more white), and if the locations are close to anatomically independent, the dependency is weaker (i.e., more black). Here the edge adjacencies are represented by lines, while the diagonal adjacencies are represented as two triangles. This view of the visual field details the anatomical importance in modeling visual field data, as neighboring locations can have underlying retinal ganglion cells that enter the optic nerve head with a large degree of separation. In particular, locations on either side of the equator, although adjacent, are anatomically close to independent based on anatomy.</p>
</div>
<div id="how-to-model-visual-field-data" class="section level2">
<h2>How to Model Visual Field Data?</h2>
<p>If you have gotten this far in the post, hopefully you have the sense that the study of visual field data is statistically interesting and clinically important for properly assessing a glaucoma patient’s risk of vision loss. In the next two blog posts, we will explore how visual field data are currently analyzed and new methods that account for the anatomical structure detailed above. To accomplish this, we will break down the algorithm and software used to build the <code>womblR</code> package, and will attempt to illustrate the importance of R packages for open-source clinical research.</p>
</div>
<div id="reference" class="section level2">
<h2>Reference</h2>
<ol style="list-style-type: decimal">
<li>Garway-Heath, David F., Darmalingum Poinoosawmy, Frederick W. Fitzke, and Roger A. Hitchings. “Mapping the visual field to the optic disc in normal tension glaucoma eyes” <em>Ophthalmology</em> 107, no. 10 (2000): 1809-1815.</li>
</ol>
</div>
<script>window.location.href='https://rviews.rstudio.com/2018/12/03/statistics-in-glaucoma-part-i/';</script>
October 2018: “Top 40” New Packages
https://rviews.rstudio.com/2018/11/29/october-2018-top-40-new-packages/
Thu, 29 Nov 2018 00:00:00 +0000https://rviews.rstudio.com/2018/11/29/october-2018-top-40-new-packages/
<p>One hundred eighty-five new packages made it to CRAN in October. Here are my picks for the “Top 40” in eight categories: Computational Methods, Data, Machine Learning, Medicine, Science, Statistics, Utilities, and Visualization.</p>
<h3 id="computational-methods">Computational Methods</h3>
<p><a href="https://cran.r-project.org/package=compboost">compboost</a> v0.1.0: Provides a C++ implementation of component-wise boosting written to obtain high run-time performance and full memory control. The <a href="https://cran.r-project.org/web/packages/compboost/vignettes/compboost.html">vignette</a> shows how to use the package.</p>
<p><a href="https://cran.r-project.org/package=RcppEnsmallen">RcppEnsmallen</a> v0.1.10.0.1: Implements an interface to the C++ based <a href="http://ensmallen.org/">Ensmallen</a> mathematical optimization library that provides a simple set of abstractions for writing an objective function to optimize. Optimizers include full-batch gradient descent techniques, small-batch techniques, gradient-free optimizers, and constrained optimization.</p>
<p><a href="https://cran.r-project.org/package=SAMCpack">SAMpack</a> v0.1.1: Implements Stochastic Approximation Monte Carlo (SAMC) samplers capable of sampling from multimodal or doubly intractable distributions. See <a href="doi:10.1002/9780470669723">Liang et al (2010)</a> for a complete introduction to the method, and the <a href="https://cran.r-project.org/package=SAMCpack">vignette</a> for an introduction to the package.</p>
<h3 id="data">Data</h3>
<p><a href="https://cran.r-project.org/package=crimedata">crimedata</a> v0.1.0: Provides access to publicly available, police-recorded open crime data from large cities in the United States that are included in the <a href="https://osf.io/zyaqn/">Crime Open Database</a>.</p>
<p><a href="https://cran.r-project.org/web/packages/nasapower/index.html">nasapower</a> v1.02: Implements an interface to <a href="https://power.larc.nasa.gov/"><code>POWER</code> (Prediction Of Worldwide Energy Resource)</a>, NASA’s global meteorology, surface solar energy, and climatology data API. Look <a href="https://ropensci.github.io/nasapower/">here</a> for a quick start.</p>
<p><a href="https://cran.r-project.org/package=wikisourcer">wikisourcer</a> v0.1.1: Provides access to public domain works from <a href="https://wikisource.org/">Wikisource</a>, a free library from the Wikimedia Foundation project. See the <a href="https://cran.r-project.org/web/packages/wikisourcer/vignettes/wikisourcer.html">vignette</a> for a package tutorial.</p>
<p><img src="/post/2018-11-19-Rickert-OctTop40_files/wikisourcer.png" height = "400" width="600"></p>
<h3 id="machine-learning">Machine Learning</h3>
<p><a href="https://cran.r-project.org/package=gcForest">gcForest</a> v0.2.7: Provides an API interface to the <a href="https://github.com/pylablanche/gcForest">Python implementation</a> of Deep Forest, an alternative to Deep Learning. The algorithm is described in <a href="arXiv:1702.08835v2">Zhou and Feng (2017)</a>, and there is a brief package <a href="https://cran.r-project.org/web/packages/gcForest/vignettes/gcForest-docs.html">tutorial</a>.</p>
<p><a href="https://cran.r-project.org/package=galgo">galgo</a> v1.4: Allows users to build multivariate predictive models from large data sets having a far larger number of features than samples, such as in functional genomics data sets. See <a href="doi:10.1093/bioinformatics/btl074">Trevino and Falciani (2006)</a> for details.</p>
<p><a href="https://cran.r-project.org/package=MachineShop">MachineShop</a> v0.2.0: Provides a common interface for machine learning model fitting, prediction, performance assessment, and presentation of results. There is an <a href="https://cran.r-project.org/web/packages/MachineShop/vignettes/Introduction.html">Introduction</a> and a note on <a href="https://cran.r-project.org/web/packages/MachineShop/vignettes/MLModels.html">Implementation Conventions</a>.</p>
<p><img src="/post/2018-11-19-Rickert-OctTop40_files/MachineShop.png" height = "400" width="600"></p>
<p><a href="https://cran.r-project.org/package=mlflow">mlflow</a> v0.8.0: Provides an interface to <a href="ttps://mlflow.org/"><code>MLflow</code></a>, an open-source platform for the complete machine learning life cycle that supports installation, tracking experiments, running projects, and saving models.</p>
<p><a href="https://cran.r-project.org/package=sboost">sboost</a> v0.1.0: Provides a fast, C++-based implementation of Freund and Schapire’s Adaptive Boosting (AdaBoost) algorithm, and includes methods for classifier assessment, predictions, and cross-validation.</p>
<h3 id="medicine">Medicine</h3>
<p><a href="https://cran.r-project.org/package=CoRpower">CoRpower</a> v1.0.0: Provides functions to calculate power for assessment of intermediate biomarker responses as correlates of risk in the active treatment group in clinical efficacy trials, as described in <a href="https://www.ncbi.nlm.nih.gov/pubmed/27037797">Gilbert et al. (2016)</a>. The <a href="https://cran.r-project.org/web/packages/CoRpower/vignettes/CoRpower.html">vignette</a> demonstrates the math.</p>
<p><a href="https://cran.r-project.org/package=radtools">radtools</a> v1.0.0: Provides a collection of utilities for navigating medical image data in DICOM and NIfTI formats. An emphasis on metadata allows simple conversion of image metadata to familiar R data structures, such as lists and data frames. The <a href="https://cran.r-project.org/web/packages/radtools/vignettes/radtools_usage.html">vignette</a> shows how to use the package.</p>
<p><img src="/post/2018-11-19-Rickert-OctTop40_files/radtools.png" height = "400" width="600"></p>
<p><a href="https://cran.r-project.org/package=rpact">rpact</a> v1.0.0: Provides functions for designing and analyzing confirmatory adaptive clinical trials with continuous, binary, and survival endpoints according to the methods described in the monograph by <a href="doi:10.1007/978-3-319-32562-0">Wassmer and Brannath (2016)</a>. Look <a href="https://www.rpact.org/">here</a> for an overview.</p>
<h3 id="science">Science</h3>
<p><a href="https://cran.r-project.org/package=ClimProjDiags">ClimProjDiags</a> v0.0.1: Provides functions for computing metrics and indices for climate analysis, comparing models, and combining them into ensembles. There are vignettes on <a href="https://cran.r-project.org/web/packages/ClimProjDiags/vignettes/anomaly_agreement.html">anomaly agreement</a>, <a href="https://cran.r-project.org/web/packages/ClimProjDiags/vignettes/diurnaltemp.html">diurnal temperatures</a>, <a href="https://cran.r-project.org/web/packages/ClimProjDiags/vignettes/extreme_indices.html">extreme indices</a>, and <a href="https://cran.r-project.org/web/packages/ClimProjDiags/vignettes/heatcoldwaves.html">heat and cold wave duration</a>.</p>
<p><a href="https://cran.r-project.org/package=DEVis">DEVis</a> v1.0.0: Provides a comprehensive tool set for data aggregation, visual analytics, exploratory analysis, and project management that builds upon the Bioconductor <a href="http://bioconductor.org/packages/release/bioc/html/DESeq2.html">DESeq2</a> differential expression package. The <a href="https://cran.r-project.org/web/packages/DEVis/vignettes/DEVis_vignette.pdf">vignette</a> offers a comprehensive introduction.</p>
<p><img src="/post/2018-11-19-Rickert-OctTop40_files/DEVis.png" height = "300" width="400"></p>
<p><a href="https://cran.r-project.org/package=epimdr">epimdr</a> v0.6-1: Provides functions for studying epidemics, including the <a href="http://www.public.asu.edu/~hnesse/classes/seir.html">S(E)IR model</a>, time-series SIR and chain-binomial stochastic models, catalytic disease models, and coupled map lattice models. It is a companion to the book <a href="https://www.springer.com/gp/book/9783319974866">Epidemics: Models and Data in R</a> and the Coursera course <a href="https://www.coursera.org/learn/epidemics">Epidemics Massive Online Open Course</a>.</p>
<p><a href="https://cran.r-project.org/package=firebehavioR">firebehavior</a> v0.1.1: Implements fire behavior prediction models, including those documented in <a href="doi:10.2737/RMRS-RP-29">Scott & Reinhardt (2001)</a> and <a href="doi:10.1016/j.foreco.2006.08.174">Alexander et al. (2006)</a>. The <a href="https://cran.r-project.org/web/packages/firebehavioR/vignettes/firebehavioR.html">vignette</a> is informative.</p>
<p><img src="/post/2018-11-19-Rickert-OctTop40_files/firebehavioR.png" height = "400" width="600"></p>
<p><a href="https://cran.r-project.org/package=lorentz">lorentz</a> v1.0.0: Provides the functionality to work with Lorentz transforms and the gyrogroup structure in <a href="https://en.wikipedia.org/wiki/Special_relativity">Special Relativity</a>.</p>
<p><img src="/post/2018-11-19-Rickert-OctTop40_files/lorentz.png" height = "300" width="400"></p>
<p><a href="https://cran.r-project.org/package=pubchunks">pubchunks</a> v0.1.0: Provides functions for extracting chunks of XML from scholarly articles without having to know how to work with XML. See <a href="https://cran.r-project.org/web/packages/pubchunks/readme/README.html">README</a> to get going.</p>
<h3 id="statistics">Statistics</h3>
<p><a href="https://cran.r-project.org/package=BayesMallows">BayesMallows</a> v0.1.1: Implements the Bayesian version of the Mallows rank model (Vitelli et al. (2018)(<a href="http://jmlr.org/papers/v18/15-481.html">http://jmlr.org/papers/v18/15-481.html</a>). The <a href="https://cran.r-project.org/web/packages/BayesMallows/vignettes/BayesMallowsPackage.html">vignette</a> provides the details.</p>
<p><img src="/post/2018-11-19-Rickert-OctTop40_files/BayesMallows.png" height = "400" width="600"></p>
<p><a href="https://cran.r-project.org/package=contextual">contextual</a> v0.9.1: Facilitates the simulation and evaluation of context-free and contextual multi-Armed Bandit policies or algorithms to ease the implementation, evaluation, and dissemination of both existing and new bandit algorithms and policies. See the <a href="https://cran.r-project.org/web/packages/contextual/vignettes/contextual.html">Getting Started Guide</a> and this <a href="https://cran.r-project.org/web/packages/contextual/vignettes/posts.html">list of posts</a> for more information.</p>
<p><img src="/post/2018-11-19-Rickert-OctTop40_files/contextual.png" height = "400" width="600"></p>
<p><a href="https://cran.r-project.org/package=coxrt">coxrt</a> v1.0.0: Implements Cox Proportional Hazards regression for right-truncated data. The <a href="https://cran.r-project.org/web/packages/coxrt/vignettes/coxrt-vignette.html">vignette</a> gives the details.</p>
<p><img src="/post/2018-11-19-Rickert-OctTop40_files/coxrt.png" height = "400" width="600"></p>
<p><a href="https://cran.r-project.org/package=crossrun">crossrun</a> v0.1.0: Estimates the joint distribution of number of crossings and the longest run in a series of independent Bernoulli trials. There is a <a href="https://cran.r-project.org/web/packages/crossrun/vignettes/vignettecrossrun.html">vignette</a>.</p>
<p><a href="https://cran.r-project.org/package=logisticRR">logisticRR</a> v0.2.0: Asserting that relative risk is often of interest in public health, this package provides functions to return adjusted relative risks from logistic regression model under potential confounders. The <a href="https://cran.r-project.org/web/packages/logisticRR/vignettes/logisticRR.html">vignette</a> does the math.</p>
<p><a href="https://cran.r-project.org/package=lognorm">lognorm</a> v0.1.3: Estimates the distribution parameters and computes moments and other basic statistics of the lognormal distribution <a href="doi:10.1641/0006-3568(2001)051[0341:lndats]2.0.co;2">Limpert al. (2001)</a>, and also provides an approximation to the distribution of the sum of several correlated lognormally distributed variables <a href="doi:10.12988/ams.2013.39511">Lo (2013)</a>. There is a vignette on <a href="https://cran.r-project.org/web/packages/lognorm/vignettes/aggregateCorrelated.html">Aggregating Correlated Random Variables</a> and another on <a href="https://cran.r-project.org/web/packages/lognorm/vignettes/lognormalSum.html">Approximating Sums</a>.</p>
<p><a href="https://cran.r-project.org/package=lolog">lolog</a> v1.1: Provides functions to estimate Latent Order Logistic (LOLOG) Models for Networks, and also visual diagnostics and goodness of fit metrics are provided. See <a href="arXiv:1804.04583">Fellows (2018)</a> for a detailed description of the methods. One vignette works through an <a href="https://cran.r-project.org/web/packages/lolog/vignettes/lolog-ergm.pdf">example</a>, and another introduces <a href="https://cran.r-project.org/web/packages/lolog/vignettes/lolog-introduction.pdf">lolog models</a>.</p>
<p><img src="/post/2018-11-19-Rickert-OctTop40_files/lolog.png" height = "400" width="600"></p>
<p><a href="https://cran.r-project.org/package=matrixNormal">matrixNormal</a> v0.0.0: Provides the functions to compute densities, probabilities, and random deviates of the Matrix Normal distribution. See <a href="doi:10.7508/ijmsi.2010.02.004">Iranmanesh et.al. (2010)</a></p>
<p><a href="https://cran.r-project.org/package=outcomerate">outcomerate</a> v1.0.1: Implements standardized survey outcome rate functions, including the response rate, contact rate, cooperation rate, and refusal rate that allow researchers to measure the quality of survey data using standards published by the <a href="https://www.aapor.org/">American Association of Public Opinion Research</a>. For details, see <a href="https://www.aapor.org/Standards-Ethics/Standard-Definitions-(1).aspx">AAPOR (2016)</a>. The vignette provides an <a href="https://cran.r-project.org/web/packages/outcomerate/vignettes/intro-to-outcomerate.html">Introduction</a>.</p>
<p><img src="/post/2018-11-19-Rickert-OctTop40_files/outcomerate.png" height = "400" width="600"></p>
<p><a href="https://cran.r-project.org/package=parmsurvfit">parmsurvfit</a> v0.0.1: Fits right-censored data to a given parametric distribution, and produces summary statistics, hazard, cumulative hazard and probability plots, and the Anderson-Darling test statistic. There is a <a href="https://cran.r-project.org/web/packages/parmsurvfit/vignettes/parmsurvfit_vig.html">vignette</a>.</p>
<p><a href="https://CRAN.R-project.org/package=ppgmmga">ppgmmga</a> v1.0.1: Implements a Projection Pursuit algorithm for dimension reduction based on Gaussian Mixture Models. The <a href="https://cran.r-project.org/web/packages/ppgmmga/vignettes/ppgmmga.html">vignette</a> provides a quick tour of the package.</p>
<p><img src="/post/2018-11-19-Rickert-OctTop40_files/ppgmmga.png" height = "400" width="600"></p>
<p><a href="https://cran.r-project.org/package=RcppDist">RcppDist</a> v0.1.1: Provides additional statistical distributions that
can be called from C++ when writing code using Rcpp or RcppArmadillo. See the <a href="https://cran.r-project.org/web/packages/RcppDist/vignettes/RcppDist.pdf">vignette</a> for a list of the distributions supported.</p>
<p><a href="https://cran.r-project.org/package=simstandard">simstandard</a> v0.2.0: Enables the creation of simulated data from structural equation models with standardized loading. The <a href="https://cran.r-project.org/web/packages/simstandard/vignettes/simstandard_tutorial.html">vignette</a> shows how to use the package.</p>
<p><img src="/post/2018-11-19-Rickert-OctTop40_files/simstandard.png" height = "300" width="400"></p>
<h3 id="utilities">Utilities</h3>
<p><a href="https://cran.r-project.org/package=carrier">carrier</a> v0.1.0: Enables users to create functions that are isolated from their environment. These isolated functions, also called crates, print at the console with their total size and can be easily tested locally before being sent to a remote.</p>
<p><a href="https://cran.r-project.org/package=carbonate">carbonate</a> v0.1.0: Implements an interface to <a href="https://carbon.now.sh/about">carbon.js</a>, which allows developers to create images of source code. There is a vignette on <a href="https://cran.r-project.org/web/packages/carbonate/vignettes/tests_and_coverage.html">Tests and Coverage</a>.</p>
<p><a href="https://cran.r-project.org/package=generics">generics</a> v0.0.1: In order to reduce potential package dependencies and conflicts, <code>generics</code> provides a number of commonly used S3 generics.</p>
<p><a href="https://cran.r-project.org/package=REPLesentR">REPLesentR</a> v0.3.0: Allows users to create presentations and display them inside the R <code>REPL</code> (console). Supports <code>RMarkdown</code> and other text format.</p>
<p><a href="https://cran.r-project.org/package=stationery">stationery</a> v0.98.5.5: Provides templates, guides, and scripts for writing documents in <code>LaTeX</code> and <code>R markdown</code> to produce guides, slides, and reports; and includes several vignettes to assist new users of literate programming. There is an <a href="https://cran.r-project.org/web/packages/stationery/vignettes/stationery.pdf">Overview</a>, a vignette on <a href="https://cran.r-project.org/web/packages/stationery/vignettes/Rmarkdown.pdf">R Markdown Basics</a>, and another on <a href="https://cran.r-project.org/web/packages/stationery/vignettes/HTML_special_features.html">R Markdown HTML</a>, and a comparison between <a href="https://cran.r-project.org/web/packages/stationery/vignettes/code_chunks.pdf">Sweave and Knitr code chunks</a>.</p>
<h3 id="visualization">Visualization</h3>
<p><a href="https://cran.r-project.org/package=balance">balance</a> v0.1.6: Provides an alternative scheme for visualizing balances (used in <a href="https://en.wikipedia.org/wiki/Compositional_data">compositional data analysis</a>) as described in <a href="doi:10.12688/f1000research.15858.1">Quinn (2018)</a>, as well as a method for principal balance analysis. See the <a href="https://cran.r-project.org/web/packages/balance/vignettes/balance.html">vignette</a> for details.</p>
<p><img src="/post/2018-11-19-Rickert-OctTop40_files/balances.png" height = "400" width="600"></p>
<p><a href="https://cran.r-project.org/package=trelliscopejs">trelliscopejs</a> v0.1.14: Provides methods that make it easy to create a Trelliscope display specification for TrelliscopeJS, including high-level functions for creating displays from within <code>dplyr</code> or <code>ggplot2</code> workflows. There is a vignette on <a href="https://hafen.github.io/trelliscopejs/#trelliscope">trelliscope Documentation</a> and a <a href="https://cran.r-project.org/web/packages/trelliscopejs/vignettes/rd.html">trelliscope Package Reference</a>.</p>
<p><img src="/post/2018-11-19-Rickert-OctTop40_files/trelliscopejs.png" height = "400" width="600"></p>
<script>window.location.href='https://rviews.rstudio.com/2018/11/29/october-2018-top-40-new-packages/';</script>
CRAN’s New Missing Data Task View
https://rviews.rstudio.com/2018/10/26/cran-s-new-missing-values-task-view/
Fri, 26 Oct 2018 00:00:00 +0000https://rviews.rstudio.com/2018/10/26/cran-s-new-missing-values-task-view/
<p>It is a relatively rare event, and cause for celebration, when CRAN gets a new Task View. This week the <a href="https://github.com/R-miss-tastic">r-miss-tastic</a> team: Julie Josse, Nicholas Tierney and Nathalie Vialaneix launched the <a href="https://CRAN.R-project.org/view=MissingData">Missing Data Task View</a>. Even though I did some research on R packages for a <a href="https://rviews.rstudio.com/2016/11/30/missing-values-data-science-and-r/">post on missing values</a> a couple of years ago, I was dumbfounded by the number of packages included in the new Task View.</p>
<iframe src="https://cran.r-project.org/view=MissingData" width="90%" height="300"> </iframe>
<hr />
<p>This single page not only describes what R has to offer with respect to coping with missing data, it is probably the world’s most complete index of statistical knowledge on the subject. But, the task view is not just devoted to esoterica. The <em>Exploration of missing data</em> section contains a number of tools that should be useful in any data wrangling effort like this plot diagnostic plot from the <code>dlookr</code> package.</p>
<p><img src="/post/2018-10-25-Missing-Values_files/dlookr.png" height = "300" width=90%></p>
<p>The downside that may curb one’s enthusiasm, is that mastering missing data techniques requires some serious study. Missing data problems are among the most vexing in statistics, and the newer techniques for tackling these problems are relatively sophisticated. So, unless you are a <code>na.omit()</code> kind of guy/gal (a data scientist?) coming to grips with <code>NAs</code> may involve a subtle inferential task embedded in the usual data wrangling effort.</p>
<p>It is also the case that it is not easy to find good introductory material on the subject. The notable exception is Stef van Buuren’s authoritative R-based textbook, <a href="https://stefvanbuuren.name/fimd/">Flexible Imputation of Missing Data</a> which he has graciously made available online.</p>
<p>I also found the review papers by <a href="https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3701793/">Dong and Peng</a> and by <a href="https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1839993/">Horton and Kleinman</a> to be helpful. And, if you read French a little better than I do, the review by <a href="http://journal-sfds.fr/article/view/681">Imbert and Vialaneix</a> looks like it covers all of the basic material.</p>
<p>Finally, I should mention that the Missing Data Task View is part of the R Consortium funded project <a href="https://www.r-consortium.org/projects/awarded-projects">A unified platform for missing values methods and workflows</a>. Many thanks to the ISC for making it possible for the expert r-miss-tastic team to do the work.</p>
<script>window.location.href='https://rviews.rstudio.com/2018/10/26/cran-s-new-missing-values-task-view/';</script>
Searching for R Packages
https://rviews.rstudio.com/2018/10/22/searching-for-r-packages/
Mon, 22 Oct 2018 00:00:00 +0000https://rviews.rstudio.com/2018/10/22/searching-for-r-packages/
<script src="/rmarkdown-libs/htmlwidgets/htmlwidgets.js"></script>
<link href="/rmarkdown-libs/vis/vis.css" rel="stylesheet" />
<script src="/rmarkdown-libs/vis/vis.min.js"></script>
<script src="/rmarkdown-libs/visNetwork-binding/visNetwork.js"></script>
<script src="/rmarkdown-libs/FileSaver/FileSaver.min.js"></script>
<script src="/rmarkdown-libs/Blob/Blob.js"></script>
<script src="/rmarkdown-libs/canvas-toBlob/canvas-toBlob.js"></script>
<script src="/rmarkdown-libs/html2canvas/html2canvas.js"></script>
<script src="/rmarkdown-libs/jspdf/jspdf.debug.js"></script>
<hr />
<p>Searching for R packages is a vexing problem for both new and experienced R users. With over 13,000 packages already on <a href="https://cran.r-project.org/web/packages/">CRAN</a>, and new packages arriving at a rate of almost 200 per month, it is impossible to keep up. Package names can be almost anything, and they are rarely informative, so <a href="https://cran.r-project.org/web/packages/available_packages_by_name.html">searching by name</a> is of little help. I make it a point to look at all of the new packages arriving on CRAN each month, but after a month or so, when asked about packages related to some particular topic, more often than not, I have little more to offer than a vague memory that I saw something that might be useful.</p>
<p>Fortunately, package developers have provided some very useful tools, if you know where to look. :) This post presents a search strategy based on some relatively new packages I have come across in my monthly review.</p>
<pre class="r"><code>library(tidyverse)</code></pre>
<pre><code>## ── Attaching packages ───────────────────────────────────────────── tidyverse 1.2.1 ──</code></pre>
<pre><code>## ✔ ggplot2 2.2.1 ✔ purrr 0.2.4
## ✔ tibble 1.4.2 ✔ dplyr 0.7.5
## ✔ tidyr 0.8.1 ✔ stringr 1.3.1
## ✔ readr 1.1.1 ✔ forcats 0.3.0</code></pre>
<pre><code>## ── Conflicts ──────────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag() masks stats::lag()</code></pre>
<pre class="r"><code>library(packagefinder)
library(dlstats)
library(cranly)</code></pre>
<p><a href="https://CRAN.R-project.org/package=packagefinder">packagefinder v0.0.7</a>, which appeared on CRAN this past July, goes right to the heart of the problem and shows great promise. The function <code>findPackage()</code> allows you to do a keyword search through the metadata of all CRAN packages. Since I am researching a possible post on <a href="https://thomasleeper.com/Rcourse/Tutorials/permutationtests.html">Permutation Tests</a>, I thought I would give <code>packagefinder::findPackage()</code> the most straightforward search text I could think of. (Note that the link for <code>Permutation Tests</code> above goes to an example by Thomas Leeper that references the <code>coin</code> package. This is a pretty strong hint that I expect to find <code>coin</code> prominently listed among the results.)</p>
<p>Also note, that making the output a <code>tibble</code> is not just obsessive-compulsive tidy behavior. The default print method sends the output to the Viewer in the RStudio IDE.</p>
<pre class="r"><code>pt_pkg <- as.tibble(findPackage("permutation test"))</code></pre>
<pre><code>##
## 59 out of 13256 CRAN packages found in 6 seconds.</code></pre>
<pre class="r"><code>pt_pkg</code></pre>
<pre><code>## # A tibble: 59 x 5
## SCORE NAME DESC_SHORT DOWNL_TOTAL GO
## <dbl> <chr> <chr> <S3: format> <fct>
## 1 100 permutes Permutation Tests for Time Series … NA 8300
## 2 75 AUtests Approximate Unconditional and Perm… NA 502
## 3 75 jmuOutlier Permutation Tests for Nonparametri… NA 5564
## 4 75 lmPerm Permutation Tests for Linear Models NA 6083
## 5 75 NetRep Permutation Testing Network Module… NA 7453
## 6 75 perm Exact or Asymptotic permutation te… NA 8289
## 7 75 permDep Permutation Tests for General Depe… NA 8292
## 8 75 permuco "Permutation Tests for Regression,… NA 8297
## 9 75 RATest Randomization Tests NA 9287
## 10 75 treeperm Exact and Asymptotic K Sample Perm… NA 12442
## # ... with 49 more rows</code></pre>
<p>Unfortunately, the package is very new and not well-documented. It is not clear how <code>SCORE</code> is computed, and <code>DOWNL_TOTAL</code> is replete with <code>NA</code>s. Nevertheless, the function does seem to find packages. I can’t vouch for its completeness, but when I tried it out on some topics with which it I am familiar, it did a pretty thorough job. Note that <code>findPackage()</code> allows a user to set a weights parameter that affects how the search “hits in the package’s title, short description and long description”. So far, I have not found this to be particularly useful, but I have not spent a lot of time with it, either.</p>
<p>The next line of code just selects the columns we will be using.</p>
<pre class="r"><code>pt_pkg <- select(pt_pkg, NAME, DESC_SHORT)</code></pre>
<p>Now that we have a list of packages of interest, it would be nice to have an indication of the quality and usefulness of the packages selected. A natural measure of usefulness is the number of times the package has been downloaded. For this, we turn to the <code>cran_stats()</code> function from the <a href="https://cran.r-project.org/package=dlstats"><code>dlstats</code> package</a>. This function takes a vector of packages names as inputs, queries the <a href="http://cran-logs.rstudio.com/">RStudio download logs</a>, and returns a data frame listing the number of downloads by month for each package.</p>
<pre class="r"><code>pt_downloads <- cran_stats(pt_pkg$NAME)
dim(pt_downloads)</code></pre>
<pre><code>## [1] 2784 4</code></pre>
<pre class="r"><code>head(pt_downloads)</code></pre>
<pre><code>## start end downloads package
## 4485 2018-05-01 2018-05-31 52 permutes
## 4544 2018-06-01 2018-06-30 89 permutes
## 4603 2018-07-01 2018-07-31 92 permutes
## 4662 2018-08-01 2018-08-31 74 permutes
## 4721 2018-09-01 2018-09-30 227 permutes
## 4780 2018-10-01 2018-10-22 142 permutes</code></pre>
<p>Just a little wrangling yields a data frame that lists total downloads for each package over its lifespan.</p>
<pre class="r"><code>top_downloads <- pt_downloads %>% group_by(package) %>%
summarize(downloads = sum(downloads)) %>%
arrange(desc(downloads))
head(top_downloads,10)</code></pre>
<pre><code>## # A tibble: 10 x 2
## package downloads
## <fct> <int>
## 1 coin 1103426
## 2 exactRankTests 137674
## 3 RVAideMemoire 108837
## 4 perm 97071
## 5 logcondens 83033
## 6 HardyWeinberg 55735
## 7 biotools 47694
## 8 smacof 45257
## 9 SNPassoc 38920
## 10 broman 30956</code></pre>
<p>As expected, <code>coin</code> has flipped to the head of the list. Plotting the downloads over time shows that the package has increased in popularity over the past five years, and it looks like people have been doing a crazy amount of permutation testing over the past year or so.</p>
<pre class="r"><code>top_pkgs <- pt_downloads %>% filter(package %in% top_downloads$package[1:3])
ggplot(top_pkgs, aes(end, downloads, group=package, color=package)) +
geom_line() + geom_point(aes(shape=package))</code></pre>
<p><img src="/post/2018-10-17-searching-for-r-packages_files/figure-html/unnamed-chunk-7-1.png" width="672" /></p>
<p>One way to gauge the quality and reliability of a package is to see how many other packages depend on it. These would be the packages listed as “Reverse depends” and “Reverse imports” on the CRAN page for a package. Using the canonical link, <a href="https://cran.r-project.org/package=coin" class="uri">https://cran.r-project.org/package=coin</a>, we see that 24 packages are listed in these fields on the <code>coin</code> page.</p>
<p>Likewise, knowing something of an author’s background, his or her experience writing other R packages, and prominent R developers he or she may have collaborated with is also helpful in assessing whether to give a newly found package is worth a try. The same link above also shows the package’s authors. Checking the <a href="http://www.cran.r-project.org/contributors.html">Contributors page</a> for the R Project, we see that two authors are members of R Core and the lead author, Torsten Hothorn, is listed with the contributors who have provided “invaluable help”. The background and collaborators couldn’t be better.</p>
<p>In most cases, background checks aren’t so easy. However, with the help of the <code>build_network()</code> function from the <a href="https://cran.r-project.org/web/packages/cranly/index.html">cranly package</a>, it is simple to track down an author’s collaboration network. Here, we see that Torston has an extensive network of collaborators.</p>
<pre class="r"><code>p_db <- tools::CRAN_package_db()
clean_p_db <- clean_CRAN_db(p_db)
author_net <- build_network(object = clean_p_db, perspective = "author")
plot(author_net, author = "Torsten Hothorn", exact = FALSE)</code></pre>
<div id="htmlwidget-1" style="width:672px;height:480px;" class="visNetwork html-widget"></div>
<script type="application/json" data-for="htmlwidget-1">{"x":{"nodes":{"colorlabel":["Tony Plate","Richard Heiberger","Gilles Kratzer","Henrik Singmann","Sandrine Pavoine","Daniel Chessel","Stephane Dray","Christian Kleiber","Achim Zeileis","Ben Bolker","John Fox","Kevin Wright","Martin Maechler","Jeremy VanDerWal","Michael Sumner","Kamil Erguler","Max Kuhn","Hadley Wickham","Dirk Eddelbuettel","Renaud Lancelot","Simon Blomberg","Jim Lemon","Karline Soetaert","Friedrich Leisch","Arni Magnusson","Christian Buchta","Kurt Hornik","Ken Aho","Nick Parsons","Rolf Turner","Jon Eugster","Andrea Farnham","Raphael Hartmann","Tea Isler","Ke Li","Silvia Panunzi","Sophie Schneider","Craig Wang","Torsten Hothorn","Terry Therneau","Alexandros Karatzoglou","Andrie de Vries","Jeff Enos","Bill Venables","Roman Hornung","Brian Ripley","Peter Buehlmann","Barry Eggleston","Christopher Jackson","Thomas Kneib","Andreas Mayr","Benjamin Hofner","Matthias Schmid","Romain Francois","Mikko Korpela","Fabian Scheipl","Greg Snow","Kevin Ushey","Bjoern Bornkamp","Duncan Murdoch","Dieter Menne","Uwe Ligges","Klaus Nordhausen","Zhu Wang","Michael Friendly","David Ruegamer","Thomas Petzoldt","Spencer Graves","Henric Nilsson","Derek Ogle","David Winsemius","Roger Bivand","Andreas Alfons","Michael Smithson","Gabor Grothendieck","Matthieu Stigler","Venkatraman E Seshan","Andreas Bender","Mark A van de Wiel","Henric Winell","Tyler Rinker","Alec Stephenson","Christian W Hoffmann","Tal Galili","Gregory R Warnes","Barry Rowlingson","Rob J Hyndman","Joshua Ulrich","Marc Schwartz","Andri Signorell","Nanina Anderegg","Tomas Aragon","Antti Arppe","Adrian Baddeley","Kamil Barton","Frederico Caeiro","Stephane Champely","Leanne Chhay","Clint Cummins","Michael Dewey","Harold C Doran","Charles Dupont","Claus Ekstrom","Martin Elff","Richard W Farebrother","Matthias Gamer","Joseph L Gastwirth","Yulia R Gel","Juergen Gross","Frank E Harrell Jr","Michael Hoehle","Markus Huerzeler","Wallace W Hui","Pete Hurd","Pablo J Villacorta Iglesias","Matthias Kohl","Detlew Labes","Friederich Leisch","Dong Li","Daniel Malter","George Marsaglia","John Marsaglia","Alina Matei","David Meyer","Weiwen Miao","Giovanni Millo","Yongyi Min","David Mitchell","Markus Naepflin","Daniel Navarro","Hong Ooi","Roland Rapold","William Revelle","Caroline Rodriguez","Nathan Russell","Nick Sabbe","Werner A Stahel","Mark Stevenson","Matthias Templ","Yves Tille","Adrian Trapletti","John Verzani","Stefan Wellek","Rand R Wilcox","Peter Wolf","Daniel Wollschlaeger","Thomas Yee","Detlef Steuer","Frank Bretz","Andy Bunn","Sarah Goslee","Sarah Brockhaus","R community","Peter Dalgaard","Kjetil Brinchmann Halvorsen","Ray Brownrigg","David L Reiner","Berton Gunter","Roger Koenker","Charles Berry","Peter Dunn","Roland Rau","Mark Leeds","Emmanuel Charpentier","Chris Evans","Paolo Sonego","Peter Ehlers","Liviu Andronic","Brian Diggs","Richard M Heiberger","Patrick Burns","R Michael Weylandt","Jon Olav Skoien","Francois Morneau","Antony Unwin","Joshua Wiley","Bryan Hanson","Eduard Szoecs","Gregor Passolt","John C Nash","Matthias Speidel","Anne-Laure Boulesteix","Hannah Frick","Christina Riedel","Martin Spindler","Ivan Kondofersky","Oliver S Kuehnle","Christian Lindenlaub","Georg Pfundstein","Ariane Straub","Florian Wickler","Katharina Zink","Manuel Eugster","Heidi Seibold","Brian S Everitt","Andrea Peters","Beth Atkinson","Fabian Sobotka","Alan Genz","Nikhil Garge","Georgiy Bobashev","Benjamin Carper","Kasey Jones","Carolin Strobl","Basil Abou El-Komboz","Abdelilah El Hadad","Laura Goeres","Max Hughes-Brandl","Peter Westfall","Andre Schuetzenmeister","Susan Scheibe","Tetsuhisa Miwa","Xuefei Mi"],"id":["Tony Plate","Richard Heiberger","Gilles Kratzer","Henrik Singmann","Sandrine Pavoine","Daniel Chessel","Stephane Dray","Christian Kleiber","Achim Zeileis","Ben Bolker","John Fox","Kevin Wright","Martin Maechler","Jeremy VanDerWal","Michael Sumner","Kamil Erguler","Max Kuhn","Hadley Wickham","Dirk Eddelbuettel","Renaud Lancelot","Simon Blomberg","Jim Lemon","Karline Soetaert","Friedrich Leisch","Arni Magnusson","Christian Buchta","Kurt Hornik","Ken Aho","Nick Parsons","Rolf Turner","Jon Eugster","Andrea Farnham","Raphael Hartmann","Tea Isler","Ke Li","Silvia Panunzi","Sophie Schneider","Craig Wang","Torsten Hothorn","Terry Therneau","Alexandros Karatzoglou","Andrie de Vries","Jeff Enos","Bill Venables","Roman Hornung","Brian Ripley","Peter Buehlmann","Barry Eggleston","Christopher Jackson","Thomas Kneib","Andreas Mayr","Benjamin Hofner","Matthias Schmid","Romain Francois","Mikko Korpela","Fabian Scheipl","Greg Snow","Kevin Ushey","Bjoern Bornkamp","Duncan Murdoch","Dieter Menne","Uwe Ligges","Klaus Nordhausen","Zhu Wang","Michael Friendly","David Ruegamer","Thomas Petzoldt","Spencer Graves","Henric Nilsson","Derek Ogle","David Winsemius","Roger Bivand","Andreas Alfons","Michael Smithson","Gabor Grothendieck","Matthieu Stigler","Venkatraman E Seshan","Andreas Bender","Mark A van de Wiel","Henric Winell","Tyler Rinker","Alec Stephenson","Christian W Hoffmann","Tal Galili","Gregory R Warnes","Barry Rowlingson","Rob J Hyndman","Joshua Ulrich","Marc Schwartz","Andri Signorell","Nanina Anderegg","Tomas Aragon","Antti Arppe","Adrian Baddeley","Kamil Barton","Frederico Caeiro","Stephane Champely","Leanne Chhay","Clint Cummins","Michael Dewey","Harold C Doran","Charles Dupont","Claus Ekstrom","Martin Elff","Richard W Farebrother","Matthias Gamer","Joseph L Gastwirth","Yulia R Gel","Juergen Gross","Frank E Harrell Jr","Michael Hoehle","Markus Huerzeler","Wallace W Hui","Pete Hurd","Pablo J Villacorta Iglesias","Matthias Kohl","Detlew Labes","Friederich Leisch","Dong Li","Daniel Malter","George Marsaglia","John Marsaglia","Alina Matei","David Meyer","Weiwen Miao","Giovanni Millo","Yongyi Min","David Mitchell","Markus Naepflin","Daniel Navarro","Hong Ooi","Roland Rapold","William Revelle","Caroline Rodriguez","Nathan Russell","Nick Sabbe","Werner A Stahel","Mark Stevenson","Matthias Templ","Yves Tille","Adrian Trapletti","John Verzani","Stefan Wellek","Rand R Wilcox","Peter Wolf","Daniel Wollschlaeger","Thomas Yee","Detlef Steuer","Frank Bretz","Andy Bunn","Sarah Goslee","Sarah Brockhaus","R community","Peter Dalgaard","Kjetil Brinchmann Halvorsen","Ray Brownrigg","David L Reiner","Berton Gunter","Roger Koenker","Charles Berry","Peter Dunn","Roland Rau","Mark Leeds","Emmanuel Charpentier","Chris Evans","Paolo Sonego","Peter Ehlers","Liviu Andronic","Brian Diggs","Richard M Heiberger","Patrick Burns","R Michael Weylandt","Jon Olav Skoien","Francois Morneau","Antony Unwin","Joshua Wiley","Bryan Hanson","Eduard Szoecs","Gregor Passolt","John C Nash","Matthias Speidel","Anne-Laure Boulesteix","Hannah Frick","Christina Riedel","Martin Spindler","Ivan Kondofersky","Oliver S Kuehnle","Christian Lindenlaub","Georg Pfundstein","Ariane Straub","Florian Wickler","Katharina Zink","Manuel Eugster","Heidi Seibold","Brian S Everitt","Andrea Peters","Beth Atkinson","Fabian Sobotka","Alan Genz","Nikhil Garge","Georgiy Bobashev","Benjamin Carper","Kasey Jones","Carolin Strobl","Basil Abou El-Komboz","Abdelilah El Hadad","Laura Goeres","Max Hughes-Brandl","Peter Westfall","Andre Schuetzenmeister","Susan Scheibe","Tetsuhisa Miwa","Xuefei Mi"],"title":["Author: Tony Plate<br>148 collaborators in 10 packages: <br>abind, DescTools, Holidays, JuniperKernel<br>RSVGTipsDevice, scriptests, sfsmisc, svglite<br>TimeWarp, track","Author: Richard Heiberger<br>149 collaborators in 4 packages: <br>abind, car, DescTools, Rcmdr","Author: Gilles Kratzer<br>13 collaborators in 3 packages: <br>abn, ATR, varrank","Author: Henrik Singmann<br>147 collaborators in 11 packages: <br>acss, acss.data, afex, bridgesampling<br>emmeans, fortunes, LaplacesDemon, lme4<br>MPTinR, plotrix, rtdists","Author: Sandrine Pavoine<br>131 collaborators in 4 packages: <br>ade4, adiv, DescTools, seewave","Author: Daniel Chessel<br>112 collaborators in 2 packages: <br>ade4, DescTools","Author: Stephane Dray<br>109 collaborators in 4 packages: <br>adehabitatLT, DescTools, Guerry, HistData","Author: Christian Kleiber<br>77 collaborators in 5 packages: <br>AER, fortunes, ineq, plm<br>strucchange","Author: Achim Zeileis<br>349 collaborators in 53 packages: <br>AER, bamlss, BayesXsrc, betareg<br>bfast, car, coin, colorspace<br>condvis, crch, ctv, DescTools<br>dichromat, dynlm, evtree, exams<br>Formula, fortunes, fxregime, glmertree<br>glmx, glogis, ineq, lagsarlmtree<br>lmSubsets, lmtest, mobForest, model4you<br>modeltools, mpath, mpt, palmtree<br>party, partykit, plm, pscl<br>psychomix, psychotools, psychotree, pwt<br>pwt8, pwt9, quantreg, R2BayesX<br>RWeka, sandwich, stablelearner, strucchange<br>truncreg, tth, vcd, vcdExtra<br>zoo","Author: Ben Bolker<br>402 collaborators in 29 packages: <br>afex, ape, bbmle, broom<br>broom.mixed, car, DescTools, dotwhisker<br>emdbook, foghorn, fortunes, gdata<br>ggalt, glmmTMB, gmodels, gplots<br>gtools, lme4, MEMSS, metaplus<br>minpack.lm, mlmRev, plotrix, R2admb<br>RLRsim, rncl, rstanarm, SASmixed<br>sfsmisc","Author: John Fox<br>205 collaborators in 19 packages: <br>afex, candisc, car, carData<br>DescTools, effects, english, heplots<br>matlib, phia, polycor, pubh<br>Rcmdr, RcmdrMisc, RcmdrPlugin.survival, RcmdrPlugin.TeachingDemos<br>relimp, sem, twoway","Author: Kevin Wright<br>87 collaborators in 12 packages: <br>agridat, corrgram, desplot, fortunes<br>gge, lucid, mountainplot, nipals<br>pagenum, pals, Rcmdr, rseedcalc","Author: Martin Maechler<br>448 collaborators in 58 packages: <br>akima, bastah, Bessel, Biodem<br>bitops, car, CLA, classGraph<br>cluster, cobs, copula, copulaData<br>DEoptimR, DescTools, DetR, diptest<br>expm, fortunes, fracdiff, GLDEX<br>glmmTMB, gmp, gnm, gplots<br>hdi, hexbin, lasso2, lme4<br>lokern, longmemo, lpridge, Matrix<br>MatrixModels, MEMSS, mlmRev, mvtnorm<br>pcalg, pixmap, plugdensity, polynom<br>RankingProject, Rcmdr, Rmpfr, robust<br>robustbase, robustX, rstanarm, SASmixed<br>scatterplot3d, sfsmisc, simsalapar, sptm<br>stabledist, supclust, timeDate, TMB<br>VLMC, xgobi","Author: Jeremy VanDerWal<br>119 collaborators in 4 packages: <br>ALA4R, DescTools, landscapemetrics, SDMTools","Author: Michael Sumner<br>175 collaborators in 19 packages: <br>ALA4R, bsam, decido, fasterize<br>fortunes, GeoLight, gibble, hddtools<br>mapview, mregions, ncmeta, raster<br>rgdal, sf, sp, stars<br>tidytransit, trread, vapour","Author: Kamil Erguler<br>101 collaborators in 3 packages: <br>albopictus, Barnard, DescTools","Author: Max Kuhn<br>148 collaborators in 17 packages: <br>AmesHousing, AppliedPredictiveModeling, C50, caret<br>contrast, Cubist, DescTools, desirability<br>dials, embed, QSARdata, recipes<br>rsample, sparseLDA, spectacles, tidyposterior<br>yardstick","Author: Hadley Wickham<br>784 collaborators in 131 packages: <br>analogsea, anyflights, assertthat, babynames<br>bigQueryR, bigrquery, blob, bnclassify<br>bookdown, broom, cellranger, classifly<br>cli, clusterfly, conflicted, curl<br>damr, DBI, dbplyr, DescribeDisplay<br>DescTools, devtools, dplyr, dtplyr<br>ellipsis, evaluate, fda, feather<br>forcats, fs, fueleconomy, gdtools<br>geozoo, GGally, ggmap, ggmosaic<br>ggplot2, ggplot2movies, ggstance, ggthemes<br>ggvis, gh, gtable, haven<br>hflights, hipread, HistData, httr<br>itertools, knitr, knitrProgressBar, labelled<br>lazyeval, leaflet, lemon, lubridate<br>lvplot, magrittr, meifly, memoise<br>modelr, namespace, nasaweather, nlmixr<br>nullabor, nycflights13, odbc, packagedocs<br>partools, pillar, pkgbuild, pkgdown<br>pkgload, plotrix, plumbr, plyr<br>prettydoc, productplots, profr, proto<br>pryr, purrr, purrrlyr, rappdirs<br>Rd2roxygen, readr, readxl, recipes<br>remotes, reprex, reshape, reshape2<br>rggobi, RInno, rlang, RMariaDB<br>rmarkdown, RMySQL, roxygen2, RPostgres<br>rsample, RSQLite, rstan, rstudioapi<br>rticles, rvest, RxODE, scales<br>sessioninfo, sf, skimr, spectacles<br>stringr, svglite, testthat, tibble<br>tidymodels, tidyr, tidyselect, tidyverse<br>tidyxl, tourr, tourrGui, tribe<br>unjoin, usethis, wesanderson, withr<br>xml2, yaml, yesno","Author: Dirk Eddelbuettel<br>336 collaborators in 74 packages: <br>anytime, AsioHeaders, BH, bigFastlm<br>binb, DescTools, digest, drat<br>fortunes, gaussfacts, gcbd, gettz<br>gtrendsR, gunsales, hurricaneexposure, inline<br>komaletter, lbfgs, linl, littler<br>mvabund, mvst, n1qn1, nanotime<br>nlmixr, nloptr, permGPU, pinp<br>pkgKitten, prrd, random, RApiDatetime<br>RApiSerialize, Rblpapi, Rcpp, RcppAnnoy<br>RcppAPT, RcppArmadillo, RcppBDT, RcppBlaze<br>RcppCCTZ, RcppClassic, RcppClassicExamples, RcppCNPy<br>RcppDE, RcppEigen, RcppExamples, RcppFaddeeva<br>RcppGetconf, RcppGSL, RcppMsgPack, RcppNLoptExample<br>RcppQuantuccia, RcppRedis, RcppSMC, RcppStreams<br>RcppTOML, RcppXts, RcppZiggurat, RDieHarder<br>reticulate, rfoaas, RInside, Rmalschains<br>rmsfact, RPostgreSQL, RProtoBuf, RPushbullet<br>RQuantLib, RVowpalWabbit, sanitizers, tensorflow<br>tint, x13binary","Author: Renaud Lancelot<br>87 collaborators in 4 packages: <br>aod, aods3, fortunes, Rcmdr","Author: Simon Blomberg<br>91 collaborators in 2 packages: <br>ape, fortunes","Author: Jim Lemon<br>236 collaborators in 11 packages: <br>ape, clinsig, crank, DescTools<br>eventInterval, fortunes, irr, logmult<br>plotrix, prettyR, sp","Author: Karline Soetaert<br>138 collaborators in 23 packages: <br>AquaEnv, BCE, bvpSolve, DescTools<br>deSolve, deTestSet, diagram, diffEq<br>ecolMod, FME, inline, LIM<br>limSolve, marelac, MSCMT, NetIndices<br>OceanView, plot3D, plot3Drgl, ReacTran<br>rootSolve, seacarb, shape","Author: Friedrich Leisch<br>87 collaborators in 20 packages: <br>archetypes, biclust, bindata, bootstrap<br>e1071, exams, flexclust, flexmix<br>genetics, mda, mlbench, modeltools<br>mvtnorm, ockc, pixmap, psychomix<br>signal, StatDataML, strucchange, tth","Author: Arni Magnusson<br>185 collaborators in 18 packages: <br>areaplot, coda, DescTools, gdata<br>glmmTMB, gmt, gplots, icesAdvice<br>icesDatras, icesSAG, icesTAF, icesVocab<br>ora, plotMCMC, r2d2, scape<br>TMB, xtable","Author: Christian Buchta<br>39 collaborators in 14 packages: <br>arules, arulesSequences, cba, DSL<br>ISOcodes, proxy, relations, Rglpk<br>RWeka, seriation, sets, slam<br>tau, textcat","Author: Kurt Hornik<br>299 collaborators in 71 packages: <br>arules, aucm, bibtex, bindata<br>cclust, chron, clue, cluster<br>coin, colorspace, cordillera, ctv<br>date, dendextend, digest, e1071<br>exactRankTests, fortunes, gap, ISOcodes<br>isotone, kernlab, MASS, mda<br>mobForest, movMF, mvord, NLP<br>NLPutils, OAIHarvester, openNLP, openNLPdata<br>oz, pandocfilters, party, polyclip<br>polynom, PolynomF, princurve, qrmdata<br>qrmtools, Rcplex, relations, Rglpk<br>RKEA, RKEAjars, ROI, ROI.plugin.msbinlp<br>Rpoppler, Rsymphony, RWeka, RWekajars<br>seriation, sets, signal, skmeans<br>slam, stablelearner, strucchange, tau<br>textcat, tm, tm.plugin.mail, topicmodels<br>tseries, TSP, Unicode, vcd<br>W3CMarkupValidator, wordnet, xgobi","Author: Ken Aho<br>101 collaborators in 2 packages: <br>asbio, DescTools","Author: Nick Parsons<br>101 collaborators in 3 packages: <br>asd, DescTools, repolr","Author: Rolf Turner<br>315 collaborators in 11 packages: <br>AssetPricing, deldir, fortunes, hmm.discnp<br>Iso, maptools, mixreg, plotrix<br>spatstat, spatstat.data, spatstat.utils","Author: Jon Eugster<br>9 collaborators in 1 packages: <br>ATR","Author: Andrea Farnham<br>9 collaborators in 1 packages: <br>ATR","Author: Raphael Hartmann<br>9 collaborators in 1 packages: <br>ATR","Author: Tea Isler<br>12 collaborators in 2 packages: <br>ATR, eggCounts","Author: Ke Li<br>9 collaborators in 1 packages: <br>ATR","Author: Silvia Panunzi<br>9 collaborators in 1 packages: <br>ATR","Author: Sophie Schneider<br>9 collaborators in 1 packages: <br>ATR","Author: Craig Wang<br>12 collaborators in 3 packages: <br>ATR, eggCounts, variosig","Author: Torsten Hothorn<br>264 collaborators in 39 packages: <br>ATR, basefun, bst, coin<br>DescTools, exactRankTests, FDboost, fortunes<br>globalboosttest, hgam, HSAUR, HSAUR2<br>HSAUR3, inum, ipred, libcoin<br>lmtest, maxstat, mboost, mlt<br>mlt.docreg, mobForest, model4you, modeltools<br>MUCflights, multcomp, MVA, mvtnorm<br>palmtree, party, partykit, RWeka<br>sfa, stabs, StatDataML, TH.data<br>tram, trtf, variables","Author: Terry Therneau<br>180 collaborators in 10 packages: <br>attribrisk, bdsmatrix, date, deming<br>DescTools, fortunes, glmBfp, ipred<br>noweb, rpart","Author: Alexandros Karatzoglou<br>19 collaborators in 4 packages: <br>aucm, kernlab, personalized, RWeka","Author: Andrie de Vries<br>93 collaborators in 10 packages: <br>AzureML, dendextend, fortunes, ggdendro<br>miniCRAN, rfordummies, rrd, secret<br>sss, surveydata","Author: Jeff Enos<br>113 collaborators in 4 packages: <br>backtest, DescTools, portfolio, portfolioSim","Author: Bill Venables<br>252 collaborators in 23 packages: <br>bannerCommenter, BART, codingMatrices, conf.design<br>demoKde, DescTools, english, fortunes<br>fractional, gnm, gplots, lasso2<br>lazyData, LSD, MASS, minimax<br>oz, polynom, PolynomF, raster<br>sfsmisc, SOAR, sudokuAlt","Author: Roman Hornung<br>13 collaborators in 4 packages: <br>bapred, MUCflights, ordinalForest, prioritylasso","Author: Brian Ripley<br>541 collaborators in 41 packages: <br>BART, boot, car, class<br>crs, DescTools, fastICA, fortunes<br>FREGAT, gap, gee, ggdendro<br>gnm, ipred, itree, KernSmooth<br>LSD, MarginalMediation, MASS, mda<br>mpath, nnet, pbdMPI, pbdSLAP<br>pbdZMQ, polyclip, pspline, quantreg<br>rattle, Rcmdr, RODBC, RODBCext<br>rpart, rstanarm, sm, spatial<br>spatstat, tactile, tree, TSA<br>xgobi","Author: Peter Buehlmann<br>19 collaborators in 4 packages: <br>bastah, hdi, mboost, protiq","Author: Barry Eggleston<br>12 collaborators in 2 packages: <br>BayesCTDesign, mobForest","Author: Christopher Jackson<br>107 collaborators in 7 packages: <br>bayesDP, denstrip, DescTools, ecoreg<br>flexsurv, MetaAnalyser, msm","Author: Thomas Kneib<br>25 collaborators in 6 packages: <br>BayesX, BayesXsrc, cAIC4, mboost<br>R2BayesX, svcm","Author: Andreas Mayr<br>14 collaborators in 3 packages: <br>betaboost, gamboostLSS, mboost","Author: Benjamin Hofner<br>31 collaborators in 8 packages: <br>betaboost, Daim, gamboostLSS, kangar00<br>mboost, OpenML, papeR, stabs","Author: Matthias Schmid<br>20 collaborators in 7 packages: <br>betaboost, discSurv, DStree, gamboostLSS<br>kernDeepStackNet, mboost, survAUC","Author: Romain Francois<br>279 collaborators in 29 packages: <br>bibtex, bigFastlm, DescTools, highlight<br>hipread, inline, knitr, knitrProgressBar<br>mvst, operators, permGPU, Rcpp<br>Rcpp11, RcppArmadillo, RcppBDT, RcppBlaze<br>RcppClassic, RcppClassicExamples, RcppEigen, RcppExamples<br>RcppGSL, RcppParallel, readr, RInside<br>RProtoBuf, sos, svMisc, svTools<br>tibble","Author: Mikko Korpela<br>133 collaborators in 6 packages: <br>BINCOR, DescTools, dplR, RXKCD<br>sisal, skimr","Author: Fabian Scheipl<br>48 collaborators in 9 packages: <br>bioimagetools, dlnm, gamm4, lme4<br>mboost, mvtnorm, refund, RLRsim<br>spikeSlabGAM","Author: Greg Snow<br>237 collaborators in 11 packages: <br>blockrand, BrailleR, DescTools, fortunes<br>maptools, obsSens, qcc, sfsmisc<br>spData, sudoku, TeachingDemos","Author: Kevin Ushey<br>246 collaborators in 21 packages: <br>blogdown, bookdown, cloudml, cronR<br>DescTools, icd, packrat, Rcpp<br>Rcpp11, RcppParallel, RcppRoll, reticulate<br>rex, rmarkdown, rsnps, rstudioapi<br>sourcetools, sparklyr, tfdatasets, tfestimators<br>withr","Author: Bjoern Bornkamp<br>12 collaborators in 7 packages: <br>bnpmr, DoseFinding, iterLap, MCPMod<br>mvtnorm, SEL, txtplot","Author: Duncan Murdoch<br>265 collaborators in 22 packages: <br>BrailleR, car, digest, ellipse<br>fortunes, gpclib, gsl, inline<br>knitr, manipulateWidget, nlsr, orientlib<br>patchDVI, Rcmdr, Rdpack, rgl<br>rglwidget, sciplot, spatialkernel, tables<br>tkrgl, vcdExtra","Author: Dieter Menne<br>125 collaborators in 7 packages: <br>breathtestcore, breathteststan, broom, broom.mixed<br>fortunes, gastempt, installr","Author: Uwe Ligges<br>206 collaborators in 16 packages: <br>BRugs, dendextend, fftw, fortunes<br>klaR, nortest, R2OpenBUGS, R2WinBUGS<br>Rcmdr, reliaR, RobPer, RWinEdt<br>scatterplot3d, signal, tuneR, xtable","Author: Klaus Nordhausen<br>153 collaborators in 17 packages: <br>BSSasymp, DescTools, fastM, fICA<br>ICS, ICSNP, ICSOutlier, ICSShiny<br>ICtest, JADE, LDRTools, MNM<br>OjaNP, REPPlab, SpatialNP, tensorBSS<br>tsBSS","Author: Zhu Wang<br>11 collaborators in 5 packages: <br>bst, bujar, cts, mpath<br>orsk","Author: Michael Friendly<br>386 collaborators in 25 packages: <br>ca, candisc, car, DescTools<br>effects, fortunes, genridge, Guerry<br>heplots, HistData, installr, knitr<br>Lahman, logmult, maptools, matlib<br>mvinfluence, sem, statquotes, tableplot<br>twoway, vcd, vcdExtra, vegan<br>WordPools","Author: David Ruegamer<br>5 collaborators in 2 packages: <br>cAIC4, FDboost","Author: Thomas Petzoldt<br>109 collaborators in 11 packages: <br>caper, cardidates, deSolve, FME<br>fortunes, growthrates, marelac, plotrix<br>proto, qualV, simecol","Author: Spencer Graves<br>97 collaborators in 7 packages: <br>car, Ecfun, fda, fortunes<br>maxLik, multcompView, sos","Author: Henric Nilsson<br>125 collaborators in 2 packages: <br>car, DescTools","Author: Derek Ogle<br>155 collaborators in 6 packages: <br>car, DescTools, FSA, FSAdata<br>plotrix, readbitmap","Author: David Winsemius<br>86 collaborators in 2 packages: <br>car, fortunes","Author: Roger Bivand<br>244 collaborators in 25 packages: <br>cartogram, classInt, DCluster, foreign<br>fortunes, INLABMA, interp, mapproj<br>maptools, MBA, PBSmapping, pixmap<br>raster, rgdal, rgeos, rgrass7<br>sf, sp, spatial, spData<br>spdep, spgrass6, spgwr, splancs<br>vapour","Author: Andreas Alfons<br>119 collaborators in 13 packages: <br>ccaPP, cvTools, DescTools, laeken<br>perry, robmed, robustHD, simFrame<br>simPop, sparseLTSEigen, sparsestep, VIM<br>VIMGUI","Author: Michael Smithson<br>103 collaborators in 3 packages: <br>cdfquantreg, DescTools, smdata","Author: Gabor Grothendieck<br>169 collaborators in 12 packages: <br>chron, DescTools, gdata, lme4<br>optimr, optimx, plotrix, proto<br>rockchalk, Ryacas, stinepack, zoo","Author: Matthieu Stigler<br>108 collaborators in 9 packages: <br>classInt, fortunes, rddapp, rddtools<br>rsdmx, tsDyn, urca, vars<br>xtable","Author: Venkatraman E Seshan<br>106 collaborators in 4 packages: <br>clinfun, DescTools, genepi, PSCBS","Author: Andreas Bender<br>12 collaborators in 2 packages: <br>coalitions, MUCflights","Author: Mark A van de Wiel<br>4 collaborators in 1 packages: <br>coin","Author: Henric Winell<br>4 collaborators in 1 packages: <br>coin","Author: Tyler Rinker<br>170 collaborators in 17 packages: <br>cowsay, DescTools, gofastr, lexicon<br>numform, pacman, qdap, qdapDictionaries<br>qdapRegex, qdapTools, reports, sentimentr<br>textclean, textreadr, textshape, textstem<br>wakefield","Author: Alec Stephenson<br>112 collaborators in 8 packages: <br>cubing, DescTools, evd, evdbayes<br>evir, PlayerRatings, texmex, TideHarmonics","Author: Christian W Hoffmann<br>101 collaborators in 2 packages: <br>cwhmisc, DescTools","Author: Tal Galili<br>218 collaborators in 9 packages: <br>d3heatmap, dendextend, DescTools, digitize<br>edfun, fortunes, heatmaply, installr<br>shinyHeatmaply","Author: Gregory R Warnes<br>168 collaborators in 13 packages: <br>daff, DescTools, gdata, gmodels<br>gplots, gtools, mcgibbsit, namespace<br>r2d2, SASxport, session, SII<br>yaml","Author: Barry Rowlingson<br>182 collaborators in 13 packages: <br>DClusterm, fortunes, geonames, gpclib<br>installr, plotrix, raster, rgdal<br>sp, spatialkernel, splancs, stplanr<br>stpp","Author: Rob J Hyndman<br>111 collaborators in 11 packages: <br>demography, DescTools, emma, expsmooth<br>fds, fma, fpp, ftsa<br>rainbow, smoothAPC, stR","Author: Joshua Ulrich<br>115 collaborators in 4 packages: <br>DEoptim, DescTools, PerformanceAnalytics, TTR","Author: Marc Schwartz<br>76 collaborators in 4 packages: <br>descr, fortunes, gplots, WriteXLS","Author: Andri Signorell<br>108 collaborators in 3 packages: <br>DescTools, DescToolsAddIns, kyotil","Author: Nanina Anderegg<br>101 collaborators in 1 packages: <br>DescTools","Author: Tomas Aragon<br>112 collaborators in 2 packages: <br>DescTools, pubh","Author: Antti Arppe<br>106 collaborators in 2 packages: <br>DescTools, ndl","Author: Adrian Baddeley<br>342 collaborators in 11 packages: <br>DescTools, globe, goftest, maptools<br>polyclip, scuba, spatstat, spatstat.data<br>spatstat.local, spatstat.utils, statip","Author: Kamil Barton<br>103 collaborators in 2 packages: <br>DescTools, svMisc","Author: Frederico Caeiro<br>105 collaborators in 3 packages: <br>DescTools, evt0, randtests","Author: Stephane Champely<br>109 collaborators in 4 packages: <br>DescTools, PairedData, pwr, RcmdrPlugin.pointG","Author: Leanne Chhay<br>116 collaborators in 2 packages: <br>DescTools, forecast","Author: Clint Cummins<br>106 collaborators in 2 packages: <br>DescTools, lmtest","Author: Michael Dewey<br>163 collaborators in 4 packages: <br>DescTools, fortunes, latdiag, metap","Author: Harold C Doran<br>101 collaborators in 1 packages: <br>DescTools","Author: Charles Dupont<br>107 collaborators in 4 packages: <br>DescTools, Hmisc, PResiduals, sensitivityPStrat","Author: Claus Ekstrom<br>115 collaborators in 5 packages: <br>DescTools, isdals, kulife, MethComp<br>pwr","Author: Martin Elff<br>101 collaborators in 4 packages: <br>DescTools, mclogit, memisc, munfold","Author: Richard W Farebrother<br>106 collaborators in 2 packages: <br>DescTools, lmtest","Author: Matthias Gamer<br>103 collaborators in 2 packages: <br>DescTools, irr","Author: Joseph L Gastwirth<br>106 collaborators in 2 packages: <br>DescTools, lawstat","Author: Yulia R Gel<br>120 collaborators in 5 packages: <br>DescTools, funtimes, lawstat, nparLD<br>snowboot","Author: Juergen Gross<br>102 collaborators in 2 packages: <br>DescTools, nortest","Author: Frank E Harrell Jr<br>187 collaborators in 5 packages: <br>DescTools, greport, Hmisc, knitr<br>rms","Author: Michael Hoehle<br>103 collaborators in 2 packages: <br>DescTools, polyCub","Author: Markus Huerzeler<br>101 collaborators in 1 packages: <br>DescTools","Author: Wallace W Hui<br>101 collaborators in 1 packages: <br>DescTools","Author: Pete Hurd<br>101 collaborators in 1 packages: <br>DescTools","Author: Pablo J Villacorta Iglesias<br>101 collaborators in 1 packages: <br>DescTools","Author: Matthias Kohl<br>148 collaborators in 21 packages: <br>DescTools, distr, distrDoc, distrEx<br>distrMod, distrSim, distrTeach, distrTEst<br>MKmisc, mpe, RandVar, RFLPtools<br>RobAStBase, RobAStRDA, RobExtremes, RobLox<br>RobLoxBioC, RobRex, ROptEst, ROptEstOld<br>ROptRegTS","Author: Detlew Labes<br>105 collaborators in 3 packages: <br>DescTools, Power2Stage, PowerTOST","Author: Friederich Leisch<br>101 collaborators in 1 packages: <br>DescTools","Author: Dong Li<br>101 collaborators in 1 packages: <br>DescTools","Author: Daniel Malter<br>101 collaborators in 1 packages: <br>DescTools","Author: George Marsaglia<br>104 collaborators in 2 packages: <br>DescTools, goftest","Author: John Marsaglia<br>104 collaborators in 2 packages: <br>DescTools, goftest","Author: Alina Matei<br>102 collaborators in 2 packages: <br>DescTools, sampling","Author: David Meyer<br>136 collaborators in 14 packages: <br>DescTools, e1071, kst, proxy<br>registry, relations, ROI, ROI.plugin.msbinlp<br>RWeka, sets, slam, StatDataML<br>tau, vcd","Author: Weiwen Miao<br>106 collaborators in 2 packages: <br>DescTools, lawstat","Author: Giovanni Millo<br>146 collaborators in 6 packages: <br>DescTools, lmtest, pder, plm<br>spdep, splm","Author: Yongyi Min<br>101 collaborators in 1 packages: <br>DescTools","Author: David Mitchell<br>134 collaborators in 3 packages: <br>DescTools, lmtest, xtable","Author: Markus Naepflin<br>101 collaborators in 1 packages: <br>DescTools","Author: Daniel Navarro<br>101 collaborators in 2 packages: <br>DescTools, lsr","Author: Hong Ooi<br>102 collaborators in 2 packages: <br>DescTools, glmnetUtils","Author: Roland Rapold<br>101 collaborators in 1 packages: <br>DescTools","Author: William Revelle<br>101 collaborators in 2 packages: <br>DescTools, psych","Author: Caroline Rodriguez<br>101 collaborators in 1 packages: <br>DescTools","Author: Nathan Russell<br>108 collaborators in 3 packages: <br>DescTools, hashmap, Rcpp","Author: Nick Sabbe<br>104 collaborators in 2 packages: <br>DescTools, pim","Author: Werner A Stahel<br>103 collaborators in 2 packages: <br>DescTools, spate","Author: Mark Stevenson<br>128 collaborators in 3 packages: <br>DescTools, epiR, pubh","Author: Matthias Templ<br>131 collaborators in 9 packages: <br>DescTools, emdi, laeken, robCompositions<br>sdcMicro, simPop, sparkTable, VIM<br>VIMGUI","Author: Yves Tille<br>101 collaborators in 1 packages: <br>DescTools","Author: Adrian Trapletti<br>103 collaborators in 2 packages: <br>DescTools, tseries","Author: John Verzani<br>111 collaborators in 12 packages: <br>DescTools, Devore7, gWidgets, gWidgets2<br>gWidgets2RGtk2, gWidgets2tcltk, gWidgetsRGtk2, gWidgetstcltk<br>ProgGUIinR, RGtk2Extras, traitr, UsingR","Author: Stefan Wellek<br>103 collaborators in 3 packages: <br>DescTools, EQUIVNONINF, MIDN","Author: Rand R Wilcox<br>101 collaborators in 1 packages: <br>DescTools","Author: Peter Wolf<br>124 collaborators in 2 packages: <br>DescTools, Rcmdr","Author: Daniel Wollschlaeger<br>106 collaborators in 4 packages: <br>DescTools, DVHmetrics, epitools, shotGroups","Author: Thomas Yee<br>103 collaborators in 3 packages: <br>DescTools, VGAMdata, VGAMextra","Author: Detlef Steuer<br>73 collaborators in 5 packages: <br>desire, fortunes, loglognorm, mco<br>truncnorm","Author: Frank Bretz<br>17 collaborators in 4 packages: <br>DoseFinding, MCPMod, multcomp, mvtnorm","Author: Andy Bunn<br>74 collaborators in 2 packages: <br>dplR, fortunes","Author: Sarah Goslee<br>65 collaborators in 4 packages: <br>ecodist, fortunes, landsat, VFS","Author: Sarah Brockhaus<br>6 collaborators in 2 packages: <br>FDboost, mousetrap","Author: R community<br>62 collaborators in 1 packages: <br>fortunes","Author: Peter Dalgaard<br>70 collaborators in 3 packages: <br>fortunes, ISwR, pwr","Author: Kjetil Brinchmann Halvorsen<br>62 collaborators in 1 packages: <br>fortunes","Author: Ray Brownrigg<br>71 collaborators in 4 packages: <br>fortunes, mapdata, mapproj, maps","Author: David L Reiner<br>62 collaborators in 1 packages: <br>fortunes","Author: Berton Gunter<br>62 collaborators in 1 packages: <br>fortunes","Author: Roger Koenker<br>73 collaborators in 5 packages: <br>fortunes, glmx, quantreg, REBayes<br>SparseM","Author: Charles Berry<br>62 collaborators in 2 packages: <br>fortunes, sonicLength","Author: Peter Dunn<br>66 collaborators in 2 packages: <br>fortunes, statmod","Author: Roland Rau<br>69 collaborators in 6 packages: <br>fortunes, LogrankA, Miney, npst<br>ROMIplot, RVideoPoker","Author: Mark Leeds<br>62 collaborators in 1 packages: <br>fortunes","Author: Emmanuel Charpentier<br>74 collaborators in 3 packages: <br>fortunes, LaplacesDemon, patchSynctex","Author: Chris Evans<br>62 collaborators in 1 packages: <br>fortunes","Author: Paolo Sonego<br>63 collaborators in 3 packages: <br>fortunes, Rcolombos, RXKCD","Author: Peter Ehlers<br>62 collaborators in 1 packages: <br>fortunes","Author: Liviu Andronic<br>124 collaborators in 7 packages: <br>fortunes, plm, Rcmdr, RcmdrPlugin.Export<br>RcmdrPlugin.sos, RGtk2Extras, xtable","Author: Brian Diggs<br>147 collaborators in 2 packages: <br>fortunes, knitr","Author: Richard M Heiberger<br>72 collaborators in 6 packages: <br>fortunes, HH, microplot, multcomp<br>RcmdrPlugin.HH, twoway","Author: Patrick Burns<br>62 collaborators in 1 packages: <br>fortunes","Author: R Michael Weylandt<br>62 collaborators in 1 packages: <br>fortunes","Author: Jon Olav Skoien<br>71 collaborators in 4 packages: <br>fortunes, intamap, psgp, rtop","Author: Francois Morneau<br>62 collaborators in 1 packages: <br>fortunes","Author: Antony Unwin<br>62 collaborators in 3 packages: <br>fortunes, GDAdata, OutliersO3","Author: Joshua Wiley<br>64 collaborators in 2 packages: <br>fortunes, MplusAutomation","Author: Bryan Hanson<br>74 collaborators in 3 packages: <br>fortunes, hyperSpec, LindenmayeR","Author: Eduard Szoecs<br>89 collaborators in 3 packages: <br>fortunes, taxize, vegan","Author: Gregor Passolt<br>66 collaborators in 2 packages: <br>fortunes, vitality","Author: John C Nash<br>80 collaborators in 12 packages: <br>fortunes, lbfgsb3, lbfgsb3c, minqa<br>nlmrt, nlsr, optextras, optimr<br>optimx, Rcgmin, Rtnmin, Rvmmin","Author: Matthias Speidel<br>13 collaborators in 3 packages: <br>FourScores, hgam, hmi","Author: Anne-Laure Boulesteix<br>6 collaborators in 6 packages: <br>globalboosttest, ipflasso, MAclinical, plsgenomics<br>SNPmaxsel, WilcoxCV","Author: Hannah Frick<br>18 collaborators in 4 packages: <br>goodpractice, hgam, psychomix, trackeR","Author: Christina Riedel<br>11 collaborators in 2 packages: <br>GWG, MUCflights","Author: Martin Spindler<br>13 collaborators in 2 packages: <br>hdm, hgam","Author: Ivan Kondofersky<br>11 collaborators in 1 packages: <br>hgam","Author: Oliver S Kuehnle<br>11 collaborators in 1 packages: <br>hgam","Author: Christian Lindenlaub<br>22 collaborators in 2 packages: <br>hgam, MUCflights","Author: Georg Pfundstein<br>11 collaborators in 1 packages: <br>hgam","Author: Ariane Straub<br>23 collaborators in 3 packages: <br>hgam, MUCflights, sfa","Author: Florian Wickler<br>22 collaborators in 2 packages: <br>hgam, MUCflights","Author: Katharina Zink<br>11 collaborators in 1 packages: <br>hgam","Author: Manuel Eugster<br>25 collaborators in 3 packages: <br>hgam, MUCflights, roxygen2","Author: Heidi Seibold<br>11 collaborators in 5 packages: <br>highriskzone, model4you, palmtree, partykit<br>simex","Author: Brian S Everitt<br>4 collaborators in 4 packages: <br>HSAUR, HSAUR2, HSAUR3, MVA","Author: Andrea Peters<br>4 collaborators in 1 packages: <br>ipred","Author: Beth Atkinson<br>8 collaborators in 3 packages: <br>ipred, itree, rpart","Author: Fabian Sobotka<br>7 collaborators in 1 packages: <br>mboost","Author: Alan Genz<br>21 collaborators in 7 packages: <br>mnormpow, mnormt, mvord, mvtnorm<br>pbivnorm, PLordprob, SimplicialCubature","Author: Nikhil Garge<br>8 collaborators in 1 packages: <br>mobForest","Author: Georgiy Bobashev<br>8 collaborators in 1 packages: <br>mobForest","Author: Benjamin Carper<br>8 collaborators in 1 packages: <br>mobForest","Author: Kasey Jones<br>14 collaborators in 2 packages: <br>mobForest, rollmatch","Author: Carolin Strobl<br>27 collaborators in 6 packages: <br>mobForest, party, psychomix, psychotools<br>psychotree, stablelearner","Author: Basil Abou El-Komboz<br>11 collaborators in 1 packages: <br>MUCflights","Author: Abdelilah El Hadad<br>11 collaborators in 1 packages: <br>MUCflights","Author: Laura Goeres<br>11 collaborators in 1 packages: <br>MUCflights","Author: Max Hughes-Brandl<br>11 collaborators in 2 packages: <br>MUCflights, NightDay","Author: Peter Westfall<br>5 collaborators in 1 packages: <br>multcomp","Author: Andre Schuetzenmeister<br>8 collaborators in 4 packages: <br>multcomp, STB, VCA, VFP","Author: Susan Scheibe<br>5 collaborators in 1 packages: <br>multcomp","Author: Tetsuhisa Miwa<br>8 collaborators in 1 packages: <br>mvtnorm","Author: Xuefei Mi<br>11 collaborators in 2 packages: <br>mvtnorm, selectiongain"]},"edges":{"from":["Jon Eugster","Andrea Farnham","Raphael Hartmann","Tea Isler","Gilles Kratzer","Ke Li","Silvia Panunzi","Sophie Schneider","Craig Wang","Zhu Wang","Torsten Hothorn","Torsten Hothorn","Torsten Hothorn","Torsten Hothorn","Andri Signorell","Ken Aho","Andreas Alfons","Nanina Anderegg","Tomas Aragon","Antti Arppe","Adrian Baddeley","Kamil Barton","Ben Bolker","Frederico Caeiro","Stephane Champely","Daniel Chessel","Leanne Chhay","Clint Cummins","Michael Dewey","Harold C Doran","Stephane Dray","Charles Dupont","Dirk Eddelbuettel","Jeff Enos","Claus Ekstrom","Martin Elff","Kamil Erguler","Richard W Farebrother","John Fox","Romain Francois","Michael Friendly","Tal Galili","Matthias Gamer","Joseph L Gastwirth","Yulia R Gel","Juergen Gross","Gabor Grothendieck","Frank E Harrell Jr","Richard Heiberger","Michael Hoehle","Christian W Hoffmann","Torsten Hothorn","Torsten Hothorn","Torsten Hothorn","Torsten Hothorn","Torsten Hothorn","Torsten Hothorn","Torsten Hothorn","Torsten Hothorn","Torsten Hothorn","Torsten Hothorn","Torsten Hothorn","Torsten Hothorn","Torsten Hothorn","Torsten Hothorn","Torsten Hothorn","Torsten Hothorn","Torsten Hothorn","Torsten Hothorn","Torsten Hothorn","Torsten Hothorn","Torsten Hothorn","Torsten Hothorn","Torsten Hothorn","Torsten Hothorn","Torsten Hothorn","Torsten Hothorn","Torsten Hothorn","Torsten Hothorn","Torsten Hothorn","Torsten Hothorn","Torsten Hothorn","Torsten Hothorn","Torsten Hothorn","Torsten Hothorn","Torsten Hothorn","Torsten Hothorn","Torsten Hothorn","Torsten Hothorn","Torsten Hothorn","Torsten Hothorn","Torsten Hothorn","Torsten Hothorn","Torsten Hothorn","Torsten Hothorn","Torsten Hothorn","Torsten Hothorn","Torsten Hothorn","Torsten Hothorn","Torsten Hothorn","Torsten Hothorn","Torsten Hothorn","Torsten Hothorn","Torsten Hothorn","Torsten Hothorn","Torsten Hothorn","Torsten Hothorn","Torsten Hothorn","Torsten Hothorn","Torsten Hothorn","Torsten Hothorn","Torsten Hothorn","Torsten Hothorn","Torsten Hothorn","Torsten Hothorn","Torsten Hothorn","Sarah Brockhaus","David Ruegamer","Achim Zeileis","R community","Torsten Hothorn","Torsten Hothorn","Torsten Hothorn","Torsten Hothorn","Torsten Hothorn","Torsten Hothorn","Torsten Hothorn","Torsten Hothorn","Torsten Hothorn","Torsten Hothorn","Torsten Hothorn","Torsten Hothorn","Torsten Hothorn","Torsten Hothorn","Torsten Hothorn","Torsten Hothorn","Torsten Hothorn","Torsten Hothorn","Torsten Hothorn","Torsten Hothorn","Torsten Hothorn","Torsten Hothorn","Torsten Hothorn","Torsten Hothorn","Torsten Hothorn","Torsten Hothorn","Torsten Hothorn","Torsten Hothorn","Torsten Hothorn","Torsten Hothorn","Torsten Hothorn","Torsten Hothorn","Torsten Hothorn","Torsten Hothorn","Torsten Hothorn","Torsten Hothorn","Torsten Hothorn","Torsten Hothorn","Torsten Hothorn","Torsten Hothorn","Torsten Hothorn","Torsten Hothorn","Torsten Hothorn","Torsten Hothorn","Torsten Hothorn","Torsten Hothorn","Torsten Hothorn","Torsten Hothorn","Torsten Hothorn","Torsten Hothorn","Torsten Hothorn","Torsten Hothorn","Torsten Hothorn","Torsten Hothorn","Torsten Hothorn","Torsten Hothorn","Torsten Hothorn","Torsten Hothorn","Torsten Hothorn","Torsten Hothorn","Anne-Laure Boulesteix","Hannah Frick","Ivan Kondofersky","Oliver S Kuehnle","Christian Lindenlaub","Georg Pfundstein","Matthias Speidel","Martin Spindler","Ariane Straub","Florian Wickler","Katharina Zink","Manuel Eugster","Brian S Everitt","Brian S Everitt","Torsten Hothorn","Andrea Peters","Torsten Hothorn","Torsten Hothorn","Torsten Hothorn","Torsten Hothorn","Torsten Hothorn","Torsten Hothorn","Torsten Hothorn","Torsten Hothorn","Torsten Hothorn","Torsten Hothorn","Torsten Hothorn","Torsten Hothorn","Torsten Hothorn","Torsten Hothorn","Torsten Hothorn","Nikhil Garge","Barry Eggleston","Georgiy Bobashev","Benjamin Carper","Kasey Jones","Torsten Hothorn","Torsten Hothorn","Torsten Hothorn","Heidi Seibold","Achim Zeileis","Torsten Hothorn","Torsten Hothorn","Basil Abou El-Komboz","Andreas Bender","Abdelilah El Hadad","Laura Goeres","Roman Hornung","Max Hughes-Brandl","Christian Lindenlaub","Christina Riedel","Ariane Straub","Florian Wickler","Manuel Eugster","Torsten Hothorn","Torsten Hothorn","Torsten Hothorn","Torsten Hothorn","Torsten Hothorn","Brian S Everitt","Alan Genz","Frank Bretz","Tetsuhisa Miwa","Xuefei Mi","Friedrich Leisch","Fabian Scheipl","Bjoern Bornkamp","Martin Maechler","Heidi Seibold","Torsten Hothorn","Torsten Hothorn","Torsten Hothorn","Torsten Hothorn","Torsten Hothorn","Torsten Hothorn","Kurt Hornik","Christian Buchta","Torsten Hothorn","Torsten Hothorn","Torsten Hothorn","Ariane Straub","Benjamin Hofner","David Meyer","Torsten Hothorn"],"to":["Torsten Hothorn","Torsten Hothorn","Torsten Hothorn","Torsten Hothorn","Torsten Hothorn","Torsten Hothorn","Torsten Hothorn","Torsten Hothorn","Torsten Hothorn","Torsten Hothorn","Kurt Hornik","Mark A van de Wiel","Henric Winell","Achim Zeileis","Torsten Hothorn","Torsten Hothorn","Torsten Hothorn","Torsten Hothorn","Torsten Hothorn","Torsten Hothorn","Torsten Hothorn","Torsten Hothorn","Torsten Hothorn","Torsten Hothorn","Torsten Hothorn","Torsten Hothorn","Torsten Hothorn","Torsten Hothorn","Torsten Hothorn","Torsten Hothorn","Torsten Hothorn","Torsten Hothorn","Torsten Hothorn","Torsten Hothorn","Torsten Hothorn","Torsten Hothorn","Torsten Hothorn","Torsten Hothorn","Torsten Hothorn","Torsten Hothorn","Torsten Hothorn","Torsten Hothorn","Torsten Hothorn","Torsten Hothorn","Torsten Hothorn","Torsten Hothorn","Torsten Hothorn","Torsten Hothorn","Torsten Hothorn","Torsten Hothorn","Torsten Hothorn","Markus Huerzeler","Wallace W Hui","Pete Hurd","Rob J Hyndman","Pablo J Villacorta Iglesias","Christopher Jackson","Matthias Kohl","Mikko Korpela","Max Kuhn","Detlew Labes","Friederich Leisch","Jim Lemon","Dong Li","Martin Maechler","Arni Magnusson","Daniel Malter","George Marsaglia","John Marsaglia","Alina Matei","David Meyer","Weiwen Miao","Giovanni Millo","Yongyi Min","David Mitchell","Markus Naepflin","Daniel Navarro","Henric Nilsson","Klaus Nordhausen","Derek Ogle","Hong Ooi","Nick Parsons","Sandrine Pavoine","Tony Plate","Roland Rapold","William Revelle","Tyler Rinker","Brian Ripley","Caroline Rodriguez","Nathan Russell","Nick Sabbe","Venkatraman E Seshan","Greg Snow","Michael Smithson","Karline Soetaert","Werner A Stahel","Alec Stephenson","Mark Stevenson","Matthias Templ","Terry Therneau","Yves Tille","Adrian Trapletti","Joshua Ulrich","Kevin Ushey","Jeremy VanDerWal","Bill Venables","John Verzani","Gregory R Warnes","Stefan Wellek","Hadley Wickham","Rand R Wilcox","Peter Wolf","Daniel Wollschlaeger","Thomas Yee","Achim Zeileis","Kurt Hornik","Torsten Hothorn","Torsten Hothorn","Torsten Hothorn","Torsten Hothorn","Peter Dalgaard","Uwe Ligges","Kevin Wright","Martin Maechler","Kjetil Brinchmann Halvorsen","Kurt Hornik","Duncan Murdoch","Andy Bunn","Ray Brownrigg","Roger Bivand","Spencer Graves","Jim Lemon","Christian Kleiber","David L Reiner","Berton Gunter","Roger Koenker","Charles Berry","Marc Schwartz","Michael Dewey","Ben Bolker","Peter Dunn","Sarah Goslee","Simon Blomberg","Bill Venables","Roland Rau","Thomas Petzoldt","Rolf Turner","Mark Leeds","Emmanuel Charpentier","Chris Evans","Paolo Sonego","Peter Ehlers","Detlef Steuer","Tal Galili","Greg Snow","Brian Ripley","Michael Sumner","David Winsemius","Liviu Andronic","Brian Diggs","Matthieu Stigler","Michael Friendly","Dirk Eddelbuettel","Richard M Heiberger","Patrick Burns","Dieter Menne","Andrie de Vries","Barry Rowlingson","Renaud Lancelot","R Michael Weylandt","Jon Olav Skoien","Francois Morneau","Antony Unwin","Joshua Wiley","Terry Therneau","Bryan Hanson","Henrik Singmann","Eduard Szoecs","Gregor Passolt","John C Nash","Torsten Hothorn","Torsten Hothorn","Torsten Hothorn","Torsten Hothorn","Torsten Hothorn","Torsten Hothorn","Torsten Hothorn","Torsten Hothorn","Torsten Hothorn","Torsten Hothorn","Torsten Hothorn","Torsten Hothorn","Torsten Hothorn","Torsten Hothorn","Brian S Everitt","Torsten Hothorn","Brian Ripley","Terry Therneau","Beth Atkinson","Achim Zeileis","Richard W Farebrother","Clint Cummins","Giovanni Millo","David Mitchell","Peter Buehlmann","Thomas Kneib","Matthias Schmid","Benjamin Hofner","Fabian Sobotka","Fabian Scheipl","Andreas Mayr","Torsten Hothorn","Torsten Hothorn","Torsten Hothorn","Torsten Hothorn","Torsten Hothorn","Kurt Hornik","Carolin Strobl","Achim Zeileis","Torsten Hothorn","Torsten Hothorn","Friedrich Leisch","Achim Zeileis","Torsten Hothorn","Torsten Hothorn","Torsten Hothorn","Torsten Hothorn","Torsten Hothorn","Torsten Hothorn","Torsten Hothorn","Torsten Hothorn","Torsten Hothorn","Torsten Hothorn","Torsten Hothorn","Frank Bretz","Peter Westfall","Richard M Heiberger","Andre Schuetzenmeister","Susan Scheibe","Torsten Hothorn","Torsten Hothorn","Torsten Hothorn","Torsten Hothorn","Torsten Hothorn","Torsten Hothorn","Torsten Hothorn","Torsten Hothorn","Torsten Hothorn","Torsten Hothorn","Achim Zeileis","Kurt Hornik","Carolin Strobl","Achim Zeileis","Heidi Seibold","Achim Zeileis","Torsten Hothorn","Torsten Hothorn","Alexandros Karatzoglou","David Meyer","Achim Zeileis","Torsten Hothorn","Torsten Hothorn","Torsten Hothorn","Friedrich Leisch"],"color":["#4A6FE3","#4A6FE3","#4A6FE3","#4A6FE3","#4A6FE3","#4A6FE3","#4A6FE3","#4A6FE3","#4A6FE3","#4A6FE3","#4A6FE3","#4A6FE3","#4A6FE3","#4A6FE3","#4A6FE3","#4A6FE3","#4A6FE3","#4A6FE3","#4A6FE3","#4A6FE3","#4A6FE3","#4A6FE3","#4A6FE3","#4A6FE3","#4A6FE3","#4A6FE3","#4A6FE3","#4A6FE3","#4A6FE3","#4A6FE3","#4A6FE3","#4A6FE3","#4A6FE3","#4A6FE3","#4A6FE3","#4A6FE3","#4A6FE3","#4A6FE3","#4A6FE3","#4A6FE3","#4A6FE3","#4A6FE3","#4A6FE3","#4A6FE3","#4A6FE3","#4A6FE3","#4A6FE3","#4A6FE3","#4A6FE3","#4A6FE3","#4A6FE3","#4A6FE3","#4A6FE3","#4A6FE3","#4A6FE3","#4A6FE3","#4A6FE3","#4A6FE3","#4A6FE3","#4A6FE3","#4A6FE3","#4A6FE3","#4A6FE3","#4A6FE3","#4A6FE3","#4A6FE3","#4A6FE3","#4A6FE3","#4A6FE3","#4A6FE3","#4A6FE3","#4A6FE3","#4A6FE3","#4A6FE3","#4A6FE3","#4A6FE3","#4A6FE3","#4A6FE3","#4A6FE3","#4A6FE3","#4A6FE3","#4A6FE3","#4A6FE3","#4A6FE3","#4A6FE3","#4A6FE3","#4A6FE3","#4A6FE3","#4A6FE3","#4A6FE3","#4A6FE3","#4A6FE3","#4A6FE3","#4A6FE3","#4A6FE3","#4A6FE3","#4A6FE3","#4A6FE3","#4A6FE3","#4A6FE3","#4A6FE3","#4A6FE3","#4A6FE3","#4A6FE3","#4A6FE3","#4A6FE3","#4A6FE3","#4A6FE3","#4A6FE3","#4A6FE3","#4A6FE3","#4A6FE3","#4A6FE3","#4A6FE3","#4A6FE3","#4A6FE3","#4A6FE3","#4A6FE3","#4A6FE3","#4A6FE3","#4A6FE3","#4A6FE3","#4A6FE3","#4A6FE3","#4A6FE3","#4A6FE3","#4A6FE3","#4A6FE3","#4A6FE3","#4A6FE3","#4A6FE3","#4A6FE3","#4A6FE3","#4A6FE3","#4A6FE3","#4A6FE3","#4A6FE3","#4A6FE3","#4A6FE3","#4A6FE3","#4A6FE3","#4A6FE3","#4A6FE3","#4A6FE3","#4A6FE3","#4A6FE3","#4A6FE3","#4A6FE3","#4A6FE3","#4A6FE3","#4A6FE3","#4A6FE3","#4A6FE3","#4A6FE3","#4A6FE3","#4A6FE3","#4A6FE3","#4A6FE3","#4A6FE3","#4A6FE3","#4A6FE3","#4A6FE3","#4A6FE3","#4A6FE3","#4A6FE3","#4A6FE3","#4A6FE3","#4A6FE3","#4A6FE3","#4A6FE3","#4A6FE3","#4A6FE3","#4A6FE3","#4A6FE3","#4A6FE3","#4A6FE3","#4A6FE3","#4A6FE3","#4A6FE3","#4A6FE3","#4A6FE3","#4A6FE3","#4A6FE3","#4A6FE3","#4A6FE3","#4A6FE3","#4A6FE3","#4A6FE3","#4A6FE3","#4A6FE3","#4A6FE3","#4A6FE3","#4A6FE3","#4A6FE3","#4A6FE3","#4A6FE3","#4A6FE3","#4A6FE3","#4A6FE3","#4A6FE3","#4A6FE3","#4A6FE3","#4A6FE3","#4A6FE3","#4A6FE3","#4A6FE3","#4A6FE3","#4A6FE3","#4A6FE3","#4A6FE3","#4A6FE3","#4A6FE3","#4A6FE3","#4A6FE3","#4A6FE3","#4A6FE3","#4A6FE3","#4A6FE3","#4A6FE3","#4A6FE3","#4A6FE3","#4A6FE3","#4A6FE3","#4A6FE3","#4A6FE3","#4A6FE3","#4A6FE3","#4A6FE3","#4A6FE3","#4A6FE3","#4A6FE3","#4A6FE3","#4A6FE3","#4A6FE3","#4A6FE3","#4A6FE3","#4A6FE3","#4A6FE3","#4A6FE3","#4A6FE3","#4A6FE3","#4A6FE3","#4A6FE3","#4A6FE3","#4A6FE3","#4A6FE3","#4A6FE3","#4A6FE3","#4A6FE3","#4A6FE3","#4A6FE3","#4A6FE3","#4A6FE3","#4A6FE3","#4A6FE3","#4A6FE3","#4A6FE3","#4A6FE3","#4A6FE3","#4A6FE3","#4A6FE3","#4A6FE3","#4A6FE3","#4A6FE3"],"title":["collaborate in: ATR","collaborate in: ATR","collaborate in: ATR","collaborate in: ATR","collaborate in: ATR","collaborate in: ATR","collaborate in: ATR","collaborate in: ATR","collaborate in: ATR","collaborate in: bst","collaborate in: coin","collaborate in: coin","collaborate in: coin","collaborate in: coin","collaborate in: DescTools","collaborate in: DescTools","collaborate in: DescTools","collaborate in: DescTools","collaborate in: DescTools","collaborate in: DescTools","collaborate in: DescTools","collaborate in: DescTools","collaborate in: DescTools","collaborate in: DescTools","collaborate in: DescTools","collaborate in: DescTools","collaborate in: DescTools","collaborate in: DescTools","collaborate in: DescTools","collaborate in: DescTools","collaborate in: DescTools","collaborate in: DescTools","collaborate in: DescTools","collaborate in: DescTools","collaborate in: DescTools","collaborate in: DescTools","collaborate in: DescTools","collaborate in: DescTools","collaborate in: DescTools","collaborate in: DescTools","collaborate in: DescTools","collaborate in: DescTools","collaborate in: DescTools","collaborate in: DescTools","collaborate in: DescTools","collaborate in: DescTools","collaborate in: DescTools","collaborate in: DescTools","collaborate in: DescTools","collaborate in: DescTools","collaborate in: DescTools","collaborate in: DescTools","collaborate in: DescTools","collaborate in: DescTools","collaborate in: DescTools","collaborate in: DescTools","collaborate in: DescTools","collaborate in: DescTools","collaborate in: DescTools","collaborate in: DescTools","collaborate in: DescTools","collaborate in: DescTools","collaborate in: DescTools","collaborate in: DescTools","collaborate in: DescTools","collaborate in: DescTools","collaborate in: DescTools","collaborate in: DescTools","collaborate in: DescTools","collaborate in: DescTools","collaborate in: DescTools","collaborate in: DescTools","collaborate in: DescTools","collaborate in: DescTools","collaborate in: DescTools","collaborate in: DescTools","collaborate in: DescTools","collaborate in: DescTools","collaborate in: DescTools","collaborate in: DescTools","collaborate in: DescTools","collaborate in: DescTools","collaborate in: DescTools","collaborate in: DescTools","collaborate in: DescTools","collaborate in: DescTools","collaborate in: DescTools","collaborate in: DescTools","collaborate in: DescTools","collaborate in: DescTools","collaborate in: DescTools","collaborate in: DescTools","collaborate in: DescTools","collaborate in: DescTools","collaborate in: DescTools","collaborate in: DescTools","collaborate in: DescTools","collaborate in: DescTools","collaborate in: DescTools","collaborate in: DescTools","collaborate in: DescTools","collaborate in: DescTools","collaborate in: DescTools","collaborate in: DescTools","collaborate in: DescTools","collaborate in: DescTools","collaborate in: DescTools","collaborate in: DescTools","collaborate in: DescTools","collaborate in: DescTools","collaborate in: DescTools","collaborate in: DescTools","collaborate in: DescTools","collaborate in: DescTools","collaborate in: DescTools","collaborate in: exactRankTests","collaborate in: FDboost","collaborate in: FDboost","collaborate in: fortunes","collaborate in: fortunes","collaborate in: fortunes","collaborate in: fortunes","collaborate in: fortunes","collaborate in: fortunes","collaborate in: fortunes","collaborate in: fortunes","collaborate in: fortunes","collaborate in: fortunes","collaborate in: fortunes","collaborate in: fortunes","collaborate in: fortunes","collaborate in: fortunes","collaborate in: fortunes","collaborate in: fortunes","collaborate in: fortunes","collaborate in: fortunes","collaborate in: fortunes","collaborate in: fortunes","collaborate in: fortunes","collaborate in: fortunes","collaborate in: fortunes","collaborate in: fortunes","collaborate in: fortunes","collaborate in: fortunes","collaborate in: fortunes","collaborate in: fortunes","collaborate in: fortunes","collaborate in: fortunes","collaborate in: fortunes","collaborate in: fortunes","collaborate in: fortunes","collaborate in: fortunes","collaborate in: fortunes","collaborate in: fortunes","collaborate in: fortunes","collaborate in: fortunes","collaborate in: fortunes","collaborate in: fortunes","collaborate in: fortunes","collaborate in: fortunes","collaborate in: fortunes","collaborate in: fortunes","collaborate in: fortunes","collaborate in: fortunes","collaborate in: fortunes","collaborate in: fortunes","collaborate in: fortunes","collaborate in: fortunes","collaborate in: fortunes","collaborate in: fortunes","collaborate in: fortunes","collaborate in: fortunes","collaborate in: fortunes","collaborate in: fortunes","collaborate in: fortunes","collaborate in: fortunes","collaborate in: fortunes","collaborate in: fortunes","collaborate in: fortunes","collaborate in: fortunes","collaborate in: globalboosttest","collaborate in: hgam","collaborate in: hgam","collaborate in: hgam","collaborate in: hgam","collaborate in: hgam","collaborate in: hgam","collaborate in: hgam","collaborate in: hgam","collaborate in: hgam","collaborate in: hgam","collaborate in: hgam","collaborate in: HSAUR","collaborate in: HSAUR2","collaborate in: HSAUR3","collaborate in: ipred","collaborate in: ipred","collaborate in: ipred","collaborate in: ipred","collaborate in: lmtest","collaborate in: lmtest","collaborate in: lmtest","collaborate in: lmtest","collaborate in: lmtest","collaborate in: mboost","collaborate in: mboost","collaborate in: mboost","collaborate in: mboost","collaborate in: mboost","collaborate in: mboost","collaborate in: mboost","collaborate in: mobForest","collaborate in: mobForest","collaborate in: mobForest","collaborate in: mobForest","collaborate in: mobForest","collaborate in: mobForest","collaborate in: mobForest","collaborate in: mobForest","collaborate in: model4you","collaborate in: model4you","collaborate in: modeltools","collaborate in: modeltools","collaborate in: MUCflights","collaborate in: MUCflights","collaborate in: MUCflights","collaborate in: MUCflights","collaborate in: MUCflights","collaborate in: MUCflights","collaborate in: MUCflights","collaborate in: MUCflights","collaborate in: MUCflights","collaborate in: MUCflights","collaborate in: MUCflights","collaborate in: multcomp","collaborate in: multcomp","collaborate in: multcomp","collaborate in: multcomp","collaborate in: multcomp","collaborate in: MVA","collaborate in: mvtnorm","collaborate in: mvtnorm","collaborate in: mvtnorm","collaborate in: mvtnorm","collaborate in: mvtnorm","collaborate in: mvtnorm","collaborate in: mvtnorm","collaborate in: mvtnorm","collaborate in: palmtree","collaborate in: palmtree","collaborate in: party","collaborate in: party","collaborate in: party","collaborate in: partykit","collaborate in: partykit","collaborate in: RWeka","collaborate in: RWeka","collaborate in: RWeka","collaborate in: RWeka","collaborate in: RWeka","collaborate in: sfa","collaborate in: stabs","collaborate in: StatDataML","collaborate in: StatDataML"]},"nodesToDataframe":true,"edgesToDataframe":true,"options":{"width":"100%","height":"100%","nodes":{"shape":"dot"},"manipulation":{"enabled":false},"edges":{"physics":false},"interaction":{"dragNodes":true,"dragView":true,"zoomView":true}},"groups":null,"width":null,"height":null,"idselection":{"enabled":false,"style":"width: 150px; height: 26px","useLabels":true},"byselection":{"enabled":false,"style":"width: 150px; height: 26px","multiple":false,"hideColor":"rgba(200,200,200,0.5)"},"main":{"text":"cranly collaboration network<br> CRAN database version<br>Mon, 22 Oct 2018, 11:52 <br> Author names with<br> \"Torsten Hothorn\" <br> Package names with<br> \"Inf\"","style":"font-family:Georgia, Times New Roman, Times, serif;font-size:15px"},"submain":null,"footer":null,"background":"rgba(0, 0, 0, 0)","highlight":{"enabled":true,"hoverNearest":false,"degree":1,"algorithm":"all","hideColor":"rgba(200,200,200,0.5)","labelOnly":true},"collapse":{"enabled":false,"fit":false,"resetHighlight":true,"clusterOptions":null},"legend":{"width":0.2,"useGroups":false,"position":"left","ncol":1,"stepX":100,"stepY":100,"zoom":true,"nodes":{"label":["Authors matching query","Collaborators"],"color":["#4A6FE3","#ECEEFC"]},"nodesToDataframe":true},"tooltipStay":300,"tooltipStyle":"position: fixed;visibility:hidden;padding: 5px;white-space: nowrap;font-family: verdana;font-size:14px;font-color:#000000;background-color: #f5f4ed;-moz-border-radius: 3px;-webkit-border-radius: 3px;border-radius: 3px;border: 1px solid #808074;box-shadow: 3px 3px 10px rgba(0, 0, 0, 0.2);","export":{"type":"png","css":"float:right;","background":"#fff","name":"cranly_network-22-Oct-2018-Torsten Hothorn-Inf.png","label":"PNG snapshot"}},"evals":[],"jsHooks":[]}</script>
<p>It is also helpful to know who the most prolific CRAN package authors are. You can generally count on packages from this crew being top-shelf.</p>
<pre class="r"><code>author_summary <- summary(author_net)</code></pre>
<pre><code>## Warning in closeness(cranly_graph, normalized = FALSE): At centrality.c:
## 2784 :closeness centrality is not well-defined for disconnected graphs</code></pre>
<pre class="r"><code>plot(author_summary)</code></pre>
<p><img src="/post/2018-10-17-searching-for-r-packages_files/figure-html/unnamed-chunk-9-1.png" width="672" /></p>
<p>I am not claiming that the path I have taken here is the best, or even unique. I have by no means exhausted the possibilities with the packages I have highlighted. Previous posts explore <a href="https://rviews.rstudio.com/2018/05/31/exploring-r-packages/">cranly</a> and the <code>tools::CRAN_package_db()</code> function in a little more depth, but there is much more to explore.</p>
<p>Finally, it would be remiss of me not to mention that the first thing anyone, novice or expert, should do when looking for a package to solve some new problem, or even to get an indication of the quality of a package, is to examine the <a href="https://cran.r-project.org/web/views/">CRAN Task Views</a>. These are lists of packages curated by experts and organized into functional areas. With just a little searching, you will see that <code>coin</code> shows up in multiple task views.</p>
<script>window.location.href='https://rviews.rstudio.com/2018/10/22/searching-for-r-packages/';</script>
September 2018: Top 40 New Packages
https://rviews.rstudio.com/2018/10/08/september-2018-top-40-new-packages/
Mon, 08 Oct 2018 00:00:00 +0000https://rviews.rstudio.com/2018/10/08/september-2018-top-40-new-packages/
<p>September was another relatively slow month for new package activity on CRAN: “only” 126 new packages by my count. My Top 40 list is heavy on what I characterize as “utilities”: packages that either extend R in some fashion or make it easier to do things in R. This month, the packages I selected fall into eight categories: Data, Finance, Machine Learning, Science, Statistics, Time Series, Utilities and Visualization.</p>
<h3 id="data">Data</h3>
<p><a href="https://cran.r-project.org/package=trigpoints">trigpoints</a> v1.0.0: Contains a complete data set of historic GB trig points (fixed survey points that help mapmakers and hikers) in <a href="https://en.wikipedia.org/wiki/Ordnance_Survey_National_Grid">British National Grid (OSGB36)</a> coordinate reference system.</p>
<p><a href="https://cran.r-project.org/package=UKgrid">UKgrid</a> v0.1.0: Provides a time series of the national grid demand (high-voltage electric power transmission network) in the UK since 2011. The <a href="https://cran.r-project.org/web/packages/UKgrid/vignettes/UKgrid_vignette.html">vignette</a> shows how to use the package.</p>
<p><img src="/post/2018-10-08-Sept-Top40_files/UKgrid.png" height = "400" width="600"></p>
<h3 id="finance">Finance</h3>
<p><a href="https://cran.r-project.org/package=jubilee">jubilee</a> v0.2-5: Implements a long-term forecast model called <a href="https://papers.ssrn.com/sol3/papers.cfm?abstract_id=3156574">Jubilee-Tectonic model</a> to forecast future returns of the U.S. stock market, Treasury yield, and gold price. The <a href="https://cran.r-project.org/web/packages/jubilee/vignettes/jubilee-tutorial.pdf">vignette</a> shows the math.</p>
<p><img src="/post/2018-10-08-Sept-Top40_files/jubilee.png" height = "400" width="600"></p>
<p><a href="https://cran.r-project.org/package=portsort">portsort</a> v0.1.0: Provides functions to sort assets into portfolios for up to three factors via a conditional or unconditional sorting procedure. There is an <a href="https://cran.r-project.org/web/packages/portsort/vignettes/portsort.html">Introduction</a>.</p>
<p><img src="/post/2018-10-08-Sept-Top40_files/portsort.png" height = "300" width="500"></p>
<h3 id="machine-learning">Machine Learning</h3>
<p><a href="https://cran.r-project.org/package=crfsuite">crfsuite</a> v0.1.1: Wraps the <a href="https://github.com/chokkan/crfsuite">CRFsuite library</a> allowing users to fit a conditional random field model. The focus is Natural Language Processing, and there are models for named entity recognition, text chunking, part of speech tagging, intent recognition, and classification. The <a href="https://cran.r-project.org/web/packages/crfsuite/vignettes/crfsuite-nlp.html">vignette</a> shows how to use the package.</p>
<p><a href="https://cran.r-project.org/package=ELMSO">ELMSO</a> v1.0.0: Implements the algorithm described in <a href="http://journals.ama.org/doi/10.1509/jmr.15.0307">Paulson, Luo, and James (2018)</a>; see <a href="http://www-bcf.usc.edu/~gareth/research/ELMSO.pdf">here</a> for a full-text version of the paper. The algorithm allocates budget across a set of online advertising opportunities.</p>
<p><a href="https://cran.r-project.org/package=embed">embed</a> v0.0.1: Provides functions to convert factor predictors to one or more numeric representations using simple generalized <a href="arXiv:1611.09477">linear models</a> or <a href="arXiv:1604.06737">nonlinear models</a>.</p>
<p><a href="https://cran.r-project.org/package=newsmap">newsmap</a> v0.6: Implements a semi-supervised model for geographical document classification ([Watanabe (2018)])(doi:10.<sup>1080</sup>⁄<sub>21670811</sub>.2017.1293487) with seed dictionaries in English, German, Spanish, Japanese, and Russian. See the <a href="https://cran.r-project.org/web/packages/newsmap/readme/README.html">README</a> for an example.</p>
<p><a href="https://CRAN.R-project.org/package=splinetree">splinetree</a> v0.1.0: Provides functions to build regression trees and random forests for longitudinal or functional data using a spline projection method. Implements and extends the work of <a href="doi:10.1080/10618600.1999.10474847">Yu and Lambert (1999)</a>. There is an <a href="https://cran.r-project.org/web/packages/splinetree/vignettes/Long-Intro.html">Introduction</a> and vignettes on <a href="https://cran.r-project.org/web/packages/splinetree/vignettes/Tree-Intro.html">trees</a> and <a href="https://cran.r-project.org/web/packages/splinetree/vignettes/Forest-Intro.html">forests</a>.</p>
<p><img src="/post/2018-10-08-Sept-Top40_files/splinetree.png" height = "400" width="600"></p>
<p><a href="https://cran.r-project.org/package=stylest">stylest</a> v0.1.0: Provides functions to estimate the distinctiveness in speakers’ (authors’) style. Fits models that can be used for predicting speakers of new texts. See <a href="doi:10.2139/ssrn.3235506">Spirling et al (2018)</a> for the details and the <a href="https://cran.r-project.org/web/packages/stylest/vignettes/stylest-vignette.html">vignette</a> for an example on how to use the package.</p>
<h3 id="science">Science</h3>
<p><a href="https://cran.r-project.org/package=conStruct">conStruct</a> v1.0.0: Provides a method for modeling genetic data as a combination of discrete layers, within each of which relatedness may decay continuously with geographic distance. There are vignettes for <a href="https://cran.r-project.org/web/packages/conStruct/vignettes/format-data.html">formatting data</a>, <a href="https://cran.r-project.org/web/packages/conStruct/vignettes/model-comparison.html">model construction</a>, and on <a href="https://cran.r-project.org/web/packages/conStruct/vignettes/run-conStruct.html">running</a> and <a href="https://cran.r-project.org/web/packages/conStruct/vignettes/visualize-results.html">visualizing</a> <code>consStruct</code> analyses.</p>
<p><img src="/post/2018-10-08-Sept-Top40_files/conStruct.png" height = "400" width="600"></p>
<p><a href="https://cran.r-project.org/package=episcan">episcan</a> v0.0.1: Provides some efficient mechanisms to scan epistasis in genome-wide interaction studies (GWIS), and supports both case-control status (binary outcome) and quantitative phenotype (continuous outcome) studies. See <a href="doi:10.1038/ejhg.2010.196">Kam-Thong and Cxamara et al. (2011)</a>, <a href="doi:10.1093/bioinformatics/btr218">Kam-Thong and Pütz et al. (2011)</a>, and the <a href="https://cran.r-project.org/web/packages/episcan/vignettes/episcan.html">vignette</a>.</p>
<h3 id="statistics">Statistics</h3>
<p><a href="https://cran.r-project.org/package=ahpsurvey">ahpsurvey</a> v0.2.2: Implements the Analytic Hierarchy Process, a versatile multi-criteria decision-making tool introduced by <a href="doi:10.1016/0270-0255(87)90473-8">Saaty (1987)</a> that allows decision-makers to weigh attributes and evaluate alternatives presented to them. The <a href="https://cran.r-project.org/web/packages/ahpsurvey/vignettes/my-vignette.html">vignette</a> provides examples.</p>
<p><img src="/post/2018-10-08-Sept-Top40_files/ahpsurvey.png" height = "400" width="600"></p>
<p><a href="https://cran.r-project.org/package=empirical">empirical</a> v0.1.0: Implements empirical univariate probability density functions (continuous functions) and empirical cumulative distribution functions (step functions or continuous). The <a href="https://cran.r-project.org/web/packages/empirical/vignettes/empirical.pdf">vignette</a> provides examples.</p>
<p><a href="https://cran.r-project.org/package=basicMCMCplots">basisMCMCplots</a> v0.1.0: Provides functions for examining posterior MCMC samples from a single and multiple chains that interface with the NIMBLE software package. See <a href="doi:10.1080/10618600.2016.1172487">de Valpine et al. (2017)</a>.</p>
<p><a href="https://cran.r-project.org/package=MetaStan">MetaStan</a> v0.0.1: Provides functions to perform Bayesian meta-analysis using <code>Stan</code>. Includes binomial-normal hierarchical models and option to use weakly informative priors for the heterogeneity parameter and the treatment effect parameter, which are described in <a href="arXiv:1809.04407">Guenhan, Roever, and Friede (2018)</a>. The <a href="https://cran.r-project.org/web/packages/MetaStan/vignettes/MetaStan_BNHM.html">vignette</a> contains an example.</p>
<p><img src="/post/2018-10-08-Sept-Top40_files/metastan.png" height = "300" width="500"></p>
<p><a href="https://cran.r-project.org/package=Opt5PL">Opt4PL</a> v0.1.1: Provides functions to obtain and evaluate various optimal designs for the 3-, 4-, and 5-parameter logistic models. The optimal designs are obtained based on the numerical algorithm in <a href="doi:10.18637/jss.v083.i05">Hyun, Wong, Yang (2018)</a>.</p>
<p><a href="https://cran.r-project.org/package=rmetalog">rmatalog</a> v1.0.0: Implements the metalog distribution, a modern, highly flexible, data-driven distribution. See <a href="doi:10.1287/deca.2016.0338">Keelin (2016)</a>. The <a href="https://cran.r-project.org/web/packages/rmetalog/vignettes/rmetalog-vignette.html">vignette</a> provides an example.</p>
<p><img src="/post/2018-10-08-Sept-Top40_files/rmetalog.png" height = "400" width="600"></p>
<p><a href="https://cran.r-project.org/package=rwavelet">rwavelet</a> v0.1.0: Provides functions to perform wavelet analysis (orthogonal and translation invariant transforms) with applications to data compression or denoising. Most of the code is a port of the <a href="https://statweb.stanford.edu/~wavelab/"><code>MATLAB</code> Wavelab toolbox</a> written by Donoho, Maleki and Shahram. The <a href="https://cran.r-project.org/web/packages/rwavelet/vignettes/rwaveletvignette.html">vignette</a> provides examples.</p>
<p><img src="/post/2018-10-08-Sept-Top40_files/rwavelet.png" height = "300" width="400"></p>
<p><a href="https://cran.r-project.org/package=SamplingBigData">samplingBigData</a> v1.0.0: Provides methods for sampling large data sets, including spatially balanced sampling in multi-dimensional spaces with any prescribed inclusion probabilities. Written in C, it uses efficient data structures such as k-d trees that scale to several million rows on a modern desktop computer.</p>
<p><a href="https://cran.r-project.org/package=survivalAnalysis">survivalAnalysis</a> v0.1.0: Implements a high-level interface to perform survival analysis, including Kaplan-Meier analysis and log-rank tests and Cox regression. There are vignettes for <a href="https://cran.r-project.org/web/packages/survivalAnalysis/vignettes/univariate.html">univariate</a> and <a href="https://cran.r-project.org/web/packages/survivalAnalysis/vignettes/multivariate.html">multivariate</a> survival analyses.</p>
<p><img src="/post/2018-10-08-Sept-Top40_files/survivalAnalysis.png" height = "300" width="700"></p>
<p><a href="https://cran.r-project.org/package=ungroup">ungroup</a> v1.1.0: Provides functions to implement a penalized composite link model for efficient estimation of smooth distributions from coarsely binned data. For a detailed description of the method and applications, see <a href="doi:10.1093/aje/kwv020">Rizzi et al. (2015)</a>. The <a href="https://cran.r-project.org/web/packages/ungroup/vignettes/Intro.pdf">vignette</a> provides examples.</p>
<p><img src="/post/2018-10-08-Sept-Top40_files/ungroup.png" height = "400" width="600"></p>
<h3 id="time-series">Time Series</h3>
<p><a href="https://CRAN.R-project.org/package=bayesdfa">bayesdfa</a> v0.1.0: Implements Bayesian dynamic factor analysis, a dimension-reduction tool for multivariate time series, with <code>Stan</code>. The <a href="https://cran.r-project.org/web/packages/bayesdfa/vignettes/bayesdfa.html">vignette</a> shows how to identify extremes and latent regimes with <code>glmmfields</code>.</p>
<p><img src="/post/2018-10-08-Sept-Top40_files/bayesdfa.png" height = "400" width="600"></p>
<p><a href="https://cran.r-project.org/package=tbrf">tbrf</a> v0.1.0: Provides rolling statistical functions based on date and time windows instead of n-lagged observations. The <a href="https://cran.r-project.org/web/packages/tbrf/vignettes/intro_to_tbrf.html">vignette</a> offers examples.</p>
<p><img src="/post/2018-10-08-Sept-Top40_files/tbrf.png" height = "400" width="700"></p>
<h3 id="utilities">Utilities</h3>
<p><a href="https://cran.r-project.org/package=atable">atable</a> v0.1.0: Provides functions to create tables for reporting clinical trials, calculate descriptive statistics and hypotheses tests, and arrange the results in a table with <code>LaTeX</code> or <code>Word</code>. The <a href="https://cran.r-project.org/web/packages/atable/vignettes/atable_usage.pdf">vignette</a> provides examples.</p>
<p><img src="/post/2018-10-08-Sept-Top40_files/atable.png" height = "400" width="700"></p>
<p><a href="https://cran.r-project.org/package=av">av</a> v0.2: Implements bindings to the <a href="http://www.ffmpeg.org/">FFmpeg</a> AV library for working with audio and video in R.</p>
<p><a href="https://cran.r-project.org/package=binb">binb</a> v0.0.2: Provides a collection of <code>LaTeX</code> styles using <code>Beamer</code> customization for PDF-based presentation slides in <code>RMarkdown</code>. The <a href="https://cran.r-project.org/web/packages/binb/vignettes/metropolisDemo.pdf">vignette</a> provides an example.</p>
<p><a href="https://cran.r-project.org/package=broom.mixed">broom.mixed</a> v0.2.2: Converts fitted objects from various R mixed-model packages into tidy data frames along the lines of the <code>broom</code> package.</p>
<p><a href="https://cran.r-project.org/package=codified">codified</a> v0.2.0: Allows authors to augment clinical data with metadata to create output used in conventional publications and reports. See the <a href="https://cran.r-project.org/web/packages/codified/vignettes/nih-enrollment-html.html">vignette</a> for examples.</p>
<p><img src="/post/2018-10-08-Sept-Top40_files/codified.png" height = "400" width="700"></p>
<p><a href="https://cran.r-project.org/package=duawranglr">duawrangler</a> v0.6.3: Allows users to create shareable data sets from raw data files that contain protected elements. There are vignettes on the <a href="https://cran.r-project.org/web/packages/duawranglr/vignettes/duawranglr.html">motivation</a> for the package and on <a href="https://cran.r-project.org/web/packages/duawranglr/vignettes/securing_data.html">securing data</a>.</p>
<p><a href="https://cran.r-project.org/package=ipc">ipc</a> v0.1.0: Provides tools for passing messages between R processes with Shiny Examples showing how to perform useful tasks. The <a href="https://cran.r-project.org/web/packages/ipc/vignettes/shinymp.html">vignette</a> shows how to use the package.</p>
<p><a href="https://cran.r-project.org/package=piggyback">piggyback</a> v0.0.8: Works around git’s 50MB commit limit to allow larger (up to 2 GB) data files to piggyback on a repository as assets attached to individual GitHub releases. There is a package <a href="https://cran.r-project.org/web/packages/piggyback/vignettes/intro.html">overview</a> and a vignette on <a href="https://cran.r-project.org/web/packages/piggyback/vignettes/alternatives.html">alternatives</a>.</p>
<p><a href="https://cran.r-project.org/package=pysd2r">pysd2r</a> v0.1.0: Uses <code>reticulate</code> to implement an interface to the <code>pysd</code> toolset, provides a number of <code>pysd</code> functions, and can read files in <code>Vensim</code>, <code>mdl</code>, and <code>xmile</code> formats. The vignette provides an <a href="https://cran.r-project.org/web/packages/pysd2r/vignettes/pysd2r.html">overview</a>.</p>
<p><a href="https://cran.r-project.org/package=radix">radix</a> v0.5: Provides functions to format scientific and technical articles for the web with Radix reader-friendly typography, flexible layout options for visualizations, and full support for footnotes and citations.</p>
<p><a href="https://cran.r-project.org/package=rbtc">rbtc</a> v0.1-5: Implements the <a href="https://en.bitcoin.it/wiki/API_reference_(JSON-RPC)">RPC-JSON API for Bitcoin</a> and provides utility functions for address creation and content analysis of the blockchain.</p>
<p><a href="https://cran.r-project.org/package=salty">salty</a> v0.1.0: Lets users take real or simulated data and salt it with errors commonly found in the wild, such as pseudo-OCR errors, Unicode problems, numeric fields with nonsensical punctuation, bad dates, etc. See <a href="https://cran.r-project.org/web/packages/salty/readme/README.html">README</a> for examples.</p>
<h3 id="visualization">Visualization</h3>
<p><a href="https://cran.r-project.org/package=customLayout">customLayout</a> v0.2.0: Offers an extended version of the <code>graphics::layout()</code> function that also supports <code>grid</code> graphics, allowing users to create complicated drawing areas for multiple elements by combining much simpler layouts. The <a href="https://cran.r-project.org/web/packages/customLayout/vignettes/layouts-for-officer-power-point-document.html">vignette</a> for <code>PowerPoint</code>.</p>
<p><img src="/post/2018-10-08-Sept-Top40_files/customLayout.png" height = "400" width="600"></p>
<p><a href="https://cran.r-project.org/package=echarts4r">echarts4r</a> v0.1.1: Allows users to create interactive charts by leveraging the <a href="https://ecomfe.github.io/echarts-examples/public/index.html">Echarts</a> JavaScript library. It includes 33 chart types, themes, <code>Shiny</code> proxies, and animations. Look <a href="https://echarts4r.john-coene.com/">here</a> for an example.</p>
<p><img src="/post/2018-10-08-Sept-Top40_files/echarts4r.png" height = "400" width="600"></p>
<p><a href="https://cran.r-project.org/package=ggparliament">ggparliament</a> v2.0.0: Provides parliament plots to visualize election results as points in the architectural layout of the legislative chamber. There are vignettes for <a href="https://cran.r-project.org/web/packages/ggparliament/vignettes/arrange_parliament_8.html">arranging parliament</a>, <a href="https://cran.r-project.org/web/packages/ggparliament/vignettes/basic-parliament-plots_1.html">basic plots</a>, <a href="https://cran.r-project.org/web/packages/ggparliament/vignettes/draw-majority-threshold_3.html">drawing majorities</a>, <a href="https://cran.r-project.org/web/packages/ggparliament/vignettes/emphasize_parliamentarians_6.html">emphasizing parliamentarians</a>, <a href="https://cran.r-project.org/web/packages/ggparliament/vignettes/facet-parliament_5.html">faceting</a>,
<a href="https://cran.r-project.org/web/packages/ggparliament/vignettes/hanging_seats_7.html">hanging seats</a>, <a href="https://cran.r-project.org/web/packages/ggparliament/vignettes/highlight-government_4.html">highlighingt government</a>, and <a href="https://cran.r-project.org/web/packages/ggparliament/vignettes/label-parties_2.html">labeling parties</a>.</p>
<p><img src="/post/2018-10-08-Sept-Top40_files/ggparliament.png" height = "400" width="600"></p>
<p><a href="https://cran.r-project.org/package=ggTimeSeries">ggTimeSeries</a> v1.0.1: Provides additional time series visualizations, such as calendar heat map, steamgraph, and marimekko. There is a <a href="https://cran.r-project.org/web/packages/ggTimeSeries/vignettes/ggTimeSeries.html">vignette</a>.</p>
<p><img src="/post/2018-10-08-Sept-Top40_files/ggTimeSeries.png" height = "400" width="600"></p>
<script>window.location.href='https://rviews.rstudio.com/2018/10/08/september-2018-top-40-new-packages/';</script>
August 2018: Top 40 New Packages
https://rviews.rstudio.com/2018/09/26/august-2018-top-40-new-packages/
Wed, 26 Sep 2018 00:00:00 +0000https://rviews.rstudio.com/2018/09/26/august-2018-top-40-new-packages/
<p>Package developers relaxed a bit in August.; only 160 new packages went to CRAN that month. Here are my “Top 40” picks organized into seven categories: Data, Machine Learning, Science, Statistics, Time Series, Utilities, and Visualization.</p>
<h3 id="data">Data</h3>
<p><a href="https://cran.r-project.org/package=nsapi">nsapi</a> v0.1.1: Provides an interface to the <a href="https://www.ns.nl/en/travel-information/ns-api">Nederlandse Spoorwegen (Dutch Railways) API</a>, allowing users to download current departure times, disruptions and engineering work, the station list, and travel recommendations from station to station. There is a <a href="https://cran.r-project.org/web/packages/nsapi/vignettes/basic_use_nsapi_package.html">vignette</a>.</p>
<p><a href="https://cran.r-project.org/package=repec">repec</a> v0.1.0: Provides utilities for accessing <a href="http://repec.org/">RePEc</a> (Research Papers in Economics) through a RESTful API. You can request an access code and get detailed information <a href="https://ideas.repec.org/api.html">here</a>.</p>
<p><a href="https://cran.r-project.org/package=rfacebookstat">rfacebookstat</a> v1.8.3: Implements an interface to the <a href="https://developers.facebook.com/docs/marketing-apis/">Facebook Marketing API</a>, allowing users to load data by campaigns, ads, ad sets, and insights.</p>
<p><a href="https://CRAN.R-project.org/package=UCSCXenaTools">UCSCXenaTools</a> v0.2.4: Provides access to data sets from <a href="https://xena.ucsc.edu/public-hubs/">UCSC Xena data hubs</a>, which are a collection of UCSC-hosted public databases.</p>
<p><a href="https://cran.r-project.org/package=ZipRadius">ZipRadius</a> v1.0.1: Generates a data frame of US zip codes and their distance to the given zip code, when given a starting zip code and a radius in miles. Also includes functions for use with <code>choroplethrZip</code>, which are detailed in the <a href="https://cran.r-project.org/web/packages/ZipRadius/vignettes/ZipRadius.html">vignette</a>.
<img src="/post/2018-09-21-Aug-Top40_files/ZipRadius.png" height = "500" width="700"></p>
<h3 id="machine-learning">Machine Learning</h3>
<p><a href="https://cran.r-project.org/package=dials">dials</a> v0.0.1: Provides tools for creating model parameters that cannot be directly estimated from the data. There is a <a href="https://cran.r-project.org/web/packages/dials/vignettes/Basics.html">vignette</a>.</p>
<p><a href="https://cran.r-project.org/package=tosca">tosca</a> v0.1-2: Provides a framework for statistical analysis in content analysis. See the <a href="https://cran.r-project.org/web/packages/tosca/vignettes/Vignette.pdf">vignette</a> for details.</p>
<p><img src="/post/2018-09-21-Aug-Top40_files/tosca.png" height = "400" width="500"></p>
<p><a href="https://cran.r-project.org/package=tsmp">tsmap</a> v0.3.1: Implements the <a href="http://www.cs.ucr.edu/~eamonn/MatrixProfile.html">Matrix Profile concept</a> for classification.</p>
<p><img src="/post/2018-09-21-Aug-Top40_files/tsmap.png" height = "400" width="500"></p>
<h3 id="science">Science</h3>
<p><a href="https://cran.r-project.org/package=DSAIRM">DSAIRM</a> v0.4.0: Provides a collection of <code>Shiny</code> apps that implement dynamical systems simulations to explore within-host immune response scenarios. See the package <a href="https://cran.r-project.org/web/packages/DSAIRM/vignettes/DSAIRM.html">Tutorial</a>.</p>
<p><a href="https://cran.r-project.org/package=epiflows">epiflows</a> v0.2.0: Provides functions and classes designed to handle and visualize epidemiological flows between locations, as well as a statistical method for predicting disease spread from flow data initially described in <a href="doi:10.2807/1560-7917.ES.2017.22.28.30572">Dorigatti et al. (2017)</a>. For more information, see the <a href="http://www.repidemicsconsortium.org/">RECON toolkit</a> for outbreak analysis. There is an <a href="https://cran.r-project.org/web/packages/epiflows/vignettes/introduction.html">Overview</a> and a vignette on <a href="https://cran.r-project.org/web/packages/epiflows/vignettes/epiflows-class.html">Data Preparation</a>.</p>
<p><img src="/post/2018-09-21-Aug-Top40_files/epiflows.png" height = "400" width="500"></p>
<p><a href="https://cran.r-project.org/package=fieldRS">fieldRS</a> v0.1.1: Provides functions for remote-sensing field work using best practices suggested by <a href="doi:10.1016/j.rse.2014.02.015">Olofsson et al. (2014)</a>. See the <a href="https://cran.r-project.org/web/packages/fieldRS/vignettes/fieldRS.html">vignette</a> for details.</p>
<p><img src="/post/2018-09-21-Aug-Top40_files/fieldsRS.png" height = "500" width="700"></p>
<p><a href="https://cran.r-project.org/package=Rnmr1D">Rnmr1D</a> v1.2.1: Provides functions to perform the complete processing of proton nuclear magnetic resonance spectra from the free induction decay raw data. For details see <a href="doi:10.1007/s11306-017-1178-y">Jacob et al. (2017)</a> and the <a href="https://cran.r-project.org/web/packages/Rnmr1D/vignettes/Rnmr1D.html">vignette</a>.</p>
<p><img src="/post/2018-09-21-Aug-Top40_files/Rnmr1D.png" height = "500" width="700"></p>
<h3 id="statistics">Statistics</h3>
<p><a href="https://cran.r-project.org/package=bcaboot">bcaboot</a> v0.2-1: Provides functions to compute bootstrap confidence intervals in an almost automatic fashion. See the <a href="https://cran.r-project.org/web/packages/bcaboot/vignettes/bcaboot.html">vignette</a>.</p>
<p><img src="/post/2018-09-21-Aug-Top40_files/bcaboot.png" height = "500" width="700"></p>
<p><a href="https://cran.r-project.org/package=bivariate">bivariate</a> v0.2.2: Contains convenience functions for constructing and plotting bivariate probability distributions. See the <a href="https://cran.r-project.org/web/packages/bivariate/vignettes/bivariate.pdf">vignette</a> for details.</p>
<p><img src="/post/2018-09-21-Aug-Top40_files/bivariate.png" height = "400" width="500"></p>
<p><a href="https://CRAN.R-project.org/package=DesignLibrary">DesignLibrary</a> v0.1.1: Provides a simple interface to build designs and allow users to compare performance of a given design across a range of combinations of parameters, such as effect size, sample size, and assignment probabilities. Look <a href="https://declaredesign.org/library/">here</a> for more information.</p>
<p><a href="https://CRAN.R-project.org/package=doremi">doremi</a> v0.1.0: Provides functions to fit the dynamics of a regulated system experiencing exogenous inputs using differential equations and linear mixed-effects regressions to estimate the characteristic parameters of the equation. See the <a href="https://cran.r-project.org/web/packages/doremi/vignettes/Introduction-to-doremi.html">vignette</a>.</p>
<p><img src="/post/2018-09-21-Aug-Top40_files/doremi.png" height = "400" width="500"></p>
<p><a href="https://cran.r-project.org/package=eikosograms">eikosograms</a> v0.1.1: An eikosogram (probability picture from the ancient Greek εὶκὀσ - likely or probable) divides the unit square into rectangular regions whose areas, sides, and widths represent various probabilities associated with the values of one or more categorical variates. For a discussion on the eikosogram and its superiority to Venn diagrams in teaching probability, see <a href="https://math.uwaterloo.ca/~rwoldfor/papers/eikosograms/paper.pdf">Cherry and Oldford (2003)</a>, and for a discussion of its value in exploring conditional independence structure and relation to graphical and log-linear models, see <a href="https://math.uwaterloo.ca/~rwoldfor/papers/eikosograms/independence/paper.pdf">Oldford (2003)</a>. There is an <a href="https://cran.r-project.org/web/packages/eikosograms/vignettes/Introduction.html">Introduction</a> and vignettes on <a href="https://cran.r-project.org/web/packages/eikosograms/vignettes/DataAnalysis.html">Data Analysis</a> and <a href="https://cran.r-project.org/web/packages/eikosograms/vignettes/IndependenceExploration.html">Independence Relations</a>.</p>
<p><img src="/post/2018-09-21-Aug-Top40_files/eikosograms.png" height = "400" width="500"></p>
<p><a href="https://cran.r-project.org/package=localIV">localIV</a> v0.1.0: Provides functions to estimate marginal treatment effects using local instrumental variables. See <a href="doi:10.1162/rest.88.3.389">Heckman et al. (2006)</a> and <a href="https://scholar.harvard.edu/files/xzhou/files/zhou-xie_mte2.pdf">Zhou and Xie (2018)</a> for background.</p>
<p><a href="https://cran.r-project.org/package=merlin">merlin</a> v0.0.1: Provides functions to fit linear, non-linear, and user-defined mixed effects regression models following the framework developed by <a href="arXiv:1710.02223">Crowther (2017)</a>. See the <a href="https://cran.r-project.org/web/packages/merlin/vignettes/merlin.html">vignette</a> for details.</p>
<p><a href="https://cran.r-project.org/package=MRFcov">MRFcov</a> v1.0.35: Provides functions to approximate node interaction parameters of Markov Random Fields graphical networks. The general methods are described in <a href="doi:10.1002/ecy.2221">Clark et al. (2018)</a>. There are vignettes on <a href="https://cran.r-project.org/web/packages/MRFcov/vignettes/CRF_data_prep.html">Preparing Datasets</a>, <a href="https://cran.r-project.org/web/packages/MRFcov/vignettes/Gaussian_Poisson_CRFs.html">Gaussian and Poisson Fields</a>, and an example using <a href="https://cran.r-project.org/web/packages/MRFcov/vignettes/Bird_Parasite_CRF.html">Bird parasite data</a>.</p>
<p><a href="https://CRAN.R-project.org/package=SCPME">SCPME</a> v1.0: Provides functions to estimate a penalized precision matrix via an augmented ADMM algorithm as described in <a href="doi:10.1093/biomet/asy023">Molstad and Rothman (2018)</a>. There is a <a href="https://cran.r-project.org/web/packages/SCPME/vignettes/Tutorial.html">Tutorial</a> and a vignette describing <a href="https://cran.r-project.org/web/packages/SCPME/vignettes/Details.html">Algorithm Details</a>.</p>
<p><img src="/post/2018-09-21-Aug-Top40_files/SCPME.png" height = "400" width="500"></p>
<p><a href="https://cran.r-project.org/package=survxai">survxai</a> v0.2.0: Contains functions for creating a unified representation of survival models, which can be further processed by various survival explainers. There are vignettes on <a href="https://cran.r-project.org/web/packages/survxai/vignettes/Local_explanations.html">Local explanations</a>, <a href="https://cran.r-project.org/web/packages/survxai/vignettes/Global_explanations.html">global explanations</a>, <a href="https://cran.r-project.org/web/packages/survxai/vignettes/How_to_compare_models_with_survxai.html">comparing models</a>, and on a <a href="https://cran.r-project.org/web/packages/survxai/vignettes/Custom_predict_for_survival_models.html">custom prediction function</a>.</p>
<p><img src="/post/2018-09-21-Aug-Top40_files/survxai.png" height = "400" width="500"></p>
<h3 id="time-series">Time Series</h3>
<p><a href="https://CRAN.R-project.org/package=hpiR">hpiR</a> v0.2.0: Provides functions to compute house price indexes and series, and evaluate index goodness based on accuracy, volatility and revision statistics. For the background on model construction, see <a href="doi:10.2307/2109686">Case and Quigley (1991)</a>, and for hedonic pricing models, see <a href="doi:10.1016/j.jhe.2006.03.001">Bourassa et al. (2006)</a>. There is an an <a href="https://cran.r-project.org/web/packages/hpiR/vignettes/introduction.html">introduction</a> to the package and a vignette on <a href="https://cran.r-project.org/web/packages/hpiR/vignettes/classstructure.html">Classes</a>.</p>
<p><img src="/post/2018-09-21-Aug-Top40_files/hpiR.png" height = "500" width="700"></p>
<p><a href="https://cran.r-project.org/package=STMotif">STMotif</a> v0.1.1: Provides functions to identify motifs (previously identified sub-sequences) in spatial-time series. There are vignettes on <a href="https://cran.r-project.org/web/packages/STMotif/vignettes/discovery-motifs.html">motif discovery</a>, <a href="https://cran.r-project.org/web/packages/STMotif/vignettes/examples.html">examples</a>, <a href="https://cran.r-project.org/web/packages/STMotif/vignettes/generation-of-candidates.html">candidate generation</a>, and <a href="https://cran.r-project.org/web/packages/STMotif/vignettes/validate-candidates.html">candidate validation</a>.</p>
<p><img src="/post/2018-09-21-Aug-Top40_files/STMotif.png" height = "400" width="500"></p>
<p><a href="https://cran.r-project.org/package=trawl">trawl</a> v0.2.1: Contains functions for simulating and estimating integer-valued trawl processes as described in <a href="https://papers.ssrn.com/sol3/papers.cfm?abstract_id=3100076">Veraart (2018)</a>, and for simulating random vectors from the bivariate negative binomial and the bi- and trivariate logarithmic series distributions. There is a vignette on <a href="https://cran.r-project.org/web/packages/trawl/vignettes/my-vignette2.html">trawl processes</a>, and another on the <a href="https://cran.r-project.org/web/packages/trawl/vignettes/my-vignette.html">binomial distributions</a>.</p>
<h3 id="utilities">Utilities</h3>
<p><a href="https://cran.r-project.org/src/contrib/Archive/arkdb">arkdb</a> v0.0.3: Provides functions for exporting tables from relational database connections into compressed text files, and streaming those text files back into a database without requiring the whole table to fit in working memory. See the <a href="https://cran.r-project.org/web/packages/arkdb/vignettes/arkdb.html">vignette</a> for a tutorial.</p>
<p><a href="https://cran.r-project.org/package=aws.kms">aws.kms</a> v0.1.2: Implements an interface to <a href="https://aws.amazon.com/kms/">AWS Key Management Service</a>, a cloud service for managing encryption keys. See the <a href="https://cran.r-project.org/web/packages/aws.kms/readme/README.html">README</a> for details.</p>
<p><a href="https://cran.r-project.org/package=DataPackageR">DatapackageR</a> v0.15.3: Provides a framework to help construct R data packages in a reproducible manner. It maintains data provenance by turning the data-processing scripts into package vignettes, as well as enforcing documentation and version checking of included data objects. There is a <a href="https://cran.r-project.org/web/packages/DataPackageR/vignettes/usingDataPackageR.html">Guide</a> to using the package, and a vignette on <a href="https://cran.r-project.org/web/packages/DataPackageR/vignettes/YAML_CONFIG.html">YAML configuration</a>.</p>
<p><a href="https://cran.r-project.org/package=hedgehog">hedgehog</a> v0.1: Enables users to test properties of their programs against randomly generated input, providing far superior test coverage compared to unit testing. There is a general <a href="https://cran.r-project.org/web/packages/hedgehog/vignettes/hedgehog.html">tutorial</a> and a description of the <a href="https://cran.r-project.org/web/packages/hedgehog/vignettes/state-machines.html">Hedgehog state machine</a>.</p>
<p><a href="https://cran.r-project.org/package=jsonstat">jsonstat</a> v0.0.2: Implements an interface to <a href="https://json-stat.org/">JSON-stat</a>, a simple, lightweight ‘JSON’ format for data dissemination. There is a short <a href="https://cran.r-project.org/web/packages/jsonstat/vignettes/quickstart.html">quickstart quide</a>.</p>
<p><a href="https://cran.r-project.org/package=nseval">nseval</a> v0.4: Provides an API for Lazy and Non-Standard Evaluation with facilities to capture, inspect, manipulate, and create lazy values (promises), “…” lists, and active calls. See <a href="https://cran.r-project.org/web/packages/nseval/readme/README.html">README</a>.</p>
<p><a href="https://cran.r-project.org/package=runner">runner</a> v0.1.0: Provides running functions (windowed, rolling, cumulative) with varying window size and missing handling options for R vectors. See the <a href="https://cran.r-project.org/web/packages/runner/vignettes/runner.html">vignette</a> for details.</p>
<p><a href="https://cran.r-project.org/package=RTest">RTest</a> v1.1.9.0: Provides an XML-based testing framework for automated component tests of R packages developed for a regulatory environment. There is a short <a href="https://cran.r-project.org/web/packages/RTest/vignettes/RTest.pdf">vignette</a>.</p>
<p><a href="https://cran.r-project.org/package=sparkbq">sparkbq</a> v0.1.0: Extends <code>sparklyr</code> by providing integration with Google <a href="https://cloud.google.com/bigquery/">BigQuery</a>. It supports direct import/export from/to <code>BigQuery</code>, as well as intermediate data extraction from <a href="https://cloud.google.com/storage/">Google Cloud Storage</a>. See <a href="https://cran.r-project.org/web/packages/sparkbq/readme/README.html">README</a>.</p>
<p><a href="https://cran.r-project.org/package=vapour">vapour</a> v0.1.0: Provides low-level access to <code>GDAL</code>, the <a href="http://gdal.org/">Geospatial Data Abstraction Library</a>. There is a <a href="https://cran.r-project.org/web/packages/vapour/vignettes/vapour.html">vignette</a>.</p>
<h3 id="visualization">Visualization</h3>
<p><a href="https://cran.r-project.org/package=mapdeck">mapdeck</a> v0.1.0: Provides a mechanism to plot interactive maps using <a href="https://www.mapbox.com/mapbox-gl-js/api/">Mapbox GL</a>, a JavaScript library for interactive maps, and <a href="http://deck.gl/#/">Deck.gl</a>, a JavaScript library which uses <code>WebGL</code> for visualizing large data sets. The <a href="https://cran.r-project.org/web/packages/mapdeck/vignettes/mapdeck.html">vignette</a> explains how to use the package.</p>
<p><img src="/post/2018-09-21-Aug-Top40_files/mapdeck.gif" height = "400" width="500"></p>
<p><a href="https://cran.r-project.org/package=rayshader">rayshader</a> v0.5.1: Provides functions that use a combination of raytracing, spherical texture mapping, lambertian reflectance, and ambient occlusion to produce hillshades of elevation matrices. Includes water-detection and layering functions, programmable color palette generation, built-in textures, 2D and 3D plotting options, and more. See <a href="https://cran.r-project.org/web/packages/rayshader/readme/README.html">README</a> for details and examples.</p>
<p><img src="/post/2018-09-21-Aug-Top40_files/rayshader.gif" height = "400" width="500"></p>
<p><a href="https://cran.r-project.org/package=sigmajs">sigmajs</a> v0.1.1: Provides an interface to the <a href="http://sigmajs.org/">sigma.js</a> graph-visualization library, including animations, plugins, and shiny proxies. There is a brief <a href="https://cran.r-project.org/web/packages/sigmajs/vignettes/get_started.html">Get Started Guide</a>, and vignettes on <a href="https://cran.r-project.org/web/packages/sigmajs/vignettes/animate.html">Animation</a>, <a href="https://cran.r-project.org/web/packages/sigmajs/vignettes/buttons.html">Buttons</a>, <a href="https://cran.r-project.org/web/packages/sigmajs/vignettes/cluster.html">Coloring by Cluster</a>, <a href="https://cran.r-project.org/web/packages/sigmajs/vignettes/dynamic.html">Dynamic graphs</a>, <a href="https://cran.r-project.org/web/packages/sigmajs/vignettes/formats.html">igraph & gexf</a>, <a href="https://cran.r-project.org/web/packages/sigmajs/vignettes/layout.html">Layout</a>, <a href="https://cran.r-project.org/web/packages/sigmajs/vignettes/plugins.html">Plugins</a>, <a href="https://cran.r-project.org/web/packages/sigmajs/vignettes/settings.html">Settings</a>, <a href="https://cran.r-project.org/web/packages/sigmajs/vignettes/shiny.html">Shiny</a>, and <a href="https://cran.r-project.org/web/packages/sigmajs/vignettes/talkcross.html">Crosstalk</a>.</p>
<p><img src="/post/2018-09-21-Aug-Top40_files/sigmajs.png" height = "400" width="500"></p>
<p><a href="https://cran.r-project.org/package=survsup">survsup</a> v0.0.1: Implements functions to plot survival curves. The <a href="https://cran.r-project.org/web/packages/survsup/vignettes/survsup_intro.html">vignette</a> provides examples.</p>
<p><img src="/post/2018-09-21-Aug-Top40_files/survsup.png" height = "400" width="500"></p>
<p><a href="https://cran.r-project.org/package=tidybayes">tidybayes</a> v1.0.1: Provides functions for composing data and extracting, manipulating, and visualizing posterior draws from Bayesian models (<code>JAGS</code>, <code>Stan</code>, <code>rstanarm</code>, <code>brms</code>, <code>MCMCglmm</code>, <code>coda</code>, …) in a tidy data format. There is a vignette on <a href="https://cran.r-project.org/web/packages/tidybayes/vignettes/tidybayes.html">Using tidy data with Bayesian Models</a>, and vignettes for <a href="https://cran.r-project.org/web/packages/tidybayes/vignettes/tidy-brms.html">brms</a> and <a href="https://cran.r-project.org/web/packages/tidybayes/vignettes/tidy-rstanarm.html">rstanarm</a> models.</p>
<p><img src="/post/2018-09-21-Aug-Top40_files/tidybayes.png" height = "400" width="500"></p>
<script>window.location.href='https://rviews.rstudio.com/2018/09/26/august-2018-top-40-new-packages/';</script>
July 2018: Top 40 New Packages
https://rviews.rstudio.com/2018/08/27/july-2018-top-40-new-packages/
Mon, 27 Aug 2018 00:00:00 +0000https://rviews.rstudio.com/2018/08/27/july-2018-top-40-new-packages/
<p>July was a big month for submitting new packages to CRAN; by my count, 251 unique and truly new packages were accepted. In addition to quantity, I was pleased to see quality and variety. For instance, <code>tropicalSparse</code>, a package for exploring some abstract mathematics, and <code>eChem</code>, a package for teaching analytical chemistry, exemplify R’s expansion into new fields.</p>
<p>Below are my “Top 40” picks organized into ten categories: Computational Methods, Data, Econometrics, Machine Learning, Mathematics, Science, Statistics, Time Series, Utilities, and Visualization</p>
<h3 id="computational-methods">Computational Methods</h3>
<p><a href="https://cran.r-project.org/package=osqp">osqp</a> v0.4.0: Provides bindings to the <code>OSQP</code> solver, a numerical optimization package for solving convex quadratic programs written in <code>C</code> based on the alternating direction method of multipliers. See <a href="https://arxiv.org/abs/1711.08013">Stellato et al. (2018)</a> for details.</p>
<p><a href="https://cran.r-project.org/package=sundialr">sundailr</a> v0.1.1: Provides a way to call the functions in <a href="https://computation.llnl.gov/projects/sundials"><code>SUNDIALS</code></a> C ODE solving library. There is a <a href="https://cran.r-project.org/web/packages/sundialr/vignettes/my-vignette.html">vignette</a>.</p>
<h3 id="data">Data</h3>
<p><a href="https://cran.r-project.org/package=fredr">fredr</a> v1.0.0: Provides an R client for the <a href="https://api.stlouisfed.org">Federal Reserve Economic Data (FRED)</a>. There are vignettes on FRED <a href="https://cran.r-project.org/web/packages/fredr/vignettes/fredr-categories.html">Categories</a>, <a href="https://cran.r-project.org/web/packages/fredr/vignettes/fredr-releases.html">Releases</a>, <a href="https://cran.r-project.org/web/packages/fredr/vignettes/fredr-series.html">Series</a>, <a href="https://cran.r-project.org/web/packages/fredr/vignettes/fredr-sources.html">Sources</a>, and <a href="https://cran.r-project.org/web/packages/fredr/vignettes/fredr-tags.html">Tags</a>, as well as a <a href="https://cran.r-project.org/web/packages/fredr/vignettes/fredr.html">Getting Started Guide</a>.</p>
<p><img src="/post/2018-08-21-July-Top40_files/FRED.png" alt="" /></p>
<p><a href="https://cran.r-project.org/package=jstor">jstor</a> v0.3.2: Provides functions to import metadata, ngrams, and full-texts delivered by Data for Research by JSTOR. There is an <a href="https://cran.r-project.org/web/packages/jstor/vignettes/introduction.html">Introduction</a>, and vignettes on <a href="https://cran.r-project.org/web/packages/jstor/vignettes/automating-file-import.html">Automating File Import</a> and <a href="https://cran.r-project.org/web/packages/jstor/vignettes/known-quirks.html">Known Quirks</a>.</p>
<p><a href="https://cran.r-project.org/package=rLandsat">rLandsat</a> v0.1.0: Provides functions to search and acquire <a href="https://landsat.usgs.gov">Landsat</a> data using an API built by <a href="https://api.developmentseed.org/satellites">Development Seed</a> and the <a href="https://espa.cr.usgs.gov/api">U.S. Geological Survey</a>. See <a href="https://cran.r-project.org/web/packages/rLandsat/readme/README.html">README</a> for how to use the package.</p>
<p><a href="https://cran.r-project.org/package=weathercan">weathercan</a> v0.2.7: Provides tools for downloading historical weather data from the Environment and <a href="http://climate.weather.gc.ca/historical_data/search_historic_data_e.html">Climate Change Canada</a> website. Data can be downloaded from multiple stations over large date ranges, and automatically processed into a single dataset. There is an <a href="https://cran.r-project.org/web/packages/weathercan/vignettes/weathercan.html">Introduction</a>, a <a href="https://cran.r-project.org/web/packages/weathercan/vignettes/glossary.html">Glossary</a>, and vignettes on <a href="https://cran.r-project.org/web/packages/weathercan/vignettes/flags.html">Flags</a> and <a href="https://cran.r-project.org/web/packages/weathercan/vignettes/interpolate_data.html">Interpolation</a>.</p>
<p><img src="/post/2018-08-21-July-Top40_files/weathercan.png" alt="" /></p>
<h3 id="econometrics">Econometrics</h3>
<p><a href="https://CRAN.R-project.org/package=beezdemand">beezdemand</a> v0.1.0: Provides tools to facilitate analyses performed in studies of behavioral economic demand such as data screening proposed by <a href="https://www.ncbi.nlm.nih.gov/pubmed/26147181">Stein et al.(2015)</a>, and model fitting, including linear <a href="https://link.springer.com/chapter/10.1007/978-94-009-2470-3_22">Hursh et al. (1989)</a>, exponential <a href="https://www.ncbi.nlm.nih.gov/pubmed/18211190">Hursh & Silberberg (2008)</a>, and modified exponential <a href="https://www.researchgate.net/publication/281143353_A_Modified_Exponential_Behavioral_Economic_Demand_Model_to_Better_Describe_Consumption_Data">Koffarnus et al. (2015)</a> models. The <a href="https://cran.r-project.org/web/packages/beezdemand/vignettes/beezdemand.html">vignette</a> provides examples.</p>
<p><img src="/post/2018-08-21-July-Top40_files/beezdemand.png" height = "500" width="700"></p>
<p><a href="https://cran.r-project.org/package=sgmodel">sgmodel</a> v0.1.0: Provides functions to compute the solutions of a generic stochastic growth model for a given set of user-supplied parameters. See <a href="https://www.sciencedirect.com/science/article/pii/002205317190038X">Merton (1971)</a> and <a href="https://www.bibsonomy.org/bibtex/0619634620978e8524622d3c0f60185c?postOwner=smicha&intraHash=490b9c3154a96743c291b6d185f7337f">Tauchen (1986)</a>. There is a <a href="https://cran.r-project.org/web/packages/sgmodel/vignettes/sgmodel_vignette.html">vignette</a>.</p>
<h3 id="machine-learning">Machine Learning</h3>
<p><a href="https://cran.r-project.org/package=bigdatadist">bigdatadist</a> v1.0: Provides functions to compute distances between probability measures, entropy measures for samples of curves, distances and depth measures for functional data, and the Generalized Mahalanobis Kernel distance for high dimensional data. For further details see <a href="doi:10.3233/IDA-140706">Martos et al (2014)</a> and <a href="doi:10.3390/e20010033">Martos et al (2018)</a>.</p>
<p><img src="/post/2018-08-21-July-Top40_files/bigdatadist.png" height = "500" width="700"></p>
<p><a href="https://cran.r-project.org/package=L0Learn">L0Learn</a> v1.0.2: Provides an optimized toolkit for approximately solving L0-regularized learning problems. The algorithms are based on coordinate descent and local combinatorial search. For more details see <a href="https://arxiv.org/abs/1803.01454">Hazimeh and Mazumder (2018)</a>. There is a <a href="https://cran.r-project.org/web/packages/L0Learn/vignettes/L0Learn-vignette.html">vignette</a>.</p>
<p><img src="/post/2018-08-21-July-Top40_files/L0L.png" height = "500" width="700"></p>
<h3 id="mathematics">Mathematics</h3>
<p><a href="https://cran.r-project.org/package=tropicalSparse">tropicalSparse</a> v0.1.0: Implements some basic tropical algebra functionality for sparse matrices by applying sparse matrix storage techniques. These include addition and multiplication of vectors and matrices, dot product of the vectors in tropical form, and some general equations are also solved using tropical algebra. Look <a href="https://math.berkeley.edu/~bernd/mathmag.pdf">here</a> for the math.</p>
<h3 id="science">Science</h3>
<p><a href="https://cran.r-project.org/package=eChem">eChem</a> v1.0.0: Provides tools for use in courses in analytical chemistry. Functions simulate cyclic voltammetry, linear-sweep voltammetry, single-pulse and double-pulse chronoamperometry, and chronocoulometry experiments using the implicit finite difference method outlined in <a href="https://pubs.acs.org/doi/10.1021/acs.jchemed.5b00225">Brown (2015)</a>. There is an <a href="https://cran.r-project.org/web/packages/eChem/vignettes/Overview.pdf">Overview</a> and vignettes on <a href="https://cran.r-project.org/web/packages/eChem/vignettes/Using_eChem.pdf">Using eChem</a>, <a href="https://cran.r-project.org/web/packages/eChem/vignettes/Computational_Details.pdf">Computational details</a>, and <a href="https://cran.r-project.org/web/packages/eChem/vignettes/Additional_Examples.pdf">Examples</a>.</p>
<p><img src="/post/2018-08-21-July-Top40_files/echem.png" height = "500" width="700"></p>
<p><a href="https://CRAN.R-project.org/package=RaceID">RaceID</a> v0.1.1: Enables inference of cell types and prediction of lineage trees using the StemID2 algorithm of <a href="https://www.nature.com/articles/nmeth.4662">Herman, Sagar and Grün D. (2018)</a>. There is a <a href="https://cran.r-project.org/web/packages/RaceID/vignettes/RaceID.html">vignette</a>.</p>
<p><img src="/post/2018-08-21-July-Top40_files/RaceID.png" alt="" /></p>
<p><a href="https://cran.r-project.org/package=updog">updog</a> v1.0.1: Implements empirical Bayes approaches to genotype polyploids from next-generation sequencing data while accounting for allelic bias, over dispersion, and sequencing error. See <a href="https://www.biorxiv.org/content/early/2018/08/02/281550">Gerard et al. (2018)</a> for implementation details, along with vignettes on <a href="https://cran.r-project.org/web/packages/updog/vignettes/oracle_calculations.html">Oracle Calculations</a>, <a href="https://cran.r-project.org/web/packages/updog/vignettes/parallel_computing.html">Parallization</a>, <a href="https://cran.r-project.org/web/packages/updog/vignettes/simulate_ngs.html">Simulating Sequeencing Data</a>, and an <a href="https://cran.r-project.org/web/packages/updog/vignettes/smells_like_updog.html">Example</a>.</p>
<p><img src="/post/2018-08-21-July-Top40_files/updog.png" height = "500" width="700"></p>
<h3 id="statistics">Statistics</h3>
<p><a href="https://cran.r-project.org/package=adaptMT">adaptMT</a> v1.0.0: Implements adaptive p-value thresholding (AdaPT), including a framework that allows users to specify any algorithm to learn local false-discovery rate, as well as a pool of convenient functions that implement specific algorithms. See <a href="arXiv:1609.06035">Lei and Fithian (2016)</a>. The <a href="https://cran.r-project.org/web/packages/adaptMT/vignettes/adapt_demo.html">vignette</a> provides an introduction to the package.</p>
<p><img src="/post/2018-08-21-July-Top40_files/adaptMT.png" alt="" /></p>
<p><a href="https://cran.r-project.org/package=biglmm">biglmm</a> v0.9-1: Provides regression for data too large to fit in memory. This package functions exactly like the <a href="https://cran.r-project.org/package=biglm">biglm</a> package, but works with later versions of R.</p>
<p><a href="https://cran.r-project.org/package=circumplex">circumplex</a> v0.1.2: Provides tools for analyzing and visualizing circular data, including a generalization of the bootstrapped structural summary method from <a href="doi:10.1177/1073191115621795">Zimmermann & Wright (2017)</a>, and functions for creating publication-ready tables and figures from the results. There is an <a href="https://cran.r-project.org/web/packages/circumplex/vignettes/introduction-to-ssm-analysis.html">Introduction</a> and a vignette on <a href="https://cran.r-project.org/web/packages/circumplex/vignettes/introduction-to-ssm-analysis.html">Analysis</a>.</p>
<p><img src="/post/2018-08-21-July-Top40_files/circumplex.png" alt="" /></p>
<p><a href="https://cran.r-project.org/package=MultiFit">MultiFit</a> v0.1.2: Provides functions to test for independence of two random vectors and learn and report the dependency structure. For more information, see <a href="arXiv:1806.06777">Gorsky and Ma (2018)</a> and the <a href="https://cran.r-project.org/web/packages/MultiFit/vignettes/multiFit.html">vignette</a>.</p>
<p><img src="/post/2018-08-21-July-Top40_files/MultiFit.png" alt="" /></p>
<p><a href="https://cran.r-project.org/package=PHEindicatormethods">PHEIndicatormethods</a> v1.0.8: Provides functions to calculate commonly used public health statistics and their confidence intervals using methods approved for use in the production of Public Health England indicators, such as those presented via <a href="http://fingertips.phe.org.uk/">Fingertips</a>. The statistical methods are referenced in the following publications: <a href="1987">Breslow and Day</a>](doi:10.1002/sim.4780080614), <a href="doi:10.1002/sim.4780100317">Dobson et al (1991)</a>, <a href="doi:10.1002/9780470773666">Armitage and Berry (2002)</a>, and <a href="doi:10.1080/01621459.1927.1050295">Wilson (1927)</a>. There is a <a href="https://cran.r-project.org/web/packages/PHEindicatormethods/vignettes/DSR-vignette.html">vignette</a>.</p>
<p><a href="https://cran.r-project.org/package=robmixglm">robmixglm</a> v1.0-2: Implements robust generalized linear models (GLM) using a mixture method, as described in <a href="doi:10.1080/02664763.2017.1414164">Beath (2018)</a>. See the <a href="https://cran.r-project.org/web/packages/robmixglm/vignettes/robmixglm-package.pdf">vignette</a> for details.</p>
<p><a href="https://cran.r-project.org/package=SingleCaseES">SingelCaseES</a> v0.4.0: Provides functions for calculating basic effect size indices for single-case designs, including several non-overlap measures and parametric effect size measures, and for estimating the gradual effects model developed by <a href="doi:10.1080/00273171.2018.1466681">Swan and Pustejovsky (2018)</a>. There is a vignette on <a href="https://cran.r-project.org/web/packages/SingleCaseES/vignettes/Effect-size-definitions.html">Definitions and Mathematical Details</a> and another on <a href="https://cran.r-project.org/web/packages/SingleCaseES/vignettes/Using-SingleCaseES.html">Calculations</a>.</p>
<p><a href="https://CRAN.R-project.org/package=spCP">spCP</a> v1.0: Implements a spatially varying change-point model with unique intercepts, slopes, variance intercepts and slopes, and change points at each location. Inference is within the Bayesian setting using Markov chain Monte Carlo (MCMC). See the <a href="https://cran.r-project.org/web/packages/spCP/vignettes/spCP-example.html">vignette</a> for an example.</p>
<p><img src="/post/2018-08-21-July-Top40_files/spCp.png" alt="" /></p>
<p><a href="https://cran.r-project.org/package=TDAstats">TDAstats</a> v0.3.0: Provides a tool set for topological data analysis, specifically via the calculation of persistent homology in a Vietoris-Rips complex. For a general background on computing persistent homology for topological data analysis, see <a href="doi:10.1140/epjds/s13688-017-0109-5">Otter et al. (2017)</a>. To learn more about how the permutation test is used for nonparametric statistical inference in topological data analysis, read <a href="doi:10.1007/s41468-017-0008-7">Robinson & Turner (2017)</a>. There is an <a href="https://cran.r-project.org/web/packages/TDAstats/vignettes/intro.html">Introduction</a> and a vignette on <a href="https://cran.r-project.org/web/packages/TDAstats/vignettes/inference.html">Hypothesis Testing with TDA</a>.</p>
<p><img src="/post/2018-08-21-July-Top40_files/TDAstats.png" height = "300" width="500"></p>
<p><a href="https://cran.r-project.org/package=trafo">trafo</a> v1.0.0: Provides functions to estimate, select, and compare several families of transformations, including Bickel-Doksum <a href="doi:10.2307/2287831">Bickel and Doksum (1981)</a>, Box-Cox, Dual <a href="doi:10.1016/j.econlet.2006.01.011">Yang (2006)</a>, Glog <a href="doi:10.1093/bioinformatics/18.suppl_1.S105">Durbin et al. (2002)</a>, Gpower1, Log, Log-shift opt <a href="doi:10.1002/sta4.104">Feng et al. (2016)</a>, Manly, Modulus <a href="doi:10.2307/2986305">John and Draper (1980)</a>, Neglog <a href="doi:10.1111/j.1467-9876.2005.00520.x">Whittaker et al. (2005)</a>, Reciprocal and Yeo-Johnson. See the <a href="https://cran.r-project.org/web/packages/trafo/vignettes/vignette_trafo.pdf">vignette</a> for the math.</p>
<p><a href="https://cran.r-project.org/package=uniformly">uniformly</a> v0.1.0: Provides functions to uniformly sample from various geometric shapes, such as spheres, ellipsoids, simplices. See the <a href="https://cran.r-project.org/web/packages/uniformly/vignettes/convexhull.html">vignette</a>.</p>
<p><img src="/post/2018-08-21-July-Top40_files/uniformly.png" alt="" /></p>
<h3 id="time-series">Time Series</h3>
<p><a href="https://cran.r-project.org/package=rollRegres">rollRegress</a> v0.1.0: Implements methods for fast-rolling and expanding linear regression models. The methods use rank-one updates and downdates of the upper triangular matrix from a QR decomposition. See <a href="doi:10.1137/1.9781611971811">Dongarra et al.(1979)</a>. The <a href="https://cran.r-project.org/web/packages/rollRegres/vignettes/Comparisons.html">vignette</a> provides some details.</p>
<h3 id="utilities">Utilities</h3>
<p><a href="https://cran.r-project.org/package=anyLib">anyLib</a> v1.0.4: Provides functions to install and load a list of packages from CRAN, Bioconductor or GitHub. For GitHub, if you do not have the full path with the maintainer name in it (e.g. “achateigner/topReviGO”), it will be able to load it but not to install the package. There is a brief <a href="https://cran.r-project.org/web/packages/anyLib/vignettes/help.html">vignette</a>.</p>
<p><a href="https://cran.r-project.org/package=dbx">dbx</a> v0.2.1: Provides select, insert, update, upsert, and delete database operations for <code>PostgreSQL</code>, <code>MySQL</code>, <code>SQLite</code>, and other databases. See the <a href="https://cran.r-project.org/web/packages/dbx/readme/README.html">README</a> for usage</p>
<p><a href="https://cran.r-project.org/package=envnames">envnames</a> v0.3.0: Provides functions to keep track of user-defined environment names that cannot be retrieved with the base R function <code>environmentName()</code>. The main function in this package, <code>environment_name()</code>, returns the name of the environment given as parameter. The vignette offers an <a href="https://cran.r-project.org/web/packages/envnames/vignettes/envnames.pdf">overview</a> of the package.</p>
<p><a href="https://cran.r-project.org/package=librarian">librarian</a> v1.3.0: Provides functions to automatically install, update, and load CRAN and GitHub packages in a single function call. See <a href="https://cran.r-project.org/web/packages/librarian/readme/README.html">README</a> for usage.</p>
<p><a href="https://cran.r-project.org/package=makeParallel">makeParallel</a> v0.1.1: Provides functions to automate the transformation of serial R code into more efficient parallel versions. There is a <a href="https://cran.r-project.org/web/packages/makeParallel/vignettes/quickstart.html">Quickstart Guide</a> and a vignette on <a href="https://cran.r-project.org/web/packages/makeParallel/vignettes/concepts.html">Parallel Concepts</a>.</p>
<p><img src="/post/2018-08-21-July-Top40_files/makeParallel.png" alt="" /></p>
<p><a href="https://cran.r-project.org/package=metaDigitise">metaDigitise</a> v1.0.0: Provides functions to extract, summarize and digitize data from published figures in research papers. The <a href="https://cran.r-project.org/web/packages/metaDigitise/vignettes/metaDigitise.html">vignette</a> shows how to use the package.</p>
<p><a href="https://cran.r-project.org/package=RSuite">RSuite</a> v0.32-244: Provides a set of tools to be used with the <a href="http://rsuite.io/">R Suite</a> for developing data-science workflows.</p>
<h3 id="visualization">Visualization</h3>
<p><a href="https://cran.r-project.org/package=ceterisParibus">ceterisParibus</a> v0.3.0: Provides functions to create “What if?” plots of model responses around selected points in a feature space. The four vignettes offer several examples, including a <a href="https://cran.r-project.org/web/packages/ceterisParibus/vignettes/ceteris_paribus.html">Random Forests Example</a> and a <a href="https://cran.r-project.org/web/packages/ceterisParibus/vignettes/ceteris_paribus_HR.html">Classification Example</a>.</p>
<p><img src="/post/2018-08-21-July-Top40_files/ceterisParibus.png" alt="" /></p>
<p><a href="https://cran.r-project.org/package=cytofan">cytofan</a> v0.1.0: Implements fan plots for cytometry data in <code>ggplot2</code>. See <a href="https://www.bankofengland.co.uk/quarterly-bulletin/1998/q1/the-inflation-report-projections-understanding-the-fan-chart">Britton et al. (1998)</a> for information on fan plots, and <a href="https://cran.r-project.org/web/packages/cytofan/readme/README.htm">README</a> for package usage.</p>
<p><img src="/post/2018-08-21-July-Top40_files/cytofan.png" alt="" /></p>
<p><a href="https://cran.r-project.org/package=fingertipscharts">fingertipscharts</a> v0.0.1: Provides tools to recreate the visualizations that are displayed on the <a href="http://fingertips.phe.org.uk/">Fingertips</a> website of U.K. public health data. The <a href="https://cran.r-project.org/web/packages/fingertipscharts/vignettes/quick_charts.html">vignette</a> explains how to use the package.</p>
<p><img src="/post/2018-08-21-July-Top40_files/fingertipscharts.png" alt="" /></p>
<p><a href="https://cran.r-project.org/package=ggvoronoi">ggvoronoi</a> v0.8.0: Provides functions to create, manipulate and visualize Voronoi diagrams using the <a href="https://CRAN.R-project.org/package=deldir"><code>deldir</code></a> and <code>ggplot2</code> packages. The <a href="https://cran.r-project.org/web/packages/ggvoronoi/vignettes/ggvoronoi.html">vignette</a> shows how.</p>
<p><img src="/post/2018-08-21-July-Top40_files/ggvoronoi.png" height = "450" width="650"></p>
<script>window.location.href='https://rviews.rstudio.com/2018/08/27/july-2018-top-40-new-packages/';</script>
June 2018: Top 40 New Packages
https://rviews.rstudio.com/2018/07/29/june-2018-top-40-new-packages/
Sun, 29 Jul 2018 00:00:00 +0000https://rviews.rstudio.com/2018/07/29/june-2018-top-40-new-packages/
<p>Approximately 144 new packages stuck to CRAN in June. That fact that 31 of these are specialized to particular scientific disciplines or analyses provides some evidence to my hypothesis that working scientists are actively adopting R. Below are my Top 40 picks for June, organized into the categories of Computational Methods, Data, Data Science, Economics, Science, Statistics, Time Series, Utilities and Visualizations. The Data packages, especially <code>rtrek</code> and <code>opensensmapr</code>, look like they have some interesting new data to explore.</p>
<h3 id="computational-methods">Computational Methods</h3>
<p><a href="https://cran.r-project.org/package=nnTensor">nnTensor</a> v0.99.1: Provides methods for n-negative matrix factorization and decomposition. See <a href="doi:10.1002/9780470747278">Cichock et al (2009)</a> for details.</p>
<p><a href="https://cran.r-project.org/package=RcppEigenAD">RcppEigenAD</a> v1.0.0: Provides functions to compile <code>C++</code> code using <code>Rcpp</code>, <code>Eigen</code>, and <code>CppAD</code> to produce first- and second-order partial derivatives, and also provides an implementation of Faa’ di Bruno’s formula to combine the partial derivatives of composed functions. See <a href="arXiv:math/0601149v1">Hardy (2006)</a>.</p>
<p><a href="https://CRAN.R-project.org/package=rcane">rcrane</a> v1.0: Provides optimization algorithms to estimate coefficients in models such as linear regression and neural networks. Includes batch gradient descent, stochastic gradient descent, minibatch gradient descent, and coordinate descent. See [Kiwiel, (2001)](doi:10.1007/PL00011414, <a href="ISBN:1-4020-7553-7">Yu Nesterov (2004)</a>, <a href="doi:10.1080/01621459.1982.10477894">Ferguson (1982)</a>, <a href="arXiv:1212.5701">Zeiler (2012)</a>, and <a href="arXiv:1502.04759">Wright (2015)</a>. The <a href="https://cran.r-project.org/web/packages/rcane/vignettes/rcane.html">vignette</a> introduces the package.</p>
<h3 id="data">Data</h3>
<p><a href="https://cran.r-project.org/package=bjscrapeR">bjscrapeR</a> v0.1.0: Scrapes crime data from the <a href="https://www.bjs.gov/developer/ncvs/methodology.cfm">National Crime Victimization Survey</a>, which tracks personal and household crime in the USA.</p>
<p><a href="https://cran.r-project.org/package=genesysr">genesysr</a> v0.9.1: Implements an API to access data on plant genetic resources from genebanks around the world published on <a href="https://www.genesys-pgr.org">Genesys</a>. The <a href="https://cran.r-project.org/web/packages/genesysr/vignettes/tutorial.html">vignette</a> offers a short tutorial.</p>
<p><a href="https://cran.r-project.org/package=opensensmapropem">opensensmapr</a> v0.4.1: Allows users to download real-time environmental measurements and sensor station metadata from the <a href="https://opensensemap.org/">OpenSenseMap</a> API. There are vignettes for <a href="https://cran.r-project.org/web/packages/opensensmapr/vignettes/osem-history.html">Visualization</a>, <a href="https://cran.r-project.org/web/packages/opensensmapr/vignettes/osem-intro.html">Exploration</a>, and <a href="https://cran.r-project.org/web/packages/opensensmapr/vignettes/osem-serialization.html">Caching Data for Reproducibility</a>.</p>
<p><img src="/post/2018-07-21-June-Top40_files/opensense.png" alt="" /></p>
<p><a href="https://cran.r-project.org/package=readabs">readabs</a> v0.2.1: Provides functions to read <code>Excel</code> files from the Australian Bureau of Statistics into Tidy Data Sets. See the <a href="https://cran.r-project.org/web/packages/readabs/vignettes/my-vignette.html">vignette</a>.</p>
<p><a href="https://cran.r-project.org/package=rppo">rppo</a> v1.0: Implements an interface to the <a href="https://www.plantphenology.org/">Global Plant Phenology Data Portal</a>. See the <a href="https://cran.r-project.org/web/packages/rppo/vignettes/rppo-vignette.html">vignette</a>.</p>
<p><a href="https://cran.r-project.org/package=rtrek">rtrek</a> v0.1.0: Provides datasets related to the Star Trek fictional universe, functions for working with the data, and access to real-world datasets based on the televised series and other related licensed media productions. It interfaces with <a href="https://www.wikipedia.org/">Wikipedia</a>, the <a href="http://stapi.co/">Star Trek API (STAPI)</a>, <a href="http://memory-alpha.wikia.com/wiki/Portal:Main">Memory Alpha</a>, and <a href="http://memory-beta.wikia.com/wiki/Main_Page">Memory Beta</a> to retrieve data, metadata, and other information relating to Star Trek. See the <a href="https://cran.r-project.org/web/packages/rtrek/readme/README.html">README</a> for usage information.</p>
<p><img src="/post/2018-07-21-June-Top40_files/rtrek.png" height = "500" width="700"></p>
<p><a href="https://cran.r-project.org/package=skynet">skynet</a> v1.2.2: Implements methods for generating air transport statistics based on publicly available data from the <a href="https://www.transtats.bts.gov/databases.asp?Mode_ID=1&Mode_Desc=Aviation&Subject_ID2=0">U.S. Bureau of Transport Statistics (BTS)</a>. See the <a href="https://cran.r-project.org/web/packages/skynet/vignettes/skynet.html">vignette</a>.</p>
<p><img src="/post/2018-07-21-June-Top40_files/skynet.png" alt="" /></p>
<h3 id="data-science">Data Science</h3>
<p><a href="https://cran.r-project.org/package=AdaSampling">AdaSampling</a> v1.1: Implements the adaptive sampling procedure, a framework for both positive unlabeled learning and learning with class label noise. See <a href="doi:10.1109/TCYB.2018.2816984">Yang et al. (2018)</a> and the <a href="https://cran.r-project.org/web/packages/AdaSampling/vignettes/vignette.html">vignette</a>.</p>
<p><a href="https://cran.r-project.org/package=AROC">AROC</a> v1.0: Provides functions to estimate the covariate-adjusted Receiver Operating Characteristic (AROC) curve and pooled (unadjusted) ROC curve. See <a href="arXiv:1806.00473">de Carvalho and Rodriguez-Alvarez (2018)</a>.</p>
<p><img src="/post/2018-07-21-June-Top40_files/AROC.png" alt="" /></p>
<p><a href="https://CRAN.R-project.org/package=cloudml">cloudml</a> v0.5.1: Provides an interface to the Google Cloud Machine Learning Platform. There is a <a href="https://cran.r-project.org/web/packages/cloudml/vignettes/getting_started.html">Getting Sarted Guide</a> and vignettes on <a href="https://cran.r-project.org/web/packages/cloudml/vignettes/deployment.html">Deploying Models</a>, <a href="https://cran.r-project.org/web/packages/cloudml/vignettes/storage.html">Cloud storage</a>, <a href="https://cran.r-project.org/web/packages/cloudml/vignettes/training.html">Training</a>, and <a href="https://cran.r-project.org/web/packages/cloudml/vignettes/tuning.html">Hyperparameter Tuning</a>.</p>
<p><img src="/post/2018-07-21-June-Top40_files/cloudml.png" alt="" /></p>
<p><a href="https://cran.r-project.org/package=reclin">reclin</a> v0.1.0: Provide functions to assist in performing probabilistic record linkage and deduplication: generating pairs, comparing records, em-algorithm for estimating m- and u-probabilities, forcing one-to-one matching. There is an <a href="Introduction to reclin">Introduction</a> and a vignette on <a href="https://cran.r-project.org/web/packages/reclin/vignettes/deduplication.html">Duplication</a>.</p>
<p><a href="https://cran.r-project.org/package=vip">vip</a> v0.1.0: Provides a general framework for constructing variable importance plots from various types of machine learning models, based on a novel approach using partial dependence plots and individual conditional expectation curves as described in <a href="arXiv:1805.04755">Greenwell et al. (2018)</a>. See the <a href="https://cran.r-project.org/web/packages/vip/readme/README.html">README</a> for details and examples.</p>
<p><img src="/post/2018-07-21-June-Top40_files/vip.png" alt="" /></p>
<p><a href="https://cran.r-project.org/package=wevid">wevid</a> v0.4.2: Provides functions to quantify the performance of a binary classifier through weight of evidence. These can be used with any test dataset on which you have observed case-control status, and have computed prior and posterior probabilities of case status using a model learned on a training dataset. Look at this <a href="http://www.homepages.ed.ac.uk/pmckeigu/preprints/classify/wevidtutorial.html">website</a> for details and examples.</p>
<p><img src="/post/2018-07-21-June-Top40_files/wevid.png" alt="" /></p>
<h3 id="economics">Economics</h3>
<p><a href="https://CRAN.R-project.org/package=trade">trade</a> v0.5.3: Provides tools for working with trade model, including the ability to calibrate different consumer-demand systems and simulate the effects of tariffs and quotas under different competitive regimes. The <a href="https://cran.r-project.org/web/packages/trade/vignettes/Reference.html">vignette</a> provides details.</p>
<h3 id="science">Science</h3>
<p><a href="https://CRAN.R-project.org/package=linpk">linpk</a> v1.0: Provides functions and a shiny application to generate concentration-time profiles from linear pharmacokinetic (PK) systems. Single or multiple doses may be specified. The <a href="https://cran.r-project.org/web/packages/linpk/vignettes/linpk-intro.html">vignette</a> offers details and examples.</p>
<p><img src="/post/2018-07-21-June-Top40_files/linpk.png" alt="" /></p>
<p><a href="https://cran.r-project.org/package=ratematrix">ratematrix</a> v1.0: Provides functions to estimate the evolutionary rate matrix ® using Markov chain Monte Carlo (MCMC), as described in <a href="doi:10.1111/2041-210X.12826">Caetano and Harmon (2017)</a>. There is a vignette on <a href="https://cran.r-project.org/web/packages/ratematrix/vignettes/Set_custom_starting_point.html">Setting a custom starting point</a> and another on <a href="https://cran.r-project.org/web/packages/ratematrix/vignettes/Making_prior_on_ratematrix.html">Using prior distributions</a>.</p>
<p><img src="/post/2018-07-21-June-Top40_files/ratematrix.png" alt="" /></p>
<p><a href="https://cran.r-project.org/package=spectralAnalysis">spectralAnalysis</a> v3.12.0: Provides a toolkit for spectral-analysis, enabling users to pre-process, visualize, and analyse process analytical dat, by spectral data measurements made during a chemical process.</p>
<h3 id="statistics">Statistics</h3>
<p><a href="https://cran.r-project.org/package=betaboost">betaboost</a> v1.0.1: Implements boosting beta regression for potentially high-dimensional data <a href="doi:10.1093/ije/dyy093">Mayr et al. (2018)</a> using the same parametrization as <code>betareg</code> <a href="doi:10.18637/jss.v034.i02">Cribari-Neto and Zeileis (2010)</a>. The underlying algorithms are implemented via the R add-on packages <code>mboost</code> <a href="doi:10.1007/s00180-012-0382-5">Hofner et al. (2014)</a> and <code>gamboostLSS</code> <a href="doi:10.1111/j.1467-9876.2011.01033.x">Mayr et al. (2012)</a>. The <a href="https://cran.r-project.org/web/packages/betaboost/vignettes/Using_betaboost_IJE.html">vignette</a> offers examples.</p>
<p><a href="https://cran.r-project.org/package=bfw">bfw</a> v0.1.0: Provides a framework for conducting Bayesian analysis using Markov chain Monte Carlo with the <a href="http://mcmc-jags.sourceforge.net/">JAGS</a> sampler. There are vignettes on <a href="https://cran.r-project.org/web/packages/bfw/vignettes/fit_latent_data.html">Fitting Latent Data</a>, <a href="https://cran.r-project.org/web/packages/bfw/vignettes/fit_observed_data.html">Fitting Observed Data</a>, the <a href="https://cran.r-project.org/web/packages/bfw/vignettes/metric.html">Predict Metric</a>, <a href="https://cran.r-p[roject.org/web/packages/bfw/vignettes/plot_data.html">Plotting</a>, and <a href="https://cran.r-project.org/web/packages/bfw/vignettes/regression.html">Regression</a>.</p>
<p><img src="/post/2018-07-21-June-Top40_files/bfw.png" height = "500" width="700"></p>
<p><a href="https://cran.r-project.org/package=CaseBasedReasoning">CaseBasedReasoning</a> v0.1: Given a large set of problems and their individual solutions, case-based reasoning seeks to solve a new problem by referring to the solution of that problem that is “most similar” to the new problem. See <a href="doi:10.1016/S0167-9473(02)00058-0">Dippon et al. (2002)</a>, the vignette on <a href="https://cran.r-project.org/web/packages/CaseBasedReasoning/vignettes/Distance_Measures.html">Motivation</a>, and examples of case-based reasoning with a <a href="https://cran.r-project.org/web/packages/CaseBasedReasoning/vignettes/Cox-Beta-Model.html">Cox-Beta Model</a> and a <a href="https://cran.r-project.org/web/packages/CaseBasedReasoning/vignettes/RandomForest-Model.html">Random Forest Model</a>.</p>
<p><img src="/post/2018-07-21-June-Top40_files/case.png" alt="" /></p>
<p><a href="https://CRAN.R-project.org/package=coxed">coxed</a> v0.1.1: Provides functions for generating, simulating, and visualizing expected durations and marginal changes in duration from the Cox proportional hazards model. There is a vignette on using the <a href="https://cran.r-project.org/web/packages/coxed/vignettes/coxed.html">coxed() function</a> and another on <a href="https://cran.r-project.org/web/packages/coxed/vignettes/simulating_survival_data.html">simulating survival data</a>.</p>
<p><img src="/post/2018-07-21-June-Top40_files/coxed.png" alt="" /></p>
<p><a href="https://cran.r-project.org/package=GLMMadaptive">GLMMadaptive</a> v0.2-0: Provides functions to fit generalized linear mixed models for a single grouping factor under maximum likelihood approximating the integrals over the random effects with an adaptive Gaussian quadrature rule. See <a href="doi:10.1080/10618600.1995.10474663">Pinheiro and Bates (1995)</a> and the vignettes on <a href="https://cran.r-project.org/web/packages/GLMMadaptive/vignettes/Custom_Models.html">Custom Models</a>,
<a href="https://cran.r-project.org/web/packages/GLMMadaptive/vignettes/GLMMadaptive_basics.html">GLMMadaptive Basics</a>, <a href="https://cran.r-project.org/web/packages/GLMMadaptive/vignettes/Methods_MixMod.html">Methods for MixMod Objects</a>, and <a href="https://cran.r-project.org/web/packages/GLMMadaptive/vignettes/ZeroInflated_and_TwoPart_Models.html">Zero-Inflated and Two-Part Mixed Effects Models</a>.</p>
<p><a href="https://cran.r-project.org/package=glmmfields">glmmfields</a> v0.1.0: Implements generalized linear mixed models with robust random fields for spatiotemporal modeling. The <a href="https://cran.r-project.org/web/packages/glmmfields/vignettes/spatial-glms.html">vignette</a> provides examples.</p>
<p><img src="/post/2018-07-21-June-Top40_files/glmmfields.png" height = "500" width="700"></p>
<p><a href="https://cran.r-project.org/package=kendallRandomWalks">kendallRandomWalks</a> v0.9.3: Provides functions for simulating Kendall random walks, continuous-space Markov chains generated by the Kendall generalized convolution. See <a href="arXiv:1412.0220">Jasiulis-Gołdyn (2014)</a> for details and the vignettes <a href="https://cran.r-project.org/web/packages/kendallRandomWalks/vignettes/kendall_rws.html">Kendall Random Walks</a> and <a href="https://cran.r-project.org/web/packages/kendallRandomWalks/vignettes/behaviour.html">Studying the Behavior of Kendall Random Walks</a>.</p>
<p><img src="/post/2018-07-21-June-Top40_files/kendall.png" height = "500" width="500"></p>
<p><a href="https://cran.r-project.org/package=netSEM">netSEM</a> v0.5.0: Provides functions for structural equation modeling. There is an <a href="https://cran.r-project.org/web/packages/netSEM/vignettes/netSEM.html">Introduction</a> and vignettes on <a href="https://cran.r-project.org/web/packages/netSEM/vignettes/Backsheet.html">Backsheet Degradation</a>, <a href="https://cran.r-project.org/web/packages/netSEM/vignettes/Crack.html">Backsheet Cracking</a>, <a href="https://cran.r-project.org/web/packages/netSEM/vignettes/IVfeature.html">Current Voltage Features</a>, and <a href="https://cran.r-project.org/web/packages/netSEM/vignettes/pet.html">Modeling of the Weathering Driven Degradation of Poly(ethylene-terephthalate) Films</a>.</p>
<p><img src="/post/2018-07-21-June-Top40_files/netSEM.png" alt="" /></p>
<p><a href="https://cran.r-project.org/package=umap">umap</a> v0.1.0.3: Implements the uniform manifold approximation and projection technique for dimension reduction as described in <a href="arXiv:1802.03426">McInnes and Healy (2018)</a>. The <a href="https://cran.r-project.org/web/packages/umap/vignettes/umap.html">vignette</a> shows how to use the package.</p>
<p><a href="https://cran.r-project.org/package=vimp">vimp</a> v1.0.0:Provides functions to calculate point estimates of, and valid confidence intervals for, non-parametric variable importance measures in high and low dimensions. For information about the methods, see <a href="https://biostats.bepress.com/uwbiostat/paper422/">Williamson et al. (2017)</a>. The <a href="https://cran.r-project.org/web/packages/vimp/vignettes/introduction_to_vimp.html">vignette</a> contains an introduction to the package.</p>
<p><img src="/post/2018-07-21-June-Top40_files/vimp.png" height = "500" width="600"></p>
<p><a href="https://cran.r-project.org/package=vsgoftest">vsgoftest</a> v0.3-2: Implements Vasicek and Song goodness-of-fit tests (based on Kullbach-Leibler divergence) for a family of distributions that include uniform, Gaussian, log-normal, exponential, gamma, Weibull, Pareto, Fisher, Laplace, and beta distributions. See <a href="arXiv:1806.07244">Lequesne and Regnault (2018)</a> for details and the <a href="https://cran.r-project.org/web/packages/vsgoftest/vignettes/vsgoftest_tutorial.pdf">Tutorial</a>.</p>
<h3 id="time-series">Time Series</h3>
<p><a href="https://cran.r-project.org/package=anomaly">anomaly</a> v1.0.0: Implements the CAPA (Collective And Point Anomaly) algorithm of <a href="arXiv:1806.01947">Fisch, Eckley and Fearnhead (2018)</a> for the detection of anomalies in time series data.</p>
<p><a href="https://cran.r-project.org/package=exuber">exuber</a> v0.1.0: Provides functions for testing and dating periods of explosive dynamics (exuberance) in time series using recursive unit root tests as proposed by <a href="doi:10.1111/iere.12132">Phillips et al. (2015)</a>. See the <a href="https://cran.r-project.org/web/packages/exuber/readme/README.html">README</a> to get started.</p>
<p>Simulate a variety of periodically-collapsing bubble models. The estimation and simulation utilizes the matrix inversion lemma from the recursive least squares algorithm, which results in a significant speed improvement.</p>
<h3 id="utilities">Utilities</h3>
<p><a href="https://cran.r-project.org/package=BiocManager">BiocManager</a> v1.30.1: Implements a tool to install and update Bioconductor packages. The <a href="https://cran.r-project.org/web/packages/BiocManager/vignettes/BiocManager.html">vignette</a> shows how to use the package.</p>
<p><a href="https://cran.r-project.org/package=IntervalSurgeon">IntervalSurgeon</a> v1.0: Provides functions for manipulating integer-bounded intervals including finding overlaps, piling, and merging. The <a href="https://cran.r-project.org/web/packages/IntervalSurgeon/vignettes/intro.html">vignette</a> shows how to use the package.</p>
<p><img src="/post/2018-07-21-June-Top40_files/interval.png" alt="" /></p>
<p><a href="https://cran.r-project.org/package=pkgbuild">pkgbuild</a> v1.0.0: Provides functions used to build R packages. Locates compilers needed to build R packages on various platforms and ensures the PATH is configured appropriately.</p>
<p><a href="https://cran.r-project.org/package=rqdatatable">rqdatatable</a> v0.1.2: Implements the <code>rquery</code> piped query algebra using <code>data.table</code>. There is a vignette on <a href="https://cran.r-project.org/web/packages/rqdatatable/vignettes/GroupedSampling.html">Grouped Sampling</a> and a <a href="https://cran.r-project.org/web/packages/rqdatatable/vignettes/logisticexample.html">Logistic Example</a>.</p>
<p><a href="https://cran.r-project.org/package=ssh">ssh</a> v0.2: Provides functions to connect to a remote server over SSH to transfer files via SCP, setup a secure tunnel, or run a command or script on the host while streaming stdout and stderr directly to the client. There is a <a href="https://cran.r-project.org/web/packages/ssh/vignettes/intro.html">vignette</a>.</p>
<h3 id="visualization">Visualization</h3>
<p><a href="https://cran.r-project.org/package=mgcViz">mgcViz</a> v0.1.1: An extension of the <code>mgcv</code> package, providing visual tools for Generalized Additive Models (GAMs) that exploit the additive structure of GAMs, scale to large data sets, and can be used in conjunction with a wide range of response distributions. See the <a href="https://cran.r-project.org/web/packages/mgcViz/vignettes/mgcviz.html">vignette</a> for examples.</p>
<p><img src="/post/2018-07-21-June-Top40_files/mgcViz.png" alt="" /></p>
<p><a href="https://cran.r-project.org/package=tiler">tiler</a> v0.2.0: Provides functions to create geographic map tiles from geospatial map files or non-geographic map tiles from simple image files. The <a href="https://cran.r-project.org/web/packages/tiler/vignettes/tiler.html">vignette</a> provides an introduction.</p>
<p><img src="/post/2018-07-21-June-Top40_files/tiler.png" height = "500" width="600"></p>
<script>window.location.href='https://rviews.rstudio.com/2018/07/29/june-2018-top-40-new-packages/';</script>
JSM 2018 Itinerary
https://rviews.rstudio.com/2018/07/25/jsm-2018-itinerary/
Wed, 25 Jul 2018 00:00:00 +0000https://rviews.rstudio.com/2018/07/25/jsm-2018-itinerary/
<p>JSM 2018 is almost here! Usually around this time, I comb through the entire program manually making an itinerary for myself. But this year I decided to try something new – a programmatic way of going through the program, and then building a Shiny app that helps me better navigate the online program.</p>
<p>The end result of the app is below. (I might tweak it a bit further after this post goes live, depending on feedback I receive.) You can interact with the app <a href="https://minecr.shinyapps.io/jsm2018-schedule/">here</a>.</p>
<p>I’m often dissatisfied with conference webpages, and almost always dissatisfied with conference apps, so I thought this was a good opportunity to build one myself. Also, I’m teaching Shiny at JSM and have been wanting to acquaint myself better with the <a href="http://glue.tidyverse.org/">glue</a> package, so I figured spending some time web scraping, text wrangling, and building an app could be fun. (Note to self: Next time do this <em>after</em> you’re done preparing all your presentations!)</p>
<p>This is a three part blog post: (1) the data, (2) the app, and (3) the itinerary.</p>
<p>All relevant source code can be found <a href="https://github.com/mine-cetinkaya-rundel/jsm2018-schedule">here</a>.</p>
<div id="the-data" class="section level2">
<h2>The data</h2>
<p>The data were scraped from the <a href="http://ww2.amstat.org/meetings/jsm/2018/onlineprogram/index.cfm">JSM 2018 Online Program</a>.</p>
<p>Before scraping the data, I checked that the scraping is allowed on this page using <code>robotstxt::paths_allowed()</code>.</p>
<pre class="r"><code>library(robotstxt)
paths_allowed("http://ww2.amstat.org/meetings/jsm/2018/onlineprogram/index.cfm")</code></pre>
<pre><code>##
ww2.amstat.org No encoding supplied: defaulting to UTF-8.</code></pre>
<pre><code>## [1] TRUE</code></pre>
<p>Looks like we’re good to go!</p>
<p>Once you’re on the online program <a href="http://ww2.amstat.org/meetings/jsm/2018/onlineprogram/index.cfm">landing page</a>, you need to click Search without setting any search parameters in order to get to a page with information on all JSM sessions. For convenience, I saved the resulting HTML file for this page in my repo. (You can access it <a href="https://github.com/mine-cetinkaya-rundel/jsm2018-schedule/blob/master/data/jsm2018.html">here</a>.)</p>
<p>The next step is using <a href="https://github.com/hadley/rvest">rvest</a> and the <a href="https://selectorgadget.com/">SelectorGadget</a> to scrape the data. This process allows us to take not-so-tidy data from the web and turn it into a tidy data frame that we can then work with in R.</p>
<div class="figure">
<img src="/post/2018-07-25-jsm-2018-itinerary_files/data-scrape.png" />
</div>
<p>Based on these data, I created two data frames: one for sessions and the other for talks. These two data frames will serve as the source data for the two tabs in the app. The code for data scraping the data and wrangling into these two data frames can be found <a href="https://github.com/mine-cetinkaya-rundel/jsm2018-schedule/blob/master/scrape.R">here</a>.</p>
</div>
<div id="the-app" class="section level2">
<h2>The app</h2>
<p>The app is built with Shiny, using a <code>navbarPage</code> to allow for two separate <code>tabPanel</code>s.</p>
<p>The first panel is the session schedule. This tab allows users to subset sessions based on days and times, as well as session sponsors and types. This is similar to the functionality on the JSM page; however, it’s designed to easily subset for sessions I like having on my radar, and it allows me to subset by time of day.</p>
<div class="figure">
<img src="/post/2018-07-25-jsm-2018-itinerary_files/jsm2018-app-sessions.png" />
</div>
<p>The second tab is designed to navigate talks, as opposed to sessions. You can look for keywords in talk titles. Curious how many talks have “R” in their title? How about “tidy”? Take a guess first, then peek at <a href="https://minecr.shinyapps.io/jsm2018-schedule/">the app</a> to check your answer.</p>
<div class="figure">
<img src="/post/2018-07-25-jsm-2018-itinerary_files/jsm2018-app-talks.png" />
</div>
<p>The source code for the app can be found <a href="https://github.com/mine-cetinkaya-rundel/jsm2018-schedule/blob/master/app.R">here</a>.</p>
</div>
<div id="the-itinerary" class="section level2">
<h2>The itinerary</h2>
<p>Using a combination of the app I build and good ol’ Ctrl-F on the online program, I came up with the following itinerary for JSM. The foci of the sessions I selected are education, data science, computing, visualization, and social responsibility. I obviously won’t make it to all the sessions I list here, but I plan to at least try to get my hands on the slides for the talks I don’t make.</p>
<p>I also plan on stopping by the <a href="https://ww2.amstat.org/meetings/jsm/2018/dataartshow.cfm">JSM Data Art Show</a> at some point!</p>
<p>If you have suggestions for other sessions (in these topics or other) that you think should be on this list, let me know in the comments!</p>
<div id="saturday-jul-28" class="section level3">
<h3>Saturday, Jul 28</h3>
<ul>
<li><p>8:00 AM - 12:00 PM: <a href="http://ww2.amstat.org/meetings/jsm/2018/onlineprogram/ActivityDetails.cfm?SessionID=215972">Shiny Essentials</a> - This morning I will be teaching a half-day workshop on building Shiny apps and dashboards. If you’re interested, you can sign up <a href="https://www.amstat.org/_EventSolution/EventDisplay.aspx?WebsiteKey=26030f62-5b88-4f45-9fe5-e6cf2757ee09&Eventkey=JSM2018&e0fa4633_3294_4cc7_b5d8_eef47e46e6ea=2#e0fa4633_3294_4cc7_b5d8_eef47e46e6ea">here</a>.</p></li>
<li><p>8:00 AM - 4:00 PM: <a href="https://sites.google.com/view/preparetoteach/">Preparing Graduate Students to Teach Statistics and Data Science</a> - This is a workshop designed to prepare graduate students for a role as undergraduate faculty responsible for teaching statistics and data science. I will be teaching two modules in this workshop in the afternoon.</p></li>
</ul>
</div>
<div id="sunday-jul-29" class="section level3">
<h3>Sunday, Jul 29</h3>
<ul>
<li>2:00 PM - 3:50 PM
<ul>
<li><a href="http://ww2.amstat.org/meetings/jsm/2018/onlineprogram/ActivityDetails.cfm?SessionID=215165">Data Science Education - Successes and Challenges: Stories from the Classroom and Beyond</a></li>
<li><a href="http://ww2.amstat.org/meetings/jsm/2018/onlineprogram/ActivityDetails.cfm?SessionID=215027">Transparency, Reproducibility and Replicability in Work with Social and Economic Data</a></li>
<li><a href="http://ww2.amstat.org/meetings/jsm/2018/onlineprogram/ActivityDetails.cfm?SessionID=215835">Introductory Overview Lecture: The Deep Learning Revolution</a></li>
<li><a href="http://ww2.amstat.org/meetings/jsm/2018/onlineprogram/ActivityDetails.cfm?SessionID=215591">Leading to Quantitative Literacy</a></li>
</ul></li>
<li>4:00 PM - 5:50 PM: <a href="http://ww2.amstat.org/meetings/jsm/2018/onlineprogram/ActivityDetails.cfm?SessionID=215836">Introductory Overview Lecture: Examining What and How We Teach at All Levels: Key Ideas to Ensure the Progress and Relevance of Statistics — Invited Special Presentation</a> - I will be chairing this session and based on what I’ve seen from the fantastic speakers so far, I strongly recommend you not miss it!</li>
</ul>
</div>
<div id="monday-jul-30" class="section level3">
<h3>Monday, Jul 30</h3>
<ul>
<li>7:00 AM - 8:30 AM: <a href="http://ww2.amstat.org/meetings/jsm/2018/onlineprogram/ActivityDetails.cfm?SessionID=216530">Section on Statistical Education Officers Meeting</a></li>
<li>8:30 AM - 10:20 AM:
<ul>
<li><a href="http://ww2.amstat.org/meetings/jsm/2018/onlineprogram/ActivityDetails.cfm?SessionID=215215">Visualization and Reproducibility - Challenges and Best Practices</a></li>
<li><a href="http://ww2.amstat.org/meetings/jsm/2018/onlineprogram/ActivityDetails.cfm?SessionID=215677">Curricular Considerations for Statistics and Data Science Education</a></li>
<li><a href="http://ww2.amstat.org/meetings/jsm/2018/onlineprogram/ActivityDetails.cfm?SessionID=215837">Introductory Overview Lecture: Leading Data Science: Talent, Strategy, and Impact</a></li>
</ul></li>
<li>10:30 AM - 12:20 PM
<ul>
<li><a href="http://ww2.amstat.org/meetings/jsm/2018/onlineprogram/ActivityDetails.cfm?SessionID=215256">Creating and Sustaining an Undergraduate Research Program</a></li>
<li><a href="http://ww2.amstat.org/meetings/jsm/2018/onlineprogram/ActivityDetails.cfm?SessionID=215482">Statistical Computing and Statistical Graphics: Student Paper Award and Chambers Statistical Software Award</a></li>
<li><a href="http://ww2.amstat.org/meetings/jsm/2018/onlineprogram/ActivityDetails.cfm?SessionID=215938">SPEED: Teaching Statistics: Strategies and Applications</a> (Ends at 11:15 AM)</li>
<li><a href="http://ww2.amstat.org/meetings/jsm/2018/onlineprogram/ActivityDetails.cfm?SessionID=215941">SPEED: Data Expo</a> (Starts at 11:35 AM)</li>
</ul></li>
<li>2:00 PM - 3:50 PM: It will be particularly difficult to choose between these three sessions.
<ul>
<li><a href="http://ww2.amstat.org/meetings/jsm/2018/onlineprogram/ActivityDetails.cfm?SessionID=215487">Late-Breaking Session: Addressing Sexual Misconduct in the Statistics Community</a></li>
<li><a href="http://ww2.amstat.org/meetings/jsm/2018/onlineprogram/ActivityDetails.cfm?SessionID=214992">An Emerging Ecosystem for Data Science/Statistics Education</a></li>
<li><a href="http://ww2.amstat.org/meetings/jsm/2018/onlineprogram/ActivityDetails.cfm?SessionID=215035">Academic Publication Is Dead, Long Live Academic Publication</a></li>
</ul></li>
<li>4:00 PM - 5:50 PM: <a href="http://ww2.amstat.org/meetings/jsm/2018/onlineprogram/ActivityDetails.cfm?SessionID=215427">ASA President’s Invited Address - Helping to Save the Business of Journalism, One Data Insight at a Time</a></li>
<li>6:00 PM - 8:00 PM: <a href="http://ww2.amstat.org/meetings/jsm/2018/onlineprogram/ActivityDetails.cfm?SessionID=216569">Statistical Computing and Statistics Graphics Mixer</a></li>
<li>7:00 PM - 8:30 PM: <a href="http://ww2.amstat.org/meetings/jsm/2018/onlineprogram/ActivityDetails.cfm?SessionID=216813">Public Lecture: Born on Friday the Thirteenth: The Curious World of Probabilities</a></li>
</ul>
</div>
<div id="tuesday-jul-31" class="section level3">
<h3>Tuesday, Jul 31</h3>
<ul>
<li>8:30 AM - 11:30 AM: <a href="http://ww2.amstat.org/meetings/jsm/2018/onlineprogram/ActivityDetails.cfm?SessionID=216546">ASA DataFest Steering Committee and Information Session</a> - If you’re running an ASA DataFest or if you’re interested running one next year, come join us! We’ll be discussing leads for next year’s dataset, and organization tips.</li>
<li>8:30 AM - 10:20 AM:
<ul>
<li><a href="http://ww2.amstat.org/meetings/jsm/2018/onlineprogram/ActivityDetails.cfm?SessionID=215839">Introductory Overview Lecture: Reproducibility, Efficient Workflows, and Rich Environments</a></li>
<li><a href="http://ww2.amstat.org/meetings/jsm/2018/onlineprogram/ActivityDetails.cfm?SessionID=215319">Student Outcomes in Undergraduate Courses Using a Simulation-Based Inference Approach to Teaching Statistics</a></li>
</ul></li>
<li>10:30 AM - 12:20 PM:
<ul>
<li><a href="http://ww2.amstat.org/meetings/jsm/2018/onlineprogram/ActivityDetails.cfm?SessionID=215022">The Future of Spatial and Spatio-Temporal Statistics</a></li>
<li><a href="http://ww2.amstat.org/meetings/jsm/2018/onlineprogram/ActivityDetails.cfm?SessionID=215286">Graphics in Statistical Practice: Saying it with Pictures in the Classroom, Boardroom, or the Consulting Cube</a></li>
<li><a href="http://ww2.amstat.org/meetings/jsm/2018/onlineprogram/ActivityDetails.cfm?SessionID=215809">SPEED: Sports to Fire: Fascinating Applications of Statistics</a> - A variety of interesting applications; there may be some good examples for teaching among them.</li>
<li><a href="http://ww2.amstat.org/meetings/jsm/2018/onlineprogram/ActivityDetails.cfm?SessionID=215744">Data Science</a></li>
<li><a href="http://ww2.amstat.org/meetings/jsm/2018/onlineprogram/ActivityDetails.cfm?SessionID=215887">Late-Breaking Session: Statistical Issues in Application of Machine Learning to High-Stakes Decisions</a></li>
</ul></li>
<li>12:30 PM - 2:00 PM: <a href="http://ww2.amstat.org/meetings/jsm/2018/onlineprogram/ActivityDetails.cfm?SessionID=216815">2019 SDSS Planning Meeting</a> - I will be the short-course organizer for SDSS 2019. If you have ideas for a short course, either that you might want to teach or that you might want to take, let’s chat at JSM!</li>
<li>2:00 PM - 3:50 PM:
<ul>
<li><a href="http://ww2.amstat.org/meetings/jsm/2018/onlineprogram/ActivityDetails.cfm?SessionID=215024">Bringing Intro Stats into a Multivariate and Data-Rich World</a> - I will be speaking at this session, hope to see you there!</li>
<li><a href="http://ww2.amstat.org/meetings/jsm/2018/onlineprogram/ActivityDetails.cfm?SessionID=215580">Lead with Statistics: Case Studies and Methods for Learning and Improving Healthcare Through EHRs</a></li>
</ul></li>
<li>5:30 PM - 7:30 PM: <a href="http://ww2.amstat.org/meetings/jsm/2018/onlineprogram/ActivityDetails.cfm?SessionID=216829">Bayesian Mixer</a></li>
</ul>
</div>
<div id="wednesday-aug-1" class="section level3">
<h3>Wednesday, Aug 1</h3>
<ul>
<li>8:30 AM - 11:30 AM:
<ul>
<li><a href="http://ww2.amstat.org/meetings/jsm/2018/onlineprogram/ActivityDetails.cfm?SessionID=215099">Getting Shots Inside the Box-Cox</a> - I don’t usually go to sports statistics sessions, but (1) this one is about soccer and (2) that session title is pretty witty!</li>
<li><a href="http://ww2.amstat.org/meetings/jsm/2018/onlineprogram/ActivityDetails.cfm?SessionID=215252">Worldwide Statistics Without Borders Projects: Statistics, Data Visualization, and Decision Making</a></li>
<li><a href="http://ww2.amstat.org/meetings/jsm/2018/onlineprogram/ActivityDetails.cfm?SessionID=215533">Innovative and Effective Teaching for Large-Enrollment Statistics and Data Science Courses</a></li>
</ul></li>
<li>10:30 AM - 12:20 PM:
<ul>
<li><a href="http://ww2.amstat.org/meetings/jsm/2018/onlineprogram/ActivityDetails.cfm?SessionID=215217">The Potential for Web-Scraping in the Production of Official Statistics: An Opportunity for Statistics to Lead?</a></li>
<li><a href="http://ww2.amstat.org/meetings/jsm/2018/onlineprogram/ActivityDetails.cfm?SessionID=215333">Cloud and Distributed Computing for Statisticians</a></li>
<li><a href="http://ww2.amstat.org/meetings/jsm/2018/onlineprogram/ActivityDetails.cfm?SessionID=215674">Fresh Approaches to Statistical Pedagogy</a></li>
</ul></li>
<li>2:00 PM - 3:50 PM:
<ul>
<li><a href="http://ww2.amstat.org/meetings/jsm/2018/onlineprogram/ActivityDetails.cfm?SessionID=215679">A Mixed Bag of Graphical Delights</a></li>
<li><a href="http://ww2.amstat.org/meetings/jsm/2018/onlineprogram/ActivityDetails.cfm?SessionID=215089">The State of Peer-Review and Publication in Statistics and the Sciences</a></li>
<li><a href="http://ww2.amstat.org/meetings/jsm/2018/onlineprogram/ActivityDetails.cfm?SessionID=215131">Innovations in Teaching Undergraduate Probability</a></li>
<li><a href="http://ww2.amstat.org/meetings/jsm/2018/onlineprogram/ActivityDetails.cfm?SessionID=215260">Staying Statistically Relevant: Keep Your Skills Sharp!</a></li>
</ul></li>
<li>4:00 PM - 5:50 PM: <a href="http://ww2.amstat.org/meetings/jsm/2018/onlineprogram/ActivityDetails.cfm?SessionID=215426">COPSS Awards and Fisher Lecture - The Future: Stratified Micro-Randomized Trials with Applications in Mobile Health</a></li>
<li>6:00 PM - 7:30 PM: <a href="http://ww2.amstat.org/meetings/jsm/2018/onlineprogram/ActivityDetails.cfm?SessionID=216529">Section on Statistical Education Business Meeting</a> - Come celebrate the section turning 70, we’ll have cake!</li>
</ul>
</div>
<div id="thursday-aug-2" class="section level3">
<h3>Thursday, Aug 2</h3>
<ul>
<li>8:30 AM - 10:20 AM
<ul>
<li><a href="http://ww2.amstat.org/meetings/jsm/2018/onlineprogram/ActivityDetails.cfm?SessionID=215033">Foundation or Backdrop? - the Role of Statisticians in Academic Data Science Initiatives</a></li>
<li><a href="http://ww2.amstat.org/meetings/jsm/2018/onlineprogram/ActivityDetails.cfm?SessionID=215525">GAISEing into Introductory Service Courses in Light of Analytics/Data Scienc</a></li>
</ul></li>
<li>10:30 AM - 12:20 PM
<ul>
<li><a href="http://ww2.amstat.org/meetings/jsm/2018/onlineprogram/ActivityDetails.cfm?SessionID=215338">Expanding the Tent: Undergraduate Majors in Data Science</a> - I’ll be a discussant in this session, very much looking forward to hearing others’ ideas on curricular approaches for data science education.</li>
<li><a href="http://ww2.amstat.org/meetings/jsm/2018/onlineprogram/ActivityDetails.cfm?SessionID=215026">Data Science for Social Good</a></li>
<li><a href="http://ww2.amstat.org/meetings/jsm/2018/onlineprogram/ActivityDetails.cfm?SessionID=215317">The ‘Ergonomics’ of Statistics and Data Science</a></li>
<li><a href="https://ww2.amstat.org/meetings/jsm/2018/onlineprogram/ActivityDetails.cfm?SessionID=215097">Statistical Computing on Parallel Architectures</a></li>
</ul></li>
</ul>
</div>
</div>
<script>window.location.href='https://rviews.rstudio.com/2018/07/25/jsm-2018-itinerary/';</script>
CVXR: A Direct Standardization Example
https://rviews.rstudio.com/2018/07/20/cvxr-a-direct-standardization-example/
Fri, 20 Jul 2018 00:00:00 +0000https://rviews.rstudio.com/2018/07/20/cvxr-a-direct-standardization-example/
<p>In our <a href="https://rviews.rstudio.com/2017/11/27/introduction-to-cvxr/">first blog post</a>, we introduced <code>CVXR</code>, an R package for disciplined convex optimization, and showed how to model and solve a non-negative least squares problem using its interface. This time, we will tackle a non-parametric estimation example, which features new atoms as well as more complex constraints.</p>
<div id="direct-standardization" class="section level2">
<h2>Direct Standardization</h2>
<p>Consider a set of observations <span class="math inline">\((x_i,y_i)\)</span> drawn non-uniformly from an unknown distribution. We know the expected value of the columns of <span class="math inline">\(X\)</span>, denoted by <span class="math inline">\(b \in {\mathbf R}^n\)</span>, and want to estimate the true distribution of <span class="math inline">\(y\)</span>. This situation may arise, for instance, if we wish to analyze the health of a population based on a sample skewed toward young males, knowing the average population-level sex, age, etc.</p>
<p>A naive approach would be to simply take the empirical distribution that places equal probability <span class="math inline">\(1/m\)</span> on each <span class="math inline">\(y_i\)</span>. However, this is not a good estimation strategy when our sample is unbalanced. Instead, we will use the method of <strong>direct standardization</strong> (Fleiss, Levin, and Paik 2003, 19.5): we solve for weights <span class="math inline">\(w \in {\mathbf R}^m\)</span> of a weighted empirical distribution, <span class="math inline">\(y = y_i\)</span> with probability <span class="math inline">\(w_i\)</span>, which rectifies the skewness of the sample. This can be posed as the convex optimization problem</p>
<p><span class="math display">\[
\begin{array}{ll} \underset{w}{\mbox{maximize}} & \sum_{i=1}^m -w_i\log w_i \\
\mbox{subject to} & w \geq 0, \quad \sum_{i=1}^m w_i = 1,\quad X^Tw = b.
\end{array}
\]</span></p>
<p>Our objective is the total entropy, which is concave on <span class="math inline">\({\mathbf R}_+^m\)</span>. The constraints ensure <span class="math inline">\(w\)</span> is a probability vector that induces our known expectations over the columns of <span class="math inline">\(X\)</span>, i.e., <span class="math inline">\(\sum_{i=1}^m w_iX_{ij} = b_j\)</span> for <span class="math inline">\(j = 1,\ldots,n\)</span>.</p>
</div>
<div id="an-example-with-simulated-data" class="section level2">
<h2>An Example with Simulated Data</h2>
<p>As an example, we generate <span class="math inline">\(m = 1000\)</span> data points <span class="math inline">\(x_{i,1} \sim \mbox{Bernoulli}(0.5)\)</span>, <span class="math inline">\(x_{i,2} \sim \mbox{Uniform}(10,60)\)</span>, and <span class="math inline">\(y_i \sim N(5x_{i,1} + 0.1x_{i,2},1)\)</span>. We calculate <span class="math inline">\(b_j\)</span> to be the mean over <span class="math inline">\(x_{.,j}\)</span> for <span class="math inline">\(j = 1,2\)</span>. Then we construct a skewed sample of <span class="math inline">\(m = 100\)</span> points that over-represent small values of <span class="math inline">\(y_i\)</span>, thus biasing its distribution downwards.</p>
<p>Using <code>CVXR</code>, we construct the direct standardization problem. We first define the variable <span class="math inline">\(w\)</span>.</p>
<pre class="r"><code>w <- Variable(m)</code></pre>
<p>Then, we form the objective function by combining <code>CVXR</code>’s library of operators and atoms.</p>
<pre class="r"><code>objective <- Maximize(sum(entr(w)))</code></pre>
<p>Here, <code>entr</code> is the element-wise entropy atom; the S4 object <code>entr(w)</code> represents an <span class="math inline">\(m\)</span>-dimensional vector with entries <span class="math inline">\(-w_i\log(w_i)\)</span> for <span class="math inline">\(i=1,\ldots,m\)</span>. The <code>sum</code> operator acts exactly as expected, forming an expression that is the sum of the entries in this vector. (For a full list of atoms, see the <a href="http://cvxr.rbind.io/post/cvxr_functions/">function reference</a> page).</p>
<p>Our next step is to generate the list of constraints. Note that, by default, the relational operators apply over all entries in a vector or matrix.</p>
<pre class="r"><code>constraints <- list(w >= 0, sum(w) == 1, t(X) %*% w == b)</code></pre>
<p>Finally, we are ready to formulate and solve the problem.</p>
<pre class="r"><code>prob <- Problem(objective, constraints)
result <- solve(prob)
weights <- result$getValue(w)</code></pre>
<p>Using our optimal <code>weights</code>, we can then re-weight our skewed sample and compare it to the population distribution. Below, we plot the density functions using linear approximations for the range of <span class="math inline">\(y\)</span>.</p>
<pre class="r"><code>## Approximate density functions
dens1 <- density(ypop)
dens2 <- density(y)
dens3 <- density(y, weights = weights)
yrange <- seq(-3, 15, 0.01)
d <- data.frame(x = yrange,
True = approx(x = dens1$x, y = dens1$y, xout = yrange)$y,
Sample = approx(x = dens2$x, y = dens2$y, xout = yrange)$y,
Weighted = approx(x = dens3$x, y = dens3$y, xout = yrange)$y)
## Plot probability distribution functions
plot.data <- gather(data = d, key = "Type", value = "Estimate", True, Sample, Weighted,
factor_key = TRUE)
ggplot(plot.data) +
geom_line(mapping = aes(x = x, y = Estimate, color = Type)) +
theme(legend.position = "top")</code></pre>
<pre><code>## Warning: Removed 300 rows containing missing values (geom_path).</code></pre>
<div class="figure"><span id="fig:unnamed-chunk-6"></span>
<img src="/post/2018-07-20-cvxr-a-direct-standardization-example_files/figure-html/unnamed-chunk-6-1.png" alt="Probability distribution functions: population, skewed sample and reweighted sample" width="672" />
<p class="caption">
Figure 1: Probability distribution functions: population, skewed sample and reweighted sample
</p>
</div>
<pre class="r"><code>## Return the cumulative distribution function
get_cdf <- function(data, probs, color = 'k') {
if(missing(probs))
probs <- rep(1.0/length(data), length(data))
distro <- cbind(data, probs)
dsort <- distro[order(distro[,1]),]
ecdf <- base::cumsum(dsort[,2])
cbind(dsort[,1], ecdf)
}
## Plot cumulative distribution functions
d1 <- data.frame("True", get_cdf(ypop))
d2 <- data.frame("Sample", get_cdf(y))
d3 <- data.frame("Weighted", get_cdf(y, weights))
names(d1) <- names(d2) <- names(d3) <- c("Type", "x", "Estimate")
plot.data <- rbind(d1, d2, d3)
ggplot(plot.data) +
geom_line(mapping = aes(x = x, y = Estimate, color = Type)) +
theme(legend.position = "top")</code></pre>
<div class="figure"><span id="fig:unnamed-chunk-7"></span>
<img src="/post/2018-07-20-cvxr-a-direct-standardization-example_files/figure-html/unnamed-chunk-7-1.png" alt="Cumulative distribution functions: population, skewed sample and reweighted sample" width="672" />
<p class="caption">
Figure 2: Cumulative distribution functions: population, skewed sample and reweighted sample
</p>
</div>
<p>As is clear from the plots, the sample probability distribution peaks around <span class="math inline">\(y = 2.0\)</span>, and its cumulative distribution is shifted left from the population’s curve, a result of the downward bias in our sampled <span class="math inline">\(y_i\)</span>. However, with the direct standardization weights, the new empirical distribution cleaves much closer to the true distribution shown in red.</p>
<p>We hope you’ve enjoyed this demonstration of <code>CVXR</code>. For more examples, check out our <a href="http://cvxr.rbind.io">official site</a> and recent presentation <a href="https://www.youtube.com/watch?v=MyglbtnmQ8A">“Disciplined Convex Optimization with CVXR”</a> at useR! 2018.</p>
</div>
<script>window.location.href='https://rviews.rstudio.com/2018/07/20/cvxr-a-direct-standardization-example/';</script>