A Guide to Binge Watching R / Medicine 2021

2021-09-09

by Joseph Rickert

R / Medicine is a big deal. This year, the conference grew by 13% with 665 people from over 60 countries signing up for the virtual event which was held last month. 34% percent of the registrants were from outside of the United States and 17% identified as physicians.

Global map with locations of R Medicine registrants indicated

The conference is now an established international event where experts report on the advanced use of the R language, Machine Learning, and statistical analysis, and discuss the successes and challenges associated with bringing these technologies to day-to-day medical practice.

Almost all of the talks, including keynotes, regular talks, lightning talks, pre-conference workshops and poster sessions are available online. Find the links on the R / Medicine site or look through the playlist on the R Consortium Youtube Channel. Note that the posters can be viewed by going to the conference spatial.chat site. (If you and a friend visit at the same time you should be able to “walk around” the posters and chat about what you see.)

To kick off an evening of binge watching the conference I would begin with the keynotes.

The Keynotes

Dr. Karandeep Singh sets the hook for his talk, Bringing Machine Learning Models to the Bedside at Scale, two minutes into the video when he asks:

Who are the twenty sickest patients in the hospital right now who are not in the ICU?

This straightforward question immediately gets to the promise and the problems of introducing large scale machine learning algorithms into the hospital, and indicates how medical practice interacts with big money questions about allocating resources. Both physicians and administrators would like to identify high risk patients and treat them proactively while being able to confidently spend less on unnecessary test for low risk patients. About (5:10) into the talk, Karandeep begins discussing the challenges associated with introducing machine learning models.

Slide with list of challenges discussed. Is there infrastructure to support models? Should we implement a model? Once implemented, how do we measure model performance? Is a model “good enough” to use? Do users agree on how to use the model? Is the model effective when used? What does governance look like for machine learning models?

In the remainder of the talk he describes the technical infrastructure and then the governance or “social infrastructure” needed for success.

If you enjoy a good detective story, and take pride in your ability to interpret a well-done statistical plot you are certainly going to want to watch Ziad Obermeyer’s keynote Dissecting Algorithmic Bias. About two minutes into the video Professor Obermeyer sets the stage with the warning:

The single greatest threat to all of the gains that we can make in using algorithms in medicine is letting them go wrong in increasingly well known ways.

and the observation that due to the focus of the US health care management on “high risk care management” an estimated 150 to 200 million Americans are sorted by algorithms every year. He goes on to work through a case study that illustrates how an algorithm built with good intentions had the effect of scaling up racial bias.

Dot plot with regression line of algorithm risk score versus realized cost to show the racial bias in high risk care management

A second case study features an algorithm that “fights against” racial bias. Along the way, Ziad weaves two common themes into his presentation:

So many of the ways that algorithms can go wrong come from training algorithms with the wrong target variables, often “convenient and tempting proxies”.
The necessity of follow-up work to fix underlying problems.

In the remainder of this post, I have organized the talks into six categories that you may find helpful for setting your viewing program: Clinical Practice, Clinical Trials, Medical Data, R in Production, R Tools, and Short Courses. The majority of the talks have a machine learning angle. There is quite a bit of Shiny and several R packages, not all of them on CRAN, are featured. I have provided links when I could find them. I don’t want to spoil anyone’s fun in searching through the videos for “Easter Eggs”, but the Reproducible Research with R short course contains the first preview on the Quarto Publishing system in a talk from anyone at RStudio. (Note that the video needs some editing. Start watching at 9 minutes.)

Clinical Practice

Building an Interpretable ML Model API for Interpretation of CNVs in Patients with Rare Diseases - Francisco Requena
Subgroup Identification and Precision Medicine with the personalized R Package - Jared Huling
R and Shiny Dashboards to Facilitate Quality Improvement in Anesthesiology and Periopeartive Care - Robert Lobato
tidytof: Predicting Patient Outcomes from Single-cell Data using Tidy Data Principles - Timothy Keyes
Assessing ML Model Performance in DIverse Populations and Across Time - Victor Castro, Roy Perlis

Clinical Trials

Designing Early Phase Clinical Trials with ppseq - Emily Zabor
Collaborative, Reproducible Exploration of Clinical Trial Data - Michael Kane
Graphical Displays in R for Clinical Trials - Steven Schwager
ctrialsgov: Access, Visualization, and Discovery of the ClinicalTrials.gov Database - Taylor Arnold

Medical Data

Scaling Up and Deploying Shiny and Text Mining for National Health Decisions - Andreas Soteriade, Chris Beeley
Mapping African Health Data with afrimapr Package, Training & Community - Andy South
You R What You Measure: Digital Biomarkers for Insights in Personalized Health - Irene van den Broek
Shiny and REDCap for a Global Research Consortium - Judith Lewis, Stephany Duda
Diving into Registry Data: Using R for Large Norwegian Health Registries - Julia Romanowska
ReviewR: A Shiny App for Reviewing Clinical Records - Laura Wiley, David Mayer
DOPE: An R package for Processing and Classifying Drug Names - Layla Bouzoubaa
medicaldata for Teaching #Rstats - Peter Higgins
Stem Cell Transplant Outcomes Reporting using R/Shiny - Richard Hanna, Stephan Kadauke

R in Production

Second Server to the Right and Straight On ‘til Production: Deploying a GxP Shiny Application - Marcus Adams
Target Markdown and stantargets for Bayesian model validation pipelines - Will Landau
GENETEX: A Genomics Report Text Mining R Package to Capture Real-world Clinico-genomic Data - David Miller, Sophia Shalhout

R Tools

Generalized Additive Models for Longitudinal Biomedical Data - Ariel Mundo
Multistate Data Using the survival Package - Beth Atkinson
Bayesian Random-Effects Meta-analysis using bayesmeta - Christian Rover
An arsenal of R Functions for Statistical Summaries - Ethan Heinzen, Beth Atkinson, Jason Sinnwell
R Markdown and officedown to Automate Clinical Trial Reporting - Damian Rodziewicz
Creating and Styling PPTX Slides with rmarkdown - Emil Hvitfeldt
runway: an R Package to Visualize Prediction Model Performance - Jie Cao, Karandeep Singh
clinspacy: An R package for Clinical Natural Language Processing - Jie Cao, Karandeep Singh
Data Visualization for Machine Learning Practitioners - Julie Silge
Animated Data Visualizations with gganimate for Science Communication during the Pandemic - Kristen Panthagani
Incorporating Risk-of-Bias Assessments into Evidence Syntheses with robvis - Luke McGuinness, Randall Boyes, Alex Fowler
‘gpmodels’: A Grammar of Prediction Models - Sean Meyer, Karandeep Singh
CONSORT Diagrams in R with ggconsort - Travis Gerke

Short Courses

Secure Medical Data Collection: Best Practices with Excel, and Leveling Up to REDCap and CollaboratoR - Peter Higgins, Will Beasley, Kenneth MacLean, Amanda Miller
Introduction to R for Medical Data - Ted Laderas, Daniel Chen, Mara Alexeev
An Introductory R Guide for Targeted Maximum Likelihood Estimation in Medical Research - Ehsan Karim, Hanna Frank
Mapping Spatial Health Data - Marynia Kolak, Susan Paykin
From SAS to R - Joe Krsszun
Reproducible Research with R - Alison Hill, Stephan Kaduke, Paul Villanueva