COVID-19 Data Forum: Data Journalism

The COVID-19 Data Forum, a joint project of the Stanford Data Science Institute and the R Consortium, is an ongoing series of multidisciplinary webinars open to the public where topic experts discuss data-related aspects of the scientific response to the pandemic. This post walks through the video recording of the most recent event held on March 18, 2021 which explored the role of data journalism in the pandemic. Comments and time stamps should be helpful in viewing the video.

Read more

Share Comments · · · · · · · · ·

What does it take to do a t-test?

In this post, I examine the fundamental assumption of independence underlying the basic Independent two-sample t-test for comparing the means of two random samples. In addition to independence, we assume that both samples are draws from normal distributions where the population means and common variance are unknown. I am going to assume that you are familiar with this kind of test, but even if you are not you are still in the right place.

Read more

Share Comments · · · · ·

February 2021: "Top 40" New CRAN Packages

In February, two hundred forty-three new packages made it to CRAN, many of them very interesting and at least one entertaining. It was exceptionally difficult to pick the “Top 40”, but here they are, more or less, in eleven categories: Computational Methods, Data, Finance, Games, Genomics, Machine Learning, Mathematics, Medicine, Networks and Graphs, Statistics, Utilities, and Visualization. iconr in the Networks and Graphs section is a package for doing computational archaeology, a relatively new field that I hope will dig R. I also hope that sassy in the Statistics sections helps some statisticians find their way to R.

Read more

Share Comments · · · ·

Cheat Sheets

In a previous post, I described how I was captivated by the virtual landscape imagined by the RStudio education team while looking for resources on the RStudio website. In this post, I’ll take a look at Cheatsheets, another amazing resource hiding in plain sight.

Read more

Share Comments · · · · · · · ·

2021 R Conferences

It is not yet clear what lasting impact the Covid-19 pandemic will ultimately have on R conferences. We are still adapting to our inability to attend large events, and trying to make the best of the “silver lining” of virtual events which permit worldwide participation. The following is an attempt to list 2021 conferences that are likely to have interesting R content. I suspect that it is incomplete. If you know of an R Conference that is not mentioned, please add it to the comments section for this post.

Read more

Share Comments · · ·

January 2020: "Top 40" New CRAN Packages

Two hundred thirty new packages made it to CRAN in January. Below are my “Top 40” selections (AlleleShift, autoharp, autoMrP, autostsm, aweSOM, bayesforecast, cachem, circularEV, cmprskcoxmsm, coder, dataquieR, eList, GenomeAdmixR, ggmulti, ggOceanMaps, ghcm, gplite, igoR, LPDynR, LSMRealOptions, Microsoft365R, MOSS, multibridge, NHSDataDictionaRy, OTrecod, pacviz, parallelPlot, partR2, pwt10, RandomForestsGLS, rgee, rtables, SAMtool, spNetwork, targets, thematic, torchaudio, trainR, ubms, and vimpclust) in ten categories: Data, Finance, Genomics, Machine Learning, Medicine, Science, Statistics, Time Series, Utilities, and Visualization.

Read more

Share Comments · · ·

R Interface for MiniZinc

Constraint programming is a paradigm for solving combinatorial problems that draws on a wide range of techniques from artificial intelligence, computer science, and operations research. MiniZinc is a free and open-source constraint modeling language designed for formulating constraint satisfaction and discrete optimization problems. Models are compiled into an intermediate representation that is understood by a wide range of solvers.

Read more

Share Comments · · · · ·

Some thoughts on rstudio::global talks

The videos from the rstudio::global conference are now available online. I believe that you will find the content of most of these talks to be nothing less than compelling. The themes and moods of the talks range from informative and deeply technical R issues to data science, journalism, art, visualization, education and public service. In this post, I profile five talks that I found particularly compelling and provide a few thoughts on their significance.

Read more

Share Comments · · · · · · · ·

Dec 2020: "Top 40" New CRAN Packages

One hundred twenty-three new packages made it to CRAN in December. Here are my “Top 40” selections in nine categories: Computational Methods, Data, Genomics, Machine Learning, Medicine, Science, Statistics, Utilities, and Visualization.

Read more

Share Comments · · ·

SEM Time Series Modeling

Structural Equation Models (SEM) which are common in many economic modeling efforts, require fitting and simulating whole system of equations where each equation may depend on the results of other equations. In this post, we will show how to do structural equation modeling in R by working through the Klein Model of the United States economy, one of the oldest and most elementary models of its kind.

Read more

Share Comments · · · · · · · ·