May 2018: “Top 40” New Packages

While looking over the 215 or so new packages that made it to CRAN in May, I was delighted to find several packages devoted to subjects a little bit out of the ordinary; for instance, bioacoustics analyzes audio recordings, freegroup looks at some abstract mathematics, RQEntangle computes quantum entanglement, stemmatology analyzes textual musical traditions, and treedater estimates clock rates for evolutionary models. I take this as evidence that R is expanding beyond its traditional strongholds of statistics and finance as people in other fields with serious analytic and computational requirements become familiar with the language.

Read more

Share Comments · · ·

Reading and analysing log files in the RRD database format

I have frequent conversations with R champions and Systems Administrators responsible for R, in which they ask how they can measure and analyze the usage of their servers. Among the many solutions to this problem, one of the my favourites is to use an RRD database and RRDtool. From Wikipedia: RRDtool (round-robin database tool) aims to handle time series data such as network bandwidth, temperatures or CPU load. The data is stored in a circular buffer based database, thus the system storage footprint remains constant over time.

Read more

Share Comments · · ·

Player Data for the 2018 FIFA World Cup

The World Cup starts today! The tournament which runs from June 14 through July 15 is probably the most popular sporting event in the world. if you are a soccer fan, you know that learning about the players and their teams and talking about it all with your friends greatly enhances the experience. In this post, I will show you how to gather and explore data for the 736 players from the 32 teams at the 2018 FIFA World Cup.

Read more

Share Comments ·

Monte Carlo Part Two

In a previous post, we reviewed how to set up and run a Monte Carlo (MC) simulation of future portfolio returns and growth of a dollar. Today, we will run that simulation many, many, times and then visualize the results.

Read more

Share Comments · · ·

Monte Carlo

Today, we change gears from our previous work on Fama French and run a Monte Carlo (MC) simulation of future portfolio returns. Monte Carlo relies on repeated, random sampling. We will sample based on two parameters: mean and standard deviation of portfolio returns. Our long-term goal (long-term == over the next two or three blog posts) is to build a Shiny app that allows an end user to build a custom portfolio, simulate returns and visualize the results.

Read more

Share Comments · · · ·

Exploring R Packages with cranly

In a previous post, I showed a very simple example of using the R function tools::CRAN_package_db() to analyze information about CRAN packages. CRAN_package_db() extracts the metadata CRAN stores on all of its 12,000 plus packages and arranges it into a “database”, actually a complicated data frame in which some columns have vectors or lists as entries. It’s simple to run the function and it doesn’t take very long on my Mac Book Air.

Read more

Share Comments · · ·

April 2018: “Top 40” New Packages

Below are my “Top 40” picks from the approximately 212 new packages that made it to CRAN in April. They are organized into ten categories: Computational Methods, Data, Data Science, Machine Learning, Music, Science, Statistics, Time Series, Utilities, and Visualizations. Computational Methods diffeqr v0.1.1: Provides an interface to DifferentialEquations.jl which offers high performance methods for solving ordinary differential equations (ODE), stochastic differential equations (SDE), delay differential equations (DDE), differential-algebraic equations (DAE), and more.

Read more

Share Comments · ·

Enterprise Dashboards with R Markdown

This is a second post in a series on enterprise dashboards. See our previous post, Enterprise-ready dashboards with Shiny Databases. We have been living with spreadsheets for so long that most office workers think it is obvious that spreadsheets generated with programs like Microsoft Excel make it easy to understand data and communicate insights. Everyone in a business, from the newest intern to the CEO, has had some experience with spreadsheets.

Read more

Share Comments · · · · · · ·

2018 R Conferences

rstudio::conf 2018 and the New York R Conference are both behind us, but we are rushing headlong into the season for conferences focused on the R Language and its applications. The European R Users Meeting (eRum) begins this coming Monday, May 14th, in Budapest with three days of workshops and talks. Headlined by R Core member Martin Mächler and fellow keynote speakers Achim Zeileis, Nathalie Villa-Vialaneix, Stefano Maria Iacus, and Roger Bivand, the program features an outstanding array of accomplished speakers including RStudio’s own Barbara Borges Ribeiro, Andrie de Vries, and Lionel Henry.

Read more

Share Comments · · ·

Rolling Fama French

In a previous post, we reviewed how to import the Fama French 3-Factor data, wrangle that data, and then regress our portfolio returns on the factors. Please have a look at that previous post, as the following work builds upon it.

Read more

Share Comments · · ·