An Introduction to Greta

I was surprised by greta. I had assumed that the tensorflow and reticulate packages would eventually enable R developers to look beyond deep learning applications and exploit the TensorFlow platform to create all manner of production-grade statistical applications. But I wasn’t thinking Bayesian. After all, Stan is probably everything a Bayesian modeler could want. Stan is a powerful, production-level probability distribution modeling engine with a slick R interface, deep documentation, and a dedicated development team.

Read more

Share Comments · · ·

Reticulated Shiny

RStudio recently announced the reticulate package, which is designed to help R users inter-operate with Python code. I was immediately excited by this announcement. In a past life, I worked with a team at the National Renewable Energy Lab (NREL) on vehicle simulations. Their models could predict MPG for vehicles based on driving routes. At the time, I had wanted to build a web app that would allow users to predict MPG for different vehicles based on their daily commutes.

Read more

Share Comments · · ·

Introduction to Fama French

In two previous posts, we calculated and then visualized the CAPM beta of a portfolio by fitting a simple linear model. Today, we move beyond CAPM’s simple linear regression and explore the Fama French (FF) multi-factor model of equity risk/return. For more background, have a look at the original article published in The Journal Financial Economics, Common risk factors in the returns on stocks and bonds. The FF model extends CAPM by regressing portfolio returns on several variables, in addition to market returns.

Read more

Share Comments · · · ·

R and TensorFlow Presentations

In early March, the Bay Area useR Group was able to hold an R and TensorFlow mini-conference on Google’s new Sunnyvale campus. Pete Mohanty, a Stanford researcher and frequent BARUG speaker, lead off with a talk on his recent kerasformula package, which allows R users to call a keras-based neural net with R formula objects. Pete’s slides show an example of using using a regression-style formula with the kerasformula::kms() function to fit a sequential TensorFlow model.

Read more

Share Comments · · · · ·

Feb 2018: "Top 40" New Package Picks

Here are my picks for the “Top 40” packages of the 171 new packages that made it to CRAN (and stuck) in February, organized into the following categories: Computational Methods, Data, Finance, Science, Statistics, Time Series, and Utilities. Computational Methods adnuts v1.0.0: Provides an implementation of the no-U-turn (NUTS) algorithm by Hoffman and Gelman (2014) for ADMB and TMB models. The vignette will get you started. CholWishart v0.9.2: Provides functions to sample from the Cholesky factorization of a Wishart random variable, the inverse Wishart distribution and the Cholesky factorization of an inverse Wishart random variable.

Read more

Share Comments · · ·

Multiple Versions of R

Data scientists prefer using the latest R packages to analyze their data. To ensure a good user experience, you will need a recent version of R running on a modern operating system. If you run R on a production server – and especially if you use RStudio Connect – plan to support multiple versions of R side by side so that your code, reports, and apps remain stable over time. You can support multiple versions of R concurrently by building R from source.

Read more

Share Comments · · ·

Alternative Design for Shiny

Shiny’s Design Most Shiny apps out there have a similar design style. It is usually easy for a seasoned Shiny developer to tell the difference between a Shiny app and a standard website. Why is this? Shiny apps ARE websites for all intents and purposes. Why do they not vary as greatly as the rest of the sites we encounter when surfing the web? This is partly due to the fact that many Shiny developers leverage pre-written code (e.

Read more

Share Comments · ·

Analyzing Metadata for CRAN Packages

I have been searching for various ways to find information about R packages for some time now, but I only recently learned about the CRAN_package_db() function in the base tools package. If a colleague hadn’t pointed it out to me, I am sure I would never have found it on my own. pdb <- tools:::CRAN_package_db() When invoked, this function goes out to the CRAN mirror specified by the environment variable R_CRAN_WEB and returns a data frame containing a whole lot of information about each package currently on CRAN.

Read more

Share Comments · ·

Visualizing the Capital Asset Pricing Model

In a previous post, we covered how to calculate CAPM beta for our usual portfolio consisting of: + SPY (S&P500 fund) weighted 25% + EFA (a non-US equities fund) weighted 25% + IJS (a small-cap value fund) weighted 20% + EEM (an emerging-mkts fund) weighted 20% + AGG (a bond fund) weighted 10% Today, we will move on to visualizing the CAPM beta and explore some ggplot and highcharter functionality, along with the broom package.

Read more

Share Comments · · ·

Jan 2018: "Top 40" New Package Picks

Here are my “Top 40” picks from the two hundred or so new packages that stuck to CRAN in January, listed under seven categories: Data, Data Science, Science, Statistics, Time Series, Utilities and Visualizations (I say “stuck to” because I counted at least six packages that were accepted onto CRAN in January but removed within the month. Having packages quickly removed from CRAN is a phenomenon I have observed in recent months.

Read more

Share Comments · · · · ·