The SeaClass R Package

The SeaClass R Package The Operations Technology and Advanced Analytics Group (OTAAG) at Seagate Technology has decided to share an internal project that helps accelerate development of classification models. The interactive SeaClass tool is contained in an R-based package built using shiny and other CRAN packages commonly used for binary classification. The package is free to use and develop further, but any analysis mistakes are the sole responsibility of the user.

Read more

Share Comments · · · · · · ·

Database Queries With R

There are many ways to query data with R. This post shows you three of the most common ways: Using DBI Using dplyr syntax Using R Notebooks Background Several recent package improvements make it easier for you to use databases with R. The query examples below demonstrate some of the capabilities of these R packages. DBI. The DBI specification has gone through many recent improvements. When working with databases, you should always use packages that are DBI-compliant.

Read more

Share Comments · · · · · · · ·

Introduction to Portfolio Returns

Today, we go back a bit to where we probably should have started in the first place, but it wouldn’t have been as much fun. In our previous work on volatility, we zipped through the steps of data import, tidy and transformation. Let’s correct that oversight and do some spade work on transforming daily asset prices to monthly portfolio log returns. Our five-asset portfolio will consist of the following securities and weights:

Read more

Share Comments · · ·

Climate Change and Population Modeling in R

A recent paper in nature climate change: Less than 2°C warming by 2100 unlikely (Raftery et al. 2017), concludes that the goal of the Paris Agreement is unlikely to be met. Although the conclusion is disheartening, the paper advances the science of climate modeling by developing a joint Bayesian hierarchical model for Gross Domestic Product per capita and carbon intensity. This ensemble of models, in turn, depends on the availability of probabilistic population projections developed by the BayesPop Project at the University of Washington and available on CRAN.

Read more

Share Comments · · · · ·

WordR - A New R Package for Rendering Documents in MS Word Format

Motivation One day earlier this year, I was faced with the challenge of creating a report for management. It had to be an MS Word document (corporate requirement, you know). It was supposed to be polished and use many of the standard MS Word features like headers, footers, table of contents, and styles. I am not a Word guy, and besides, I wanted to make a reproducible document that would make it easy for me to include R code and plots in the text.

Read more

Share Comments · · · ·

August 2017 New Package Picks

August was a relatively slow month for new R packages; “only” 180 new packages stuck to CRAN. Here are my “Top 40” picks organized into seven categories: Data, Machine Learning, Miscellaneous, Science, Statistics, Utilities and Visualizations. Although they have been written for specialized audiences, I have included the three “Science” packages because, in my layman’s opinion, they not only seem to be useful, but they are each documented well enough to give an interested person some idea of what they do.

Read more

Share Comments · · ·

Survival Analysis with R

With roots dating back to at least 1662 when John Graunt, a London merchant, published an extensive set of inferences based on mortality records, survival analysis is one of the oldest subfields of Statistics [1]. Basic life-table methods, including techniques for dealing with censored data, were discovered before 1700 [2], and in the early eighteenth century, the old masters - de Moivre working on annuities, and Daniel Bernoulli studying competing risks for the analysis of smallpox inoculation - developed the modern foundations of the field [2].

Read more

Share Comments · · · · ·

Report from Mexico City

Editors Note: It has been heartbreaking watching the images from México City. Teresa Ortiz, co-organizer of R-Ladies CDMX reports on efforts of data scientists to help. Our thoughts are with them, and with the people of México. It has been a hard couple of days around here. In less than 2 weeks, México has gone through two devastating earthquakes and the damages keep adding. Nevertheless, the response from the citizens has been outstanding and Mexican data-driven initiatives have not stayed behind.

Read more

Share Comments · · · ·

Enterprise-ready dashboards with Shiny and databases

Inside the enterprise, a dashboard is expected to have up-to-the-minute information, to have a fast response time despite the large amount of data that supports it, and to be available on any device. An end user may expect that clicking on a bar or column inside a plot will result in either a more detailed report, or a list of the actual records that make up that number. This article will cover how to use a set of R packages, along with Shiny, to meet those requirements.

Read more

Share Comments · · · · · ·

Asset Contribution to Portfolio Volatility

In our previous portfolio volatility work, we covered how to import stock prices, convert to returns and set weights, calculate portfolio volatility, and calculate rolling portfolio volatility. Now we want to break that total portfolio volatility into its constituent parts and investigate how each asset contributes to the volatility.

Read more

Share Comments · ·