IPO Exploration

by Jonathan Regenstein

Inspired by recent headlines like “Fear Overtakes Greed in IPO Market after WeWork Debacle” and “This Year’s IPO Class is Least Profitable since the Tech Bubble”, today we’ll explore historical IPO data, and next time we’ll look at the the performance of IPO-driven portfolios constructed during the ten-year period from 2004 to 2014. I’ll admit, I’ve often wondered how a portfolio that allocated money to new IPOs each year might perform since this has to be an ultimate example of a few headline-gobbling whales dominating the collective consciousness. We hear a lot about a few IPOs each year, but there are dozens about which we hear nothing.

Here are the packages we’ll be using today.

library(tidyverse)
library(tidyquant)
library(dplyr)
library(plotly)
library(riingo)
library(roll)
library(tictoc)

Let’s get all the companies listed on the NASDAQ, NYSE, and AMEX exchanges and their IPO dates. That’s not every company that IPO’d in those years, of course, but we’ll go with it as a convenience for today’s purposes. Fortunately, the tq_exchange() function from tidyquant makes it painless to grab this data.

nasdaq <-
  tq_exchange("NASDAQ")

amex  <- 
  tq_exchange("AMEX")

nyse <- 
  tq_exchange("NYSE")

Big-time warning alert: not only have we missed companies that IPO’d and are not listed on those exchanges, we have also missed companies that IPO’d and have ceased to exist, i.e., the companies that went bust, i.e., the very companies that would absolutely scare the heck out of us before we invested in things like recent IPOs. You would need to correct for major survivor bias should you choose to explore this for actual trading. As we’ll see next time, even without those dead companies, portfolios built upon these IPOs are very risky. And, while we’re on the caveat theme, nothing in this post is financial advice in any way.

Back to the exciting part, the code!

Notice how we pulled in data from three different data sources, but our objects have the same column structures. That’s due to some nice work from the tidyquant authors, and it makes our lives easier in the next step, wherein we use bind_rows() to combine the data into one object.

After binding these data together, we select(symbol, company, ipo.year, sector) to isolate a few columns of interest. We will also filter out any tickers with ipo.year equal to NA with filter(!is.na(ipo.year)).

company_ipo_sector <-
  nasdaq %>% 
  bind_rows(amex) %>% 
  bind_rows(nyse) %>% 
  select(symbol, company, ipo.year, sector) %>% 
  filter(!is.na(ipo.year))


company_ipo_sector %>% 
  head()
# A tibble: 6 x 4
  symbol company                                ipo.year sector           
  <chr>  <chr>                                     <dbl> <chr>            
1 TXG    10x Genomics, Inc.                         2019 Capital Goods    
2 YI     111, Inc.                                  2018 Health Care      
3 PIH    1347 Property Insurance Holdings, Inc.     2014 Finance          
4 FLWS   1-800 FLOWERS.COM, Inc.                    1999 Consumer Services
5 BCOW   1895 Bancorp of Wisconsin, Inc.            2019 Finance          
6 VNET   21Vianet Group, Inc.                       2011 Technology       

Before we start implementing and testing portfolio strategies in next week’s post, let’s spend today on some exploration of this data set. We have the sector and IPO year of each sector, and a good place to start is visualizing the number of IPOs by year. The key here is to call count(ipo.year), which will do exactly what we hope: give us a count of the number of IPOs by year

company_ipo_sector %>% 
  group_by(ipo.year) %>%
  count(ipo.year) %>% 
  tail()
# A tibble: 6 x 2
# Groups:   ipo.year [6]
  ipo.year     n
     <dbl> <int>
1     2014   258
2     2015   210
3     2016   184
4     2017   274
5     2018   397
6     2019   310

Then we want to pipe straight to ggplot() and put the new n column on the y-axis.

company_ipo_sector %>% 
  group_by(ipo.year) %>%
  count(ipo.year) %>% 
  ggplot(aes(x = ipo.year, y = n)) +
  geom_col(color = "cornflowerblue") + 
  scale_x_continuous(breaks = scales::pretty_breaks(n = 20)) +
  theme(axis.text.x = element_text(angle = 90))

I like that chart, but it would be nice to be able to hover on the bars and get some more information. Let’s wrap the whole code flow inside of the ggplotly() function from plotly, which will convert this to an interactive chart. The names of the columns will be displayed in the tooltip, so let’s use rename(num IPOs = n, year = ipo.year) to create better labels.

ggplotly(
company_ipo_sector %>% 
  group_by(ipo.year) %>%
  count(ipo.year) %>% 
  rename(`num IPOs` = n, year = ipo.year) %>% 
  ggplot(aes(x = year, y = `num IPOs`)) +
  geom_col(color = "cornflowerblue") + 
  scale_x_continuous(breaks = scales::pretty_breaks(n = 20)) +
  theme(axis.text.x = element_text(angle = 90))
)

We see a big decline in 2008 due to the financial crisis, and a steady rise until 2014 when things jump, but that might be due to the fact that since 2014, not as many companies have had a chance to be delisted. I’ll leave it to an IPO maven to explain things further. I did come across this treasure trove of data on the IPO market for the curious. There’s a lot of interesting stuff in there, but one thing to note about this data source and others I stumbled upon is that IPO data tends to focus on companies with a certain market cap, generally greater than $50 million. We didn’t make any cutoff based on market cap, and thus will have more observations than you might find if you Google something like ‘number of IPOs in year XXXX’. For the curious, I’ll post how to create this market cap filter on linkedin, and more importantly, it does set off some neurons in my brain to think that researchers tend to focus on IPOs of a certain market cap. That usually means there’s weird data stuff going on in the ignored area, or it’s risky, or it’s not worth the time to institutional investors because of market structure issues - or any of a host of reasons to investigate the stuff that other people find unattractive.

Let’s get back on course and chart IPOs by sector by year. Instead of using count, we’ll use add_count(), which is a short-hand for group_by() + add_tally().

company_ipo_sector %>% 
  group_by(ipo.year, sector) %>% 
  select(ipo.year, sector) %>% 
  add_count(ipo.year, sector) %>% 
  slice(1) %>% 
  filter(ipo.year > 2003)
# A tibble: 193 x 3
# Groups:   ipo.year, sector [193]
   ipo.year sector                    n
      <dbl> <chr>                 <int>
 1     2004 Basic Industries          3
 2     2004 Capital Goods             4
 3     2004 Consumer Durables         1
 4     2004 Consumer Non-Durables     2
 5     2004 Consumer Services        14
 6     2004 Energy                    2
 7     2004 Finance                   8
 8     2004 Health Care              13
 9     2004 Miscellaneous             2
10     2004 Public Utilities          2
# … with 183 more rows

Now let’s take that data and pipe it to ggplot(). I want to highlight sector differences by year, so will use the fill = sector aesthetic mapping along with facet_wrap(~ipo.year). Let’s also save some room on the x-axis labels by removing the word Consumer from the sector column using mutate(sector = str_remove(sector, "Consumer")).

company_ipo_sector %>% 
  group_by(ipo.year) %>% 
  filter(ipo.year > 2003 & !is.na(sector)) %>%
  mutate(sector = str_remove(sector, "Consumer")) %>% 
  count(sector) %>% 
  ggplot(aes(x = sector, y = n, fill = sector)) +
  geom_col() +
  facet_wrap(~ipo.year) +
  theme(axis.text.x = element_text(angle = 90)) +
  labs(x = "")

Not perfect, but better. Looking at 2013 through 2019, it immediately jumps out that the Health Care and Finance sectors have the most IPOs. Let’s use between(ipo.year, 2004, 2019) to cut down on the number of years and enforce three rows with facet_wrap(~ipo.year, nrow = 3).

We’ll also wrap our entire code flow in parentheses, and then pipe to ggplotly().

(
  company_ipo_sector %>% 
  group_by(ipo.year) %>% 
  filter(between(ipo.year, 2004, 2019) & !is.na(sector)) %>%
  mutate(sector = str_remove(sector, "Consumer")) %>% 
  count(sector) %>% 
  ggplot(aes(x = sector, y = n, fill = sector)) +
  geom_col() +
  facet_wrap(~ipo.year,  nrow = 5) +
  theme(axis.text.x = element_text(angle = 90)) +
  labs(x = "")
  ) %>% 
  ggplotly()

Hover on that chart and notice that the tooltip is displaying the sector twice - that’s because we used sector in both the x and fill aesthetic mappings. To customize the tooltip, we can add an explicit aesthetic mapping called text and paste together exactly what we want to display. I want to show the sector and the number of IPOs, and label those values as such, so will set text = paste("sector:", sector, "<br> Number of IPOs:", n, sep = " "). That creates a new aesthetic mapping inside ggplot(). Now we need to tell ggplotly() to use just that text by adding the argument tooltip = "text" to ggplotly().

(company_ipo_sector %>% 
  group_by(ipo.year) %>% 
  filter(between(ipo.year, 2004, 2019) & !is.na(sector)) %>%
  mutate(sector = str_remove(sector, "Consumer")) %>% 
  count(sector) %>% 
  ggplot(aes(x = sector, y = n, fill = sector, text = paste("sector:", sector, "<br> Number of IPOs:", n, sep = " "))) +
  geom_col() +
  facet_wrap(~ipo.year,  nrow = 5) +
  theme(axis.text.x = element_text(angle = 90)) +
  labs(x = "")
 ) %>% 
  ggplotly(tooltip = "text")

Alright, the tooltip looks how I want it to - notice that we gained another nice functionality. We can double click in the legend to isolate the trace of just one sector throughout these years.

Here’s one more way to organize this data: facet_wrap() ing by sector instead of year. This emphasizes the history of each sector rather than the structure of each year’s IPO market.

(
company_ipo_sector %>% 
  group_by(ipo.year) %>% 
  filter(between(ipo.year, 2004, 2019) & !is.na(sector)) %>%
  mutate(sector = str_remove(sector, "Consumer")) %>% 
  count(sector) %>%  
  ggplot(aes(x = ipo.year, y = n, fill = sector)) +
  geom_col() +
  facet_wrap(~sector) +
  theme(axis.text.x = element_text(angle = 90))
) %>% 
  ggplotly()

This chart really emphasizes the paths of the different sectors - more than 100 Finance IPOs in each of the last two years. If we start allocating dollars based on IPO year, we’re going to get imbalanced exposure to various sectors. In 2018 and 2019, we would have been making heavy tilts to financial and health care.

Let’s move on to evaluating the performance of these companies since their IPOs, which means we need to import their price histories. We have tickers saved in our company_ipo_sector data frame, but before we grab those tickers, let’s do a little clean up. First, some companies have different share classes and multiple tickers, which means they appear more than once. I’m going to get rid of those by using distinct(company, .keep_all = TRUE).

company_ipo_sector %>% 
  distinct(company, .keep_all = TRUE) %>% 
  head()
# A tibble: 6 x 4
  symbol company                                ipo.year sector           
  <chr>  <chr>                                     <dbl> <chr>            
1 TXG    10x Genomics, Inc.                         2019 Capital Goods    
2 YI     111, Inc.                                  2018 Health Care      
3 PIH    1347 Property Insurance Holdings, Inc.     2014 Finance          
4 FLWS   1-800 FLOWERS.COM, Inc.                    1999 Consumer Services
5 BCOW   1895 Bancorp of Wisconsin, Inc.            2019 Finance          
6 VNET   21Vianet Group, Inc.                       2011 Technology       

Next, let’s remove missing sector labels with filter(!is.na(sector)) and work with just 10 years of data by adding between(ipo.year, 2004, 2014).

company_ipo_sector %>% 
  distinct(company, .keep_all = TRUE) %>% 
  filter(!is.na(sector) &  between(ipo.year, 2004, 2014)) %>% 
  head()
# A tibble: 6 x 4
  symbol company                                ipo.year sector       
  <chr>  <chr>                                     <dbl> <chr>        
1 PIH    1347 Property Insurance Holdings, Inc.     2014 Finance      
2 VNET   21Vianet Group, Inc.                       2011 Technology   
3 TWOU   2U, Inc.                                   2014 Technology   
4 JOBS   51job, Inc.                                2004 Technology   
5 ABIL   Ability Inc.                               2014 Capital Goods
6 XLRN   Acceleron Pharma Inc.                      2013 Health Care  

That leaves us with 842 tickers.

company_ipo_sector %>% 
  distinct(company, .keep_all = TRUE) %>% 
  filter(!is.na(sector) &  between(ipo.year, 2004, 2014)) %>% 
  nrow()
[1] 843

If we want to examine the performance of these companies over time, we’ll need their price histories. 842 tickers is a lot, so for today’s purposes, let’s randomly sample 10 tickers per year, which will give us a total of 110 tickers. To do so, we need to group_by(ipo.year) and then sample_n(10) before a call to pull(symbol).

set.seed(1234)

tickers <-
company_ipo_sector %>% 
  distinct(company, .keep_all = TRUE) %>% 
  filter(!is.na(sector) &  between(ipo.year, 2004, 2014)) %>% 
  group_by(ipo.year) %>%
  # randomly sample 10 tickers per yer
  # if you wish to run this in a more complete way, remove the next line
  sample_n(10) %>%  
  pull(symbol)

We will use the riingo package to import data from the excellent tiingo. We covered that package previously in this post, if a quick refresher is useful.

We do first need to create an API key. riingo makes that quite convenient:

riingo_browse_signup()
# This requires that you are signed in on the site once you sign up
riingo_browse_token() 

Now set your token for this session:

# Need an API key for tiingo

riingo_set_token("your API key here")

We’re ready to rock and roll.

Again, to ease the amount of data, we’ll import monthly data instead of daily data by setting resample_frequency = "monthly" inside of riingo_prices().

prices_riingo <-
  tickers %>%
  riingo_prices(start_date = "2004-01-01",  resample_frequency = "monthly") %>% 
  group_by(ticker)

Here’s a look at the first three months of prices for tickers that IPO’d in the first half 2004. Notice that each price series starts on a different date, depending on what month the company went public.

prices_riingo %>% 
  filter(date <= "2004-07-01") %>% 
  slice(1:3) 
# A tibble: 11 x 14
# Groups:   ticker [4]
   ticker date                close  high   low  open volume adjClose
   <chr>  <dttm>              <dbl> <dbl> <dbl> <dbl>  <int>    <dbl>
 1 ABR    2004-04-30 00:00:00  19.4  20.6  19    20.0 5.61e6     6.21
 2 ABR    2004-05-31 00:00:00  19.0  20.0  18.4  19.4 6.20e5     6.05
 3 ABR    2004-06-30 00:00:00  20.0  21    18.5  19   1.46e6     6.37
 4 BCBP   2004-01-30 00:00:00  21    24.8  21    21   9.90e3    12.0 
 5 BCBP   2004-02-27 00:00:00  28    28    21.1  22   9.80e3    15.9 
 6 BCBP   2004-03-31 00:00:00  27    28    24    27   1.21e4    15.4 
 7 NUVA   2004-05-31 00:00:00  11.2  11.6  10.3  11.0 7.59e6    11.2 
 8 NUVA   2004-06-30 00:00:00  11.0  12.2  10.6  11.2 1.90e6    11.0 
 9 NWBI   2004-01-30 00:00:00  22.5  23.2  20.2  21.2 1.01e6     5.27
10 NWBI   2004-02-27 00:00:00  25.3  26.7  22.5  23.0 1.36e6     5.94
11 NWBI   2004-03-31 00:00:00  25.4  26.1  23.7  25.1 7.87e5     5.96
# … with 6 more variables: adjHigh <dbl>, adjLow <dbl>, adjOpen <dbl>,
#   adjVolume <int>, divCash <dbl>, splitFactor <dbl>

A quick caveat, we’re working with monthly prices here to avoid the load of daily frequencies and, thus, we get the price on the last day of the month of the IPO. Taking NUVA as an example, the stock was listed on 2004-05-13 so our first price observation is 2004-05-31.

prices_riingo %>% 
  filter(ticker == "NUVA") %>% 
  head()
# A tibble: 6 x 14
# Groups:   ticker [1]
  ticker date                close  high   low  open volume adjClose
  <chr>  <dttm>              <dbl> <dbl> <dbl> <dbl>  <int>    <dbl>
1 NUVA   2004-05-31 00:00:00 11.2   11.6 10.3  11.0  7.59e6    11.2 
2 NUVA   2004-06-30 00:00:00 11.0   12.2 10.6  11.2  1.90e6    11.0 
3 NUVA   2004-07-30 00:00:00 10     11.3  9.35 11.1  1.08e6    10   
4 NUVA   2004-08-31 00:00:00  9.49  10.4  8.97  9.9  9.21e5     9.49
5 NUVA   2004-09-30 00:00:00 10.6   11.2  9.32  9.63 6.47e5    10.6 
6 NUVA   2004-10-29 00:00:00  9.86  11.6  9.25 10.4  3.14e6     9.86
# … with 6 more variables: adjHigh <dbl>, adjLow <dbl>, adjOpen <dbl>,
#   adjVolume <int>, divCash <dbl>, splitFactor <dbl>

Let’s go ahead and calculate monthly returns for each of our tickers by calling mutate(monthly_return = close/lag(close) - 1).

prices_riingo %>% 
  select(ticker, date, close) %>% 
  mutate(monthly_return = close/lag(close) - 1) %>% 
  head()
# A tibble: 6 x 4
# Groups:   ticker [1]
  ticker date                close monthly_return
  <chr>  <dttm>              <dbl>          <dbl>
1 ABR    2004-04-30 00:00:00  19.4        NA     
2 ABR    2004-05-31 00:00:00  19.0        -0.0252
3 ABR    2004-06-30 00:00:00  20.0         0.0522
4 ABR    2004-07-30 00:00:00  19          -0.0476
5 ABR    2004-08-31 00:00:00  20.9         0.0984
6 ABR    2004-09-30 00:00:00  22.2         0.0637

We have monthly returns for each of our tickers since inception. Since we ultimately plan to build some toy portfolios based on different selection methods or algorithms, let’s take a quick look at how these companies have performed.

We’ll use a few methods to get a feel for performance. First, we’ll calculate the Sharpe Ratio for each company. For more information on the Sharpe Ratio, see this post, but in short, it’s a measure of the ratio of return to risk, so a higher Sharpe Ratio is better than a lower Sharpe Ratio.

There is a SharpeRatio function in the PerformanceAnalytics package, and we can use tq_performance() to apply that function to our tidy tibble of monthly returns.

We have 110 tickers so we’ll expect to get 110 observations back.

prices_riingo %>% 
  select(ticker, date, close) %>% 
  mutate(monthly_return = close/lag(close) - 1) %>%
  tq_performance(Ra = monthly_return,
                 performance_fun = SharpeRatio,
                 FUN = "StdDev")
# A tibble: 110 x 2
# Groups:   ticker [110]
   ticker `StdDevSharpe(Rf=0%,p=95%)`
   <chr>                        <dbl>
 1 ABR                         0.0703
 2 MNKD                        0.0659
 3 NUVA                        0.142 
 4 DPZ                         0.205 
 5 HEP                         0.0384
 6 KNL                         0.0704
 7 CHCI                        0.0743
 8 BECN                        0.0911
 9 ENS                         0.136 
10 WCG                         0.190 
# … with 100 more rows

Let’s go ahead and order these by Sharpe Ratio, from highest to lowest, using arrange(desc(better_name_for_this_column)).

prices_riingo %>% 
  select(ticker, date, close) %>% 
  mutate(monthly_return = close/lag(close) - 1) %>%
  tq_performance(Ra = monthly_return,
                 performance_fun = SharpeRatio,
                 FUN = "StdDev") %>%
  `colnames<-`(c("ticker", "sharpe_ratio")) %>% 
  arrange(desc(sharpe_ratio)) %>% 
  head(10)
# A tibble: 10 x 2
# Groups:   ticker [10]
   ticker sharpe_ratio
   <chr>         <dbl>
 1 RNG           0.317
 2 AWK           0.288
 3 DG            0.285
 4 GWRE          0.277
 5 IBP           0.248
 6 POST          0.223
 7 WD            0.217
 8 MELI          0.206
 9 DPZ           0.205
10 TRNO          0.202

Since the start dates are jagged, a cumulative return won’t really be a fair comparison - one ticker might have a higher or lower Sharpe Ratio simply by virtue of being in existence longer. To account for that, let’s add the annualize = TRUE argument.

prices_riingo %>% 
  select(ticker, date, close) %>% 
  mutate(monthly_return = close/lag(close) - 1) %>%
  tq_performance(Ra = monthly_return,
                 performance_fun = SharpeRatio,
                 FUN = "StdDev", 
                 annualize = TRUE) %>% 
  `colnames<-`(c("ticker", "sharpe_ratio")) %>% 
  arrange(desc(sharpe_ratio))  %>% 
  head(10)
# A tibble: 10 x 2
# Groups:   ticker [10]
   ticker sharpe_ratio
   <chr>         <dbl>
 1 RNG           1.07 
 2 AWK           0.981
 3 DG            0.960
 4 GWRE          0.919
 5 IBP           0.761
 6 POST          0.699
 7 TRNO          0.643
 8 WD            0.640
 9 DPZ           0.561
10 LOPE          0.533

The tickers in the top 10 didn’t change, but the order of the tickers did change. That would make a big difference if we implemented a sorting algorithm based on Sharpe Ratio cut-points.

Let’s quickly port this to ggplot(). We’ll remove the x-axis labels and tick marks by supplying element_blank() to axis.text.x and axis.ticks.x, customize the tooltip with the text aesthetic mapping again and pipe to ggplotly(). That will give us an interactive, scrollable legend.

(
prices_riingo %>% 
  select(ticker, date, close) %>% 
  mutate(monthly_return = close/lag(close) - 1) %>%
  tq_performance(Ra = monthly_return,
                 performance_fun = SharpeRatio,
                 FUN = "StdDev", 
                 annualize = TRUE) %>%
  `colnames<-`(c("ticker", "sharpe_ratio")) %>% 
  arrange(desc(sharpe_ratio)) %>% 
  ggplot(aes(x = ticker, y = sharpe_ratio, color = ticker, text = paste("ticker:", ticker, "<br> sharpe ratio:", round(sharpe_ratio, 3), sep = " "))) +
  geom_point(aes(color = ticker)) +
  theme(axis.text.x = element_blank(),
        axis.ticks.x = element_blank())
) %>% 
  ggplotly(tooltip = "text")

Note, though, that the Sharpe Ratio is an aggregate measure in the sense that it gives us one number as a measure of performance over time, and could be hiding pockets of high turbulence. To adjust for that, let’s calculate the rolling mean return, rolling standard deviation of returns, and the ratio of those two will give the rolling Sharpe Ratio. I’ve heard rumors that there’s a blazing fast package for rolling calculations called, appropriately, the roll package. roll has a built-in rolling mean function called roll_mean and a rolling standard deviation function called roll_sd.

We can call those in inside of mutate() as long as we convert our monthly_returns column to a matrix inside the pipe. Here’s the full code flow:

prices_riingo %>% 
  select(ticker, date, close) %>% 
  mutate(monthly_returns = close/lag(close) - 1,
         rolling_mean = roll_mean(as.matrix(monthly_returns), 6, na_restore = T),
         rolling_sd = roll_sd(as.matrix(monthly_returns), 6, na_restore = T),
         rolling_sharpe = rolling_mean/rolling_sd) %>% 
  na.omit() %>% 
  head()
# A tibble: 6 x 7
# Groups:   ticker [1]
  ticker date                close monthly_returns rolling_mean rolling_sd
  <chr>  <dttm>              <dbl>           <dbl>        <dbl>      <dbl>
1 ABR    2004-10-29 00:00:00  20.9         -0.0586       0.0138     0.0658
2 ABR    2004-11-30 00:00:00  23.6          0.129        0.0396     0.0768
3 ABR    2004-12-31 00:00:00  24.5          0.0398       0.0375     0.0765
4 ABR    2005-01-31 00:00:00  23.8         -0.0302       0.0404     0.0729
5 ABR    2005-02-28 00:00:00  25            0.0504       0.0324     0.0677
6 ABR    2005-03-31 00:00:00  24.8         -0.01         0.0201     0.0676
# … with 1 more variable: rolling_sharpe <dbl>

And for timing purposes:

tic()
rolling_sharpe_roll_timed <- 
  prices_riingo %>% 
  select(ticker, date, close) %>% 
  mutate(monthly_returns = close/lag(close) - 1,
         rolling_mean = roll_mean(as.matrix(monthly_returns), 6, na_restore = T),
         rolling_sd = roll_sd(as.matrix(monthly_returns), 6, na_restore = T),
         rolling_sharpe = rolling_mean/rolling_sd) %>% 
  na.omit()
toc()
0.021 sec elapsed

On my machine, that took about 0.1 seconds. That is one-tenth of a second to perform three rolling calcs on 110 tickers.

Here’s the same calculation with tidyquant.

sharpe_tq_roll <- function(df){
  SharpeRatio(df, 
              Rf = 0, 
              FUN = "StdDev")
}

tic()
rolling_sharpe_tq_timed <- 
prices_riingo %>% 
  select(ticker, date, close) %>% 
  mutate(monthly_returns = close/lag(close) - 1) %>% 
  na.omit() %>% 
tq_mutate(
        select     = monthly_returns,
        mutate_fun = rollapply, 
        width      = 6,
        align      = "right",
        FUN        = sharpe_tq_roll,
        col_rename = "tq_sharpe"
    ) %>% 
  na.omit()
toc()
11.208 sec elapsed

That took about 10 seconds on my machine.

Alright, speed tests are fun but let’s get these rolling calculations into some visualizations. Let’s see how a first try with ggplotly() treats us.

(
rolling_sharpe_roll_timed %>% 
  ggplot(aes(x = date, y = rolling_sharpe, color = ticker)) +
  geom_line(show.legend = FALSE)
) %>% 
  ggplotly()

Well, that chart isn’t bad - it gives some sense of the range for these rolling Sharpes and we can again double-click the legend to isolate the fund of our choice.

That’s all for today. Next time we’ll get into some CAGR calculations and portfolio building. If you like this sort of code through check out my book, Reproducible Finance with R. Thanks for reading and see you next time!

Some final notes:
1. Not specific to finance, but several of the stringr and ggplot tricks in this post came from this awesome Business Science University course. 2. I’m also going to be posting weekly code snippets on linkedin, connect with me there if you’re keen for some R finance stuff.

Share Comments · · · ·

You may leave a comment below or discuss the post in the forum community.rstudio.com.