5 Takeaways from rstudio::conf 2020

February 18, 2020

I attended rstudio::conf 2020 and boy was it a valuable experience.

In this post, I share my 5 biggest takeaways from the conference.

What is rstudio::conf?

rstudio::conf is an annual conference for R users hosted by RStudio, the company behind Tidyverse, Shiny, RMarkdown, and many other popular R packages.

This year the conference was hosted in San Francisco, California at the Hilton in Union Square. While many folks flew in from across North America, and some from overseas, I live a short distance away and traveled 30 minutes on the Muni. My shortest conference trip, by far!

The conference ran from Monday, January 27 to Thursday, January 30. I didn't attend the workshops on Monday and Tuesday, but I did make it to the reception on Tuesday evening where I presented 25 Data Viz Trends in the #TidyTuesday Community. The purpose of the presentation was to showcase the work of folks who participate in Tidy Tuesday, a weekly social data project in R. Across more than 3,000 (!) data visualizations submitted since the project began in April 2018, I identified 25 emerging trends, including beeswarm plots, calendar heatmaps, the use of annotations to highlight interesting data points, dark mode, screencasts, and more.

Wednesday opened with a brief "We're so happy you're here" from Hadley Wickham, followed by a keynote talk from JJ Allaire, CEO of RStudio, who announced that RStudio, Inc. is now officially RStudio, PBC. As a public-benefit corporation, RStudio's corporate decisions as a for-profit company will always align with its open-source mission. Great news!

The rest of Wednesday was filled with talks, with occasional breaks for food, coffee, and socializing. The day ended with a nighttime trip to the California Academy of Sciences in Golden Gate Park.

Thursday opened with a keynote from Jenny Bryan on how to debug R code and help others help you by sharing reproducible examples, or reprex. The rest of the day was much like Wednesday, with talks and short breaks, and it ended with dozens of 5-minute lightning talks.

It was an action-packed few days, and I've done my best to summarize my personal rstudio::conf 2020 experience down to five key takeaways.

1. You can put R in production without too much fuss

Putting R in production was a hot topic this year.

Alex Gold kicked off a session on R in production with Deploying end-to-end data science with Shiny, Plumber, and Pins in which he introduced a new package from RStudio called pins. Pins allows you to "pin" a resource, like a remote file or an expensive local computation, and then retrieve it later in your pipeline. By doing so, you avoid the trouble of writing intermediate data to a file or storing resources temporarily in a database. Pins can also be useful for easily sharing resources across your organization.

Heather and Jacqueline Nolis presented We're hitting R a million times a day so we made a talk about it, which chronicles their experience running R in an enterprise setting at T-Mobile. They highlight several packages to get the job done, including

  • plumber to create REST APIs to host models,
  • testthat to test API endpoints,
  • loadtest to see how an API performs under a high volume of traffic, and
  • log4r to record everything.

For resources on putting R in production, visit putrinprod.com.

2. Data visualization is fundamental

I'm a sucker for a good data visualization. So naturally I gravitated to the sessions on visualization and ggplot2.

The talks were great, and there were two that stood out to me in particular.

The first was The Glamour of Graphics by Will Chase. As you may know, the Grammar of Graphics, on which ggplot2 is based, asserts that the components of a graph can be described by data, aesthetics, and geometries. Similarly, the Glamour of Graphics as outlined by Will asserts that the design of a graph can be described by:

  1. layout, which includes alignment of text, white space, grids, lines, and borders,
  2. typography (visit practicaltypography.com to learn more), and
  3. color.

Will excelled in making fundamental design principles accessible to an audience of non-designers.

The second talk was Designing Effective Visualizations by Miriah Meyer who heads the Visualization Design Lab (VDL) at University of Utah. Miriah and the VDL team work closely with researchers from a variety of domains to design and develop visualizations for their big, heterogeneous, and complex datasets.

In her talk, Miriah recounted a collaboration with USAID to visualize efforts to combat the Zika outbreak in Central and South America. After some initial development, the VDL team delivered an interactive visualization to their Zika researchers which seemed to satisfy all their requirements. Job well done! But after a short time, it appeared nobody was bothering to use it. Why?

As it later became clear, there were large discrepancies in the data across different regions. Country A might only report cases of Zika which resulted in death, while Country B might report any possible case of Zika, confirmed or otherwise. To address this, the VDL team added the capability for Zika researchers to comment on data discrepancies and discuss them within the visualization. (For more info, read their paper A Framework for Externalizing Implicit Error Using Visualization.)

I found Miriah's story fascinating because it teaches that the value of data visualization is more than just the visualization itself, it's about how it empowers someone to make sense of their data.

3. R is more web friendly than ever

Shiny has come a long way since its release in 2012.

I remember trying to create an application with it in its early days. I was in grad school at the time and I thought a nice Shiny app would impress my new advisor. "Wow, what an amazing, talented student!" he would say.

Except he didn't, because I never finished it. I tried to reverse engineer one of the handful of example applications on the Shiny website, but I couldn't quite figure it out.

Thankfully there are many resources for a budding Shiny developer today. In fact, there are now more than 100 example applications in the Shiny Gallery (of which, I'm proud to say, my tidytuesday.rocks app is one).

RStudio has done a great job encouraging folks to develop and share their Shiny apps. In Making the Shiny Contest, Mine Çetinkaya-Rundel reviewed the results of the 2019 Shiny Contest, which received a total of 136 submissions across a range of applications. Among the winners of the contest were a tool for interactive exploration of high-throughput biological datasets, a lyrical analysis of 69 Love Songs by Magnetic Fields, a hex sticker memory game, and Jenna Allen's Pet Records app which keeps track of her dogs' medical and vaccine records. Mine ended her talk by announcing the 2020 Shiny Contest. The submission deadline is March 20, 2020, I look forward to seeing what everyone creates this time around!

In another session, Joe Cheng presented Styling Shiny apps with Sass and Bootstrap 4 in which he introduced bootstraplib, an R package for styling Shiny apps directly from R instead of writing raw CSS and HTML. It even includes a way to interactively theme your application, producing code which you can copy and paste into your existing application.

Shiny is not the only way to make R more web friendly, however.

A great example is Toward a grammar of behavioural experiments by Danielle Navarro. Every semester, Danielle watched her psychology students struggle to use jsPsych, a JavaScript library for running behavioral experiments in the web browser. The students were already using R for statistical analysis, data wrangling, and general programming. For many, it was their first programming language. Why couldn't they also use R for experiments? So Danielle developed jaysire, an R package that helps her students build jsPsych experiments in R, no JavaScript required.

Like it or not, JavaScript is the language of the web. But that doesn't mean everyone should be required to learn it in order to share their work online. Is there a popular JavaScript library in your field that is in need of a related R package? Are you up for the challenge?

4. Always tell a good story

You've probably heard this advice before — when you're presenting, be sure to tell a good story.

Sure, of course! That's easy to say, but it's difficult to do in practice. How exactly are you supposed to strike the right balance between storytelling and sharing important details?

Frankly, I don't know. There were several speakers at rstudio::conf who did a heck of a job though, so I encourage you to learn from them.

Jenny Bryan spoke about the dreaded Object of type 'closure' is not subsettable, a common error in R that arises when you mistakenly try to subset a function. Jenny used this error as a jumping off point to talk about best practices for debugging your R code. There's the hard reset ("have you tried turning it off and on again?"), debugging tools like traceback and browser, the reproducible example, or reprex, to help others help you fix your problem, and finally, after you've solved it, writing unit tests and informative error messages to make sure you don't face the same problem again. Jenny's talk was memorable because she shared experiences that every R user could relate to.

In a similar fashion, Clause Wilke focused his attention on the pain of formatting text in ggplot2. In Spruce up your ggplot2 visualizations with formatted text, Claus expressed his frustration with the current approach to text formatting. "It shouldn't be this hard to italicize my axis labels," he thought. So he took it upon himself to make it easier and created ggtext, an R package for improved text rendering support for ggplot2. Using ggtext, Claus dazzled the audience — and I do mean dazzled, I heard many oohs and aahs — with plots filled with italicized text, colorful text, images as axis labels, and beautiful mathematical formulas. Claus made it all look so easy, and I couldn't wait to give it a try.

A third story that stood out to me was Learning R with humorous side projects by Ryan Timpe. Ryan's talk had one clear message: the best way to learn R is to have fun with it. And of course he led by example. Ryan made a drinking game based on The Golden Girls using the tidytext package, a fake dinosaur name generator with keras, and he even wrote a package to build 3D LEGO mosaics with R called brickr. The package was so successful that LEGO reached out to Ryan and offered him a role as a Senior Data Scientist. Now that's a happy ending!

5. R's best feature is its community

The R community is simply outstanding.

It's one of the features that really sets R apart from other programming languages like Python, which I have personally found to be less welcoming.

Here's what other folks are saying:

  • Will Chase was pleased to find that the community was as friendly as he heard it to be.
  • Kasia Kulma appreciated that rstudio::conf prioritized inclusivity and tolerance.
  • Kieran Healy nailed the "vibe" of the conference.

And that's that

If you were also at rstudio::conf 2020, what were your takeaways? If not, but you made some time to watch the talks online, is there a particular topic that resonated with you? I want to hear about it, let me know on Twitter.

See you at rstudio::conf 2021 in Orlando, Florida!