This curated list of resources is meant to welcome environmental scientists who are interested about engaging in open data science. This R-focused list is not meant to be comprehensive of all the incredible resources available; it represents a selection of material immediately useful to scientists we have worked with. It is an evolving effort and Openscapes resources are also listed below.
WORKING COLLABORATIVELY, OPENLY, & EFFICIENTLY
Thinking deliberately about data and embracing existing open tools and workflows for science increases efficiency, broadens the scientific questions that can be asked, introduces a network of online allies outside scientific domains, and makes it easier to ask for help.
For example, the Ocean Health Index describes how open software and borrowing practices from software engineers was game-changing for collaborative coding and communication.
- Our path to better science in less time using open data science tools — Lowndes et al. 2017 Nature Ecology & Evolution
- Data-intensive ecological research is catalyzed by open science and team science - Cheruvelil & Soranno 2018 BioScience
- Skills and knowledge for data-intensive environmental research - Hampton et al. 2017 BioScience
- Creating and maintaining high‐performing collaborative research teams: the importance of diversity and interpersonal skills — Cheruvelil et al. 2014 Frontiers in Ecology & the Environment
- PREReview Outbreak
JOINING & BUILDING LEARNING COMMUNITIES
Learning to code should be fun since you are learning powerful and empowering tools that will be game-changing for your science. So learn with friends, and grow your community.
R WITH RSTUDIO & GITHUB
We recommend using R with RStudio, which not only provides a supportive infrastructure but enables direct connection to GitHub.com (i.e. without additional software or use of the command line). Further, workflows with RStudio’s tidyverse & friends and with RMarkdown create good practices for collaborative, reproducible workflows.
- So you want to learn R?
- Resources for R and data science
- The importance of open data science tools in science: a list of references
Teaching official university courses is a great idea! These are a few examples of course materials that you can reuse as-is or adapt for your own context — many are environmental science focused. From our experience, we don’t recommend mix-and-matching between resources — especially if this is your first time teaching. This is because there is logic and cadence to each one and trying to combine them can result in reinventing more wheels than intended. Instead, we recommend finding one that best suits you (based on length, content, or audience) and potentially adapting it to your needs.
- STAT 545: Data wrangling, exploration, and analysis with R by Dr. Jenny Bryan
- Introduction and advanced environmental data analysis & stats in R by Dr. Allison Horst
- Reproducible quantative methods - Dr. Christie Bahlai
- Data carpentry for biologists — Dr. Ethan White & Dr. Zachary Brym
- Reproducibility for Data Science — Dr. Ben Marwick
OPENSCAPES RESOURCES, MEDIA, & PRESENTATIONS
We have focused on openly communicating the process behind our development of Oceanscapes. Here are presentations we have given about Openscapes, and materials for the Openscapes Champions series. They are licensed under a Creative Commons Attribution 4.0 International License; You are welcome to reuse and remix with attrbution.
- Supercharge your research: a ten-week plan for open data science — Nature Career Column co-authored by Openscapes Champions
- Media & Presentations — media and slides about Openscapes for various audiences
- Champions Lesson Series — lesson book for the Champions program, with links to slides and course agendas