By Jenny Seifert
Julie Lowndes likens “open data science” to the Force (yes, as in Star Wars), a penetrating energy that empowers scientists to wield their data more quickly and efficiently than they ever could before. A hybridized term that Lowndes quite possibly coined, open data science is both the movement and art of turning raw data into understanding (data science) through collaboration and open principles (open science), and with a contagious passion, Lowndes is at the forefront of spreading its good word to environmental scientists.
Lowndes recently launched Openscapes, a mentorship program aimed at awakening the open data science Force in early career environmental scientists. The program is an answer to the deficit of data science training in many graduate programs that then haunts early career scientists, particularly as they take on leadership roles and all the accompanying pressures.
“All ecologists work with data, but we all find out the hard way that we have no idea how,” said Lowndes. “With Openscapes, I want to welcome more people to engage with existing open data science tools and communities, no matter where they are starting from.”
Lowndes understands the pain firsthand. Originally a field biologist herself, Lowndes’ own awakening began while on a research ship off the coast of Mexico, where she was collecting data for her dissertation on Humboldt squid, a dataset that would eventually grow so huge that Excel would refuse to open it.
Fortuitously, on the ship with her was a scientist who would become one of her own mentors, Steven Haddock of the Monterey Bay Aquarium Research Institute. He introduced her to the wonders of code and what we now call data science, and the lessons she learned would help her liberate her data and ultimately shift her path towards Openscapes.
Her next big hurdle-turned-aha-moment occurred once she joined NCEAS, as a scientist on the Ocean Health Index, a collaborative data synthesis project of global proportions. In seeking a better way than fumbling with unruly files with names like “data_analysis_final_v2b.xls,” Lowndes and her teammates discovered the “power of welcome” in the open science community, with its ethos of working and learning together and online. (Read more about their path to better science in less time.)
“I have had incredible mentors who have given me guidance and confidence to learn software tools that have been game-changing for my science, and for my life really. And I love doing the same for others,” said Lowndes.
Through Openscapes, Lowndes is paying it forward. Hatched thanks to her fellowship with Mozilla, the program provides mentees, called Champions, a safe space to share their data challenges as they learn the ever-expanding landscape of open data science, and then go forth and champion the Force within their labs and beyond.
Openscapes is now a cornerstone of NCEAS’ environmental data science Learning Hub, and if you’re as excited about it as we are by the end of reading this month’s NCEAS Portrait, we encourage you to get involved as a community member or as a potential future Champion.
Why do you think the culture of science needs to be more open?
JL: Coming from an environmental angle, where we have climate change, pollution, and politics working against us, we can’t afford to lose time by reinventing the wheels of data analysis. We can learn so much from each other, build off of each other, and do so much more together by working openly. And time is of the essence.
Plus, in my experience, an open culture is a more positive culture: expectations are clear, including appropriate behavior and the expectation and channels to address issues head-on. Openness can ignite knowledge sharing and innovation, and it puts people first. And we need that.
In what ways does Openscapes answer this need for openness?
We start off by thinking about openness within the lab. Traditionally, we tend to think about scientists by what makes them unique – the vertical categories of studying organisms, locations, techniques, or disciplines. With Openscapes we focus on what unites these scientists: the horizontal attributes and needs that come from working with data, including organizing it, collaborating around it, establishing workflows, learning new techniques, and communicating about it internally and externally. Openscapes creates a space to learn new options and talk about these issues. As soon as we start talking about them, we find similarities, shared challenges, and can brainstorm solutions and share lessons learned. It’s so empowering.
Why Openscapes, and why now?
JL: Since my generation of scientists first started developing our homegrown ways back in graduate school, the whole landscape of analytical and collaborative tools has changed dramatically. There are incredible software tools and communities dedicated to welcoming scientists to engage – including R, RStudio, rOpenSci, RLadies, Mozilla, and the Carpentries – but not everyone is aware of them or feels that they could be a part of their futures. I want to change that.
I think it’s critical now because there is incredible potential for data science and machine learning to accelerate scientific discovery, but so many scientists will be excluded from this if they lack the prerequisite skillsets and mindsets that would enable them to engage. We need to reduce friction in our analyses, as well as in our day-to-day interactions. We are all overworked and trying to navigate endless amounts of information in both our personal and professional lives. How can we streamline our interactions with data and each other? I think the same tools that streamline analyses and project management can also streamline communication both online and in the real world.
For example, something unique about Openscapes is that Champions learn together with their lab members. This accelerates what would be an otherwise slow process, if the head of the lab had to be the expert on top of their other obligations.
What was your ultimate “aha moment” that turned into Openscapes?
JL: I have been kicking around the idea for Openscapes for years – I have so many lists of failed name ideas, as I tried to imagine its shape. The outcome I wanted has been clear; I just did not know what I was going to do to make it happen. My fellowship with Mozilla has given me the time and mentorship to figure it all out.
The “aha moment” came when I took part in the Mozilla Open Leaders program. Mozilla has created an online mentorship program that gives both personal one-on-one attention and time for group discussion, bonding, and peer-learning. It grows community by including people as mentees, mentors, and experts, and teaches concepts that help everyone improve their practices incrementally, no matter what their project is or where they are in their process.
And I thought: this is what I need for Openscapes. Mentorship has been such a critical thread in my experience as a scientist, and I could create a mentorship program for environmental scientists to engage and be empowered by open data science.
In five years, what do you hope the culture in science will look like, and what impact do you hope Openscapes will have had on that future?
JL: In five years, demand and value for these skillsets will have increased, which will ultimately fuel academia to meet that demand with formal education and well-paid academic jobs in open data science, which will provide more stability for not just individuals but also teams and labs. Scientific culture will improve as open practices – including collaboration, kindness, and empathy – become the norm. And all of this will allow research to unveil environmental solutions faster.
I think Openscapes can play a role in this future. In our 2017 paper we said the biggest barriers to engagement were lack of exposure to relevant open data science tools and of confidence to use them. I would now add a lack of support within the scientific culture. Openscapes is designed to tackle these three head on: by getting scientists aware of and excited about open data science, by empowering them with existing open tools and communities, and by amplifying their successes and challenges, to create more champions and build community.
What’s your favorite R package and why?
JL: Oh man, this is an unbelievably difficult question. Cornered, I would say RMarkdown. It is transformative for reproducible research, full stop. It is absolutely game-changing for communication, publishing, and open science.
For example, you can write a sentence saying “Our results show that [calculated_value] percent of species do this thing,” and that percent value will be calculated automagically with the click of a button. It means that, if you get more data at the eleventh hour, you don’t need to go back and pick through your text to change values and re-create, re-save, and re-paste figures.
This 1-minute video about RMarkdown shows the amazing possibilities. I use RMarkdown every day, even though I don’t analyze data every day. I use it to write and publish free online books, like the Openscapes lesson series, and websites, like openscapes.org.
NCEAS Portraits feature the people behind our work and impact.