Open Science for Synthesis: Gulf Research Program

July 10 - July 28, 2017
NCEAS, Santa Barbara, CA

Applications are now CLOSED. We will make announcements in March.
 

NCEAS Logo               Training in Open Science Enables Synthetic Science within the Gulf Research Program

Open Science for Synthesis: Gulf Research Program is a hands-on data science course for both early career and established researchers to gain skills in data science, including scientific synthesis, reproducible science, and data management. These skills are critical for understanding the complex environmental, human, and energy systems in the Gulf of Mexico, especially following large disturbance events like the Deepwater Horizon oil spill in 2010. This 3-week intensive training, convening in July 2017 at the National Center for Ecological Analysis and Synthesis (NCEAS) in Santa Barbara, CA, will revolve around scientific computing and scientific software for reproducible science.
 
The course will focus on techniques for data management, scientific programming, synthetic analysis, and collaboration techniques through the use of open-source, community-supported tools. Participants will learn skills for rapid and robust use of open source scientific software. These approaches will be explored and applied to scientific synthesis projects related to the Gulf of Mexico’s human, environmental, and energy systems, and will increase community capability and efficiency in synthesis research. The funding for this project is provided by a grant from the Gulf Research Program, dedicated to improving understanding of the Gulf of Mexico’s human, environmental, and energy systems in response to the Deepwater Horizon oil spill.

Topics

This is a hands-on course that incorporates skill building with group synthesis projects that cement the skills in real-world practice. The course will weave together several core themes which are reinforced – and injected into synthetic scientific research process – through daily work on group synthesis projects relevant to the Gulf Research Program. Core training themes will address:
  • Collaboration modes and technologies, virtual collaboration
  • Data management, preservation, and sharing
  • Data manipulation, integration, and exploration
  • Scientific workflows and reproducible research
  • Programming using agile and sustainable software practices
  • Data analysis and modeling
  • Communicating results to broad communities
Throughout the course participants will receive a solid foundation in computing fundamentals for doing synthetic research in today’s computational- and data-intensive era. This includes:
  • Instruction on languages like R and Python for data manipulation, analysis, and visualization
  • Analytical techniques for synthesis research, including meta-analysis and systematic reviews
  • Survey of general programming constructs, paradigms, and best practices
  • Exposure to the Linux/UNIX command line environment and useful tools
  • Discussion of cyberinfrastructure trends supporting open, networked, reproducible science

Group Synthesis Projects

Participants will form small synthesis teams that focus on utilizing the software skills they learn each day in the context of cross-cutting science research projects. Using an open community engagement process, participants will maximize their success in collaborative research that could potentially lead to publishable results. Project topics will focus on themes relevant to the Gulf Research Program, but will be participant-selected and executed with consultation from instructors.

Eligibility

Both early (upper-level graduate students) and established researchers from the Gulf research community are encouraged to apply. Participants will be selected on the basis of their current research or work activities; their previous experience with open science practices, data management techniques and analysis methods; and their current or former opportunities to access training in these areas. International applicants are eligible however travel reimbursement will be restricted to that indicated below.

Travel and Accommodation Support

Participants will receive support to cover the cost of an economy round trip airfare within the contiguous United States. Course participants will also be provided with accommodation in Santa Barbara for the duration of the course.

How to Apply

Applications are now CLOSED.

 

Instructors

Matthew B. Jones (PI) is the Director of Informatics Research and Development at NCEAS and has expertise in environmental informatics, particularly software for management, integration, analysis, and modeling of data. Jones has taught at over 20 training workshops over a decade on data science topics including analysis in R, GitHub, programming (e.g., Python), data management, quality assessment and reporting, metadata and data infrastructure, scientific workflow systems, and other topics.

Amber Budden (co-PI) is the Director for Community Engagement and Outreach at DataONE. She holds a PhD in Ecology in addition to research experience in bibliometrics. She has coordinated and taught numerous workshops focused on data management for Earth and environmental science. Her skills include data management, science communication and outreach, and training evaluation.

Tracy Teal is a co-founder and the Executive Director of Data Carpentry. As an Assistant Professor at Michigan State University in bioinformatics, she saw that effective data skills have become foundational for research and that data training needs to scale along with data production. She is involved in the open source software and reproducible research communities, including as an Editor at the Journal for Open Source Software.

Mark Schildhauer is Director of Computing at NCEAS. His research interests include informatics, the semantic web, and scientific workflows, with a focus on environmental science. Schildhauer and colleagues developed the extensible observation ontology, OBOE, and a semantic annotation architecture that improves data discovery and re-use. He helped develop Ecological Metadata Language, is a co-founder of the Kepler scientific workflow project, and led the SEEK Knowledge Representation group.

Bryce Mecum is a scientific software engineer with expertise in data analysis and programming and data management systems, including systems like R, GitHub, repository software, Python, and UNIX. He has a background in fisheries modeling and management, and builds software systems supporting environmental synthesis.

Chris Lortie is an integrative scientist with expertise in community theory, sociology, and quantitative methods, specifically systematic reviews, meta-analysis, experimental design, R and statistical analyses. Collaboration and networks are central to his research both conceptually and internationally. As such, his empirical research involves biogeographical comparisons of many forms of community dynamics (plants, animals, & people). He is a Professor at York University in Canada and a Research Associate at NCEAS. He is the Editor in Chief for Oikos for all formal synthesis papers, and an editor for PLOSONE, PeerJ, Gigascience, and Nature Scientific Data.

Julien Brun is a scientific programmer at NCEAS with expertise in data analysis and programming, data management systems, GIS, and analytical modeling. He has worked extensively in systems like R, GitHub, Python, and UNIX. His scientific background is in Ecohydrology and Earth observation techniques (remote sensing and GIS).