Discovering existing Earth and environmental science data that can be applied to answering new questions or testing new hypotheses is ever more important in the era of “big data.” Data Observation Network for Earth (DataONE) is building a cyberinfrastructure that facilitates data discovery and access and is working to foster a culture of data sharing and sound data management. As part of my work with DataONE’s Community Engagement and Education working group, I have contributed to development of a data management education curriculum. A formal evaluation of the curriculum, as implemented in a 2-day data management workshop, highlighted the need for infusion of more ‘real-life’ stories into the education materials and inspired the launch of the Data Stories project. For this project, Jessica Bragg and I are conducting interviews with researchers and data managers to collect success stories and cautionary tales related to data management and sharing. As the project progresses, we will be integrating these stories into our curriculum and publishing them online as a resource for the data management education community.
I am also collaborating with members of the DataONE Data Integration and Semantics working group and the SONet project to enhance environmental and earth science ontologies (sets of concepts and relationships between concepts that have been formally described) and apply them in the context of DataONE query tools. By integrating these formal ontologies and taxonomies with statistical representations derived from keywords and descriptions researchers use to describe their data, we will facilitate searching for data by offering suggestions for query refinement through an interactive interface. The image shown here is based on the outputs of a statistical topic model (Latent Dirichlet Allocation), and illustrates the thematic breadth of data currently available for discovery through DataONE.
Learn more about Stacy Rebich Hespanha