NCEAS is seeking applications for the 2020 Data Science Fellows program. Our next session will begin as early as November 2019. This practicum-style program gives fellows the opportunity to gain practical knowledge and skills needed to manage national-scale data repositories.
Fellows will gain experience and mentorship in activities directly related to the research project undertaken. Example research and development opportunities are listed below, however we also encourage the development of new projects and collaborations within this fellowship.
Example project opportunities
Python data and metadata processing library
We have a comprehensive suite of R packages that allow us to process data and metadata. We can make the tools more accessible to a broader community of scientists by converting useful functions from our current R packages (arcticdatautils, datamgmt, metajam, recordr) to Python. This library, or libraries, should be built using the pre-existing DataONE python library https://github.com/DataONEorg/d1_python as a dependency. This project will require familiarity with DataONE infrastructures and the current packages; before advancing into the development phase for the python package.
Quality Assessment of Arctic Data Center metadata and data
Harmonization of data publishing and documentation R packages
We currently have several R packages that have been developed organically through several rounds of development. The goal of this project is to consolidate our R packages in a coherent way for our users to interact with DataONE API. We will use a hierarchical approach with low level packages designed for advanced users and public facing packages for high level interactions. The packages within scope of the project are: R dataone, datapack (helpers, public facing), arcticdatautils (helpers, non-public facing), and datamgmt (helpers, non-public facing), recordr (prov) and metajam (download helper, public facing). There will also a need to organize interactions with external packages such as EML, Assembly line from EDI, rdryad and zenodo. There is a need to figure out how we can take the best parts of each of these, remove redundancy and put them into a coherent set of packages on CRAN.
Education modules for undergraduate instruction
Data citation and reuse
Data Science support of LTER synthesis working groups (data harmonization and analysis)
The department is especially interested in candidates who can contribute to the diversity and excellence of the academic community through research, teaching and service.
The University of California is an Equal Opportunity/Affirmative Action Employer and all qualified applicants will receive consideration for employment without regard to race, color, religion, sex, sexual orientation, gender identity, national origin, disability status, protected veteran status, or any other characteristic protected by law.