This module focuses on the practice and application of open science for data. It provides a 'how to' process for finding and assessing open data for use, for making open data and for sharing open data. The step-by-step flows are easy to follow and can be used as checklists after you complete the module. Some of the key topics discussed include: data management plans, the process for assessing data for reuse, creating a plan for making data including choosing open formats and adding documentation, and the considerations for sharing data and making your data citable.
After completing this module, you should be able to:
- Describe the meaning and purpose of open data, its benefits, and how FAIR principles are used.
- Recall methods to assess the reusability of data based on its documentation, and cite the data as instructed.
- Implement an open data management plan, select open data formats, add the needed documentation, including metadata, README files and version control, to make the data reusable and findable
- Evaluate whether your data should and can be shared.
- Recall practices to make data more accessible, including the registration of an affiliated DOI and the inclusion of citation instructions in documentation.
This lesson defines open data, its benefits, and the practices that enable data to be open. In addition, the lesson takes a closer look at how FAIR applies to open data as well as at the criticall role of metadata. It wraps up with a brief discussion on how to plan for open data in the scientific workflow and tasks guided by the use, make, share framework.
In this lesson you learn how to discover, assess, and cite an open data set. You start by exploring repositories and learning about the issues and considerations for searching datasets. You then learn how to determine if the dataset is suitable for your use by learning what to review in documentation, licenses, and file formats. The lesson wraps up with a discussion about the importance of citing the datasets and how to read and follow citation instructions.
In this lesson, you learn the criteria and tasks needed to ensure that the datasets you make are open and reusable. The lesson starts with a discussion on creating a data management plan and then continues with topics on selecting open data formats and how to include metadata, readme files, and version control for your data. It wraps up with a discussion on open licenses for data.
In this lesson, you learn about the practice of sharing your data. The discussion starts with a review of the sharing process and how to evaluate if your data are sharable. Next, you take a look at ensuring your data is accessible with a closer look at repositories and the lifecycle of data accessibility from the selecting a repository to maintaining and archiving your data. The lesson then discusses some steps to make the data as reusable as possible, and concludes with a section about considering who will help with the data sharing process.
In this lesson, you will get some practice writing a data management plan. You will then learn how you can get involved in open data communities. You will also learn about resources you can start to use and training you can take to start your journey with open data.
In addition to the TOPS module training, the community resources below are excellent information sources about Open Data.
- NASA's Transform to Open Science github landing page for TOPS
- NASA SMD's Open-Source Science Guidance
- Repository of Federal Enterprise Data Resources
- The Open Data Handbook
- GODAN MOOC on Open Data Management in Agriculture and Nutrition
- Carpentries Data Management for Scientific Computing
- Great open data in Africa farming MOOC
- FAIRsharing.org; The FAIR Principles
- Reproducible Research and Data Analysis. FOSTER
- Some guides on publishing data are available here and here
- An example of tagging and sharing data from Harvard University, DataTags