ADVANCED METHODS OF DATA SCIENCE IN BIOMEDICAL RESEARCH
Webinar 1 – New NIH Requirements for Scientific Data Management and Sharing
This webinar was livestreamed on Wednesday, January 18, 2023. A recording is available below, followed by additional responses to questions.
There were many excellent questions posed during the live question and answer sessions. Some questions submitted were not answered due to time constraints. Additional answers to some of those questions are below. The responses by NIH were submitted by the NIH Office of Science Policy.
- Do you expect further changes in NIH data management policies and in what directions?
Our hope is that there will be greater specificity as to how and what data management and sharing is best practice for NIH funded studies. Peter Elkin
NIH: At this time, we are not planning changes to the NIH DMS Policy. However, we will continue to provide guidance or resources for implementation as necessary, such as additional FAQs, and will continue to monitor evaluation of the Policy. We will also continue to evaluate the role of the NIH Genomic Data Sharing Policy, especially in light of responses received to NOT-OD-22-029.
- As data becomes reusable in further studies, what quality control do you recommend?
Strong Data definitions with pointers to metadata describing the data governance and data provenance. Links to standardized ontologies should be provided when available. Peter Elkin
NIH: The DMS Policy does not require a particular approach to quality control. NIH recommends use of repositories that have quality control processes. Funding Opportunity Announcements may in some cases provide further specificity on expectations for particular awards.
- How is NIH planning to support the additional costs that research teams incur in making data rapidly publicly available (mostly personnel costs)?
You can put data management costs as direct costs in your grant applications. Peter Elkin
NIH: NIH permits applicants to request reasonable costs of data management and sharing in their budget requests. See: https://sharing.nih.gov/data-management-and-sharing-policy/planning-and-budgeting-for-data-management-and-sharing/budgeting-for-data-management-sharing
- Can you tell us about your persistent identifier timeline related to OSTP policies again please
This regulation takes effect in 2025. Peter Elkin
NIH: Federal agencies are required to develop a plan by December 31st, 2024 that includes assigning unique digital persistent identifiers for a variety of entities and objects, among other requirements. Agencies must implement final policies by December 31, 2026. For more details, see: https://www.whitehouse.gov/wp-content/uploads/2022/08/08-2022-OSTP-Public-Access-Memo.pdf
- How much does NIH project it will cost (e.g., annually) for NIH to financially support its certified data repositories and related oversight?
I have not seen such a figure. Peter Elkin
NIH: NIH will continue to support numerous data repositories. NIH has not provided specific figures for the cost of supporting repositories, but information on NIH-supported projects including funding are available through NIH RePORTER: https://reporter.nih.gov/
- How do we protect the interests of young, international and under-resourced investigators who collected data but do not have the funding/skills to do analysis?
People are encouraged to collaborate with data scientists to meet the new requirements. Peter Elkin
NIH: The DMS Policy does not expect that data will be shared until they are the subject of a peer-reviewed journal publication, or by the end of the award, whichever comes first, which provides investigators the opportunity to publish first on data that they collect. Please note that most training awards, such as Ts, Fs, and some Ks, are not subject to the DMS Policy.
- Are there plans for publicly available analysis environments connected to repositories? Both to lower barriers to meaningful access and to improve security?
Many institutes and centers have set up such networks and communities. Peter Elkin
NIH: NIH already provides a number of platforms that can analyze data from repositories, such as Science and Technology Research Infrastructure for Discovery, Experimentation, and Sustainability (STRIDES) Initiative, NIMH Data Archive (NDA), ImmPort, and Genomic Analysis, Visualization and Informatics Lab-space (AnVIL), and will continue to consider the needs for these in the future.
- Considering that there will be additional cost for DMSP, will the NIH increase its $500,000 direct cost policy?
Not that we are aware of. Peter Elkin
NIH: Data management and sharing costs are included as direct costs. Applicants must seek agreement to accept assignment from Institute/Center staff at least 6 weeks prior to the anticipated submission of any application when the total direct costs (excluding consortium F&A costs) is $500,000 or more in any budget period. See NIH Grants Policy Statement, Section 18.104.22.168 Acceptance for Review of Unsolicited Applications Requesting $500,000 or More in Direct Costs for additional policy information. If any policies change, the community will be alerted.
- Which persistent identifiers would you like to see in DMPs?
UUIDs. Peter Elkin
NIH: The DMS Policy does not require the use of a particular persistent identifier. In the future, as required by the OSTP Public Access Memo, NIH will provide additional guidance about the use of persistent identifiers. Note that specific repositories may set additional expectations for the type of persistent identifier to be used.
- Is there a specific NIH expectation regarding the choice of data repository for the results? Is there a particular threshold?
No. However there are emerging national standards from which one can choose. For example OMOP, i2b2 and PCORNet for observational databases and SNOMED CT, LOINC and RxNorm for terminologies beyond ICD10 and CPT. Peter Elkin
NIH: It isn’t clear what threshold means in this case, but NIH has provided resources for identifying an appropriate repository. For more information, see: https://sharing.nih.gov/data-management-and-sharing-policy/sharing-scientific-data
- At this point, do we know what kinds or what depths of meta data of the data sets will need to be shared?
The goal is toward open science so unless it violates HIPAA, data should be shared liberally. Peter Elkin
NIH: The DMS Policy does not indicate which metadata should be shared, but the recommended elements of a DMS Plan include describing the metadata that will be generated and shared with the scientific data. Note that specific repositories may set additional expectations for the type of metadata to be submitted.
- Is ORCID recommended for researchers in NIH DMPs (i.e. use of DMPTool)?
Yes. Peter Elkin
NIH: The DMS Policy does not require the use of a particular persistent identifier. However, NIH does have a requirement for ORCID IDs for individuals supported by research training, fellowship, research education, and career development awards, but most of these are not subject to the DMS Policy. For more details, see: https://grants.nih.gov/grants/guide/notice-files/NOT-OD-19-109.html
- Do you envision electronic lab notebooks (ELNs) also known as electronic research notebooks (ERNs) more integrated in current/future DMPs?
Many have used Jupyter Notebooks for this purpose. Peter Elkin
NIH: Use of these resources is not required by the DMS Policy but may simplify management and sharing of scientific data and compliance with the Policy.
- Can you provide some guidance on the oversight question in Element 6? What are you looking for from our institutions (rather than NIH)?
NIH: The DMS Policy does not create any expectations about who will be responsible for Plan oversight at the institution.
- Young and international investigators need protection and $ so they can analyze their own data before it is exploited as 2ndary by well funded, large programs.
NIH: The DMS Policy does not expect that data will be shared until they are the subject of a peer-reviewed journal publication, or by the end of the award, whichever comes first, which provides investigators the opportunity to publish first on data that they collect. NIH allows reasonable costs of data management and sharing to be included in budget requests. Please note that most training awards, such as Ts, Fs, and some Ks, are not subject to the DMS Policy.