NIH DMSP Mad Libs Contributions
Help the FH DaSL Create a Collection of Mad Libs Data Submission Forms! Please submit example sentences in as generic a form as possible for specific technologies you are aware of so we can share these examples with the research community. 
Sign in to Google to save your progress. Learn more

Element 1: A. Data Type (raw data)

Please submit sentences in the form: 

We will collect data using [insert technology description]. File Type: Data for this study will generate [insert raw data file description]. The amount of data generated per sample is [insert average file size]  Number of files: We anticipate collecting data from [insert total number of samples/files to be collected] for a total data volume of [multiply file size and total number of files].


Element 1: A. Data Type (processed data)

Please submit sentences in the form: 

We process the [insert technology type corresponding to raw data type above] using [insert brief pipeline description here]File Type: The data processing will result in  [insert processed data file description]. The amount of data generated per sample is [insert average file size] Number of files: We anticipate generating  [insert total number of samples/files to be collected] for a total data volume of [multiply file size and total number of files]



Element 1: B. Scientific data that will be preserved and shared, and the rationale for doing so:

Please submit sentences in the form: 

Raw data: [insert technologies from list above to be shared] to facilitate re-analysis and re-use of the data by other investigators. 

Processed data: [insert processed data to be shared] to facilitate re-analysis and re-use of the data by other investigators. 


Element 1: C. Metadata, other relevant data, and associated documentation:

Please submit sentences in the form: 

Metadata on [insert metadata descriptors]  will be collected via [insert process for collecting metadata here] and will be submitted in accordance with FAIR data principles according to [if it exists insert FAIR standards for data type from here https://www.nature.com/articles/sdata201618]

Metadata on [insert metadata descriptors]  will be collected via [insert process for collecting metadata here] and will be released in accordance with FAIR data principles in the form of a spreadsheet with consistent sample labels, dates in ISO 8601 format (YYYY-MM-DD), without empty cells, with one data item per cell, organized as a single rectangle (with subjects as rows and variables as columns, and with a single header row), with a corresponding data dictionary. Metadata will be released in raw form without calculations on the raw data files, font color or highlighting as data, with human and machine readable variable names, links to raw data urls for [insert raw data type], saved as a plain text file and uploaded to [insert location where metadata will be deposited]. For more information on data formatting see (https://www.tandfonline.com/doi/full/10.1080/00031305.2017.1375989). 


Element 2: Related Tools, Software and/or Code:

Please submit sentences of the form

All software developed as a part of this proposal, will be developed in the open on [pick one of GitHub/GitLab/BitBucket], and released as a collection of open source [pick one or more of R scripts/R packages/Python Libraries/Jupyter Notebooks/WDL Workflows/Galaxy Workflows/NextFlow workflows] with an MIT license. Software packages will be formally released via [pick one or more of Bioconductor/CRAN/conda/PyPi/Galaxy Workbench/AnVIL]

All software developed as a part of this proposal, will be deposited after publication on [pick one of GitHub/GitLab/BitBucket],and released as a collection of open source [pick one or more of R scripts/R packages/Python Libraries/Jupyter Notebooks/WDL Workflows/Galaxy Workflows/NextFlow workflows] with an MIT license. Software packages will be formally released via [pick one or more of Bioconductor/CRAN/conda/PyPi/Galaxy Workbench/AnVIL]

 

Software used on this proposal is proprietary and there is no mechanism for sharing these software publicly. We will provide step by step analysis instructions including complete descriptions of all software versions, parameter settings, intermediate analysis steps, and intermediate calculations. Where possible, we will include screenshots of analysis steps to support reproducibility of results.



Element 3: Standards

Please submit sentences of the form 


Element 4: Data Preservation, Access, and Associated Timelines

A Repository where scientific data and metadata will be archived:

Please submit sentences of the form 

Primary repositories for raw sequence data will be the Gene Expression Omnibus for data that can be made publicly available and dbGaP for data that require access controls. Both repositories are backed by the Sequence Read Archive (SRA) for storage of raw sequence data, typically in Fastq format although uBAM can also be submitted. The SRA is managed and supported by the NCBI. Data will be deposited within [x months] of publication and stored for [y years] in accordance with generally accepted data storage policies.



Element 4: Data Preservation, Access, and Associated Timelines

B How scientific data will be findable and identifiable:

Please submit sentences of the form 

GEO, dbGaP, and SRA all provide stable IDs to various levels (Project accession, SRA read accession, sequencing platform, etc.). Primary references would be to a GEO series accession (e.g. GSE198265), dbGaP study accession (e.g. phs001805.v1.p1 to pick a random yet compelling example), or Sequence Read Archive run accession (e.g. SRR18284544). 



Element 4: Data Preservation, Access, and Associated Timelines

C When and how long the scientific data will be made available:

Please submit sentences of the form 

Sequence level data will be deposited in dbGap within 3 months of data generation and preserved for the duration of the grant funding.

Processed gene-level summaries will be deposited within 3 months of data generation and preserved for the duration of the grant funding.

Sequence level data will be deposited in dbGap at the time of publication and preserved for the duration of the grant funding.

Sequence level data will be deposited in dbGap at the time of publication and preserved according to SRA preservation standards.




Element 5: Access, Distribution, or Reuse Considerations

A. Factors affecting subsequent access, distribution, or reuse of scientific data:

Please submit sentences of the form 

Raw data: [insert technologies from list above] will not be shared because they are Level 0 data ( https://sharing.nih.gov/genomic-data-sharing-policy/submitting-genomic-data/data-submission-and-release-expectations) and can only be opened in a limited number of non-open source software programs. 

Raw data: [insert technologies from list above] will not be shared because because they are Level 0 data (https://sharing.nih.gov/genomic-data-sharing-policy/submitting-genomic-data/data-submission-and-release-expectations) and can only be easily opened in proprietary, licensed viewing softwares. 


Raw data: [insert technologies from list above] will not be shared because they are Level 0 data ( https://sharing.nih.gov/genomic-data-sharing-policy/submitting-genomic-data/data-submission-and-release-expectations); are very large and are only needed for advanced data processing or to reconstruct the scientifically accepted raw data type [insert raw data type here]. 


Raw data: [insert technologies from list above] will not be shared because they are Level 1 data ( https://sharing.nih.gov/genomic-data-sharing-policy/submitting-genomic-data/data-submission-and-release-expectations); are very large and are only needed for advanced data processing or to reconstruct the scientifically accepted raw data type [insert raw data type here]. 


Raw data/processed data: [insert technologies from list above] will not be shared because the IRB for this protocol does not include consent for public data sharing. 


Raw data/processed data: [insert technologies from list above] are not suitable to be shared in identified form due to IRB restrictions. However, de-identified data with randomly generated participant or sample IDs will be applied to the de-identified data. 


Raw data/processed data: [insert technologies from list above] are not suitable to be shared due to sovereignty restrictions related to individuals from the population sampled.  





Element 5: Access, Distribution, or Reuse Considerations

B.   Whether access to scientific data will be controlled:

Please submit sentences of the form 

All requests for the raw genomic and limited phenotype data (as described above) that is stored in dbGaP will be submitted to and processed by the NIH-designated data repository under their “controlled access” process

Element 5: Access, Distribution, or Reuse Considerations

C.  Protections for privacy, rights, and confidentiality of human research participants:

Please submit sentences of the form 

In order to achieve our goal of data sharing with the research community while not violating assurances and rights of study participants, we will create a dataset for sharing that (1) excludes participants whose consent forms specifically state that their data will not be shared outside of the study team and (2) incorporates standard blurring or masking techniques for demographic, phenotypic, and descriptive variables so as to reduce risks of identifiability and/or confidentiality violation. 

Element 6: Oversight of Data Management and Sharing:

Please submit sentences of the form 

Data management will be overseen by the laboratory manager and will be executed by the Bioinformatics Shared Resource at the Fred Hutchinson Cancer Center. 

Data management will be executed through the use of the Data Portal by computational biology staff in our lab. 

Data management will be executed through collaboration with the [insert computational/biostatistical lab]. 

 

Submit
Clear form
Never submit passwords through Google Forms.
This content is neither created nor endorsed by Google. Report Abuse - Terms of Service - Privacy Policy