DMID Metadata Standards Sequencing Assay

GSCID/BRC Project and Sample Application Standard
Sequencing Assay
v1.4
Finalized by the GSCID/BRC Metadata Working Group

How to interpret the document:
BOLD: Field Name
ITALICS: Attributes of the field


1. Sample ID - Sequencing Facility
Sequencing Assay Field ID: SA1
Field Name: Sample ID - Sequencing Facility
Description: Unique identifier used by the relevant sequencing center to identify the sample submitted by the sample provider.
Privacy Risks: No
GenBank Structured Comment Synonym: SRA*
OBO Foundry Synonym: specimen identifier assigned by sequencing facility
OBO Foundry ID: http://purl.obolibrary.org/obo/OBI_0001901
OBI Mapping Comments:  
Definition (OBI): A specimen identifier which is assigned by a sequencing facility
MIxS Equivalent:  
Other Synonyms:  
Data Categories: Sample Shipment
Allowed Values: free text
Unknown/Not Applicable/Censored Allowed:  
Syntax: Alphanumeric
Example Values: V506
Data Source: GSCID
Comments: This is a candidate for the key ID that would be embedded in the GenBank record as a dbxref for linkage of BRC metadata records with GenBank sequences
2. Nucleic Acid Extraction Method
Sequencing Assay Field ID: SA2
Field Name: Nucleic Acid Extraction Method
Description: Experimental procedure used to derive the nucleic acid fraction from the submitted sample used for the sequencing reaction.
Privacy Risks: No
GenBank Structured Comment Synonym: SRA*
OBO Foundry Synonym: nucleic acid extraction
OBO Foundry ID: http://purl.obolibrary.org/obo/OBI_0666667
OBI Mapping Comments:  
Definition (OBI): a protocol is a plan specification which has sufficient level of detail and quantitative information to communicate it between domain experts, so that different domain experts will reliably be able to independently reproduce the process.
MIxS Equivalent: sample material processing
Other Synonyms: Nucleic acid preparation - extraction method
Data Categories: Sequencing Sample Preparation
Allowed Values: OBI, http://bioportal.bioontology.org/ontologies/40832
Unknown/Not Applicable/Censored Allowed:  
Syntax:  
Example Values: Illumina suggested standard method; CTAB/phenol/chloroform
Data Source: Sample Provider or GSCID
Comments:  
3. Nucleic Acid Preparation Method
Sequencing Assay Field ID: SA3
Field Name: Nucleic Acid Preparation Method
Description: Details about the preparation of DNA samples for sequencing including if amplification was used (e.g., in the case of sequencing a single mosquito), and any other relevant molecular biology protocols done prior to sequencing.
Privacy Risks: No
GenBank Structured Comment Synonym: SRA*
OBO Foundry Synonym: sample preparation for sequencing assay
OBO Foundry ID: http://purl.obolibrary.org/obo/OBI_0001902
OBI Mapping Comments:  
Definition (OBI): a protocol is a plan specification which has sufficient level of detail and quantitative information to communicate it between domain experts, so that different domain experts will reliably be able to independently reproduce the process.
MIxS Equivalent: sample material processing
Other Synonyms: Nucleic acid preparation by GSC; including amplification procedure
Data Categories: Sequencing Sample Preparation
Allowed Values:  
Unknown/Not Applicable/Censored Allowed:  
Syntax:  
Example Values: Standard 454 LC
Data Source: GSCID
Comments: required with not applicable as an allowed value
4. Sequencing Method
Sequencing Assay Field ID: SA4
Field Name: Sequencing Method
Description: Experimental procedure used to derive sequence data from the input assay sample including both method and device. Type of sequencing used based on approach (pyrosequencing) and technology (454).
Privacy Risks: No
GenBank Structured Comment Synonym: SRA*
OBO Foundry Synonym: sequencing assay
OBO Foundry ID: http://purl.obolibrary.org/obo/OBI_0600047
OBI Mapping Comments:  
Definition (OBI): a protocol is a plan specification which has sufficient level of detail and quantitative information to communicate it between domain experts, so that different domain experts will reliably be able to independently reproduce the process.
MIxS Equivalent: sequencing method
Other Synonyms: Sequencing method (e.g. dideoxysequencing, pyrosequencing, polony)
Data Categories: Sequencing Assay
Allowed Values: OBI, http://bioportal.bioontology.org/ontologies/40832
Unknown/Not Applicable/Censored Allowed:  
Syntax: Consensus or Intra-Host
Example Values: Pyrosequencing
Data Source: GSCID
Comments:  
5. Assembly Name
Sequencing Assay Field ID: SA5
Field Name: Assembly Name
Description: A unique name given to a specific assembled genome build.
Privacy Risks: No
GenBank Structured Comment Synonym: Assembly Name
OBO Foundry Synonym: IAO:'written name' denotes some 'sequence assembly'
OBO Foundry ID:  
OBI Mapping Comments: sequence assembly' has been submitted to OBI
Definition (OBI):  
MIxS Equivalent:  
Other Synonyms:  
Data Categories: Data Transformation
Allowed Values:  
Unknown/Not Applicable/Censored Allowed:  
Syntax:  
Example Values:  
Data Source:  
Comments: Newly-proposed GenBank Structured Comment Field ID
6. Assembly Method
Sequencing Assay Field ID: SA6
Field Name: Assembly Method
Description: The name and version of the software or pipeline used to assemble individual sequence reads into larger contigs. Any dependencies such as reference genome files should be listed.
Privacy Risks: No
GenBank Structured Comment Synonym: Assembly Method*
OBO Foundry Synonym: software pipeline
OBO Foundry ID: http://purl.obolibrary.org/obo/OBI_0001943
OBI Mapping Comments:  
Definition (OBI): A plan specification that specifies a chain encoded in software of processing elements (processes, threads, coroutines, etc.), arranged so that the output of each element is the input of the next. Usually some amount of buffering is provided between consecutive elements.
MIxS Equivalent: assembly
Other Synonyms: Assembly (Assembly method, estimated error rate and method of calculation)
Data Categories: Data Transformation
Allowed Values: pick list
Unknown/Not Applicable/Censored Allowed:  
Syntax: free text
Example Values: Illumina GA pipeline ver1.3; Newbler MapAsmResearch-03/15/2010 -vs C_elegans -e 45; No assembly. Reads mapped to reference genome withTopHat; Newbler de novo hibrid assembly/CLC reference mapping of 454 reads; AV454 v1.0
Data Source: GSCID
Comments:  
7. Depth of Coverage - Average
Sequencing Assay Field ID: SA7
Field Name: Depth of Coverage - Average
Description: Depth of sequence coverage based both on external (e.g. Cot-based size estimates) and internal (average coverage in the assembly) measures of genome size.
Privacy Risks: No
GenBank Structured Comment Synonym: Genome Coverage*
OBO Foundry Synonym: average depth of sequence coverage
OBO Foundry ID: http://purl.obolibrary.org/obo/OBI_0001618
OBI Mapping Comments:  
Definition (OBI): An average value of the depth of sequence coverage based both on external (e.g. Cot-based size estimates) and internal (average coverage in the assembly) measures of genome size.
MIxS Equivalent: finishing strategy
Other Synonyms: average depth of sequence coverage
Data Categories: Data Transformation
Allowed Values:  
Unknown/Not Applicable/Censored Allowed:  
Syntax:  
Example Values: 45X average
Data Source: GSCID
Comments: core, but not available until genome submitted - average? min? max?
8. Annotation Algorithm
Sequencing Assay Field ID: SA8
Field Name: Annotation Algorithm
Description: Computational algorithm used to identify sequence features (e.g. protein coding regions) in the assembled contig sequence. This may also include a description of any manual curation that may have generated or validated the annotation.
Privacy Risks: No
GenBank Structured Comment Synonym: Annotation Algorithm
OBO Foundry Synonym: sequence annotation algorithm
OBO Foundry ID: http://purl.obolibrary.org/obo/OBI_0001625
OBI Mapping Comments:  
Definition (OBI): A plan specification which describes inputs, output of mathematical functions as well as workflow of execution for achieving an predefined objective. Algorithms are realized usually by means of implementation as computer programs for execution by automata.
MIxS Equivalent:  
Other Synonyms: Annotation source
Data Categories: Data Transformation
Allowed Values: pick list
Unknown/Not Applicable/Censored Allowed:  
Syntax: free text
Example Values: RAST v4.0; reference annotation transfer
Data Source: GSCID
Comments: required with not applicable as an allowed value; core, but not available until genome submitted
9. Annotation Provider
Sequencing Assay Field ID: SA9
Field Name: Annotation Provider
Description:  
Privacy Risks: No
GenBank Structured Comment Synonym: Annotation Provider
OBO Foundry Synonym: (organization or 'Homo sapiens') and participates_in some 'sequence annotation'
OBO Foundry ID:  
OBI Mapping Comments: sequence annotation' has been submitted to OBI
Definition (OBI):  
MIxS Equivalent:  
Other Synonyms:  
Data Categories: Data Transformation
Allowed Values:  
Unknown/Not Applicable/Censored Allowed:  
Syntax:  
Example Values:  
Data Source:  
Comments: Newly-proposed GenBank Structured Comment Field ID
10. Annotation Status
Sequencing Assay Field ID: SA10
Field Name: Annotation Status
Description:  
Privacy Risks: No
GenBank Structured Comment Synonym: Annotation Status
OBO Foundry Synonym: report is_about some (is_specified_output_of some 'sequence annotation')
OBO Foundry ID:  
OBI Mapping Comments: sequence annotation' has been submitted to OBI
Definition (OBI):  
MIxS Equivalent:  
Other Synonyms:  
Data Categories: Data Transformation
Allowed Values:  
Unknown/Not Applicable/Censored Allowed:  
Syntax:  
Example Values:  
Data Source:  
Comments: Newly-proposed GenBank Structured Comment Field ID
11. Annotation Version
Sequencing Assay Field ID: SA11
Field Name: Annotation Version
Description:  
Privacy Risks: No
GenBank Structured Comment Synonym: Annotation Version
OBO Foundry Synonym: version number' and is_about some (is_specified_output_of some 'sequence annotation')
OBO Foundry ID:  
OBI Mapping Comments: sequence annotation' has been submitted to OBI
Definition (OBI):  
MIxS Equivalent:  
Other Synonyms:  
Data Categories: Data Transformation
Allowed Values:  
Unknown/Not Applicable/Censored Allowed:  
Syntax:  
Example Values:  
Data Source:  
Comments: Newly-proposed GenBank Structured Comment Field ID
12. Annotation Pipeline
Sequencing Assay Field ID: SA12
Field Name: Annotation Pipeline
Description:  
Privacy Risks: No
GenBank Structured Comment Synonym: Annotation Pipeline
OBO Foundry Synonym: software pipeline' and 'is concretized as' some ('is realized by' some 'sequence annotation')
OBO Foundry ID:  
OBI Mapping Comments: software pipeline' and 'sequence annotation' have been submitted to OBI
Definition (OBI):  
MIxS Equivalent:  
Other Synonyms:  
Data Categories: Data Transformation
Allowed Values:  
Unknown/Not Applicable/Censored Allowed:  
Syntax:  
Example Values:  
Data Source:  
Comments: Newly-proposed GenBank Structured Comment Field ID
13. Annotation Method
Sequencing Assay Field ID: SA13
Field Name: Annotation Method
Description:  
Privacy Risks: No
GenBank Structured Comment Synonym: Annotation Method
OBO Foundry Synonym: protocol and 'is concretized as' some ('is realized by' some 'sequence annotation')
OBO Foundry ID:  
OBI Mapping Comments: sequence annotation' has been submitted to OBI
Definition (OBI):  
MIxS Equivalent:  
Other Synonyms:  
Data Categories: Data Transformation
Allowed Values:  
Unknown/Not Applicable/Censored Allowed:  
Syntax:  
Example Values:  
Data Source:  
Comments: Newly-proposed GenBank Structured Comment Field ID
14. Features Annotated
Sequencing Assay Field ID: SA14
Field Name: Features Annotated
Description:  
Privacy Risks: No
GenBank Structured Comment Synonym: Features Annotated
OBO Foundry Synonym: sequence_feature
OBO Foundry ID: http://purl.obolibrary.org/obo/SO_0000110
OBI Mapping Comments:  
Definition (OBI): An extent of biological sequence.
MIxS Equivalent:  
Other Synonyms:  
Data Categories: Data Transformation
Allowed Values:  
Unknown/Not Applicable/Censored Allowed:  
Syntax:  
Example Values:  
Data Source:  
Comments: Newly-proposed GenBank Structured Comment Field ID
15. GenBank Record ID
Sequencing Assay Field ID: SA15
Field Name: GenBank Record ID
Description: Unique identifier of the submitted GenBank sequence record(s).
Privacy Risks: No
GenBank Structured Comment Synonym:  
OBO Foundry Synonym: GenBank ID
OBO Foundry ID: http://purl.obolibrary.org/obo/OBI_0001614
OBI Mapping Comments:  
Definition (OBI): An information content entity that consists of a CRID symbol and additional information about which CRID registry it belongs.
MIxS Equivalent:  
Other Synonyms:  
Data Categories: Data Transformation
Allowed Values: free text
Unknown/Not Applicable/Censored Allowed:  
Syntax: free text
Example Values:  
Data Source: GSCID
Comments: core, but not available until genome submitted