Glossary¶
- BAM¶
The comprehensive raw data of genome sequencing. BAM is a binary format for storing sequence data.
- base-quality¶
The base quality score is a measure of the quality of the identification of the nucleobases generated by automated DNA sequencing. Essentially, this is an indication of the likelihood of this base call being correct.
- child¶
An object or node which can be found under another node in a hierarchy (e.g., in the PhenoCODE Metadata tree)
- cohort¶
A group of people with a shared characteristic. In terms of Sequence Miner, a group of subjects in either the case group or control group. See cohort analysis.
- cohort analysis¶
A subset of analytics that identifies relationships between the phenotypic and genomic data of two groups: cases and controls. Two main types of report builders exist for cohort analysis in Sequence Miner: 1) Variant-based association tests and 2) Gene-based association tests.
- collapse¶
Hide an item or object in a menu or heirarchy by clicking the arrow next to it.
- Clinical Sequence Analyzer¶
The Clinical Sequence Analyzer (CSA) is a GUI tool used for data mining and report generation from raw genetic data.
- expand¶
Show an item or object in a menu or hierarchy by clicking the arrow next to it.
- gene association¶
Definition
- Gene Ontology¶
Definition
- genome¶
The complete set of genes or genetic material present in a cell or organism.
- genomic ordered data¶
Genomic data ordered by the genomic position of the data.
- GOR¶
Genomic Ordered Relation. The main component of GOR is the GOR data processing language, but other components are the GORServer, GORWorker, and AppServer.
The term “GOR” may be used in reference to:
The declarative query language used to structure commands to access information in the GOR database.
The GOR database itself.
The GOR architecture as a whole.
The act of merging two streams together, e.g. to “gor” together two files.
- GORpipe¶
A command line interface for the GOR language.
- GOR stream¶
A stream of data that is genomic-ordered. In other words, the output from a GOR query.
- GOR Query Language¶
The subject of this manual. A query language for processing, filtering, and outputting genomic-ordered (and non-ordered) relational data.
- heterozygous¶
In terms of the genotype of a diploid organism, a pair of different alleles at a locus on the DNA.
- homozygous¶
In the genotype of a diploid organism, a pair of identical alleles at a single locus on the DNA.
- leaf¶
The end object or node in an object hierarchy or tree, found by expanding the parent nodes (e.g., in the PhenoCODE Metadata tree). See expand.
- mitochrondrial DNA¶
(also known as mtDNA) DNA located in the mitochondria, cellular organelles within eukaryotic cells that convert chemical energy from food into a form that cells can use. Genetic material that is passed down from mothers to sons and daughters.
- OMIM¶
Online Mendelian Inheritance in Man - a continuously updated catalog of human genes and genetic disorders and traits, with a particular focus on the gene-phenotype relationship.
- NOR¶
Non-Ordered Relations. A subset of the commands in the GOR query language is also usable in NOR.
- paralogs¶
Definition
- parent¶
In a hierarchy or tree, the parent is the next object or node above a selected node (e.g., in the PhenoCODE Metadata tree).
- PhenoCODE¶
A phenotype exploration module in Sequence Miner.
- position¶
Also labeled as “POS”. The (first) basepair position of the sequence variant, e.g. the position of the first nucleotide in the Reference column.
- relation¶
In PhenoCODE, data is kept together in a relation.
- root¶
A single top-most object or node in a hierarchy or tree (e.g. PN in the PhenoCODE Metadata tree).
- Sequence Miner¶
The GUI tool used by advanced users of WuXi NextCODE’s tools, which enables deep data mining and custom queries on top of raw genetic data as well as derived data.
- variant association¶
Definition
- variants¶
All the different ways that one person’s DNA sequence can differ from the reference DNA sequence (e.g. Single nucleotide polymorphisms, insertions, deletions, substitutions, structural variants).
- Variant Effect Predictor¶
The Variant Effect Predictor, or VEP, determines the effect of your variants (see above) on genes, transcripts, and protein sequence, as well as regulatory regions.
- zero-based position¶
A numbering format that starts from zero where individual bases in the genomic sequence actually occupy the spaces between the numbers. 0-based systems include UCSC, where other systems like Ensembl use 1-based. GOR is 1-based.