Pathways to genes¶
The Pathways to genes report builder generates a list of genes found in the selected pathway(s). The list of genes can be reported in any of the following formats:
Single row per gene per pathway related to each gene identified
Single row per gene in the pathway(s) related to each gene identified
Single row per pathway for a set of genes related to each gene identified

Pathways To Genes module in Sequence Miner¶
Example use case¶
The user has identified a candidate pathway for a phenotype and wishes to identify genes in that pathway in order to create a candidate genelist for a study.
Description of the algorithm¶
Gene and pathway information is stored in the ensgenes_gene2pathway.mmap
file in the ref folder in the Sequence Miner File Explorer.
The pathways are derived from:
BIOCYC: BioCyc is a collection of 9387 Pathway/Genome Databases (PGDBs).
HGMD: Qiagen’s knowledge base for their Ingenuity Pathway Analysis platform.
KEGG: KEGG PATHWAY is a collection of manually drawn pathway maps representing current knowledge on molecular interaction and reaction networks. KEGG PATHWAY mapping is the process of mapping molecular datasets (eg: genomics, transcriptomics, proteomics, and metabolomics) to KEGG pathway maps for biological interpretation.
Pathway Interaction Database: The PID is comprised of NCI/Nature-curated pathway information, now available through ndexbio.org hosted by the Ideker Lab at the UC San Diego School of Medicine.
REACTOME: REACTOME is an open-source, open access, manually curated and peer-reviewed pathway database.
WikiPathways: WikiPathways was established to facilitate the contribution and maintenance of pathway information by the biology community. WikiPathways is an open, collaborative platform dedicated to the curation of biological pathways.
The user types a search term in the pathways field and selects one or more pathways from the drop-down list. The pathway term is mapped to the ensgenes_gene2pathway.mmap
file and genes matched to the pathway are then reported in the results.
Interpreting the output¶
The query returns a table in GOR report format displaying the results in one of three available formats:
line_per_gene_pathway results display:
One gene per row in the Gene_Symbol column
One pathway per row in the Pathway column
All genes in that pathway are in a comma-delimited list in the Genes_in_pathway column
line_per_gene results display:
One gene per row in the Gene_Symbol column
All pathways for that gene are in a comma-delimited list in the pathways column
All genes in that pathway are in a comma-delimited list in the Genes_in_pathway column
line_per_gene_pathway_relatedgene results display:
One gene per row in the Gene_Symbol column
One pathway per row in the Pathway column
Every other gene in the same pathway is in the Gene_in_pathway column, one related gene per row
Column descriptions¶
Group |
Column |
Description |
---|---|---|
gene |
end |
The end base pair position of the gene |
start |
The start base pair positions of the gene (zero based, i.e., the position of the base pair before the first base pair in the gene) |
|
Symbol |
Gene identified by viewing variant in Ensembl; outputs HGNC gene symbol for gene identified (clone name provided when HGNC unavailable) |
|
Other columns |
Chrom |
The chromosome of given gene |
Genes_in_pathway |
Gene(s) in the selected pathway |
|
Pathway/pathways |
The selected pathway(s) |