Pathways to genes

The Pathways to genes report builder generates a list of genes found in the selected pathway(s). The list of genes can be reported in any of the following formats:

  • Single row per gene per pathway related to each gene identified

  • Single row per gene in the pathway(s) related to each gene identified

  • Single row per pathway for a set of genes related to each gene identified

../_images/pathwaysToGenes.png

Pathways To Genes module in Sequence Miner

Example use case

The user has identified a candidate pathway for a phenotype and wishes to identify genes in that pathway in order to create a candidate genelist for a study.

Description of the algorithm

Gene and pathway information is stored in the ensgenes_gene2pathway.mmap file in the ref folder in the Sequence Miner File Explorer. The pathways are derived from:

  • BIOCYC: BioCyc is a collection of 9387 Pathway/Genome Databases (PGDBs).

  • HGMD: Qiagen’s knowledge base for their Ingenuity Pathway Analysis platform.

  • KEGG: KEGG PATHWAY is a collection of manually drawn pathway maps representing current knowledge on molecular interaction and reaction networks. KEGG PATHWAY mapping is the process of mapping molecular datasets (eg: genomics, transcriptomics, proteomics, and metabolomics) to KEGG pathway maps for biological interpretation.

  • Pathway Interaction Database: The PID is comprised of NCI/Nature-curated pathway information, now available through ndexbio.org hosted by the Ideker Lab at the UC San Diego School of Medicine.

  • REACTOME: REACTOME is an open-source, open access, manually curated and peer-reviewed pathway database.

  • WikiPathways: WikiPathways was established to facilitate the contribution and maintenance of pathway information by the biology community. WikiPathways is an open, collaborative platform dedicated to the curation of biological pathways.

The user types a search term in the pathways field and selects one or more pathways from the drop-down list. The pathway term is mapped to the ensgenes_gene2pathway.mmap file and genes matched to the pathway are then reported in the results.

Interpreting the output

The query returns a table in GOR report format displaying the results in one of three available formats:

  • line_per_gene_pathway results display:

    • One gene per row in the Gene_Symbol column

    • One pathway per row in the Pathway column

    • All genes in that pathway are in a comma-delimited list in the Genes_in_pathway column

  • line_per_gene results display:

    • One gene per row in the Gene_Symbol column

    • All pathways for that gene are in a comma-delimited list in the pathways column

    • All genes in that pathway are in a comma-delimited list in the Genes_in_pathway column

  • line_per_gene_pathway_relatedgene results display:

    • One gene per row in the Gene_Symbol column

    • One pathway per row in the Pathway column

    • Every other gene in the same pathway is in the Gene_in_pathway column, one related gene per row

Column descriptions

Report output columns and descriptions

Group

Column

Description

gene

end

The end base pair position of the gene

start

The start base pair positions of the gene (zero based, i.e., the position of the base pair before the first base pair in the gene)

Symbol

Gene identified by viewing variant in Ensembl; outputs HGNC gene symbol for gene identified (clone name provided when HGNC unavailable)

Other columns

Chrom

The chromosome of given gene

Genes_in_pathway

Gene(s) in the selected pathway

Pathway/pathways

The selected pathway(s)