Input file formats

Following is a list of input file formats and the report builders they are used in.

Input argument

candidategenereportfile , candidategenefile , or genepanelinfofile

Report builders

Multi-family Mendelian analysis, De novo/CHZ for trios, Sample QC and statistics

Format

This parameter points to the candidate gene file (.tsv, .rep, .gor) which the CSA generates for each study (located in the corresponding study folder under studies in the folder tree); there are four required columns: chrom, gene_start, gene_end, gene_symbol

guides/images/candidateGeneReportFile.png

candidategenereportfile, candidategenefile, or genpanelinfofile

Input argument

candidategenereportgrid

Report builders

Multi-family Mendelian analysis, Mendelian analysis, De novo/CHZ for trios

Format

This is an alternative to candidategenereportfile ; GOR grid in an open tab with four required columns: chrom, gene_start, gene_end, gene_symbol

guides/images/candidateGeneReportGrid.png

candidategenereportgrid

Input argument

customAlleleFreqFile

Report builders

Multi-family Mendelian analysis, Mendelian analysis, De novo/CHZ for trios

Format

A GOR variant file with additional columns for alleleFreq and PNcount (the Variant QC and statistics report builder can be used to calculate such allele frequencies); the column containing allele frequencies must be lableled “alleleFreq” and the column containing the number of carriers of the variants must be labeled “PNcount”

guides/images/customAlleleFreqFile.png

customeAlleleFreqFile

Input argument

customAnnotationFile

Report builders

Mendelian analysis, De novo/CHZ for trios

Format

A GOR variant file with annotation columns; maximum one line per variant

guides/images/customAnnotationFile.png

customAnnotationFile

Input argument

customRegionFile

Report builders

Mendelian analysis, De novo/CHZ for trios

Format

A GOR region file to add custom annotations such as CNVs (not to filter); file should contain at least three columns (chrom, bpstart, bpstop) as the first three columns (column names do not need to match these terms)

guides/images/customRegionFile.png

customRegionFile

Input argument

exclusionRegions

Report builders

Variant association

Format

A GOR segment file (containing chrom, bpstart, bpstop columns) that defines regions to be excluded (column names do not need to match these terms)

guides/images/exclusionRegions.png

exclusionRegions

Input argument

exclusionVarfile

Report builders

Gene association, Multi-family Mendelian analysis, Mendelian analysis, De novo/CHZ for trios

Format

A GOR variant file (containing chrom, pos, ref, and alt columns) with variants to be excluded from the analysis (column names do not need to match these terms)

guides/images/exclusionVarFile.png

exclusionVarfile

Input argument

genelist

Report builders

Used in many report builders

Format

One single-column grid containing gene symbols; the column name is either gene_symbol or gene

guides/images/geneList.png

genelist

Input argument

Genepanelinfofile

Report builders

Multi-family Mendelian analysis, Mendelian analysis, De novo/CHZ for trios, Transcripts

Format

A .tsv, .rep, or .gor file with the following columns: gene_symbol, gene_DmaxAf, gene_MOI, gene_RmaxAf, Gene_RmaxGf. This file allows the default RmaxAf, RmaxGf, and DmaxAf to be overwritten per gene symbol. Column names should match the five column names listed, and the gene_symbol column should be the first column.

guides/images/genePanelInfoFile.png

Genepanelinfofile

Input argument

geneReport

Report builders

Genes to pathways, Genes to paralogs

Format

Data grid contains the gene information; the first column should be the chromosome (the column name is not important), and the second to fourth columns should be start position, end position, and gene symbol (these three columns should be named “gene_start | gene_begin”, “gene_end | gene_stop”, “gene_symbol | gene”)

guides/images/geneReport.png

geneReport

Input argument

gorReport or variantreport

Report builders

Used in many report builders

Format

Data grid contains the variant information; the first two columns should be chromosome and position (the column names are not important), and the third and fourth columns should be named “Ref/reference” and “Call/Allele/Alt” respectively

guides/images/gorReport.png

gorReport or variantreport

Input argument

pedigree

Report builders

De novo/CHZ for trios

Format

Three required columns with the following column names: PN (sample ID of index case), FN (sample ID of father), and MN (sample ID of mother)

guides/images/pedigree.png

pedigree

Input argument

Regionfile

Report builders

Multi-family Mendelian analysis, Mendelian analysis, De novo/CHZ for trios, Sample QC and statistics

Format

A GOR segment file (containing chrom,*bpstart*,*bpstop* columns) that defines regions to be included or excluded (column names do not need to match these terms)

guides/images/regionFile.png

Regionfile

Input argument

rsIDfile

Report builders

Risk SNPs

Format

A grid (.rep) file containing a list of sIDs in a column named “rsids”

guides/images/rsIDFile.png

rsIDfile

Input argument

rsIDgrid

Report builders

Risk SNPs

Format

An open grid containing a list of rsIDs in a column named “rsids”

guides/images/rsIDGrid.png

rsIDgrid

Input argument

Sample ID grid

Input names in many report builders

CASEs, CTRLs, subjects

Format

One-column grid contains PN only (can be any column name)

guides/images/sampleIDGrid.png

Sample ID grid

Input argument

SubjectAnnotation

Report builders

Gene association, Gene analysis

Format

Adds columns to output from an open grid which has a column of subject IDs and associated columns (e.g., subject phenotypes); the first column should contain PNs (subject/sample IDs), and the column should be named “PN”

guides/images/subjectAnnotation.png

SubjectAnnotation