Skip to content

Visualization Scheme

Representation of Variation: Graph Tracks

Each indivdual chromosome/scaffold/contig sequence (aka graph path) is shown in one row as a track. In the single-bp zoom level, an artificial pangenome sequence is generated and shown on top, which is the concatenation of all sequences in the graph and thus contains all variation in the samples.

Pantograph Visualization Scheme

Due to the presence/absence representation of sequence, a single nucleotide polymorphism (SNP) is represented in two consecutive columns; filled column indicates the presence, blank column the absence of the sequence.

Different kinds of variation can be represented with colors:

  • duplicated regions are colored in shades of turquoise based on the number of traversals of the respective sequences in the graph
  • inverted regions are denoted with a small band in shades of red on top of each track; such regions must be read from left to right

Variation that cannot be represented in the linear matrix scheme is displayed by links. Arrows pointing upwards indicate to continue reading the path's sequence at the other end of the link and in the direction of the tiny 'landing arrow'. By right clicking on a link column and selecting 'Follow link', the user can navigate to the other end of the link.

TIP

A legend can be opened anytime in a popup to explain the colors of the tracks; click on the info icon in the top right corner.

Gene Annotations

Available gene annotations for individual graph paths are overlaid in blue bands at the bottom of each track (exons: dark blue, introns: light blue; reverse oriented genes have darker blue colors than forward oriented genes).

Zoom levels / Bin Widths

Pantograph supports pre-computed zoom levels known as "bin widths", where each bin represents a defined number of consecutive bases in the pangenome sequence. The tool displays the fraction of bases covered in each bin through tooltips over the track's cells. Cell colors visually represent the coverage value using shades of grey, ranging from white (coverage 0: no sequence in the bin is present in the track) to dark grey (coverage 1: all bases in the bin are present in the track).

For example, if a bin width of 100 is selected, each column in Pantograph represents 100 consecutive bases in the sorted pangenome sequence. If the coverage value (indicated as 'cov' in the tooltip) is 0.7, it means that 70 of the 100 bases in that column are covered in the selected path. Similarly, the 'inv' value indicates the fraction of the 100 bases that are inverted (traversed from right to left).

INFO

If all bases in a bin are inverted, the cov and inv values are identical.

Zoom levels can be changed by the dropdown "Bin width" in the toolbar, or via the right-click menu on any column.

Variant Tracks

Any variant data obtained from re-sequencing, targeted sequencing, or genotyping can be seamlessly integrated into Pantograph, provided the reference genome used is included in the pangenome graph. Variants for each sample are displayed as a single variant track, with variant sites highlighted in the corresponding columns of the reference genome track.

Cells are color-coded in shades of green to represent genotypes: dark green indicates homozygous alternative alleles, while light green signifies homozygous reference calls.

Hovering over a cell displays detailed variant information, including the location on the reference genome, the called genotype ('GT'), the reference ('Ref') and alternative alleles ('Alt'), as well as comma-separated read coverages for these alleles ('covs'). If more than five variants are present in the same bin, the total number of variants is shown instead.

Variant tracks can be sorted by genotypes in a specific column by right-clicking the desired column and selecting "Sort variant tracks."