Appearance
Update Data
Most data in Pantograph can be extended or updated by admin users:
- Upload metadata
- Upload variation tracks
- Upload expression data
Upload Metadata
Admin users can upload metadata.
New or updated metadata should be stored in comma- or tab-separated text files (CSV/TSV) and can then be uploaded via the 'Metadata management' tab on Pantograph's Pipeline page.
Requirements
The metadata file must have a header line indicating the metadata categories that will be displayed in Pantograph, and the first column need to be named genome_name
. Metadata for each track should be listed in the same order of metadata categories; see the example metadata file below.
INFO
Both categorical and numerical metadata values are supported and will be automatically identified.
IMPORTANT
When a new file is uploaded, its metadata is added to the existing metadata in Pantograph. Each upload is assigned a unique group label, enabling easier organization and the option to delete all metadata associated with a specific group/file only. This is also the reason, why uploading already existing metadata categories from previously uploaded files will duplicate the metadata category.
Example Metadata CSV File
csv
genome_name,cultivation,yield_random
pathA,cultivated,0.146957
pathB,landrace,0.326076
vcfTrack1,cultivated,0.461396
vcfTrack2,,0.473928
INFO
Missing data is represented by an empty entry in the respective metadata column (e.g., cultivation
for vcfTrack2
in the example above).
After the upload, the new data will be available in Pantograph after a reload of the Pantograph website.
Upload Variation Tracks
Admin users can upload new variation tracks.
To upload new variation tracks, ensure the following requirements are met:
Samplesheet Format:
The samplesheet must be in CSV format (comma-separated).
The header line must follow this format: "group,reference,vcf".
The required samplesheet columns are:
- Column 1: Group name for the VCF files (user-specified; this name will show up in the track menu; e.g., "vcf_group_1")
- Column 2: Name of the reference genome (as specified in the graph; without sequence suffix, e.g. indicate 'genomeA' and not 'genomeA_Chr01')
- Column 3: Absolute path of uncompressed (.vcf) or bgzip-compressed (.vcf.gz) vcf file on the bucket
VCF Files
Place all VCF files listed in the samplesheet into a unique folder on the object storage.
Ensure the file paths specified in Column 3 of the samplesheet correctly point to their respective files in the object storage's folder.
Notes:
- Variant track names are derived from the SAMPLE columns in the header line of the VCF file.
- Only globally unique variant track names are supported across all variant groups (they may match graph track names, if desired).
- Uploading data from samples with names that already exist in the indicated variant group will overwrite the corresponding existing data.
After the upload, the new data will be available as new variant track groups in the Tracks Menu in Pantograph after a reload of the Pantograph website.