File I/O

Pedigree Files

Pydigree supports pedigree files specified in the following delimited format:

PEDID ID FATHER MOTHER SEX

Reading these is supported through pydigree.io.read_ped. Pedigree files are not required to be sorted, and IDs are not required to be numeric. These files may contain extra data following sex, which can be handled with a user-specified function.

VCF format

Variant Call Format files can contain large amounts of genotype data. Since the bulk of (human) genome variation is rare, naive input of VCF files can result in huge amounts of memory usage. Pydigree reads VCF files into the SparseAlleles container to avoid this. This functionality is found in pydigree.io.vcf. VCF files can contain large amounts of metadata, and parsing them is more intensive than other formats.

Pydigree does not store any metadata when reading VCF files, so writing them is not supported.