pydigree.genotypes package¶
Submodules¶
pydigree.genotypes.alleles module¶
-
class
pydigree.genotypes.alleles.Alleles¶ Bases:
numpy.ndarray,pydigree.genotypes.genoabc.AlleleContainerA class for holding genotypes
-
copy_span(template, copy_start, copy_stop)¶ Copies a span of another AlleleContainer to this one
Parameters: - template (AlleleContainer) – Container to copy from
- copy_start (int) – start point for copy (inclusive)
- copy_stop (int) – end_point for copy (exclusive)
Return type: void
-
empty_like()¶ Returns an empty Alleles object like this one
-
missing¶ Returns a numpy array indicating which markers have missing data
Returns: missingness array Return type: np.array
-
missingcode¶
-
nmark()¶ Return the number of markers represented by the Alleles object
Returns: number of markers Return type: int
-
pydigree.genotypes.chromosometemplate module¶
Classes for storing population level genotype information
-
class
pydigree.genotypes.chromosometemplate.ChromosomeSet¶ Bases:
objectAn object representing the full complement of variants in a population
-
add_chromosome(template)¶ Add a chromosome to the set.
Parameters: template (ChromosomeTemplate) – the chromosome to add
-
finalize()¶ Finalize each template in the set
-
frequency(chrom, variant)¶ Get the frequency of a variant
Parameters: - chrom (int) – index of chromosome
- variant (int) – index of marker on the chromosome
-
marker_label(chrom, variant)¶ Get the label of a variant
Parameters: - chrom (int) – index of chromosome
- variant (int) – index of marker on the chromosome
-
nchrom()¶ Returns the number of chromosomes in the set
-
nloci()¶ Returns the total number of variants in the set
-
physical_map(chrom, variant)¶ Get the physical position of a variant
Parameters: - chrom (int) – index of chromosome
- variant (int) – index of marker on the chromosome
-
select_random_loci(nloc)¶ Chooses nloc random sites throughout the set of chromosomes
Parameters: nloc (int) – number of loci to select Return type: generator of locations
-
-
class
pydigree.genotypes.chromosometemplate.ChromosomeTemplate(label=None)¶ Bases:
objectChromsome is a class that keeps track of marker frequencies and distances. Not an actual chromosome with genotypes, which you would find under Individual.
Markers are currently diallelic and frequencies are given for minor alleles. Marker frequencies must sum to 1. Major allele frequency is then f = 1 - f_minor.
linkageequilibrium_chromosome generates chromsomes that are generated from simulating all markers with complete independence (linkage equilibrium). This is not typically what you want: you won’t find any LD for association etc. linkageequilibrium_chromosome is used for ‘seed’ chromosomes when initializing a population pool or when simulating purely family-based studies for linkage analysis.
-
add_genotype(frequency=None, map_position=None, label=None, bp=None, reference=None, alternates=None)¶ Adds a variant to the chromosome
-
closest_marker(position, map_type='physical')¶ Returns the index of the closest marker to a position
Parameters: - position – desired location
- map_type ('physical' or 'genetic') – distance metric to use
Returns: closest index
Return type: int
-
empty_chromosome(dtype=<class 'numpy.uint8'>, sparse=False, refcode=None)¶ Produces a completely empty chromosome associated with this template.
Parameters: - sparse (bool) – Should a SparseAlleles object be returned
- refcode (int8_t) – if sparse, what should the refcode be?
Returns: empty alleles container
-
finalize()¶ When no more major modifications (i.e. adding or removing sites), ChromosomeTemplate can be reorganized into something more efficient
numpy arrays instead of lists, for example
Return type: void
-
static
from_genomesimla(filename)¶ Reads positions and frequencies from a genomeSIMLA template file
Parameters: filename (string) – path to the template Returns: Template containing the data from the file Return type: ChromosomeTemplate
-
iterinfo()¶ Iterator over genotype labels, cM position, bp position, MAF
-
linkageequilibrium_chromosome(sparse=False)¶ Returns a randomly generated chromosome in linage equilibrium
Parameters: sparse (bool) – Should the output be sparse Returns: random chromosome Return type: Alleles or SparseAlleles
-
linkageequilibrium_chromosomes(nchrom)¶ Returns a numpy array of many randomly generated chromosomes
-
nmark()¶ Returns the number of markers on the chromosome
Returns: marker count Return type: int
-
outputlabel¶ The label outputted when written to disk
-
set_frequency(position, frequency)¶ Manually change an allele’s frequency
Parameters: - position (int) – Index to change
- frequency (float) – new minor allele frequency
-
size()¶ Returns the size of the chromosome in centimorgans
Return type: float
-
pydigree.genotypes.genoabc module¶
pydigree.genotypes.labelledalleles module¶
-
class
pydigree.genotypes.labelledalleles.AncestralAllele(anc, hap)¶ Bases:
object-
ancestor¶
-
haplotype¶
-
-
class
pydigree.genotypes.labelledalleles.InheritanceSpan(ancestor, chromosomeidx, haplotype, start, stop)¶ Bases:
object-
ancestor¶
-
ancestral_allele¶
-
ancestral_chromosome¶
-
chromosomeidx¶
-
contains(index)¶ Returns true if the index specified falls within this span
-
haplotype¶
-
interval¶
-
start¶
-
stop¶
-
to_tuple()¶
-
-
class
pydigree.genotypes.labelledalleles.LabelledAlleles(spans=None, chromobj=None, nmark=None)¶ Bases:
pydigree.genotypes.genoabc.AlleleContainer-
add_span(new_span)¶
-
copy_span(template, copy_start, copy_stop)¶
-
delabel()¶
-
dtype¶
-
empty_like()¶
-
static
founder_chromosome(ind, chromidx, hap, chromobj=None, nmark=None)¶
-
pydigree.genotypes.sparsealleles module¶
-
class
pydigree.genotypes.sparsealleles.SparseAlleles(data=None, refcode=0, size=None, template=None)¶ Bases:
pydigree.genotypes.genoabc.AlleleContainerAn object representing a set of haploid genotypes efficiently by storing allele differences from a reference. Useful for manipulating genotypes from sequence data (e.g. VCF files)
In the interest of conserving memory for sequencing data, all alleles must be represented by a signed 8-bit integer (i.e. between -128 and 127). Negative values are interpreted as missing.
-
copy()¶ Creates a copy of the current data
Returns: cloned allele set Return type: SparseAlleles
-
copy_span(template, copy_start, copy_stop)¶ Copies one segment of a chromosome over to the other
Parameters: - template (AlleleContainer) – the data to be copied from
- copy_start (int) – where to start copying (inclusive)
- copy_stop (int) – where to stop copying (exclusive)
Rtype void:
-
dtype¶
-
static
empty(template)¶ Creates an empty SparseAlleles (everybody is wild-type)
Parameters: template (ChromosomeTemplate) – The chromosome info associated with this set of alleles Returns: Empty container Return type: SparseAlleles
-
empty_like()¶ Creates a blank SparseAlleles with same parameters
Returns: empty SparseAlleles
-
keys()¶
-
missing¶ Returns a numpy array indicating which markers have missing data
-
missingcode¶ Returns the code used for missing values
-
nmark()¶ Return the number of markers (both reference and non-reference) represented by the SparseAlleles object
Returns: markercount Return type: int
-
refcode¶ Returns the sparse value in the container
Return type: int8_t
-
todense()¶ Converts to a dense representation of the same genotypes (Alleles).
Returns: dense version Return type: Alleles
-
values()¶
-