pydigree.cydigree package¶
Submodules¶
pydigree.cydigree.cyfuncs module¶
-
class
pydigree.cydigree.cyfuncs.Segment¶ Bases:
object-
marker_labels¶
-
nmark¶
-
physical_position¶
-
physical_start¶
-
physical_stop¶
-
start¶
-
stop¶
-
-
pydigree.cydigree.cyfuncs.all_same_type()¶ Quickly checks if all items in iterable are the same type
Parameters: - iter – sequence to be checked
- t – type desired
Returns: items are all same type
Return type: bool
-
pydigree.cydigree.cyfuncs.fastfirstitem()¶ Rapidly gets the first item from each element in an interable of iterables
-
pydigree.cydigree.cyfuncs.ibs()¶ Returns how many alleles (0, 1, or 2) are identical-by-state between two diploid genotypes
Parameters: - g1 (tuple) – genotype 1
- g2 (tuple) – genotype 2
- missingval – value if either g1 or g2 is missing
-
pydigree.cydigree.cyfuncs.interleave()¶ Takes two lists and interleaves them. For example interleave(“AAA”, “BBB”) gives [“A”, “B”, “A”, “B”, “A”, “B”]
Returns: interleaved iterables Return type: list
-
pydigree.cydigree.cyfuncs.is_sorted()¶ Check if the sequence is sorted
Returns: sorted? Return type: bool
-
pydigree.cydigree.cyfuncs.runs()¶ Identifies runs of values in a sequence for which predicate(value) evaluates True and yields 2-tuples of the start and end (inclusive) indices
Parameters: - sequence (iterable) – Sequence to run through
- predicate (callable) – function to call
- minlength (int) – shortest allowable run
Returns: Runs
Return type: list of tuples
-
pydigree.cydigree.cyfuncs.runs_gte()¶ Identifies runs of values in an iterable where each value is greater than or equal to a value minval, and returns a list of 2-tuples with the start and end (inclusive) indices of the runs
Parameters: - sequence –
- minval – minimum value to occur in run
- minlength – minimum allowable runlength
Returns: runs
Return type: list of tuples
-
pydigree.cydigree.cyfuncs.runs_gte_uint8()¶
-
pydigree.cydigree.cyfuncs.set_intervals_to_value()¶ Creates a numpy integer array and sets intervals to a single value
Parameters: - intervals (iterable of 2-tuples) – Intervals tuples in format (start_idx, stop_idx_inclusive)
- size (unsigned int) – outgoing array size
- value (np.int) – value to set itervals to
pydigree.cydigree.datastructures module¶
-
class
pydigree.cydigree.datastructures.IntTree¶ Bases:
object-
clear()¶ Removes all nodes from tree
-
delete()¶
-
delrange()¶ Deletes keys where start <= key < stop
-
empty()¶
-
find()¶
-
static
from_keys()¶
-
static
from_pairs()¶
-
get()¶
-
getrange()¶
-
insert()¶
-
intersection()¶
-
keys()¶
-
size()¶
-
to_stack()¶
-
traverse()¶
-
union()¶
-
values()¶
-
verify()¶
-
-
class
pydigree.cydigree.datastructures.NodeStack¶ Bases:
object
-
class
pydigree.cydigree.datastructures.SparseArray¶ Bases:
object-
all()¶
-
any()¶
-
container¶
-
copy()¶
-
static
from_dense()¶
-
static
from_items()¶
-
static
from_numpy()¶
-
indices¶
-
items()¶
-
keys()¶
-
logical_not()¶
-
refcode¶
-
set_item()¶
-
size¶
-
sparsity()¶ Returns the proportion of sparse sites in the array
-
tolist()¶
-
values()¶
-
-
pydigree.cydigree.datastructures.print_sizes()¶
pydigree.cydigree.sparsearray module¶
-
class
pydigree.cydigree.sparsearray.SparseArray¶ Bases:
objectA data structure for working with sparse sets of small ints. Can support an array of size \(2^{32}-1\).
Dense values are stored in a self balancing tree, so lookups, setting a dense value, or changing a dense value to sparse will have slower algorithmic performance (O(log n) instead of O(1)). The bookeeping of the tree will also incur some penalties in memory use. For each non-sparse value, a uint32_t is used for the key (4 bytes), int8_t (1 byte) for the value.
Variables: -
all()¶ Returns: are all values are nondense? Rtype bool:
-
any()¶ Are there any non-sparse values?
-
clear()¶ Removes a non-sparse value
Parameters: k (uint32_t) – the key to remove
-
clear_range()¶ Removes all non-sparse values in a region
Parameters: - start (uint32_t) – start of the location (inclusive)
- stop (uint32_t) – the end of the region (exclusive)
-
cmp_single()¶
-
copy()¶ Creates a copy of the array.
Returns: the copy Return type: SparseArray
-
data¶
-
dense_cmp()¶
-
density()¶ Proportion of non-sparse sites
Returns: Percent non-sparse Return type: float
-
eq_single()¶
-
static
from_dense()¶ Creates a SparseArray from a dense sequence
Returns: resulting array Return type: SparseArray
-
static
from_items()¶ Creates a SparseArray from pairs of itemss
Parameters: - seq – A sequence of pairs of type (uint32_t, int8_t)
- size (uint32_t) – the size of the array
- refcode (int8_t) – the sparse value of the array
Returns: the resulting array
Return type:
-
get_item()¶
-
get_slice()¶
-
items()¶ Gets the non-sparse indices and their values
Returns: non-sparse locations and values Return type: list of (uint32_t, int8_t) tuples
-
keys()¶ Gets the non-sparse locations
Returns: locations of the non-sparse values Return type: list
-
logical_not()¶ Performs a logical not on the entire array
Returns: the not-ed array
-
ndense()¶ The number of non-sparse sites in the array
Returns: number of non-sparse items Return type: int
-
ref¶
-
set_item()¶
-
size¶
-
sparse_cmp()¶
-
sparse_eq()¶
-
sparsity()¶ Proportion of array that is sparse
Returns: Percent sparse Return type: float
-
tolist()¶ Returns the SparseArray in a dense format :rtype: list in python, C++ std::vector<uint8_t>
-
values()¶ Gets the non-sparse values
Returns: non-sparse values, in order Return type: list
-
pydigree.cydigree.varianttree module¶
pydigree.cydigree.vcfparse module¶
-
pydigree.cydigree.vcfparse.assign_genorow()¶
-
pydigree.cydigree.vcfparse.vcf_allele_parser()¶