Simulations¶
Extremely naive simulation functions to generate genotype data for illustration of other features in the anhima package.
- anhima.sim.simulate_biallelic_genotypes(n_variants, n_samples, af_dist, p_missing=0.1, ploidy=2)[source]¶
Simulate genotypes at biallelic variants for a population in Hardy-Weinberg equilibrium
Parameters: n_variants : int
The number of variants.
n_samples : int
The number of samples.
af_dist : frozen continuous random variable
The distribution of allele frequencies.
p_missing : float, optional
The fraction of missing genotype calls.
ploidy : int, optional
The sample ploidy.
Returns: genotypes : ndarray, int8
An array of shape (n_variants, n_samples, ploidy) where each element of the array is an integer corresponding to an allele index (-1 = missing, 0 = reference allele, 1 = alternate allele).
- anhima.sim.simulate_genotypes_with_ld(n_variants, n_samples, correlation=0.2)[source]¶
A very simple function to simulate a set of genotypes, where variants are in some degree of linkage disequilibrium with their neighbours.
Parameters: n_variants : int
The number of variants to simulate data for.
n_samples : int
The number of individuals to simulate data for.
correlation : float, optional
The fraction of samples to copy genotypes between neighbouring variants.
Returns: gn : ndarray, int8
A 2-dimensional array of shape (n_variants, n_samples) where each element is a genotype call coded as a single integer counting the number of non-reference alleles.
Simulate relatedness by randomly copying genotypes between individuals.
Parameters: genotypes : array_like
An array of shape (n_variants, n_samples, ploidy) where each element of the array is an integer corresponding to an allele index (-1 = missing, 0 = reference allele, 1 = first alternate allele, 2 = second alternate allele, etc.).
relatedness : float, optional
Fraction of variants to copy genotypes for.
n_iter : int, optional
Number of times to randomly copy genotypes between individuals.
copy : bool, optional
If False, modify genotypes in place.
Returns: genotypes : ndarray, shape (n_variants, n_samples, ploidy)
The input genotype array but with relatedness simulated.