-
Notifications
You must be signed in to change notification settings - Fork 3
Expand file tree
/
Copy pathsimulatePedigree.Rd
More file actions
126 lines (108 loc) · 4.89 KB
/
simulatePedigree.Rd
File metadata and controls
126 lines (108 loc) · 4.89 KB
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/simulatePedigree.R
\name{simulatePedigree}
\alias{simulatePedigree}
\alias{SimPed}
\title{Simulate Pedigrees
This function simulates "balanced" pedigrees based on a group of parameters:
1) k - Kids per couple;
2) G - Number of generations;
3) p - Proportion of males in offspring;
4) r - Mating rate.}
\usage{
simulatePedigree(
kpc = 3,
Ngen = 4,
sexR = 0.5,
marR = 2/3,
rd_kpc = FALSE,
balancedSex = TRUE,
balancedMar = TRUE,
verbose = FALSE,
personID = "ID",
momID = "momID",
dadID = "dadID",
spouseID = "spouseID",
code_male = "M",
code_female = "F",
fam_shift = 1L,
remap_ids = FALSE,
beta = FALSE
)
SimPed(...)
}
\arguments{
\item{kpc}{Number of kids per couple. An integer >= 2 that determines how
many kids each fertilized mated couple will have in the pedigree. Default
value is 3. Returns an error when kpc equals 1.}
\item{Ngen}{Number of generations. An integer >= 2 that determines how many
generations the simulated pedigree will have. The first generation is always
a fertilized couple. The last generation has no mated individuals.}
\item{sexR}{Sex ratio of offspring. A numeric value ranging from 0 to 1 that
determines the proportion of males in all offspring in this pedigree. For
instance, 0.4 means 40 percent of the offspring will be male.}
\item{marR}{Mating rate. A numeric value ranging from 0 to 1 which determines
the proportion of mated (fertilized) couples in the pedigree within each
generation. For instance, marR = 0.5 suggests 50 percent of the offspring in
a specific generation will be mated and have their offspring.}
\item{rd_kpc}{logical. If TRUE, the number of kids per mate will be randomly
generated from a poisson distribution with mean kpc. If FALSE, the number of
kids per mate will be fixed at kpc.}
\item{balancedSex}{Not fully developed yet. Always \code{TRUE} in the
current version.}
\item{balancedMar}{Not fully developed yet. Always \code{TRUE} in the
current version.}
\item{verbose}{logical If TRUE, message progress through stages of algorithm}
\item{personID}{character. Name of the column in ped for the person ID variable}
\item{momID}{character. Name of the column in ped for the mother ID variable}
\item{dadID}{character. Name of the column in ped for the father ID variable}
\item{spouseID}{The name of the column that will contain the spouse ID in the output data frame. Default is "spID".}
\item{code_male}{The value to use for males. Default is "M"}
\item{code_female}{The value to use for females. Default is "F"}
\item{fam_shift}{An integer to shift the person ID. Default is 1L.
This is useful when simulating multiple pedigrees to avoid ID conflicts.}
\item{remap_ids}{logical. If TRUE, remap all ID columns to sequential integers (1, 2, 3, ...) in row order.}
\item{beta}{logical or character. Controls which algorithm version to use:
\itemize{
\item{\code{FALSE}, \code{"base"}, or \code{"original"} (default): Use the original algorithm.
Slower but ensures exact reproducibility with set.seed().}
\item{\code{TRUE} or \code{"optimized"}: Use the optimized algorithm with 4-5x speedup.
Produces statistically equivalent results but not identical to base version
due to different random number consumption. Recommended for large simulations
where speed matters more than exact reproducibility.}
}
Note: Both versions are mathematically correct and produce valid pedigrees with the
same statistical properties (sex ratios, mating rates, etc.). The optimized version
uses vectorized operations instead of loops, making it much faster for large pedigrees.}
\item{...}{Additional arguments to be passed to other functions.}
}
\value{
A \code{data.frame} with each row representing a simulated individual. The columns are as follows:
\itemize{
\item{fam: The family id of each simulated individual. It is 'fam1' in a single simulated pedigree.}
\item{ID: The unique personal ID of each simulated individual. The first digit is the fam id; the fourth digit is the generation the individual is in; the following digits represent the order of the individual within their pedigree. For example, 100411 suggests this individual has a family id of 1, is in the 4th generation, and is the 11th individual in the 4th generation.}
\item{gen: The generation the simulated individual is in.}
\item{dadID: Personal ID of the individual's father.}
\item{momID: Personal ID of the individual's mother.}
\item{spID: Personal ID of the individual's mate.}
\item{sex: Biological sex of the individual. F - female; M - male.}
}
}
\description{
Simulate Pedigrees
This function simulates "balanced" pedigrees based on a group of parameters:
1) k - Kids per couple;
2) G - Number of generations;
3) p - Proportion of males in offspring;
4) r - Mating rate.
}
\examples{
set.seed(5)
df_ped <- simulatePedigree(
kpc = 4,
Ngen = 4,
sexR = .5,
marR = .7
)
summary(df_ped)
}