Skip to content

Commit f0d3ab3

Browse files
ce
1 parent 818aca5 commit f0d3ab3

4 files changed

Lines changed: 39 additions & 11 deletions

File tree

NEWS.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -4,6 +4,8 @@
44

55
* Optimized sliceFamilies to be more abstract
66
* Created `.require_openmx()` to make it easier to use OpenMx functions without making OpenMx a dependency
7+
* Smarter string ID handling for ped2id
8+
79

810
# BGmisc 1.7.0.0
911
* Fixed bug in parList

R/segmentPedigree.R

Lines changed: 27 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -11,6 +11,7 @@
1111
#' @param dadID character. Name of the column in ped for the father ID variable
1212
#' @param famID character. Name of the column to be created in ped for the family ID variable
1313
#' @param twinID character. Name of the column in ped for the twin ID variable, if applicable
14+
#' @param overwrite logical. If TRUE, will overwrite existing famID variable if it exists. Default is TRUE.
1415
#' @param ... additional arguments to be passed to \code{\link{ped2com}}
1516
#' @details
1617
#' The general idea of this function is to use person ID, mother ID, and father ID to
@@ -31,18 +32,20 @@
3132
ped2fam <- function(ped, personID = "ID",
3233
momID = "momID", dadID = "dadID", famID = "famID",
3334
twinID = "twinID",
35+
overwrite = TRUE,
3436
...) {
3537
# Call to wrapper function
3638
.ped2id(
3739
ped = ped, personID = personID, momID = momID, dadID = dadID, famID = famID, twinID = twinID,
38-
type = "parents"
40+
type = "parents",
41+
overwrite = overwrite
3942
)
4043
}
4144

4245
.ped2id <- function(ped,
4346
personID = "ID", momID = "momID", dadID = "dadID",
4447
famID = "famID", twinID = "twinID",
45-
type,
48+
type, overwrite = TRUE,
4649
...) {
4750
# Turn pedigree into family
4851
pg <- ped2graph(
@@ -55,23 +58,43 @@ ped2fam <- function(ped, personID = "ID",
5558

5659
# Create famID data.frame
5760
# Convert IDs to numeric, with warning if coercion collapses IDs
61+
5862
uniques <- suppressWarnings(unique(as.numeric(names(wcc$membership))))
63+
keep_string <- FALSE
5964

6065
if (length(uniques) == 1L && is.na(uniques)) {
6166
warning("After converting IDs to numeric, all IDs became NA. This indicates ID coercion collapsed IDs. Please ensure IDs aren't character or factor variables.")
62-
67+
keep_string <- TRUE
68+
} else if (length(uniques) < length(wcc$membership)) {
69+
warning("After converting IDs to numeric, some IDs became NA. This indicates ID coercion collapsed some IDs. Please ensure IDs aren't character or factor variables.")
70+
keep_string <- TRUE
71+
}
72+
if(keep_string==TRUE) {
6373
fam <- data.frame(
6474
V1 = names(wcc$membership),
6575
V2 = wcc$membership
6676
)
67-
} else {
77+
} else {
6878
fam <- data.frame(
6979
V1 = as.numeric(names(wcc$membership)),
7080
V2 = wcc$membership
7181
)
7282
}
7383

7484
names(fam) <- c(personID, famID)
85+
86+
if(famID %in% names(ped)) {
87+
if(overwrite) {
88+
overwrite_message <- "be overwritten."
89+
ped[[famID]] <- NULL
90+
} else {
91+
overwrite_message <- "not be overwritten."
92+
}
93+
94+
warning(sprintf("The famID variable '%s' already exists in the pedigree. The existing variable will %s", famID, overwrite_message))
95+
96+
}
97+
7598
ped2 <- merge(fam, ped,
7699
by = personID, all.x = FALSE, all.y = TRUE
77100
)

man/ped2fam.Rd

Lines changed: 3 additions & 0 deletions
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

vignettes/articles/tutorialmanuscript.Xmd

Lines changed: 7 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -7,7 +7,7 @@ author:
77
corresponding: true
88
email: "garrissm@wfu.edu"
99
abstract: |
10-
Twin studies remain the dominant design in behavior genetics, yet most twin half-siblings, cousins, and multi-generational relatives whose distinct kinship coefficients jointly identify a richer set of variance components than any MZ/DZ comparison alone. We demonstrate how to fit extended pedigree models using the BGmisc package and OpenMx.
10+
Twin studies remain the dominant design in behavior genetics, yet most twin half-siblings, cousins, and multi-generational relatives whose distinct kinship coefficients jointly identify a richer set of variance components than any MZ/DZ comparison alone. We demonstrate how to fit extended pedigree models using the BGmisc package and OpenMx.
1111
We apply the extended pedigree model to mutiple datasets
1212
of Youth (a large human panel study with researcher-linked kinship), the Kluane Red Squirrel Project
1313
(a multi-generational animal field study), and a children-of-twins dataset.
@@ -31,8 +31,8 @@ vignette: >
3131
%\VignetteIndexEntry{Extended Family Modeling with BGmisc}
3232
%\VignetteEncoding{UTF-8}
3333
%\VignetteEngine{knitr::rmarkdown}
34-
editor_options:
35-
markdown:
34+
editor_options:
35+
markdown:
3636
wrap: 100
3737
---
3838

@@ -65,7 +65,7 @@ studies, either intentionally (e.g., twin registries that also include siblings
6565
byproduct of large panel studies (e.g., the National Longitudinal Survey of Youth, which includes
6666
researcher-linked kinship). In most cases, the additional relatives are excluded from analysis, and
6767
the twin design is applied to a subset of the data, even though these relatives carry independent
68-
information about the genetic and environmental architecture of the phenotype.
68+
information about the genetic and environmental architecture of the phenotype. For example, many of the twin registries reviewed in FOO, include triplets, sibles, children, parents. https://helda.helsinki.fi/server/api/core/bitstreams/f0b6dc08-69df-449e-a8fe-e2c78abf7f60/content
6969

7070
The extended pedigree model, which we have introduced elsewhere (see ETC), leverages the full range
7171
of kinship coefficients in a pedigree to identify a richer set of variance components than the
@@ -87,15 +87,15 @@ value provides independent leverage for disentangling genetic from environmental
8787
the number of distinct kinship types increases, so does the number of identifiable variance
8888
components.
8989

90-
Extended pedigree designs have been used in behavior genetics since at least the 1970s [@eaves1978; @fulker_multiple_1988], but they have remained a minority practice. Partially over concerns about model identification and power (Wilson, 1982, 1989), the complexity of fitting these models, and the relative costs of collecting twin data compared to extended family data.
90+
Extended pedigree designs have been used in behavior genetics since at least the 1970s [@eaves1978; @fulker_multiple_1988], but they have remained a minority practice. Partially over concerns about model identification and power (Wilson, 1982, 1989), the complexity of fitting these models, and the relative costs of collecting twin data compared to extended family data.
9191

92-
<! -- https://onlinelibrary.wiley.com/doi/10.1002/bimj.4710310511
92+
<! -- https://onlinelibrary.wiley.com/doi/10.1002/bimj.4710310511 -->
9393

9494
but also because the twin design has been so successful and widely adopted. The twin design is often seen as the "gold standard" in behavior genetics, and many researchers may be hesitant to deviate from this established approach. Additionally, many human datasets simply do not include the necessary family structure to fit extended pedigree models, which may limit their applicability in certain contexts.
9595

9696
<the reasosn are numerous for why this is the case, but a key factor is that many human datasets simply do not include the necessary family structure to fit these models. And the twin design is often the default analytic approach, even when more complex family data are available.
9797

98-
Deriving
98+
9999

100100
In contrast, similar
101101
models are common in plant and animal breeding, where pedigree data is more routinely collected and

0 commit comments

Comments
 (0)