Skip to content

Commit 5b00852

Browse files
committed
Update README
1 parent 7fbef12 commit 5b00852

5 files changed

Lines changed: 51 additions & 40 deletions

File tree

README.Rmd

Lines changed: 13 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -20,7 +20,7 @@ knitr::opts_chunk$set(
2020
# childdevdata
2121

2222
<!-- badges: start -->
23-
[![Lifecycle: experimental](https://img.shields.io/badge/lifecycle-experimental-orange.svg)](https://lifecycle.r-lib.org/articles/stages.html#experimental)
23+
[![Lifecycle: stable](https://img.shields.io/badge/lifecycle-stable-brightgreen.svg)](https://lifecycle.r-lib.org/articles/stages.html#stable)
2424
<!-- badges: end -->
2525

2626
The goal of `childdevdata` is to support innovation in child development. The package
@@ -30,8 +30,8 @@ The goal of `childdevdata` is to support innovation in child development. The pa
3030
3. Supports multiple measurement instruments;
3131
4. Eases joint analyses of the data.
3232

33-
The current version bundles milestone data from eight studies, x children and x
34-
measurement made on z instruments.
33+
The current version bundles milestone data from ten studies, containing 1,116,061 assessments
34+
made on 10831 unique children during 28465 visits, covering 21 different instruments.
3535

3636
## Installation
3737

@@ -48,7 +48,7 @@ remotes::install_github(repo = "d-score/childdevdata")
4848

4949
The following example visualises how the proportion of toddlers that are able to walk increases with age.
5050

51-
```{r example}
51+
```{r example, fig.retina=2}
5252
library(childdevdata)
5353
library(ggplot2)
5454
@@ -94,20 +94,20 @@ The first seven columns are administrative and background variables. Column numb
9494

9595
## Combining data
9696

97-
Concatenating two or more data is straightforward using `dplyr`. The following code concatenates all datasets.
97+
Concatenating two or more data is straightforward using `dplyr`. The following code concatenates all avialable GCDG datasets.
9898

9999
```{r concatenate}
100100
library(dplyr)
101-
alldata <- bind_rows(gcdg_col_lt42m, gcdg_col_lt45m, gcdg_ecu, gcdg_jam_lbw, gcdg_jam_stunted, gcdg_mdg, gcdg_nld_smocc, gcdg_zaf)
101+
alldata <- bind_rows(gcdg_chl_1, gcdg_chn, gcdg_col_lt42m, gcdg_col_lt45m, gcdg_ecu, gcdg_jam_lbw, gcdg_jam_stunted, gcdg_mdg, gcdg_nld_smocc, gcdg_zaf)
102102
dim(alldata)
103103
```
104104

105105
Both the number of rows and the number of columns have increased. Milestones not appearing in a particular data obtain all missing (`NA`) scores.
106106

107-
The number of records per cohort is
107+
The number of records per cohort by sex is
108108

109109
```{r}
110-
table(alldata$cohort)
110+
table(alldata$cohort, alldata$sex)
111111
```
112112

113113
## Calculating D-score and DAZ
@@ -124,19 +124,20 @@ dim(d)
124124

125125
We visualise the D-score distribution by age per cohort as
126126

127-
```{r}
127+
```{r fig.retina=2}
128128
alldata <- bind_cols(alldata, d)
129129
ggplot(alldata, aes(age, d, group = cohort)) +
130130
geom_point(cex = 0.3) +
131131
facet_wrap(~ cohort) +
132+
ylab("D-score") + xlab("Age (years)") +
132133
theme_bw()
133134
```
134135

135136
## Why this package?
136137

137138
We all want our children to grow and prosper. While there is no shortage of apps and instruments to track child development, it is often unclear which data went into the construction of these tools. In order to improve measurement and norm setting of child development, we need child-level response data per milestone and age. However, no such public dataset seem to exist. The `childdevdata` package fills that void.
138139

139-
The package grew out of a project in which we collected milestone data from 16 cohorts. See @weber2019 and <http://d-score.org/dbook2/> for results. Eight of the cohort owners graciously decided to make their data available for third parties. We are grateful to them.
140+
The package grew out of a project in which we collected milestone data from 16 cohorts. See @weber2019 and <http://d-score.org/dbook2/> for results. Ten cohort owners graciously decided to make their data available for third parties. We are grateful to them.
140141

141142
## How to use the data?
142143

@@ -154,9 +155,9 @@ The citation of the `childevdata` package is
154155
title = {Child Development Data},
155156
author = {Stef {van Buuren} and Iris Eekhout and Marta Rubio Codina and Orazio Attanasio and Costas Meghir and
156157
Emla Fitzsimons and Sally Grantham-McGregor and Maria Caridad Araujo and Susan Walker and Susan Chang and
157-
Christine Powell and Ann Weber and Lia Fernald and Paul Verkerk and Linda Richter},
158+
Christine Powell and Ann Weber and Lia Fernald and Paul Verkerk and Linda Richter and Betsy Lozoff},
158159
year = {2021},
159-
note = {R package version 0.1.0},
160+
note = {R package version 1.0.0},
160161
url = {https://github.com/d-score/childdevdata},
161162
}
162163
```

README.md

Lines changed: 38 additions & 28 deletions
Original file line numberDiff line numberDiff line change
@@ -6,7 +6,7 @@
66
<!-- badges: start -->
77

88
[![Lifecycle:
9-
experimental](https://img.shields.io/badge/lifecycle-experimental-orange.svg)](https://lifecycle.r-lib.org/articles/stages.html#experimental)
9+
stable](https://img.shields.io/badge/lifecycle-stable-brightgreen.svg)](https://lifecycle.r-lib.org/articles/stages.html#stable)
1010
<!-- badges: end -->
1111

1212
The goal of `childdevdata` is to support innovation in child
@@ -17,8 +17,9 @@ development. The package
1717
3. Supports multiple measurement instruments;
1818
4. Eases joint analyses of the data.
1919

20-
The current version bundles milestone data from eight studies, x
21-
children and x measurement made on z instruments.
20+
The current version bundles milestone data from ten studies, containing
21+
1,116,061 assessments made on 10831 unique children during 28465 visits,
22+
covering 21 different instruments.
2223

2324
## Installation
2425

@@ -60,8 +61,9 @@ The package contains multiple datasets. Obtain the list of datasets by
6061

6162
``` r
6263
data(package = "childdevdata")$results[, "Item"]
63-
#> [1] "gcdg_col_lt42m" "gcdg_col_lt45m" "gcdg_ecu" "gcdg_jam_lbw"
64-
#> [5] "gcdg_jam_stunted" "gcdg_mdg" "gcdg_nld_smocc" "gcdg_zaf"
64+
#> [1] "gcdg_chl_1" "gcdg_chn" "gcdg_col_lt42m" "gcdg_col_lt45m"
65+
#> [5] "gcdg_ecu" "gcdg_jam_lbw" "gcdg_jam_stunted" "gcdg_mdg"
66+
#> [9] "gcdg_nld_smocc" "gcdg_zaf"
6567
```
6668

6769
The documentation of the data can be found by typing into the console:
@@ -98,7 +100,7 @@ Column numbers eight and up hold the milestone scores.
98100
## Combining data
99101

100102
Concatenating two or more data is straightforward using `dplyr`. The
101-
following code concatenates all datasets.
103+
following code concatenates all avialable GCDG datasets.
102104

103105
``` r
104106
library(dplyr)
@@ -110,24 +112,31 @@ library(dplyr)
110112
#> The following objects are masked from 'package:base':
111113
#>
112114
#> intersect, setdiff, setequal, union
113-
alldata <- bind_rows(gcdg_col_lt42m, gcdg_col_lt45m, gcdg_ecu, gcdg_jam_lbw, gcdg_jam_stunted, gcdg_mdg, gcdg_nld_smocc, gcdg_zaf)
115+
alldata <- bind_rows(gcdg_chl_1, gcdg_chn, gcdg_col_lt42m, gcdg_col_lt45m, gcdg_ecu, gcdg_jam_lbw, gcdg_jam_stunted, gcdg_mdg, gcdg_nld_smocc, gcdg_zaf)
114116
dim(alldata)
115-
#> [1] 25336 1280
117+
#> [1] 28465 1306
116118
```
117119

118120
Both the number of rows and the number of columns have increased.
119121
Milestones not appearing in a particular data obtain all missing (`NA`)
120122
scores.
121123

122-
The number of records per cohort is
124+
The number of records per cohort by sex is
123125

124126
``` r
125-
table(alldata$cohort)
126-
#>
127-
#> GCDG-COL-LT42M GCDG-COL-LT45M GCDG-ECU GCDG-JAM-LBW
128-
#> 1311 1335 667 443
129-
#> GCDG-JAM-STUNTED GCDG-MDG GCDG-NLD-SMOCC GCDG-ZAF
130-
#> 477 205 16722 4176
127+
table(alldata$cohort, alldata$sex)
128+
#>
129+
#> Female Male
130+
#> GCDG-CHL-1 970 1169
131+
#> GCDG-CHN 509 481
132+
#> GCDG-COL-LT42M 646 665
133+
#> GCDG-COL-LT45M 651 684
134+
#> GCDG-ECU 337 330
135+
#> GCDG-JAM-LBW 242 201
136+
#> GCDG-JAM-STUNTED 207 270
137+
#> GCDG-MDG 113 92
138+
#> GCDG-NLD-SMOCC 8499 8223
139+
#> GCDG-ZAF 2154 2018
131140
```
132141

133142
## Calculating D-score and DAZ
@@ -141,15 +150,15 @@ library(dscore)
141150
alldata$age <- round(alldata$agedays/365.25, 4)
142151
d <- dscore(alldata)
143152
head(d)
144-
#> a n p d sem daz
145-
#> 1 1.81 29 0.345 58.0 0.950 -0.932
146-
#> 2 3.19 49 0.857 71.8 0.566 0.515
147-
#> 3 0.86 41 0.537 45.4 0.541 -0.282
148-
#> 4 3.39 36 0.556 69.3 0.523 -0.773
149-
#> 5 1.86 42 0.691 65.3 0.542 1.475
150-
#> 6 2.94 44 0.795 70.0 0.511 0.297
153+
#> a n p d sem daz
154+
#> 1 1.024 29 0.690 50.4 0.666 0.286
155+
#> 2 1.509 22 0.955 57.8 1.445 0.269
156+
#> 3 0.975 29 0.724 50.8 0.682 0.742
157+
#> 4 1.016 29 0.759 51.3 0.700 0.649
158+
#> 5 1.016 22 0.682 49.1 0.677 -0.099
159+
#> 6 1.517 25 0.840 56.9 1.058 -0.070
151160
dim(d)
152-
#> [1] 25336 6
161+
#> [1] 28465 6
153162
```
154163

155164
We visualise the D-score distribution by age per cohort as
@@ -159,8 +168,9 @@ alldata <- bind_cols(alldata, d)
159168
ggplot(alldata, aes(age, d, group = cohort)) +
160169
geom_point(cex = 0.3) +
161170
facet_wrap(~ cohort) +
171+
ylab("D-score") + xlab("Age (years)") +
162172
theme_bw()
163-
#> Warning: Removed 253 rows containing missing values (geom_point).
173+
#> Warning: Removed 380 rows containing missing values (geom_point).
164174
```
165175

166176
<img src="man/figures/README-unnamed-chunk-7-1.png" width="100%" />
@@ -177,8 +187,8 @@ dataset seem to exist. The `childdevdata` package fills that void.
177187
The package grew out of a project in which we collected milestone data
178188
from 16 cohorts. See [Weber et al.](#ref-weber2019)
179189
([2019](#ref-weber2019)) and <http://d-score.org/dbook2/> for results.
180-
Eight of the cohort owners graciously decided to make their data
181-
available for third parties. We are grateful to them.
190+
Ten cohort owners graciously decided to make their data available for
191+
third parties. We are grateful to them.
182192

183193
## How to use the data?
184194

@@ -203,9 +213,9 @@ The citation of the `childevdata` package is
203213
title = {Child Development Data},
204214
author = {Stef {van Buuren} and Iris Eekhout and Marta Rubio Codina and Orazio Attanasio and Costas Meghir and
205215
Emla Fitzsimons and Sally Grantham-McGregor and Maria Caridad Araujo and Susan Walker and Susan Chang and
206-
Christine Powell and Ann Weber and Lia Fernald and Paul Verkerk and Linda Richter},
216+
Christine Powell and Ann Weber and Lia Fernald and Paul Verkerk and Linda Richter and Betsy Lozoff},
207217
year = {2021},
208-
note = {R package version 0.1.0},
218+
note = {R package version 1.0.0},
209219
url = {https://github.com/d-score/childdevdata},
210220
}
211221

man/figures/README-example-1.png

39.7 KB
Loading

man/figures/README-pressure-1.png

-22.4 KB
Binary file not shown.
85.1 KB
Loading

0 commit comments

Comments
 (0)