Skip to content

Commit 16bbba6

Browse files
Adding detail on the cytosine contexts
1 parent 675b579 commit 16bbba6

1 file changed

Lines changed: 42 additions & 0 deletions

File tree

documentation/docs/cytosine_contexts.md

Lines changed: 42 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -81,3 +81,45 @@ contexts would be:
8181
13 + CHG
8282
15 - CHG
8383
```
84+
85+
The following table should help explain which triples of nucleotides
86+
are counted towards each context. Each of the triples begins with a C,
87+
and in our formats, this is the cytosine where the methylation level
88+
or state is in question.
89+
90+
| Trip | CpG | CHH | CHG | CWG | CCG | |
91+
|------|-----|-----|-----|-----|-----|---|
92+
| CAA | | 1 | | | | |
93+
| CAC | | 1 | | | | |
94+
| CAG | | | 1 | 1 | | * |
95+
| CAT | | 1 | | | | |
96+
| CCA | | 1 | | | | |
97+
| CCC | | 1 | | | | |
98+
| CCG | | | 1 | | 1 | * |
99+
| CCT | | 1 | | | | |
100+
| CGA | 1 | | | | | |
101+
| CGC | 1 | | | | | |
102+
| CGG | 1 | | | | | |
103+
| CGT | 1 | | | | | |
104+
| CTA | | 1 | | | | |
105+
| CTC | | 1 | | | | |
106+
| CTG | | | 1 | 1 | | * |
107+
| CTT | | 1 | | | | |
108+
109+
The traditional contexts considered are the CpG, the CHH and the
110+
CHG. The CHH and CHG have been of more interest in plants, especially
111+
Arabidopsis. Together the CpG, CHH and CHG contexts cover all
112+
trinucleotides that start with C, and partition the trinucs
113+
unambiguously. These are the first 3 columns above. The CWG and CCG
114+
are of interest mostly because the CWG is so important in vertebrate
115+
species (and more-so for mammalia) where a combination of
116+
deamination-induced loss of the middle cytosine of a CCG and tandem
117+
expansion of CAG/CTG repeats (including within human populations) has
118+
led to a relative abundance of the CWG in important places in the
119+
genome. The CWG may also be called "symmetric" in the same way as a
120+
CHG. However, in a strict sense, if one calls every CHG symmetric,
121+
then it might include CCG on one strand, with CGG on the other, and
122+
the CGG would not be a CHG.
123+
124+
The above table does not mention the CXG; we plan to remove the CXG
125+
and only include contexts from the above table.

0 commit comments

Comments
 (0)