Skip to content

Commit 1ff831c

Browse files
author
zhangrui2018@picb.ac.cn
committed
toydata/
1 parent b257879 commit 1ff831c

48 files changed

Lines changed: 804 additions & 0 deletions

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.
Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
a 1 100 350
Lines changed: 27 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,27 @@
1+
#CHROM POS ID REF ALT QUAL FILTER INFO FORMAT Ind1 Ind2 Ind3 Ind4 Ind5
2+
1 100 . A T . . . GT 0|1 0|0 0|0 1|0 0|0
3+
1 110 . G T . . . GT 0|1 0|0 0|0 1|0 0|0
4+
1 120 . A C . . . GT 1|1 1|1 1|1 1|1 1|1
5+
1 130 . A T . . . GT 0|0 0|0 0|0 0|0 0|0
6+
1 140 . G T . . . GT 0|1 0|0 0|0 0|0 0|0
7+
1 150 . G T . . . GT 0|0 0|0 1|0 0|1 0|0
8+
1 160 . A G . . . GT 0|0 0|0 0|0 0|0 0|0
9+
1 170 . A C . . . GT 0|0 0|0 0|0 0|0 0|0
10+
1 180 . A G . . . GT 0|0 0|0 0|0 0|0 0|0
11+
1 190 . G A . . . GT 1|1 0|1 1|1 1|1 1|1
12+
1 200 . A G . . . GT 1|0 0|0 0|0 0|0 0|0
13+
1 210 . A T . . . GT 0|0 1|0 0|0 0|0 0|0
14+
1 220 . A C . . . GT 1|0 0|1 1|0 0|1 1|0
15+
1 230 . C G . . . GT 0|0 0|0 0|0 0|0 0|0
16+
1 240 . C T . . . GT 0|0 0|0 0|0 0|0 0|0
17+
1 250 . T G . . . GT 1|0 0|1 1|0 0|1 1|0
18+
1 260 . T C . . . GT 1|1 1|1 1|1 1|1 1|1
19+
1 270 . G T . . . GT 0|0 0|0 0|0 0|0 0|0
20+
1 280 . G C . . . GT 0|0 0|0 0|0 0|0 0|0
21+
1 290 . T A . . . GT 0|0 0|0 0|0 0|0 0|0
22+
1 300 . C T . . . GT 0|0 0|0 0|0 0|0 0|0
23+
1 310 . A C . . . GT 0|0 0|0 1|0 0|1 0|0
24+
1 320 . T A . . . GT 0|0 0|0 0|0 0|0 0|0
25+
1 330 . T C . . . GT 0|0 0|0 1|0 0|1 0|0
26+
1 340 . T G . . . GT 0|0 0|0 1|0 0|1 0|0
27+
1 350 . G T . . . GT 0|0 0|1 0|0 0|0 1|0
Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,2 @@
1+
regionID chr start end #sequence #marker ThetaPI ThetaK #segregating #haplotype Hap_diversity Dtajima Dtajima_P Dtajima_adj.P
2+
a 1 100 350 10 26 4.4 4.5953149109272 13 7 0.9333333333333333 -0.193407676442191 0.896843046548428 0.896843046548428

toydata/Summary_Statistics/readme

Lines changed: 30 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,30 @@
1+
calculate the following statistics within given regions:
2+
1.number of sequence
3+
2.number of genetic markers
4+
3.number of singleton
5+
4.ThetaPI
6+
5.ThetaK
7+
6.number of segregating site
8+
7.number of haplotype
9+
8.Haplotype diversity
10+
9. Tajima's D (with p-values and BH-corrected P-values)
11+
12+
input:
13+
1. phased VCF file [required]
14+
2. target regions to be analyzed [required]
15+
3. samples to be included [optional, default = 'all']
16+
4. length of sliding windows (bp) and increment [optional, default = 'target_region']
17+
5. name of output file [optional, default = 'out']
18+
19+
output:
20+
*.stat
21+
22+
parameter:
23+
--vcf: input vcf file
24+
--region: region file, 4 columns: <region ID> <chrom ID> <start pos> <end pos>, no header line, tab or space separated
25+
--samples: included sample ID list
26+
--window_shift: windowsize@increment
27+
--out: prefix of output file
28+
29+
commands line for example data:
30+
python Summary_Statistics.py --vcf input.vcf --region input.region --out output
Lines changed: 31 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,31 @@
1+
AdmixSim 2
2+
Arguments and Options:
3+
Modfile = test.mod
4+
SNVfile = test.snv
5+
Hapfile = test.hap
6+
Indfile = test.ind
7+
Output population list and corresponding generation and population size: Admixed 8g 4;
8+
Seed = 1598518282
9+
Chr = A
10+
Recombinaion Setting: Using position-specific genetic distance in snv file
11+
Mutation Setting: General mutation rate = 1e-08
12+
Out prefix = out1
13+
==========================================================================================================
14+
15+
Simulation start! Thu Aug 27 16:51:22 2020
16+
17+
Reading snvfile time: 0s
18+
Reading indfile time: 0s
19+
Reading modfile time: 0s
20+
Reading hapfile time: 0s
21+
22+
Simulation time: 0s
23+
24+
Saving ancestral sequence data time: 0s
25+
26+
Final add de novo mutation and output snvfile time: 0.01s
27+
28+
Population Admixed at generation 8 output time: 0s
29+
30+
Simulation end! Thu Aug 27 16:51:23 2020
31+

toydata/example1/out1.af

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
Gen Pop Position(s) Condition Frequency Male_Frequency Female_Frequency

toydata/example1/out1.hap

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,8 @@
1+
00100000011010011000000000
2+
11101000010000001000000000
3+
00100000000100001000000000
4+
00100000010010011000000001
5+
00100100010010011000010110
6+
00100000010000001000000000
7+
11100000010000001000000000
8+
00100100010010011000010110

toydata/example1/out1.ind

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,4 @@
1+
Ind1 Admixed_8 0
2+
Ind2 Admixed_8 0
3+
Ind3 Admixed_8 0
4+
Ind4 Admixed_8 0

toydata/example1/out1.log

Lines changed: 31 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,31 @@
1+
AdmixSim 2
2+
Arguments and Options:
3+
Modfile = test.mod
4+
SNVfile = test.snv
5+
Hapfile = test.hap
6+
Indfile = test.ind
7+
Output population list and corresponding generation and population size: Admixed 8g 4;
8+
Seed = 1599362113
9+
Chr = A
10+
Recombinaion Setting: Using locus-specific genetic distance in snv file
11+
Mutation Setting: Uniform mutation rate = 1e-08
12+
Out prefix = out1
13+
==========================================================================================================
14+
15+
Simulation start! Sun Sep 6 11:15:13 2020
16+
17+
Reading snvfile time: 0s
18+
Reading indfile time: 0s
19+
Reading modfile time: 0s
20+
Reading hapfile time: 0s
21+
22+
Simulation time: 0s
23+
24+
Saving ancestral sequence data time: 0s
25+
26+
Final add de novo mutation and output snvfile time: 0s
27+
28+
Population Admixed at generation 8 output time: 0s
29+
30+
Simulation end! Sun Sep 6 11:15:13 2020
31+

toydata/example1/out1.seg

Lines changed: 29 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,29 @@
1+
Ind1 Hap 1
2+
16469059 0.02114202 23976120 0.2125382477 Anc1
3+
23976120 0.2125382477 47601123 0.6243576526 Anc2
4+
47601123 0.6243576526 48849346 0.66376859 Anc1
5+
Ind1 Hap 2
6+
16469059 0.02114202 36275930 0.4250721252 Anc4
7+
36275930 0.4250721252 37634829 0.455307771 Anc2
8+
37634829 0.455307771 46661084 0.6041259271 Anc4
9+
46661084 0.6041259271 47601123 0.6243576526 Anc2
10+
47601123 0.6243576526 48849346 0.66376859 Anc1
11+
Ind2 Hap 1
12+
16469059 0.02114202 32449091 0.3657696056 Anc1
13+
32449091 0.3657696056 48849346 0.66376859 Anc4
14+
Ind2 Hap 2
15+
16469059 0.02114202 17439688 0.04906685992 Anc1
16+
17439688 0.04906685992 48849346 0.66376859 Anc2
17+
Ind3 Hap 1
18+
16469059 0.02114202 48849346 0.66376859 Anc2
19+
Ind3 Hap 2
20+
16469059 0.02114202 17169157 0.04128372665 Anc2
21+
17169157 0.04128372665 28162591 0.3167143957 Anc1
22+
28162591 0.3167143957 48849346 0.66376859 Anc4
23+
Ind4 Hap 1
24+
16469059 0.02114202 20281233 0.1199932635 Anc4
25+
20281233 0.1199932635 28162591 0.3167143957 Anc1
26+
28162591 0.3167143957 48849346 0.66376859 Anc4
27+
Ind4 Hap 2
28+
16469059 0.02114202 16710569 0.02809022378 Anc1
29+
16710569 0.02809022378 48849346 0.66376859 Anc2

0 commit comments

Comments
 (0)