Commit 81ef5f8
committed
add scripts to make sacCer3_VCF db
This commit updates the sacCer3_VCF variant database such that the VCF files for each strain only contain unique, unambiguous variants. The full, original VCF files are moved into a subdirectory `sacCer3_VCF/full_VCF`. Also included in this commit are scripts to regenerate these databases from scratch.
sacCer3_VCF/*.gatk.vcf
-These are the unambiguous variants for each sacCer3 strain
sacCer3_VCF/full_VCF/*.gatk.vcf
-These are the full VCF files saved for reference
utility_scripts/generate_sacCer3_VariantDB.sh
-This bash script downloads VCF files from SGD, parses them to convert chromosome naming from roman to arabic numerals, and strips out ambiguous variants with bedtools intersect
utility_scripts/parsers/parse_sacCer3_VCF.pl
-This is the perl script for converting the chromosome naming system of the VCF files1 parent 9fc4413 commit 81ef5f8
26 files changed
Lines changed: 258914 additions & 200693 deletions
File tree
- StrainID
- sacCer3_VCF
- full_VCF
- utility_scripts
- parsers
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
This file was deleted.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
0 commit comments