|
| 1 | +############################################################### |
| 2 | +# This is a readme for analyzing pairwise fasta alignments |
| 3 | +# output from lastz. |
| 4 | +# |
| 5 | +# Required programs: Lastz |
| 6 | +# AlignmentProcessor1.2 |
| 7 | +# PhyML |
| 8 | +# paml |
| 9 | +############################################################### |
| 10 | + |
| 11 | +#------------------------------- |
| 12 | +# 1. Run Lastz for Alignment |
| 13 | +#------------------------------- |
| 14 | + |
| 15 | +# Download scripts from the Avian Genome Project and upload to server. |
| 16 | +# Submit falcon_chicken.sh script on Saguaro. See comments at the end for changing file, target, and query names. |
| 17 | + |
| 18 | +#------------------------------- |
| 19 | +# 2. Run Stich Gene Blocks on Galaxy |
| 20 | +#------------------------------- |
| 21 | + |
| 22 | +# Upload the Lastz output maf file to Galaxy, and import the GalGal4 bed12 file from UCSC (select Genes and Gene Predictions, |
| 23 | +# Ensembl genes, genome, and BED format before getting output. On the next page select whole gene). Remove the "chr_UN" and "chr" |
| 24 | +# chromosome prefixes from the file, and resubmit the file to Galaxy. Also submit the chicken genome used in the alignment as a |
| 25 | +# custom build and set the builds of all three files to the custom build. (Alternatively, just use the UCSC chicken genome in the alignment.) |
| 26 | + |
| 27 | +#------------------------------- |
| 28 | +# 3. Run AlignmentProcessor |
| 29 | +#------------------------------- |
| 30 | + |
| 31 | +# KaKs_Calculator |
| 32 | + cd AlignmentProcessor/ |
| 33 | + python AlignmentProcessor.py -t 4 --axt --kaks -r galGal4 -i Pairwise/galgal_falper.fa -o KaKs/ |
| 34 | + |
| 35 | +# CodeML |
| 36 | + cd AlignmentProcessor/ |
| 37 | + cp controlFiles/pairwiseAlt.ctl PairwiseCodeML1.2 |
| 38 | + python AlignmentProcessor.py -t 4 --phylip --codeml -r galGal4 -i Pairwise/galgal_falper.fa -o PairwiseCodeML/ |
| 39 | + |
| 40 | +#------------------------------- |
| 41 | +# 4. Prepare Output for analysis in R |
| 42 | +#------------------------------- |
| 43 | + |
| 44 | +# Download a list of gene and transcript IDs and their chromosome for galGal4 from Ensembl BioMart (v83). |
| 45 | +# Join this list with the Ka/Ks results to add gene, scaffold, and chromosome information to the results. |
| 46 | +# Prior to joining, open each list in Excel and sort both files either ascending or descending by their transcript IDs. |
| 47 | +# It does not matter which one you use, as long as you use the same one for each (join will not work properly if they are sorted differently). |
| 48 | + |
| 49 | +# KaKs |
| 50 | + join -t "," --header --check-order -1 2 -2 1 "Pairwise/galGal4GeneTranscriptIDs.txt" "KaKs1.2/KaKs.csv" > "KaKs1.2/galGal4KaKs.csv" |
| 51 | + |
| 52 | +# CodeML |
| 53 | + python bin/ConcatenateCodeML.py --pairwise -i PairwiseCodeML1.2/04_CodemlOutput -o PairwiseCodeML1.2/pairwiseAlt.csv |
| 54 | + join -t "," --header --check-order -1 2 -2 1 "Pairwise/galGal4GeneTranscriptIDs.txt" "PairwiseCodeML1.2/pairwiseAlt.csv" > "PairwiseCodeML1.2/galGal4Pairwise.csv" |
| 55 | + |
| 56 | +#------------------------------- |
| 57 | +# 5. Permutation test for Z chromosome vs. Autosomes |
| 58 | +#------------------------------- |
| 59 | + |
| 60 | +# Make two csv files from the joined file in step 4, one for only autosomal genes and one for only Z-linked genes. |
| 61 | +# Manually remove any genes with NAs for Ka, Ks, or Ka/Ks. |
| 62 | + |
| 63 | + cd PermutationScripts/ |
| 64 | + |
| 65 | + python permutation.py --c1 7 --c2 7 -i1 KaKsOut/galGal_falPerAutosomalKaKs.csv -i2 KaKsOut/galGal_falPerZKaKs.csv |
| 66 | + |
| 67 | + |
0 commit comments