Skip to content

Commit 2b4e56e

Browse files
authored
Merge pull request #9 from OpenTreeOfLife/ms
Ms
2 parents 22c9715 + 005b765 commit 2b4e56e

36 files changed

Lines changed: 2233 additions & 231 deletions

.coveragerc

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,7 @@
11
# .coveragerc to control coverage.py
22
[run]
33
branch = True
4+
include = opentree/*.py
45

56
[report]
67
# Regexes for lines to exclude from consideration

.travis.yml

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -5,6 +5,10 @@ python:
55
# command to install dependencies
66
install:
77
- pip install -r requirements.txt
8+
- pip install codecov
89
# command to run tests
910
script:
1011
- ./run_tests.sh
12+
13+
after_success:
14+
- codecov

README.md

Lines changed: 7 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,10 +1,11 @@
11
python-opentree
22
===============
3-
[![Build Status](https://travis-ci.org/OpenTreeOfLife/python-opentree.svg?branch=master)](https://travis-ci.org/OpenTreeOfLife/python-opentree)[![Documentation](https://readthedocs.org/projects/opentree/badge/?version=latest&style=flat)](https://opentree.readthedocs.io/en/latest/)
3+
[![Build Status](https://travis-ci.org/OpenTreeOfLife/python-opentree.svg?branch=master)](https://travis-ci.org/OpenTreeOfLife/python-opentree)[![Documentation](https://readthedocs.org/projects/opentree/badge/?version=latest&style=flat)](https://opentree.readthedocs.io/en/latest/)[![codecov](https://codecov.io/gh/OpenTreeOfLife/python-opentree/branch/main/graph/badge.svg)](https://codecov.io/gh/OpenTreeOfLife/python-opentree)
44

55
This package is a python library designed to make it easier to work with web services and
66
data resources associated with the [Open Tree of Life](https://opentreeoflife.github.io)
77
project.
8+
The git repo is at https://github.com/OpenTreeOfLife/python-opentree.
89

910

1011
Prior work / design / road map
@@ -27,6 +28,11 @@ If you don't need the latest version you, can simply use:
2728

2829
If you are developer who wants to install multiple times, you probably want to use:
2930

31+
git clone https://github.com/OpenTreeOfLife/python-opentree.git
32+
cd python-opentree
33+
34+
to get a local copy of the code, then:
35+
3036
python3 -m venv env
3137
source env/bin/activate
3238
pip install -r requirements.txt

docs/manuscript.md

Lines changed: 89 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,89 @@
1+
---
2+
title: 'OpenTree: A Python package for accessing and analyzing data from the Open Tree of Life'
3+
tags:
4+
- Python
5+
- phylogenetics
6+
- taxonomy
7+
- evolution
8+
authors:
9+
- name: Emily Jane McTavish ^[Custom footnotes for e.g. denoting who the corresspoinding author is can be included like this.]
10+
orcid: 0000-0001-9766-5727
11+
affiliation: 1 # (Multiple affiliations must be quoted)
12+
- name: Luna Luisa Sanchez Reyes
13+
orcid: 0000-0001-7668-2528
14+
affiliation: 1
15+
- name: Mark T. Holder
16+
orcid: 0000-0001-5575-0536
17+
affiliation: 2
18+
affiliations:
19+
- name: University of California, Merced
20+
index: 1
21+
- name: University of Kansas
22+
index: 2
23+
date: 13 August 2017
24+
bibliography: paper.bib
25+
26+
27+
---
28+
29+
# Summary
30+
31+
The Open Tree of Life project constructs a comprehensive, dynamic and digitally-available tree of life by synthesizing published phylogenetic trees along with taxonomic data.
32+
We Open Tree of Life provides web-service APIs to make the tree setimates, unified taxonomy, and input phylogenetic data available to anyone.
33+
`OpenTree` provides a python wrapper for theses APIs and downstream data analysis functionality.
34+
35+
36+
# Statement of need
37+
38+
`OpenTree` is a Python package for accessing and analyzing data from the OpenTree of Life project.
39+
Open Tree of Life stores a wealth of taxonomic and phylogenetic data gathered together in an open-access interoperable framework.
40+
The current synthetic tree [@opentreeoflife_open_2019] comprises 2.4 million tips (largely species).
41+
The framework of this tree is provided by a unified taxonomy [@opentreeoflife_open_2019-1; @rees_automated_2017].
42+
This taxonomy links unique identifiers across many online taxonomic resources, including NCBI [CITE], GBIF [CITE], as well as user contributed taxonomic amendments contained in [https://github.com/OpenTreeOfLife/amendments-1].
43+
These taxonomic relationships are refined by evolutionary estimates from 1,216 published papers including 87,000 tips taxa [@opentreeoflife_open_2019; @redelings_supertree_2017].
44+
The Open Tree data store, `Phylesystem` [@mctavish_phylesystem:_2015] contains all of those publishes studies, including the mappings between the tips in these published studies, and unique taxonomic identifiers.
45+
46+
All of there data are freely accessible via API calls [https://github.com/OpenTreeOfLife/germinator/wiki/Open-Tree-of-Life-Web-APIs].
47+
`OpenTree` provides an user-friendly wrapper for calling these APIs.
48+
In addition, in converts these between commonly used file formats and data types.
49+
This package allows allows users to generate to data objects in DendroPy, a phylogenetic computing library [@sukumaran_dendropy_2010].
50+
51+
52+
`OpenTree` incorporates in python the functionality available in rotl: an {R} package to interact with the Open Tree of Life data [@michonneau_rotl:_2016], as well as additional downstream analysis and interoperability tools.
53+
`rotl` has been cited 113 times in the 4 years since its publication, demonstrating a demand for accessible user access to these data.
54+
By providing a python package to interact with these data, we make it straightforward for python users to access and analyze these data.
55+
A python wrapper for Open Tree of Life also makes linking these data with the stable of other Python biodiversity informatics tools such as ETC ETC, much easier.
56+
57+
58+
59+
# Figures
60+
61+
62+
63+
Fenced code blocks are rendered with syntax highlighting:
64+
```python
65+
for n in range(10):
66+
yield f(n)
67+
```
68+
69+
# Acknowledgements
70+
71+
Research was supported by the grant "Sustaining the Open Tree of Life", National Science Foundation ABI No. 1759838, and ABI No. 1759846.
72+
Compute time was provided by the Multi-Environment Research Computer for Exploration and Discovery (MERCED) cluster from the University of California, Merced (UCM), supported by the NSF Grant No. ACI-1429783.
73+
74+
75+
# References
76+
77+
78+
79+
# Citations
80+
81+
Citations to entries in paper.bib should be in
82+
[rMarkdown](http://rmarkdown.rstudio.com/authoring_bibliographies_and_citations.html)
83+
format.
84+
85+
86+
For a quick reference, the following citation commands can be used:
87+
- `@author:2001` -> "Author et al. (2001)"
88+
- `[@author:2001]` -> "(Author et al., 2001)"
89+
- `[@author1:2001; @author2:2001]` -> "(Author1 et al., 2001; Author2 et al., 2002)"

docs/paper.bib

Lines changed: 146 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,146 @@
1+
2+
@article{hinchliff_synthesis_2015,
3+
title = {Synthesis of phylogeny and taxonomy into a comprehensive tree of life},
4+
volume = {112},
5+
issn = {0027-8424, 1091-6490},
6+
url = {http://www.pnas.org/content/early/2015/09/16/1423041112},
7+
doi = {10.1073/pnas.1423041112},
8+
abstract = {Reconstructing the phylogenetic relationships that unite all lineages (the tree of life) is a grand challenge. The paucity of homologous character data across disparately related lineages currently renders direct phylogenetic inference untenable. To reconstruct a comprehensive tree of life, we therefore synthesized published phylogenies, together with taxonomic classifications for taxa never incorporated into a phylogeny. We present a draft tree containing 2.3 million tips—the Open Tree of Life. Realization of this tree required the assembly of two additional community resources: (i) a comprehensive global reference taxonomy and (ii) a database of published phylogenetic trees mapped to this taxonomy. Our open source framework facilitates community comment and contribution, enabling the tree to be continuously updated when new phylogenetic and taxonomic data become digitally available. Although data coverage and phylogenetic conflict across the Open Tree of Life illuminate gaps in both the underlying data available for phylogenetic reconstruction and the publication of trees as digital objects, the tree provides a compelling starting point for community contribution. This comprehensive tree will fuel fundamental research on the nature of biological diversity, ultimately providing up-to-date phylogenies for downstream applications in comparative biology, ecology, conservation biology, climate change, agriculture, and genomics.},
9+
language = {en},
10+
number = {41},
11+
urldate = {2015-09-30},
12+
journal = {Proceedings of the National Academy of Sciences},
13+
author = {Hinchliff, Cody E. and Smith, Stephen A. and Allman, James F. and Burleigh, J. Gordon and Chaudhary, Ruchi and Coghill, Lyndon M. and Crandall, Keith A. and Deng, Jiabin and Drew, Bryan T. and Gazis, Romina and Gude, Karl and Hibbett, David S. and Katz, Laura A. and Laughinghouse, H. Dail and McTavish, Emily Jane and Midford, Peter E. and Owen, Christopher L. and Ree, Richard H. and Rees, Jonathan A. and Soltis, Douglas E. and Williams, Tiffani and Cranston, Karen A.},
14+
month = sep,
15+
year = {2015},
16+
pmid = {26385966},
17+
keywords = {biodiversity, Phylogeny, synthesis, Taxonomy, tree of life},
18+
pages = {12764--12769},
19+
file = {Full Text PDF:/home/ejmctavish/.zotero/zotero/n7lb4sp9.default/zotero/storage/3EXXWC85/Hinchliff et al. - 2015 - Synthesis of phylogeny and taxonomy into a compreh.pdf:application/pdf;Snapshot:/home/ejmctavish/.zotero/zotero/n7lb4sp9.default/zotero/storage/98IFM3X2/1423041112.html:text/html}
20+
}
21+
22+
@article{redelings_supertree_2017,
23+
title = {A supertree pipeline for summarizing phylogenetic and taxonomic information for millions of species},
24+
volume = {5},
25+
issn = {2167-8359},
26+
url = {https://peerj.com/articles/3058},
27+
doi = {10.7717/peerj.3058},
28+
abstract = {We present a new supertree method that enables rapid estimation of a summary tree on the scale of millions of leaves. This supertree method summarizes a collection of input phylogenies and an input taxonomy. We introduce formal goals and criteria for such a supertree to satisfy in order to transparently and justifiably represent the input trees. In addition to producing a supertree, our method computes annotations that describe which grouping in the input trees support and conflict with each group in the supertree. We compare our supertree construction method to a previously published supertree construction method by assessing their performance on input trees used to construct the Open Tree of Life version 4, and find that our method increases the number of displayed input splits from 35,518 to 39,639 and decreases the number of conflicting input splits from 2,760 to 1,357. The new supertree method also improves on the previous supertree construction method in that it produces no unsupported branches and avoids unnecessary polytomies. This pipeline is currently used by the Open Tree of Life project to produce all of the versions of project’s “synthetic tree” starting at version 5. This software pipeline is called “propinquity”. It relies heavily on “otcetera”—a set of C++ tools to perform most of the steps of the pipeline. All of the components are free software and are available on GitHub.},
29+
language = {en},
30+
urldate = {2017-03-17},
31+
journal = {PeerJ},
32+
author = {Redelings, Benjamin D. and Holder, Mark T.},
33+
month = mar,
34+
year = {2017},
35+
pages = {e3058},
36+
file = {Full Text PDF:/home/ejmctavish/.zotero/zotero/n7lb4sp9.default/zotero/storage/J9TD5XBF/Redelings and Holder - 2017 - A supertree pipeline for summarizing phylogenetic .pdf:application/pdf;Snapshot:/home/ejmctavish/.zotero/zotero/n7lb4sp9.default/zotero/storage/MHCI8CSR/3058.html:text/html}
37+
}
38+
39+
@article{michonneau_rotl:_2016,
40+
title = {rotl: an {R} package to interact with the {Open} {Tree} of {Life} data},
41+
volume = {7},
42+
shorttitle = {rotl},
43+
url = {http://onlinelibrary.wiley.com/doi/10.1111/2041-210X.12593/full},
44+
number = {12},
45+
journal = {Methods in Ecology and Evolution},
46+
author = {Michonneau, François and Brown, Joseph W. and Winter, David J.},
47+
year = {2016},
48+
pages = {1476--1481},
49+
file = {[PDF] peerj.com:/home/ejmctavish/.zotero/zotero/n7lb4sp9.default/zotero/storage/5ECZPK2B/Michonneau et al. - 2016 - rotl an R package to interact with the Open Tree .pdf:application/pdf;Snapshot:/home/ejmctavish/.zotero/zotero/n7lb4sp9.default/zotero/storage/KMEACUG5/full.html:text/html}
50+
}
51+
52+
@article{mctavish_phylesystem:_2015,
53+
title = {Phylesystem: a git-based data store for community-curated phylogenetic estimates},
54+
volume = {31},
55+
issn = {1367-4803},
56+
shorttitle = {Phylesystem},
57+
url = {https://academic.oup.com/bioinformatics/article/31/17/2794/183373/Phylesystem-a-git-based-data-store-for-community},
58+
doi = {10.1093/bioinformatics/btv276},
59+
abstract = {Motivation: Phylogenetic estimates from published studies can be archived using general platforms like Dryad (Vision, 2010) or TreeBASE (Sanderson et al., 1994). Such services fulfill a crucial role in ensuring transparency and reproducibility in phylogenetic research. However, digital tree data files often require some editing (e.g. rerooting) to improve the accuracy and reusability of the phylogenetic statements. Furthermore, establishing the mapping between tip labels used in a tree and taxa in a single common taxonomy dramatically improves the ability of other researchers to reuse phylogenetic estimates. As the process of curating a published phylogenetic estimate is not error-free, retaining a full record of the provenance of edits to a tree is crucial for openness, allowing editors to receive credit for their work and making errors introduced during curation easier to correct.Results: Here, we report the development of software infrastructure to support the open curation of phylogenetic data by the community of biologists. The backend of the system provides an interface for the standard database operations of creating, reading, updating and deleting records by making commits to a git repository. The record of the history of edits to a tree is preserved by git’s version control features. Hosting this data store on GitHub (http://github.com/) provides open access to the data store using tools familiar to many developers. We have deployed a server running the ‘phylesystem-api’, which wraps the interactions with git and GitHub. The Open Tree of Life project has also developed and deployed a JavaScript application that uses the phylesystem-api and other web services to enable input and curation of published phylogenetic statements.Availability and implementation: Source code for the web service layer is available at https://github.com/OpenTreeOfLife/phylesystem-api. The data store can be cloned from: https://github.com/OpenTreeOfLife/phylesystem. A web application that uses the phylesystem web services is deployed at http://tree.opentreeoflife.org/curator. Code for that tool is available from https://github.com/OpenTreeOfLife/opentree.Contact: mtholder@gmail.com},
60+
number = {17},
61+
urldate = {2017-10-20},
62+
journal = {Bioinformatics},
63+
author = {McTavish, Emily Jane and Hinchliff, Cody E. and Allman, James F. and Brown, Joseph W. and Cranston, Karen A. and Holder, Mark T. and Rees, Jonathan A. and Smith, Stephen A.},
64+
month = sep,
65+
year = {2015},
66+
pages = {2794--2800},
67+
file = {Full Text PDF:/home/ejmctavish/.zotero/zotero/n7lb4sp9.default/zotero/storage/QG8IM5KP/McTavish et al. - 2015 - Phylesystem a git-based data store for community-.pdf:application/pdf;Snapshot:/home/ejmctavish/.zotero/zotero/n7lb4sp9.default/zotero/storage/8KM392HQ/btv276.html:text/html}
68+
}
69+
70+
@article{mctavish_how_2017,
71+
title = {How and {Why} to {Build} a {Unified} {Tree} of {Life}},
72+
volume = {39},
73+
issn = {1521-1878},
74+
url = {https://onlinelibrary.wiley.com/doi/abs/10.1002/bies.201700114},
75+
doi = {10.1002/bies.201700114},
76+
language = {en},
77+
number = {11},
78+
urldate = {2018-04-10},
79+
journal = {BioEssays},
80+
author = {McTavish, Emily Jane and Drew, Bryan T. and Redelings, Ben and Cranston, Karen A.},
81+
month = nov,
82+
year = {2017},
83+
file = {Full Text PDF:/home/ejmctavish/.zotero/zotero/n7lb4sp9.default/zotero/storage/H4QUTY4S/McTavish et al. - 2017 - How and Why to Build a Unified Tree of Life.pdf:application/pdf;Snapshot:/home/ejmctavish/.zotero/zotero/n7lb4sp9.default/zotero/storage/I35KQEQC/bies.html:text/html}
84+
}
85+
86+
@article{rees_automated_2017,
87+
title = {Automated assembly of a reference taxonomy for phylogenetic data synthesis},
88+
issn = {1314-2828},
89+
url = {https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5515096/},
90+
doi = {10.3897/BDJ.5.e12581},
91+
number = {5},
92+
urldate = {2018-04-10},
93+
journal = {Biodiversity Data Journal},
94+
author = {Rees, Jonathan A. and Cranston, Karen},
95+
month = may,
96+
year = {2017},
97+
pmid = {28765728},
98+
pmcid = {PMC5515096},
99+
file = {PubMed Central Full Text PDF:/home/ejmctavish/.zotero/zotero/n7lb4sp9.default/zotero/storage/PIA33BDY/Rees and Cranston - 2017 - Automated assembly of a reference taxonomy for phy.pdf:application/pdf}
100+
}
101+
102+
@article{sukumaran_dendropy_2010,
103+
title = {{DendroPy}: a {Python} library for phylogenetic computing},
104+
volume = {26},
105+
issn = {1367-4803},
106+
shorttitle = {{DendroPy}},
107+
url = {https://academic.oup.com/bioinformatics/article/26/12/1569/287181},
108+
doi = {10.1093/bioinformatics/btq228},
109+
abstract = {Abstract. Summary: DendroPy is a cross-platform library for the Python programming language that provides for object-oriented reading, writing, simulation and},
110+
language = {en},
111+
number = {12},
112+
urldate = {2020-07-20},
113+
journal = {Bioinformatics},
114+
author = {Sukumaran, Jeet and Holder, Mark T.},
115+
month = jun,
116+
year = {2010},
117+
note = {Publisher: Oxford Academic},
118+
pages = {1569--1571},
119+
file = {Full Text PDF:/home/ejmctavish/.zotero/zotero/n7lb4sp9.default/zotero/storage/IVHQJ9ZY/Sukumaran and Holder - 2010 - DendroPy a Python library for phylogenetic comput.pdf:application/pdf;Snapshot:/home/ejmctavish/.zotero/zotero/n7lb4sp9.default/zotero/storage/VDQSUIWZ/287181.html:text/html}
120+
}
121+
122+
@article{opentreeoflife_open_2019,
123+
title = {Open {Tree} of {Life} {Synthetic} {Tree}},
124+
url = {https://zenodo.org/record/3937742#.XxXISZJKhhH},
125+
abstract = {Open Tree of Life aims to construct a comprehensive, dynamic and digitally-available tree of life by synthesizing published phylogenetic trees along with taxonomic data.},
126+
urldate = {2020-07-20},
127+
author = {OpenTreeOfLife and Benjamin Redelings and Luna Luisa Sanchez Reyes and Karen A. Cranston and Jim Allman and Mark T. Holder and Emily Jane McTavish},
128+
month = dec,
129+
year = {2019},
130+
doi = {10.5281/zenodo.3937742},
131+
note = {Publisher: Zenodo},
132+
file = {Zenodo Snapshot:/home/ejmctavish/.zotero/zotero/n7lb4sp9.default/zotero/storage/63DFWJRH/3937742.html:text/html}
133+
}
134+
135+
@article{opentreeoflife_open_2019-1,
136+
title = {Open {Tree} of {Life} {Taxonomy}},
137+
url = {https://zenodo.org/record/3937751#.XxXIwJJKhhE},
138+
abstract = {The reference taxonomy is an algorithmic combination of several source taxonomies. For code, see the source code repository. Version 3.2 draft 9 was generated using commit 618bab0ca3.},
139+
urldate = {2020-07-20},
140+
author = {OpenTreeofLife and Karen A. Cranston and Benjamin Redelings and Luna Luisa Sanchez Reyes and Jim Allman and Emily Jane McTavish and Mark T. Holder},
141+
month = oct,
142+
year = {2019},
143+
doi = {10.5281/zenodo.3937751},
144+
note = {Publisher: Zenodo},
145+
file = {Zenodo Snapshot:/home/ejmctavish/.zotero/zotero/n7lb4sp9.default/zotero/storage/V8N696V9/3937751.html:text/html}
146+
}

examples/about.py

Lines changed: 18 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -3,14 +3,23 @@
33
import sys
44
from opentree import OTCommandLineTool
55

6-
cli = OTCommandLineTool(usage='Display taxonomy and synthetic tree information '
6+
def main(arg_list, out, list_for_results=None):
7+
cli = OTCommandLineTool(usage='Display taxonomy and synthetic tree information '
78
'returned by the "about" API calls.')
8-
OT = cli.parse_cli()[0]
9+
OT = cli.parse_cli(arg_list)[0]
10+
about = OT.about()
11+
if list_for_results is not None:
12+
list_for_results.append(about)
13+
if out is None:
14+
return 0
15+
for k in about.keys():
16+
call_record = about[k]
17+
if call_record:
18+
print(k)
19+
call_record.write_response(out)
20+
print('')
21+
return 0
922

10-
about = OT.about()
11-
for k in about.keys():
12-
call_record = about[k]
13-
if call_record:
14-
print(k)
15-
call_record.write_response(sys.stdout)
16-
print('')
23+
if __name__ == '__main__':
24+
rc = main(sys.argv, sys.stdout)
25+
sys.exit(rc)

0 commit comments

Comments
 (0)