The PorToL – Porifera Tree of Life project is a interdisciplinary, multi-organizational effort to define the family tree for the phylum Porifera (commonly known as sponges), which contains 8,122 valid species with an estimated 4,000 awaiting discovery and/or description. The CCL lab is participating in PorToL by building user interfaces and processing pipelines which will allow researchers to submit genetic sequencer data for processing, cataloging and sharing among the team.

The current work centers around building a processing pipeline which takes raw sequencer data as input and performs quality scoring on the bases (Phred) and sequence alignment (Phrap) in order to produce a contig, or consensus sequence of the gene or specimen that was sequenced. This contig is then compared to known sequences (BLAST) both within the existing project data as well as with publicly available sequences in GenBank to determine whether the gene or specimen represents a known species or a new discovery. MUSCLE is then used to do a final alignment of the contigs and GARLI assembles a tree from these alignments.

It is our hope that the processor-intensive portions of this process (BLAST and possibly MUSCLE and GARLI) can be distributed onto the grid by leveraging past and present work of the CCL lab in this area.


Publications and Presentations

None at this time.

Project Documentation

Glossary of PorToL Processing Pipeline Terms

PorToL Lifecycle

Toolset Documentation

PorToL – PHRED Documentation

PorToL – PHRAP Documentation (includes cross_match)

Additional links and documents