You can help EcoliWiki by editing the content of this page. For information about becoming a registered user and obtaining editing privileges, see Help:Accounts.
See Help:Databases for more information about databases in EcoliWiki.
A database for genome-wide collections of gene phylogenies.
Comparative Genomics Group at CRG (Barcelona, Spain).
PhylomeDB (see reference ) is a public database for complete collections of gene phylogenies (phylomes) and phylogeny-based orthology and paralogy predictions. It allows users to explore the evolutionary history of genes through the visualization of phylogenetic trees and alignments. All the phylogenetic trees hosted in phylomeDB have been pre-computed using a high-quality phylogenetic pipeline that includes alignmnent refinement and trimming, model testing as well as Maximum Likelihood reconstruction. Moreover, phylogeny-based orthology and paralogy predictions are provided for all genomes included in the database. For each particular sequence in a seed genome several trees are available, including the tree reconstructed using that sequence as a seed and the model best fitting the data, as well as additional trees reconstructed using homologs of that sequence as a seed, and trees reconstructed with different models or methods. Consistency among predictions from all these different trees is used as a measure of accuracy of the orthology and paralogy prediction.
Currently PhylomeDB hosts 8 public phylomes comprising 253,297 trees and 79,570 alignments and provides phylogeny-based orthology and paralogy predictions for almost 5,000,000 proteins in 609 fully-sequenced genomes.
You can search for a protein using the phylomeDB's quick search box. Proteins codes that work in PhylomeDB include the internal Phylome ID, Ensembl, SwissProt, SwissProt TrEMBL or NCBI NonRedudant IDs. Once the user performs a search and retrieves the entry/entries that match the query, the web user gets a entry which has the following parts (see image below):
- Basic protein information: Information about the query protein that includes the protein sequence and any comments retrieved from the original proteome database. There may be more than one entry for a given protein ID if two or more proteomes of the same species have been used in PhylomeDB.
- Seed trees: All trees in which the query protein has been used as seed protein to generate the trees.
- Collateral trees: All trees where the query protein appears but was not used as seed.
- Orthology prediction: Orthology predictions are provided for all the phylomes in which the query protein appears. The results are formated into a table. The first table accounts for all the orthology predictions in all phylomes while the remaining tables separate those predictions by phylome.
PhylomeDB also allows for searches using the Blast Search! option. Clicking on the Blast search button will open a box in which a protein sequence can be introduced and a standard Blast search will be executed.
The image below shows the use of Blast to query PhylomeDB. The blast search only returns proteins that have been used as seed protein in one of the phylomes and that are homologous to the query protein. The results are formatted into a table where the different entries show the name of the seed proteins, the phylomes in which it appears and the different statistics that results from the blast search. Selecting any of the resulting entries will provide a page like the one described above.
You can find a exhaustive manual about phylomeDB in our Help & Support wikipage
See Help:References for how to manage references in EcoliWiki.
- Huerta-Cepas, J et al. (2008) PhylomeDB: a database for genome-wide collections of gene phylogenies. Nucleic Acids Res. 36 D491-6 PubMed