From EcoliWiki
Jump to: navigation, search

You can help EcoliWiki by editing the content of this page. For information about becoming a registered user and obtaining editing privileges, see Help:Accounts.





Detect remote homologies by comparison of hidden Markov models


Dep. of Protein Evolution at the Max-Planck Institute for Developmental Biology


Short Description

Share your knowledge and ideas. How you can help. See Help pages if you need help with using the wiki

Home Page

HHsearch is a program for protein sequence searching that is free for non-commercial use.[1] HHpred is a free protein function and protein structure prediction server based on the HHsearch method.[2] HHpred/HHsearch are among the most popular methods for protein structure prediction and the detection of remotely related sequences, having been cited over 380 times.

Sequence searches are frequently performed by biologists to infer the function of an unknown protein from its sequence. For this purpose, the protein's sequence is compared to the sequences of other proteins in public databases and its function is deduced from those of the most similar sequences. Often, no sequences with annotated functions can be found in such a search. In this case, more sensitive methods are required to identify more remotely related proteins or protein families. From these relationships, hypotheses about the protein's functions, structure, and domain composition can be inferred. HHsearch performs searches with a protein sequence through databases. The HHpred server and the HHsearch software package offer many popular, regularly updated databases, such as the Protein Data Bank, as well as the InterPro, Pfam, COG, and SCOP databases.

HHpred/HHsearch belongs to the class of profile-profile comparison tools, which includes the most sensitive sequence search methods to date.[3][4][5][1] They represent both the query sequence and the database sequences by sequence profiles, also called position-specific scoring matrices (PSSMs). Profiles are calculated from a multiple sequence alignment of related sequences which are typically collected using the PSI-BLAST program from the National Center for Biotechnology Information (NCBI). A profile is a matrix containing for each position in the query sequence the similarity score for the 20 amino acids. These scores are calculated from the frequencies of the amino acids at the corresponding positions in the multiple sequence alignment. Because profiles contain much more information than a single sequence (e.g. the position-specific degree of conservation), profile-profile comparison methods are much more powerful than sequence-sequence comparison methods like BLAST or profile-sequence comparison methods like PSI-BLAST.[3]

HHpred/HHsearch represents query and database proteins by profile hidden Markov models (HMMs), an extension of sequence profiles which also record position-specific amino acid insertion and deletion frequencies. HHsearch searches a database of HMMs with a query HMM. Before starting the search through the actual database of HMMs, HHsearch/HHpred builds a multiple sequence alignment of related sequences using a context-specific version of PSI-BLAST, called CSI-BLAST. From this alignment, a profile HMM is calculated. The databases contain HMMs that are precalculated in the same fashion using PSI-BLAST. The output of HHpred and HHsearch is a ranked list of database matches (including E-values and probabilities for a true relationship) and the pairwise query-database sequence alignments. A search through the PDB database of proteins with solved 3D structure takes a few minutes. If a significant match with a protein of known structure (a "template") is found in the PDB database, HHpred allows to build a homology model using MODELLER software, starting from the pairwise query-template alignment.

Applications of HHpred/HHsearch include protein structure prediction, function prediction, domain prediction, domain boundary prediction, and evolutionary classification of proteins. In the CASP7 benchmark experiment, HHpred5 was ranked 2nd out of 68 automatic structure prediction servers, while being more than 50 times faster than the best 20 servers.[6]






User notes

See Also

CASP website: http://predictioncenter.org/

Homology detection of outer membrane proteins: HHomp


See Help:References for how to manage references in EcoliWiki.

  1. 1.0 1.1 Söding, J (2005) Protein homology detection by HMM-HMM comparison. Bioinformatics 21 951-60 PubMed
  2. Söding, J et al. (2005) The HHpred interactive server for protein homology detection and structure prediction. Nucleic Acids Res. 33 W244-8 PubMed
  3. 3.0 3.1 Jaroszewski, L et al. (2000) Improving the quality of twilight-zone alignments. Protein Sci. 9 1487-96 PubMed
  4. Sadreyev, RI et al. (2003) Profile-profile comparisons by COMPASS predict intricate homologies between protein families. Protein Sci. 12 2262-72 PubMed
  5. Dunbrack, RL Jr (2006) Sequence comparison and protein structure prediction. Curr. Opin. Struct. Biol. 16 374-84 PubMed
  6. Battey, JN et al. (2007) Automated server predictions in CASP7. Proteins 69 Suppl 8 68-82 PubMed