CollecTF

From EcoliWiki
Jump to: navigation, search

You can help EcoliWiki by editing the content of this page. For information about becoming a registered user and obtaining editing privileges, see Help:Accounts.

See Help:Databases for more information about databases in EcoliWiki.

<protect>

Link/URL:

http://collectf.umbc.edu

What:

CollecTFcompiles data on experimentally validated, naturally occurring TF-binding sites across the Bacteria domain, placing a strong emphasis on the transparency of the curation process, the quality and availability of the stored data and fully customizable access to its records.

Who:

CollecTF is developed and maintained by the Erill Lab at UMBC, in collaboration with the NCBI RefSeq database.

Updates:

CollecTF is under continuous revision, operating on a dual curation process that combines in-house curation by undergraduate students and direct submissions by authors.

Upcoming events:

CollecTF can be leveraged for team-based, active learning activities in Genetics and Biochemistry courses. Please email the collecTF team for further information.

Web Services:

Data in CollecTF is preiodically pushed to NCBI RefSeq complete genomes

</protect>


About CollecTF

CollecTF compiles data on experimentally validated, naturally occurring TF-binding sites across the Bacteria domain, placing a strong emphasis on the transparency of the curation process, the quality and availability of the stored data and fully customizable access to its records. CollecTF integrates multiple sources of data automatically and openly, allowing users to dynamically redefine binding motifs and their experimental support base. Most importantly, CollecTF entries are periodically submitted to NCBI for integration into RefSeq complete genome records as db_xref link-out features embedded in genome annotations. Experimentally-validated binding sites for transcription factors are therefore embedded in genome annotations as bound_moiety tags, providing information on the available experimental evidence, its published sources and a link to the full curation record on CollecTF. This approach maximizes the visibility of transcriptional regulation data by enriching the annotation of RefSeq files with regulatory information, addressing a decades-old deficit in the NCBI annotation pipeline. Hence, CollecTF operates both as a standalone database and as an open portal for entering of regulatory information into RefSeq genomes, generating a sustainable model that encourages direct author submissions in combination with in-house validation and curation of published literature.

Content

CollecTF uses exclusively a single data source: published experimental evidence on TFBS. Sites may be supported by evidence of binding and/or evidence of their regulation of gene expression. In silico evidence is also collected, but only as complement to experimental evidence. CollecTF defines three site types: motif associated, variable motif associated and non-motif associated, based on the availability of known binding patterns associated with reported sites.

Using CollecTF

Browsing

CollecTF can be browsed at three different levels

NCBI Taxonomy - Browse the database using up-to-date NCBI taxonomy. Just click on each taxonomical unit to expand it and see its associated information, and link out to species-specific reports.

TF family - Browse the database by transcription factor families. Click on each family to expand it and see its associated information, and link out to TF-family or TF-based reports.

Technique - Browse CollecTF by experimental evidence. Click on different groups of experimental techniques to expand them and see their associated information, and link out to technique.

Searching

Access to TFBS data in CollecTF is fully customizable. Users can select the specific types of experimental support for reported TFBS, aggregate information for multiple clades or TFs and compare the dynamically computed TF-binding motifs using different statistics.

Search in CollecTF is fully customizable. Just select a taxonomic unit (e.g. the Vibrio genus), a transcription factor family or instance (e.g. LexA) and the set of experimental techniques that reported sites should be backed by and click Search. Reports and exporting Search results can be seen as individual reports (one report per TF/species) or as ensemble reports (multiple TF/species). Click Export to access the data in machine readable format.

Usage examples

Add links to additional pages describing success stories here.

Other sites with related content

Technology

Accurate curation of TFBS data is accomplished by using a well-defined submission pipeline in which submitters detail relevant data on the TF and the reported TFBS. Mapping between reported data and a reference genome in the NCBI RefSeq database is established and used to validate the precise location of reported sites and their regulatory effects on genes.

Data in CollecTF is periodically submitted to the NCBI RefSeq database, where it is integrated into the annotation of complete genome sequences. CollecTF entries are annotated as /protein_bind features within GenBank format files. These features contain a link to the full CollecTF record, as well as direct links to the PubMed articles providing experimental support for the TFBS.

Web Services/API

Discussion

References

[1] Wikipedia

  1. Kiliç, S et al. (2014) CollecTF: a database of experimentally validated transcription factor-binding sites in Bacteria. Nucleic Acids Res. 42 D156-60 PubMed

External Links

[link CollecTF] URL:http://collectf.umbc.edu

Discussion of CollecTF on other websites