Four file formats are curently supported by COUGER for the upload of DNA sequences bound by the considered transcription factors (TF1 and TF2): ENCODE narrowPeak format, ENCODE broadPeak format, BED format and FASTA format.
See the documentation page to get an idea of input that works well for COUGER and to see sample sequences for each format.
Please use only uncompressed files!
If the selected format is BED-like, then the reference genome has to be selected. This online version of COUGER provides genomes of tree orgamisms: human, mouse and Drosophila melanogaster, each with several assemblies.
In the case of the FASTA format, this option is inactive.
COUGER uses DNA sequences specific to each TF:
1) bounded by TF1 but not TF2;
2) bounded by TF2 but not TF1.
The maximum number of TF-specific sequences is limited due to time and resources constraints. However, this restriction can range between 300 and 1000.
Please note that the running time may increase considerably with the increase of the input sequences number.
Name of the first transcription factor considered. This will be used throughout the execution of COUGER, including in the name of result files. Default is 'TF1'.
Name of the second transcription factor considered. This will be used throughout the execution of COUGER, including in the name of result files. Default is 'TF2'.
For classification, COUGER computes features that reflect the DNA binding specificities of putative co-factors. These features are generated from one of two types of data sets: either PBM data (data from protein binding microarray experiments), or PWM data (position weight matrices).
This online version provides multiple feature sets: PBM or PWM data from UniPROBE database (429 TFs), 1226 PWMs from TRANSFAC database, 239 PWMs derived from HT-SELEX data by Jolma et al, Cell 2013, 205 PWMs from JASPAR CORE vertebrata, and 131 PWMs from JASPAR CORE insecta database.
If FASTA format is used, all the options are available. For other formats some options will be deleted, depending on the selected organism.
The e-mail is used to notify you of your results. Be sure that this is a valid e-mail address.
Note: The e-mail address will never be used for any purpose other than sending you the notification message containing a link to your results.