YCRD



About YCRD

Construction of YCRD

Usage of YCRD



  • Motivation of YCRD

  • What is YCRD?


  • Database interface




  • Motivation of YCRD

    Transcriptional regulation is one of the major mechanisms for cells to control the timing, location, and amount of gene expression. The precise transcriptional control of gene expression is typically achieved through combinatorial regulation by cooperative transcription factors (TFs). Therefore, to understand how a gene of interest is transcriptionally regulated, it is crucial to know the cooperative TFs which function together to regulate the gene.

    YEASTRACT database provides up-to-date information on experimentally validated regulatory associations between a TF and its target genes. By querying YEASTRACT, users can know the TFs which regulate a specific gene. However, one key information is missing. Users cannot know whether these TFs function cooperatively or independently in regulating the expression of the specific gene. Therefore, it would be helpful to have a database which provides regulatory associations between cooperative TFs and their target genes. Because there is no such kind of databases in the public domain, this prompts us to construct the Yeast Combinatorial Regulation Database (YCRD).





    What is YCRD?

    In YCRD, we collected more than 2500 cooperative TF pairs predicted by 17 existing algorithms in the literature. As far as we know, this is the most comprehensive collection of predicted cooperative TF pairs in yeast. Then the target genes of a cooperative TF pair (e.g. TF1-TF2) are defined as the common target genes of TF1 and TF2, where the experimentally validated target genes of a TF were retrieved from YEASTRACT database. Therefore, the regulatory associations between cooperative TF pairs and their target genes are of biological significance because they are experimentally validated. In YCRD, users can
    (i) search the target genes of a cooperative TF pair of interest
    (ii) search the cooperative TF pairs which regulate a gene of interest
    (iii) identify important cooperative TF pairs which regulate these genes.
    We believe that YCRD will be a valuable resource for yeast biologists to study combinatorial regulation of gene expression.






    The comprehensive collection of predicted cooperative TF pairs in the literature

    Many existing algorithms have been developed to predict cooperative TF pairs in yeast. We comprehensively collected more than 2500 unique cooperative TF pairs from 17 existing algorithms (see the following Table). As far as we know, this is the most comprehensive collection of predicted cooperative TF pairs in the literature.


    Computational studies # of predicted cooperative TF pairs Computational studies # of predicted cooperative TF pairs
    Banerjee and Zhang (2003) 31 Elati et al. (2007) 20
    Harbison et al. (2004) 94 Datta and Zhao (2008) 25
    Nagamine et al. (2005) 24 Chuang et al. (2009) 13
    Tsai et al. (2005) 18 Wang et al. (2009) 159
    Balaji et al. (2006) 3,459 Yang et al. (2010) 186
    Chang et al. (2006) 55 Chen et al. (2012) 221
    He et al. (2006) 30 Lai et al. (2014) 27
    Wang (2006) 14 Wu and Lai (2015) 50
    Yu et al. (2006) 300

    For each predicted cooperative TF pair, we also provide two extra information to help users judge the biological plausibility. One information is all the algorithms which predict this cooperative TF pair (e.g. TF1-TF2) and the other one is whether TF1 and TF2 have physical or genetic interactions known from BioGRID database.






    Defining the target genes of a cooperative TF pair

    YEASTRACT database uses three kinds of experimental evidence (TFB, TFR and TFB&TFR) to define the experimentally validated target genes of a TF. TFB means the experimental evidence (from band-shift, foot-printing or ChIP assay) showing that a TF binds to the promoters of its target gene. TFR means the experimental evidence (from detailed gene by gene analysis or genome-wide expression analysis) showing that a TF perturbation (knockout or over-expression) causes a significant change in the expression of its target gene. TFB&TFR means both TFB and TFR evidence exist to support the regulatory associations between a TF and its target gene. In YCRD, the target genes of a cooperative TF pair (e.g. TF1-TF2) are defined as the common target genes of TF1 and TF2, where the experimentally validated target genes of a TF were retrieved from the YEASTRACT database. Therefore, the regulatory associations between cooperative TF pairs and their target genes are of biological significance because they are experimentally validated.

    After collecting cooperative TF pairs and defining the target genes of each cooperative TF pair, a web interface is then constructed for users to query the regulatory associations (validated by TFB, TFR or TFB&TFR) between cooperative TF pairs and their target genes. The detailed statistics of YCRD could be found in the following Table.


    # of regulatory associations # of cooperative TF pairs which can be queried # of genes which can be queried
    Regulatory association validated by TFB evidence 63,418 2,441 4,119
    Regulatory association validated by TFR evidence 364,360 2,519 6,165
    Regulatory association validated by TFB&TFR evidence 6,419 1,013 1,380





    Identifying important cooperative TF pairs which regulate a given set of genes

    When researchers have a set of genes (e.g. upregulated genes under a specific biological condition), they usually would like to know which cooperative TF pairs play important roles in regulating these genes. To meet this need, YCRD provides a tool for identifying important cooperative TF pairs which regulate a given set of genes. In YCRD, a cooperative TF pair is regarded as an important regulator if its target genes are enriched in the given set of genes. The hypergeometric distribution is used to test the statistical significance of enrichment. The procedure for checking whether a cooperative TF pair is an important regulator for a given set of genes is as follows.

    1. Calculate the p-value for rejecting the null hypothesis ($H_0$: the cooperative TF pair’s target genes are not enriched in the given set of genes)


    $P_{value} = P(x\geq \left | T \right |) \\ = \sum_{x\geq \left | T \right |} \limits {}\frac{\binom{\left | S \right |}{x}\binom{\left | F \right |-\left | S \right |}{\left | G \right |-x}}{\binom{\left | F \right |}{\left | G \right |}}$

    (1)

    S: the set of target genes of a cooperative TF pair of interest

    G: the given set of genes

    $T = S\bigcap G$: the set of the cooperative TF pair’s target genes which are also in the given set of genes

    F: the set of all genes in the yeast genome.

    $\left | G \right |$: the number of genes in set G.

    2. This p-value is then adjusted by the Bonferroni correction to represent the true alpha level in the multiple hypotheses testing

    3. A cooperative TF pair of interest is called an important regulator for the given set of genes if the Bonferroni-corrected p-value is less than the threshold determined by the user.






    Database interface

    YCRD provides both a search mode and a browse mode.

    Search mode:
    In the search mode, users have two possible ways to search YCRD.

    (1)

    Users can (i) select the experimental evidence (TFB, TFR or TFB&TFR) of the regulatory associations and (ii) select a cooperative TF pair of interest.


    Then YCRD returns the target genes of the selected cooperative TF pair shown in a table and a figure.


    The publications of the experimental evidence of the regulatory associations are also provided.


    (2)

    Users can (i) select the experimental evidence (TFB, TFR or TFB&TFR) and (ii) type in the name of a gene of interest.


    Then YCRD returns the cooperative TF pairs which regulate the input gene shown in a table and a figure.


    The publications of the experimental evidence are also provided.


    Moreover, three types of validation (Algorithm Evidence, Physical Interaction Evidence and Genetic Interaction Evidence) for each cooperative TF pair are provided.




    Browse Mode:
    In the browse mode, users have two possible ways to browse the regulatory associations (validated by TFB, TFR or TFB&TFR) between cooperative TF pairs and their target genes.
    (1)

    When users browse YCRD by the name a cooperative TF pair, users will be given the target genes of this cooperative TF pair.


    (2)

    When users browse YCRD by a gene name, users will be given the cooperative TF pairs that regulate this gene.


    YCRD also provides a tool for identifying important cooperative TF pairs which regulate a given set of genes. To use this tool, users have to (i) input a set of genes of interest, (ii) choose the experimental evidence (TFB, TFR or TFB&TFR) of the regulatory association, and (iii) set the threshold of the Bonferroni-corrected p-value, where p-value is calculated using Equation (1).


    YCRD then returns the important cooperative TF pairs which regulate the set of genes of interest.