Compare Quantitative Features Between Two Yeast Gene Lists
Comparison Results
User's Specification
# of genes in L1 | input1Gene_length | # of genes in L2 | input2Gene_length |
Multiple hypotheses testing | p-value cutoff = 10- | ||
See the testing result of a chosen quantitative feature | |||
Gene Features | |||
mRNA Features | |||
Protein Features | |||
Network Features |
Testing Result
Step 1
- Users need to input two yeast gene lists to be compared.
- Standard names, systematic names, or aliases are all acceptable.
- If users only have one input gene list, they can use our pre-complied gene lists (e.g. 6604 ORF genes, 299 tRNA genes, 27 rRNA genes, etc.) to generate the second input gene list which is the union of all the selected gene lists.
Step 2
- Users need to define the sets of genes (in the yeast genome) whose promoters/coding regions contain specific histone modifications by setting the thresholds.
- For example, by setting log2(H3K9ac/H3)≥1 (meaning the two-fold enrichment over the background) in the promoters, a set of 2129 yeast genes whose promoters contain H3K9ac could be defined.
- Then the expected ratio of promoters having H3K9ac in the yeast genome is equal to 0.32 (2129/6572).
- Further, by intersecting the input list of N genes and the set of 2129 genes, the number (denoted as M) of input genes whose promoters having H3K9ac can be calculated.
- Then the observed ratio of promoters having H3K9ac in the input list of genes is equal to M/N.
- Finally, the input list of N genes is said to be enriched with H3K9ac in the promoters if the observed ratio (M/N) is much larger than the expected ratio (2129/6572).
- The statistical significance is calculated using hypergeometric testing.
- H3K14ac [H2O2]: The yeast cells are grown in the rich medium adding H2O2.
- log2(H2AK5ac / Input): "Input" means the control experiment, which is the ChIP-chip/ChIP-seq experiment without using any anti-histone modification (e.g. anti-H3K79me2) antibody.
- MAT score (H3K79me2 / Input): MAT stands for Model-based Analysis of Tiling-arrays, which is an algorithm for reliably detecting enriched regions. The higher the MAT score, the higher the enrichment.
Step 3
- Since YQFC tests many quantitative features (i.e. multiple hypotheses testing), users have to select a statistical method (Bonferroni correction or FDR) for multiple hypotheses correction and set the p-value threshold.
- Bonferroni correction is more conservative than FDR. That is, Bonferroni correction has a smaller type I error rate, resulting in a smaller power, than FDR does.
- The p-value threshold determines the statistical significance of how different of a quantitative feature is between the two input gene lists.
- The more stringent the p-value threshold, the higher the statistical significance of the identified distinct quantitative feature.
Warning
Input genes contain names with multiple IDs or unknown names.
Please modify your input gene list.
INPUT 1 | |
Names with multiple IDs | IDs |
---|---|
Unknown names |
INPUT 2 | |
Names with multiple IDs | IDs |
---|---|
Unknown names |
Proof page
Loading...
GI (Genetic Interaction) network
- Two genes have a link if they have genetic interaction.
- (From BioGRID)
CC network
- Inferred links by co-citation of two genes across 46,111 pubmed Medline article abstracts for yeast biology
- (From YeastNet)
CX network
- Inferred links by co-expression pattern of two genes (based on high-dimensional gene expression data)
- (From YeastNet)
DC network
- Inferred links by co-occurrence of protein domains between two coding genes
- (From YeastNet)
GN network
- Inferred links by similar genomic context of bacterial orthologs of two yeast genes
- (From YeastNet)
TS network
- Inferred links by 3-D protein structure of interacting orthologous proteins between two yeast proteins
- (From YeastNet)
mRNA level (3 datasets)
- The data of mRNA expression level were retrieved from Table S4 of Nagalakshmi (2008)
- The data of transcription level and transcriptional frequency were retrieved from Holstege (1998)
Transcriptional plasticity
- The capacity for a gene to change its transcriptional level under different conditions
- (From Lin 2010)
Translational efficiency
- The rate of mRNA translation into proteins within cells
- (From WIKIPEDIA)
- To know the details of each dataset, please check Csárdi (2015)
Codon bias
- Codon Bias Index (CBI) is a measure of directional codon bias, it measures the extent to which a gene uses a subset of optimal codons.
- In a gene with extreme codon bias, CBI will equal 1.0, in a gene with random codon usage CBI will equal 0.0.
- Note that it is possible for the number of optimal codons to be less than expected by random change.
- This results in a negative value for CBI.
- (From CondonW)
Codon adaptation index
- The Codon Adaptation Index (CAI) is the most widespread technique for analyzing codon usage bias.
- CAI measures the deviation of a given protein coding gene sequence with respect to a reference set of genes.
- CAI is used as a quantitative method of predicting the level of expression of a gene based on its codon sequence.
- (From WIKIPEDIA)
Frequency of optimal codons
- This index is the ratio of optimal codons to synonymous codons (genetic code dependent).
- (From CondonW)
Hydropathicity of protein
- Hydrophobicity scales are values that define the relative hydrophobicity or hydrophilicity of amino acid residues.
- The more positive the value, the more hydrophobic are the amino acids located in that region of the protein.
- These scales are commonly used to predict the transmembrane alpha-helices of membrane proteins.
- When consecutively measuring amino acids of a protein, changes in value indicate attraction of specific protein regions towards the hydrophobic region inside lipid bilayer.
- (From WIKIPEDIA)
Aromaticity score
- The frequency of aromatic amino acids (Phe, Tyr, Trp) in the hypothetical translated gene product.
- The hydropathicity and aromaticity protein scores are indices of amino acid usage.
- (From CondonW)
Protein abundance (in the normal growth condition, 23 datasets)
- These 23 datasets were retrieved from Table S4 of Ho (2018)
Protein abundance (in various stress conditions, 11 kinds)
- The protein abundance data in 11 kinds of stress conditions were retrieved from Table S8 of Ho (2018)
Extinction coefficient at 280nm
- The extinction coefficient at 280nm indicates how much light a protein absorbs at the wavelength of 280nm.
- It is useful to have an estimation of this coefficient for following a protein which a spectrophotometer when purifying it.
- Two values are provided, both for proteins measured in water at 280 nm.
- The first one shows the computed value based on the assumption that all cysteine residues appear as half cystines (i.e. all pairs of Cys residues form cystines), and the second one assuming that no cysteine appears as half cystine (i.e. assuming all Cys residues are reduced).
- Note: Cystine is the amino acid formed when of a pair of cysteine molecules are joined by a disulfide bond.
- (From ExPASy)
Isoelectric point
- The isoelectric point (pI, pH(I), IEP) is the pH at which a molecule carries no net electrical charge or is electrically neutral in the statistical mean
- (From WIKIPEDIA)
Aliphatic index
- The aliphatic index of a protein is defined as the relative volume occupied by aliphatic side chains (alanine, valine, isoleucine, and leucine).
- It may be regarded as a positive factor for the increase of thermostability of globular proteins.
- (From ExPASy)
Instability index
- The instability index provides an estimate of the stability of a protein in a test tube.
- (From ExPASy)