Learn how to perform differential expression in SeqGeq.

Differential Expression

Differential Expression analysis is a way of identifying genes significantly upregulated and downregulated within a population of interest relative to a comparator population. This type of analysis can identify a signature (aka “Hallmark”) Geneset for a disease state, cell type, or even individual subject.

 Setup

To begin differential expression analysis in SeqGeq you can open a Gene View Graph Window with your test population on the y-axis and the control (“comparator”) population on the x-axis.

In the case of clusters defined by K-Means, a natural comparator population might be the Boolean NOT gate of a given cluster of interest. In section 3 of this tutorial we added the Boolean ribbon to the Workspace tab of SeqGeq’s workspace. To create a NOT gate for one of the K-Means clusters, select that cluster within the workspace, visit the Boolean ribbon, and select NOT – This will create a NOT gate of the cluster within the workspace, as a sibling population of the cluster selected, denoted with a minus sign:

Try creating a NOT gate from one of the clusters within your workspace, open a Gene View Graph Window of the parent population, and place your population of interest (the “test”) onto the y-axis, and your control NOT gate, on the x-axis there:

Volcano Plots

Now that you’ve set the Gene View Graph Window up properly, you can define statistically significant up and down-regulated genes for the populations being compared thereby opening the Volcano Plotting tool within SeqGeq. To do so, click the Volcano Plot icon at the top of the Gene View Graph Window:

This results in the creation of a new Gene View Graph Window illustrating a pair of Derived Observations of the Genes (DOGs for short). You’ll note that these DOG parameters in the volcano plot are the result of two statistical tests between the populations set within the initial Gene View Graph Window: Fold Change between the two populations, and an adjusted p-Value (i.e. a q-Value).

A Fold Change is simply the ratio of expression within the test population over the control population, which is calculated for each gene. The q-value results from a Mann-Whitney U test p-Value, to which a correction for multiple observations has been applied. By default this correction is done with the Bonferroni method, but can be adjusted to the False Discovery Rate (FDR) method, or turned off entirely within the Graphs section of SeqGeq’s preferences.

Try gating genes upregulated and downregulated in your cluster of interest using the Volcano Plot:

Note: Users would likely want to repeat this entire process of differential expression analysis (from setup to volcano plot filtering) for all populations of interest.


Link to SeqGeq Basic Tutorial