Clean up your data with the FlowClean plugin

Do you analyze a lot of samples? If so, data quality control may be challenging, especially when a large number of parameters is measured. In particular, fluorescence measurements for a sample over the collection time may not remain stable due to fluctuations in fluid dynamics. As many as 13.7% of publicly available FCS files have been shown to have this problem. But don’t worry, we are here to help! 

FlowJo now has several quality control plugins to help address these irregularities. The FlowAI, PeacoQC, FlowCut and FlowClean algorithms are all available as plugins that can access the quality of the data and flag potentially troublesome events. The most basic of these algorithms is FlowClean which automatically identifies and flags fluorescence anomalies in your FCS files by tracking cell populations in the centered log ratio space. This has been shown to provide a sensitive and consistent method of quality control. Do you want to give it a try?

Installing the FlowClean plugin

How to get started? First, you want to make sure to have our latest FlowJo. Second, because the plugin relies on functionality developed in R as part of the FlowClean BioConductor library, you need to install a recent version of R. R is freely available for Windows and Mac OS. Please review our tips on installing plugins.

Finally, you will need the FlowClean FlowJo plugin; this is a FlowClean.jar file that you can download from GitHub and save to the plugins directory of your FlowJo installation. Once you have done that, the plugin will be detected the next time you restart FlowJo.

Cleaning up your data with FlowClean

With one or more samples selected, navigate to the FlowClean menu item located under Plugins on the Workspace ribbon, as shown in the figure below.

Figure 1. The FlowJo Plugins menu, showing FlowClean installed.


This will open a dialog with FlowClean input parameters where you should select FCS channels that you want FlowClean to inspect.

Figure 2. FlowClean parameters.

Selecting all fluorescence channels provides good results in most cases. If you see both, compensated and uncompensated versions of the fluorescence channels then we recommend selecting the uncompensated data. The other input parameters have reasonable default values, which you don’t need to alter unless you are not satisfied with FlowClean’s results on your sample. Details about how these values affect the results are included in the flowClean package manual with additional explanation in Fletez-Brant, et al., Cytometry A, 2016. Also, since FlowClean is based on reviewing fluorescence expression values over the sample collection time, the Time channel needs to be present in the input FCS file. This channel will be used automatically even if you do not select it.

Depending on the size of your FCS file, the plugin should take from a few seconds to a few minutes to complete the sample evaluation. Results will appear as one or two clearly distinct cell populations. Specifically, a single population will be returned if no problems were identified, and two populations will be returned if FlowClean identifies fluorescence anomalies in your data. One of the populations labeled “Good Events” contains the cells that were consistent across the time of analysis, whereas the population labeled “Bad Events” contains cells that were flagged as abnormal due to an irregularity in one or more of their parameters.

Figure 3. FlowClean results, showing the two populations.

In order to manually review the results of the plugin, we suggest plotting time vs. fluorescence as shown in the example below. Although FlowClean with the default parameter settings seem to have worked well for most researchers, you can use these types of plots to determine optimal settings for your specific dataset.

Figure 4. FlowClean results, showing the ungated population of all events on the left, and the automatically gated “clean” population on the right. 

Normally, for each fluorescence channel, one would expect the fluorescence measurements for a sample to remain stable over the data collection time. However, fluctuations in fluid dynamics and other factors may lead to instabilities and to the emergence of false populations. When FlowClean identifies such anomalies, there should be one or more fluorescence channels showing a non-stable behavior when plotted against time. For demonstration purposes, we picked an example where this is clearly apparent. Please note fluctuations in your data may not be as clearly visible as in our example.