AgilentQC

From Genome Technology Core (GTC) wiki - Sequencing and Microarray
Jump to: navigation, search

Here we describe some of the information generated with each agilent microarray in the form of a QC report and how to interpret it. Please note that some of the information is protocol/array type dependent and may not be present in your QC report. For instance, the "Plot of LogRatio vs. LogProcessed Signal" will be missing for all single channel experiments.

Spot Finding of Four Corners

By viewing the features in the four corners of the microarray, you can note if the spot centroids have been located properly. If their locations are off-center in one or more corners, you may have to run the extraction again with a new grid.

Outlier Stats

If the QC Report shows a large number of non-uniform or population outliers, you may want to check your hybridization/wash step. Also, check the visual results (.shp file) to see if the spot centroids are off-center. If the grid was not placed correctly, a new grid is required.

Spatial Distribution of All Outliers

The QC report shows two plots of all the outliers, both population and nonuniformity outliers, whose positions are distributed across the microarray. One plot is for the green channel, and the other, for the red channel. To distinguish the background population and nonuniform outliers from one another, view the color coding at the bottom of the two plots.

A spacial pattern of outliers may indicate a artifact due to technical reasons and may require further investigation.

Net Signal Statistics

Net signal statistics are an indication of the dynamic range of the signal on a microarray for both non-control probes and spike-in probes (only non-control probes for the CGH QC report). The QC Report uses the range from the 1st percentile to the 99th percentile as an indicator of dynamic range for that microarray.

Negative Control Stats

The Negative Control Stats table includes the average and standard deviation of the net signals (mean signal minus scanner offset) and the background-subtracted signals for both the red and green channels in the negative controls. These statistics filter out saturated and feature non-uniform and population outliers and give a rough estimate of the background noise on the microarray.

Plot of Background-Corrected Signals

Plot of the log of the red background-corrected signal versus the log of the green background-corrected signal for non-control inlier features. The linearity or curvature of this plot can indicate the appropriateness of background method choices. The plot should be linear.

The intersection of the red vertical and horizontal lines shows the location of the median signal. The numbers along the edge of the lines represent the location of the median signal on the plot.

The values below the plot indicate the number of non-control features that have a background-corrected signal less than zero.

Histogram of Signals Plot (1-color only)

The purpose of this histogram is to show the level of signal and the shape of the signal distribution. The histogram is a line plot of the number of points in the intensity bins vs. the log of the processed signal.

Local Background Inliers

With these numbers you can see the mean signal distribution for the local background regions after outliers have been removed. This information can help you detect hybridization/wash artifacts and can be a component of noise in the low signal range.

Multiplicative Surface Fit

This is the root mean square (RMS) of the surface fit for the data. The RMS X 100 is roughly the average % deviation from “flat” on the microarray. A multiplicative trend means that there are regions of the microarray that are brighter or dimmer than other regions. This trend is an effect that multiplies signals; that is, a brighter signal is more affected in absolute signal counts than a dimmer signal.

This option is turned on in GE1, GE2 (v5) and CGH protocols, turned off in the miRNA protocol and is not available for non-Agilent protocols.

If the signal is not improved through a multiplicative surface fit, then the software turns the algorithm off, and the RMS_Fit shows up as 0.0, as in the figure below.

Spatial Distribution of Up-Regulated and Down-Regulated Features (Positive and Negative Log Ratios)

You can view the distribution of the significantly up- and down-regulated features on this plot (up–red; down–green). The software randomly selects 5000 data points. These points include the number of up-regulated features in the same proportion to the number of down-regulated features as they are found on the actual microarray.

Plot of LogRatio vs. LogProcessed Signal

This plot shows the log ratios of non-control inliers vs. the log of their red and green processed signals. The color coding signifies the degree to which features are significantly differentially expressed: those that are up-regulated (red), those that are down-regulated (green) and those that cannot confidently be said to show gene expression (light yellow). Features that were used for normalization are indicated in blue. Significance takes precedence over normalization for the color coding; that is, features that are both significantly differentially expressed and used for normalization will be color-coded either red or green.

Reproducibility Statistics (%CV Replicated Probes)

If a non-control probe has a minimum number of inliers, a %CV (percent coefficient of variation) of the background-corrected signal is calculated for each channel. This calculation is done for each replicated probe, and the median of those %CV’s is reported in the table for each channel.

A lower median %CV value indicates better reproducibility of signal across the microarray than a higher value.

Spike-in Signal Statistics

These signal statistics and S/N values for spike-ins indicate accuracy and reproducibility of the signals of the microarray probes. The table shows the expected signal of the spike-in probe, the observed average signal, the SD of the observed signal and the S/N of the observed signal.