Understanding array quality control (QC) metrics for CytoSure arrays
Understanding array quality control (QC) metrics
One of the most important aspects of any microarray experiment is data quality. Before carrying out any in-depth analysis it is vital to ensure that the data you are working with has not been comprised by any technical issues. You need to be sure that the aberrations you are detecting are caused by biology not by how the experiment was performed.
Using CytoSure Interpret Software, it is very easy to view the QC metrics and to keep a track of how they vary over a particular experiment.
The QC metrics for a single array can be viewed by scrolling down to the bottom of the Sample Information window (Figure 1).
Figure 1: Screenshot of QC metrics displayed in CytoSure Interpret Software, with warning indicating poor DLR Spread and suggestion of possible cause.
To plot QC metrics across a number of samples, the data first needs to be submitted to the database. It is then possible to select only the samples included in a particular experiment by filtering on an appropriate field in the database (e.g. Sample Details). Once the filtering has been carried out, the QC Trends view will display only the metrics of the samples that pass the filter.
Figure 2: Screenshot of QC metric plot showing DRLSpread values for a set of samples.
Many factors can influence the expected range of these metrics, including the source and quality of the DNA, the protocols used, the scanner and image processing software. The recommendations suggested here for good quality data are guidelines that have been shown to work well using the standard OGT and Agilent protocols.
The X-separation is the mean normalised log ratio of the probes on the X chromosome. When using sex mis-matched test and reference sample, the log ratio should be close to 1. For a sex matched test and reference samples, the ratio should be close to 0.
This is perhaps the most important QC metric and calculates the probe-to-probe log ratio noise of an array. A poor DLRSpread will mean that it is more difficult to accurately call amplification or deletion calls. The DLRS value should be <0.3. Higher values can indicate poor quality DNA. To detect very small aberrations, a DLRS value of <0.2 may be required. An excellent array would have a DLRS value of <0.1; although for some sample types (e.g., formalin fixed paraffin embedded), this may be difficult to achieve.
This value is the average raw signal intensity for each channel. There is no upper limit on this value; however, for a CGH array it is very unusual to see saturated spots. For a typical array, a minimum value of 350 counts and 250 counts is expected for the red and green channels respectively. This metric is highly dependent on the labelling method used. Therefore the best way to use this value is to compare it across a set of experiments where the same labelling method has been used. This allows identification of any samples with an unusually low value. Low signal intensity values can be caused by a problem occurring during the set up of a labelling reaction, poor DNA quality or the incorrect amount of DNA being added. These problems should really be identified before the array is set up by using a Nanodrop™ to accurately measure the concentration of DNA and the level of incorporation of dye molecules. Most labelling kits give expected values for the amount of dye incorporation and the concentration of DNA. If your sample meets these recommendations but the signal intensities are poor, a problem with the clean-up step is likely.
This is calculated by looking at the negative control spots on the array. Initially any outlier spots are rejected and the standard deviation of the remaining spots is calculated. If this value is high (>10), the tiff image should be examined. If areas of high background noise are evident, the array processing procedure should be investigated. High background values are typically caused during the washing steps. During the washing steps, ensure the stirrer speed is correct – a vortex should be visible when the dish is empty, and ensure the dishes are always properly cleaned. If high background values persist, the dishes should be washed with acetonitrile to remove build up of un-incorporated dye molecules. The background noise value should be between 5 and 10. A value of below 5 is excellent.
Signal to Noise Ratio
This value is calculated by dividing the signal intensity by the background noise and indicates how clearly the spots can be detected above the background level. This metric is dependent on how well the sample labelling and washing steps worked. It is often easier to look at this metric first and then, if it does not pass, identify where the problem occurred by looking at the background noise and the signal intensity. An excellent value for signal to noise would be above 100, between 100 and 30 is good but below 30 is poor. It is difficult to reliably detect aberrations on arrays where the signal to noise is <30.
Signal Intensity Ratio
This value is the ratio between the red and green signal intensities and is calculated from the processed red and green signals from the feature extracted text file. This data has been adjusted for background, with outliers removed and corrected for dye bias. Ideally this value should be close to 1.0. Deviation from this value could indicate a problem with the normalisation method. Alternatively, deviations could indicate that significantly different concentrations of each sample hybridised to the array were used.
The negative controls on the array are used to measure the non-specific hybridisation that can occur during the processing steps. A high negative control value can indicate that, at some stage during the process, a buffer has become contaminated. Buffer contamination most frequently occurs during the washing step. Normally the negative controls are between 50 and 70 counts. Negative control counts over 100 should be investigated.
This is an important metric as it looks at spot quality. The pixels within a good quality spot will have a normal distribution. If, for example, there is contamination by dust or there is a small scratch over a spot, this normal distribution will be skewed. When a spot is flagged as being an outlier it will not be removed from the results but it can be filtered out using the filtering tools so that it does not participate in the detection of aberrations. If over 1% of the total number of features are flagged as outliers, the image should be examined to identify the cause of the problem.
A feature is flagged as being saturated if more than 50% of the total number of pixels within a spot are saturated. It is unusual for more than 0.1% of spots on a CGH array to be identified as saturated. If significant saturation is apparent, it is most likely to be caused by a technical problem, either with too much DNA being added to the hybridisation or a problem with the processing steps. If this occurs, the image would need to be looked at and the labelling metrics examined.
If you have any questions about the QC metrics or problems with troubleshooting potential problems, please don’t hesitate to contact us at firstname.lastname@example.org.
CytoSure™ products and Genefficiency™ NGS browser are for research use only; not for use in diagnostic procedures.