The impact of microarray probe design on detecting copy number variants at exon-resolution

Friday 13 October 2017
Applications:
,

CytoSure™ arrays offer robust copy-number detection with high resolution – detecting CNVs of a few hundred base pairs at the single-exon level. However, a number of different factors need to be taken into account when designing microarrays to make sure that they offer robust performance across the targeted regions. Learn more about the process that we undertake to make sure our arrays perform the best they possibly can.

Want to download this as a PDF? Download now or preview the application note below.

Introduction

The analysis of structural variants, such as copy number variants (CNVs), is an important aspect of clinical genetics research. Whilst many technologies are used for determining CNVs in human DNA, array comparative genomic hybridisation (aCGH) is now established as the gold standard for detection of CNVs across the entire genome, and is used not just in research but also in clinical applications1.

As microarray technology has evolved, the resolution at which CNVs can be detected has steadily increased. Despite this, the design strategy behind many microarray designs has made it difficult or impossible to find aberrations smaller than ~30kb2, despite the fact that smaller aberrations have been demonstrated to be relevant in Mendelian disorders3. In order to tackle this problem, sophisticated probe design approaches have been developed over the last few years, and have made it possible to increase the resolution of arrays much further – targeting important genetic loci in such a way that CNVs can be found even at the exon-level4.

However, in order to develop arrays that can robustly analyse CNVs at this size, a number of factors need to be taken into account during the design process.

Probe targeting

Due to the defined/fixed number of probes on any array, an important aspect of development is to utilise these as effectively as possible. The ability of an array to detect CNVs is defined by the number of probes targeting any given region – these regions might be high-priority regions which need highly clustered probes to deliver desired resolution, or lower priority “backbone” regions which will have less clustered probes. It is therefore important to strike the right balance between probes used across the backbone regions and highly targeted regions to suit the microarray’s purpose. To do this, OGT’s design algorithms work to find the best possible genomic targets for each probe in order to achieve the desired resolution, without compromising on overall content.

Schematic of microarray probe density across different genetic elements

Figure 1: Schematic of microarray probe density across different genetic elements. Probes can be focused across exons to give ultra-high resolution at the exon-level, enabling single exon CNV discovery. High probe density can also be targeted at non-genic regions of importance. Backbone probes, dispersed throughout the genome, allow for discovery of larger variants at standard resolution in other areas.

Factors affecting probe performance

The key to producing a microarray able to robustly detect CNV at any desired resolution is in the design of the probes themselves and the way they function. To understand what makes a good probe, we need to consider the following:

Probes need to target loci in a specific manner (i.e. should not provide false positives). This can be affected by the following factors:

  • Cross-hybridisation to other targets — usually very common on genes sitting on segmental duplication regions and pseudogenes 
  • Non-specific binding (e.g. poly-G stretches or repeats) 
  • GC content — the higher the GC, the stickier the probe, which tends to cause non-specific binding and noninformative signals (probe fluorescence remains high regardless of copy number change)

Probes need to target loci in a sensitive manner (i.e. should not provide false negatives). This can be affected by the following factors:

  • Secondary structures in the probe — can cause the active sequence of the probe to become unavailable for hybridisation 
  • Secondary structures in the target — secondary structures on the target fragments might hinder hybridisation 
  • GC content — the lower the GC, the less sticky the probe is, which tends to cause a lack of binding and non-informative signals (probe fluorescence remains low regardless of copy number change)

Probes need to function under relatively similar Isothermal conditions:

  • Similar hybridisation conditions — similar characteristics mean that probes will perform more uniformly in any given set of hybridisation conditions 
  • The more similar the hybridisation performance across probes, the less noise is created in the final datasets

Probe optimisation

With all of these factors in mind, a sophisticated approach to probe design needs to be adopted to ensure probes provide the most informative signals and consequently the best possible resolution across the regions of interest. OGT’s design workflow includes all of the following in silico steps to design and identify the best possible probes to target any given region:

  1. Analysis of target sequences, including gathering sequence metadata (e.g. intronic vs exonic) 
  2. Identify repetitive and homologous regions 
  3. Generate all possible probes 
  4. Analyse the physicochemical properties of the probes 
  5. Rank probes depending on most desirable properties 
  6. Choose most suitable probes for target regions

In addition to all the in silico steps taken during probe design, all OGT catalogue microarrays also undergo an empirical optimisation process, to ensure that all probes are working at peak performance. At times, selection of the best possible probes from a number of potential options is necessary, especially for regions that are difficult to target due to the nature of the sequence. By testing a number of different probes across a great number of repeats, selection of the best possible probes to target any desired loci is possible (Figure 2). All of this design work leads to better probe specificity, reduced noise, and importantly more accurate results (figure 3).

L= data from a single experiment with a deletion present on chromosome 1. R= average probe performance across 4000 repeats.Figure 2: Left – Data from a single experiment with a deletion present on chromosome 1, shown on CytoSure™ Interpret Software. Right – average probe performance across 4000 repeats. “Non-performing probes” (blue) defined as those which gave an incorrect result more often than a correct result, “Good probes” (pink) defined as those which gave a correct result more often than an incorrect result.

Seperate experiments on the same sample with a confirmed 1600bp deletion

Figure 3: Separate experiments on the same sample with a confirmed ~1600bp deletion. Left – experiment run using non-optimised array, with a ~400 bp deletion found. Right – experiment run using optimised array, with a ~1600bp deletion found.

To aid with the design process, OGT has developed a database of probes called the Oligome™, which includes probes across the entire genome that are in silico or empirically optimised, allowing for the rapid development of microarray designs. Many hundreds of design projects have helped to continuously improve the database, (which now includes over 20 million probes) with the data from these projects improving the design process and algorithms in an iterative fashion.

Conclusion

A number of different factors must be taken into account when designing arrays, especially those that offer exon resolution. OGT is able to offer high-resolution arrays due to the design process developed as a result of many years of experience designing bespoke microarrays.

Moreover, the entire CytoSure workflow is optimised for robust analysis of CNV – to complement the excellent performance of the microarrays, CytoSure Labelling kits offer very low DLRS (derivative log ratio spread – a measure of noise in array data); and CytoSure Interpret software offers a robust, feature rich and user-friendly platform for the analysis of microarray data.

References

  1. Kearny, H.M. et al (2011) American College of Medical Genetics recommendations for the design and performance expectations for clinical genomic copy number microarrays intended for use in the postnatal setting for detection of constitutional abnormalities. Genome in Medicine 13 p676-685 
  2. Poultney C.S. et al (2013) Identification of Small Exonic CNV from Whole-Exome Sequence Data and Application to Autism Spectrum Disorder. AJHG Vol 93(4), 607-619 
  3. Aradhya S. et al (2012) Exon-level array CGH in a large clinical cohort demonstrates increased sensitivity of diagnostic testing for Mendelian disorders. Genetics IN Medicine 14, 594–603 
  4. Askree S.H. et al (2013) Detection limit of intragenic deletions with targeted array comparative genomic hybridization. BMC Genetics 14:116

CytoSure: For research use only; not for use in diagnostic procedures

Request a technical consultation

Do you have a question about what you've just read? Contact us today and one of our technical specialists will be happy to answer any questions you may have.

Request a technical consultation >

Download literature

Download a PDF of this literature.

Download Literature