Comprehensive Genome Sequence Analysis of a Breast Cancer Amplicon

  1. Colin Collins1,6,
  2. Stanislav Volik1,
  3. David Kowbel1,
  4. David Ginzinger1,
  5. Bauke Ylstra1,
  6. Thomas Cloutier2,
  7. Trevor Hawkins3,
  8. Paul Predki3,
  9. Christopher Martin4,
  10. Meredith Wernick1,
  11. Wen-Lin Kuo1,
  12. Arthur Alberts5, and
  13. Joe W. Gray1
  1. 1University of California San Francisco Cancer Center, San Francisco, California 94143-0808, USA; 2Lawrence Berkeley National Laboratory, Berkeley, California 94143, USA; 3Department of Energy Joint Genome Institute, Walnut Creek, California 94958, USA; 4Novartis Agricultural Discovery Institute, San Diego, California 92121, USA; 5Van Andel Institute, Grand Rapids, Michigan 49503, USA

Abstract

Gene amplification occurs in most solid tumors and is associated with poor prognosis. Amplification of 20q13.2 is common to several tumor types including breast cancer. The 1 Mb of sequence spanning the 20q13.2 breast cancer amplicon is one of the most exhaustively studied segments of the human genome. These studies have included amplicon mapping by comparative genomic hybridization (CGH), fluorescent in-situ hybridization (FISH), array-CGH, quantitative microsatellite analysis (QUMA), and functional genomic studies. Together these studies revealed a complex amplicon structure suggesting the presence of at least two driver genes in some tumors. One of these, ZNF217, is capable of immortalizing human mammary epithelial cells (HMEC) when overexpressed. In addition, we now report the sequencing of this region in human and mouse, and on quantitative expression studies in tumors. Amplicon localization now is straightforward and the availability of human and mouse genomic sequence facilitates their functional analysis. However, comprehensive annotation of megabase-scale regions requires integration of vast amounts of information. We present a system for integrative analysis and demonstrate its utility on 1.2 Mb of sequence spanning the 20q13.2 breast cancer amplicon and 865 kb of syntenic murine sequence. We integrate tumor genome copy number measurements with exhaustive genome landscape mapping, showing that amplicon boundaries are associated with maxima in repetitive element density and a region of evolutionary instability. This integration of comprehensive sequence annotation, quantitative expression analysis, and tumor amplicon boundaries provide evidence for an additional driver gene prefoldin 4 (PFDN4), coregulated genes, conserved noncoding regions, and associate repetitive elements with regions of genomic instability at this locus.

Footnotes

  • 6 Corresponding author.

  • E-MAIL collins{at}cc.ucsf.edu; FAX (415) 476-8218.

  • Article published on-line before print: Genome Res.,10.1101/gr.174301.

  • Article and publication are at www.genome.org/cgi/doi/10.1101/gr.174301.

    • Received December 10, 2000.
    • Accepted March 2, 2001.
| Table of Contents

Preprint Server