Integration of human immunodeficiency virus type 1 in untreated infection occurs preferentially within genes.
ABSTRACT Previous analyses of human immunodeficiency virus type 1 (HIV-1) integration sites generated in infections in vitro or in patients in whom viral replication was repressed by antiviral therapy have demonstrated a preference for integration within protein-coding genes. We analyzed integration sites in peripheral blood mononuclear cells (PBMCs), spleen, lymph node, and cerebral cortex from patients with untreated HIV-1 infections. The great majority of integration sites in each tissue were within genes. Statistical analyses of the frequencies of integration in genes in PBMCs and lymph tissue demonstrated a strong preference for integration within genes. Although the sample size for brain tissue was too small to demonstrate a clear statistical preference for integration in genes, four of the five integration sites identified in brain were within genes. Taken together, our data indicate that HIV-1 preferentially integrates within genes during untreated infection.
[show abstract] [hide abstract]
ABSTRACT: Retroviral vectors are often used to introduce therapeutic sequences into patients' cells. In recent years, gene therapy with retroviral vectors has had impressive therapeutic successes, but has also resulted in three cases of leukaemia caused by insertional mutagenesis, which has focused attention on the molecular determinants of retroviral-integration target-site selection. Here, we review retroviral DNA integration, with emphasis on recent genome-wide studies of targeting and on the status of efforts to modulate target-site selection.Nature Reviews Microbiology 12/2005; 3(11):848-58. · 21.18 Impact Factor
[show abstract] [hide abstract]
ABSTRACT: HIV DNA integration is favored in active genes, but the underlying mechanism is unclear. Cellular lens epithelium-derived growth factor (LEDGF/p75) binds both chromosomal DNA and HIV integrase, and might therefore direct integration by a tethering interaction. We analyzed HIV integration in cells depleted for LEDGF/p75, and found that integration was (i) less frequent in transcription units, (ii) less frequent in genes regulated by LEDGF/p75 and (iii) more frequent in GC-rich DNA. LEDGF is thus the first example of a cellular protein controlling the location of HIV integration in human cells.Nature Medicine 01/2006; 11(12):1287-9. · 22.46 Impact Factor
Article: Resting CD4+ T cells from human immunodeficiency virus type 1 (HIV-1)-infected individuals carry integrated HIV-1 genomes within actively transcribed host genes.[show abstract] [hide abstract]
ABSTRACT: Resting CD4+ T-cell populations from human immunodeficiency virus type 1 (HIV-1)-infected individuals include cells with integrated HIV-1 DNA. In individuals showing suppression of viremia during highly active antiretroviral therapy (HAART), resting CD4+ T-cell populations do not produce virus without cellular activation. To determine whether the nonproductive nature of the infection in resting CD4+ T cells is due to retroviral integration into chromosomal regions that are repressive for transcription, we used inverse PCR to characterize the HIV-1 integration sites in vivo in resting CD4+ T cells from patients on HAART. Of 74 integration sites from 16 patients, 93% resided within transcription units, usually within introns. Integration was random with respect to transcriptional orientation relative to the host gene and with respect to position within the host gene. Of integration sites within well-characterized genes, 91% (51 of 56) were in genes that were actively expressed in resting CD4+ T cells, as directly demonstrated by reverse transcriptase PCR (RT-PCR). These results predict that HIV-1 sequences may be included in the primary transcripts of host genes as part of rapidly degraded introns. RT-PCR experiments confirmed the presence of HIV-1 sequences within transcripts initiating upstream of the HIV-1 transcription start site. Taken together, these results demonstrate that HIV-1 genomes reside within actively transcribed host genes in resting CD4+ T cells in vivo.Journal of Virology 07/2004; 78(12):6122-33. · 5.40 Impact Factor
JOURNAL OF VIROLOGY, Aug. 2006, p. 7765–7768
Copyright © 2006, American Society for Microbiology. All Rights Reserved.
Vol. 80, No. 15
Integration of Human Immunodeficiency Virus Type 1 in
Untreated Infection Occurs Preferentially within Genes
Hongbing Liu,1Eugene C. Dow,1Reetakshi Arora,1Jason T. Kimata,1
Lara M. Bull,2Roberto C. Arduino,2and Andrew P. Rice1*
Department of Molecular Virology and Microbiology, Baylor College of Medicine, Houston, Texas,1and Section of Infectious Diseases,
Department of Medicine, University of Texas Health Science Center at Houston School of Medicine, Houston, Texas2
Received 15 March 2006/Accepted 11 May 2006
Previous analyses of human immunodeficiency virus type 1 (HIV-1) integration sites generated in infections
in vitro or in patients in whom viral replication was repressed by antiviral therapy have demonstrated a
preference for integration within protein-coding genes. We analyzed integration sites in peripheral blood
mononuclear cells (PBMCs), spleen, lymph node, and cerebral cortex from patients with untreated HIV-1
infections. The great majority of integration sites in each tissue were within genes. Statistical analyses of the
frequencies of integration in genes in PBMCs and lymph tissue demonstrated a strong preference for inte-
gration within genes. Although the sample size for brain tissue was too small to demonstrate a clear statistical
preference for integration in genes, four of the five integration sites identified in brain were within genes. Taken
together, our data indicate that HIV-1 preferentially integrates within genes during untreated infection.
A defining step in the replication cycle of retroviruses is
integration of the cDNA copy of the viral genome into the host
cell chromosome. Several studies have used the recently com-
pleted sequence of the human genome to identify and charac-
terize retroviral integration sites in infections carried out in
vitro. These studies have demonstrated a preference for
retroviral integration into actively transcribed regions of the
genome (reviewed in references 3, 6, and 7). Integration by
murine leukemia virus (MLV) occurs preferentially near tran-
scriptional start sites, whereas that of human immunodefi-
ciency virus type 1 (HIV-1) and simian immunodeficiency virus
occurs preferentially anywhere within transcription units (9,
12, 14, 17, 20). Integration by avian sarcoma-leukosis virus
(ASLV) shows a weak preference for integration into actively
transcribed genes, with no preference for transcriptional start
sites (14, 15). Analyses of nucleotide sequences surrounding
retroviral integration sites revealed symmetry in base prefer-
ences for integration by HIV-1 and ASLV, but not MLV,
indicating a distinct mechanism in the recognition of host cell
DNA by different viral preintegration complexes (PICs) (10,
12). The implication of these studies is that host cell factors
that associate with actively transcribed genes may modify chro-
matin structure or interact with PICs and facilitate integration
nearby, and the precise molecular mechanisms of integration
are likely to differ between retroviruses. In the case of HIV-1,
an association of the viral integrase protein with the transcrip-
tion factor LEDGF/p75 appears to be capable of directing
integration into actively transcribed genes (5).
A potential limitation of the studies summarized above is
that the integration sites examined were from in vitro infec-
tions. In the case of HIV-1, this is a significant issue, as the
physiological state of CD4?T lymphocytes that are produc-
tively infected can differ markedly between in vivo and in vitro
conditions. Productive infection in vitro requires T-cell activa-
tion, and a number of impediments to the infectious cycle in
resting CD4?T cells, including blocks to reverse transcription
and integration, have been documented (18, 21). In contrast,
immunohistochemical and in situ hybridization analyses have
revealed that considerable amounts of HIV-1 replication occur
in vivo in CD4?T lymphocytes that lack activation markers
and therefore appear to be in a resting state (13). Thus, it is
possible that the chromatin environment for HIV-1 PICs in
vivo may differ from that found in vitro, and this may affect
integration site selection.
A recent study examined HIV-1 integration in vivo in in-
fected individuals in whom viremia was suppressed by highly
active antiretroviral therapy (HAART) (8). Resting CD4?T
cells harboring transcriptionally silent proviruses were isolated
from these patients, and identification of integrants revealed a
strong preference for integration within transcription units of
protein-coding genes. However, these integration sites were
identified in individuals on HAART, and it is not clear how the
pattern of integration under conditions of repressed viral rep-
lication relates to that generated during untreated infection.
To extend the analysis of HIV-1 integration sites generated in
vivo, we examined integrants in tissues from individuals with un-
treated infections. We examined peripheral blood mononuclear
cells (PBMCs) from a set of six individuals prior to antiviral
cerebral cortices of the brains of two deceased patients with no
history of antiviral therapy.
PBMCs were obtained from HIV-1-infected patients prior
to the initiation of HAART. Informed consent was obtained
from these individuals in accordance with Baylor College of
Medicine and UT-Houston institutional review boards. Brain
cortices, lymph nodes, and spleens from HIV-1-infected do-
nors with no history of antiviral treatment were obtained from
the National Disease Research Interchange (Philadelphia, PA).
Genomic DNA was isolated from PBMCs with a QIAamp
* Corresponding author. Mailing address: Baylor College of Medicine,
Department of Molecular Virology and Microbiology, One Baylor Plaza,
Houston, TX 77030. Phone: (713) 798-5774. Fax: (713) 798-3490. E-mail:
DNA blood minikit (QIAGEN); genomic DNA was isolated
from brains, lymph nodes, and spleens with a genomic DNA
purification kit (Gentra Systems) according to the manufacturer’s
Our procedure to identify integration sites is a modification
of that previously described (8). Genomic DNA preparations
were digested with PstI or combinations of SpeI, XbaI, and
NheI (SpeI, XbaI, and NheI have compatible ends for liga-
tion). The use of PstI and combinations of SpeI, XbaI, and
NheI to clone integration sites reduces the bias that would
be introduced with the use of a single restriction enzyme to
cleave the flanking cellular DNA. Digested DNAs were se-
rially diluted and self-ligated to generate templates for PCR.
Circularized DNA was amplified with the primers LTR-outer
(5?-TAACCAGAGAGACCCAGTACAGGC-3?) and Gag-outer
by nested PCR with LTR-inner (5?-TGGTACTAGCTTGAA
GCACCATCCA-3?) and Gag-inner (5?-TGTTAAAAGAGA
CCATCAATGAGGAAG-3?). Reactions used the Advantage
2 PCR system (Clontech), Advantage GC PCR system (Clon-
tech), and Elongase mix (Invitrogen) and were performed at
94°C for 30 s, 55°C for 45 s, and 68°C for 150 s for 35 cycles.
PCR products were examined by Southern blot hybridizations
to confirm positive amplifications of HIV-1 sequences. Positive
PCR products were ligated to the TA vector (Promega), and
bacterial transformations were performed. Bacterial colonies
were screened for HIV-1?plasmid inserts by colony hybrid-
ization using a
(LTR). Plasmids containing HIV-1 inserts were sequenced,
and the junction of cellular and viral DNA was identified; all
junctions analyzed contained cellular sequences precisely
joined to the 5? end of the viral 5? LTR sequence (5?-TGGAA-
3?). Cellular sequences were identified as unique sites by anal-
ysis at http://www.ncbi.nlm.nih.gov/BLAST/.
Integration sites in PBMCs. We identified 23 HIV-1 inte-
gration sites in PBMCs isolated from six individuals prior to
initiation of antiviral therapy. We were able to map 21 of these
23 integrants to unique sites in the human genome (Table 1).
One integrant not mapped was located within a previously
identified human BAC clone (CIT987SK-582J2) that we could
not locate in the genome. One integrant cannot be mapped to
a unique site in the genome because it is located within a
conserved exon that is found in each of four related genes in a
cluster on chromosome 18 that may have arisen by gene du-
plication (TCEB3C, TCEB3B, LOC653415, and LOC653420);
nevertheless, this integrant was within a protein-coding gene. If
we assume that the integrant not mapped to any site in the
genome is located in an intergenic region, then 20 of the 23
integrants in PBMCs were within genes. The majority of inte-
grants identified are likely to have been present in infected
CD4?T lymphocytes rather than monocytes or other cell
types, as the majority of infected PBMCs are known to be
32P probe to the viral long terminal repeat
TABLE 1. HIV integration sites in infected PBMCs
Integrant ChromosomeJunctional sequencea
Host geneDescription Orientationb
Chloride intracellular channel 4
Hypothetical protein FLJ13150
Between TCEB3, polypeptide 3,
Rho family guanine nucleotide
Ubiquitin-activating enzyme E1C
Nuclear respiratory factor 1
SNARE protein Ykt6
Between RPL8 and zinc finger
Similar to 34G6.1
Suppressor of fused homolog
Opioid binding protein/cell
adhesion molecule like
E74-like factor 1
Small nuclear ribonucleoprotein
Repeat protein interacting with
phosphoinositides of 49 kDa
Nuclear protein localization 4
Transcription elongation factor
B polypeptide 3C
Hypothetical protein FLJ 22349
Chromosome Y ORFd15A
43 ATGAGTGGAA3q27.1 Intron 3KIAA0861
11TATACTGGAA8q23 Intron 1LRP12
1717CCTGGTGGAA 17q24.2Intron 8 WIPI49
aJunction: between host cell DNA and the 5? HIV-1 LTR (first five nucleotides of LTR are TGGAA).
b? and ?, the host gene and the HIV-1 insert have the same or opposite transcriptional orientation.
cCannot distinguish between exons located in TCEB3C, TCEB3B, LOC653415, and LOC65342 genes in duplicated gene cluster on chromosome 18 (see the text).
dORF, open reading frame.
CD4?T cells (1, 4, 16). In a statistical analysis of the data in
Table 1, we treated integration events as a Bernoulli trial,
assumed that integrations are independent events, and used
the assumption that one-third of the human genome encodes
protein genes (11, 19). Using the exact method for the one-
proportion binomial test (by STATA), the locations of inte-
gration sites in PBMCs indicate that there is a highly significant
preference for integration within genes (P value ? 0.001;
power, 1.000). There is no apparent preference for the orien-
tation of integration, as 12 integrants are oriented in the same
transcriptional direction as the host gene, while eight are ori-
ented in the opposite direction. Examination of distance from
the transcriptional start site of these genes revealed no posi-
tional bias for integration within the 5? region of genes, such as
occurs for MLV infections in vitro.
Integration sites in spleen and lymph nodes. We examined
integration sites in solid lymphoid tissues, lymph nodes, and
spleens from two deceased HIV-infected patients with no his-
tory of antiretroviral therapy (Table 2). In tissues from patient
A, we were able to identify and map 10 integration sites in
infected lymph nodes and two in infected spleen tissue. Nine of
the 10 integrants in lymph node were in genes, while both
integrants in the spleen were in genes. In tissues from patient
B, we identified and mapped three integration sites in lymph
nodes, all of which were located in genes. A statistical analysis
carried out as described above for PBMCs indicates that for
patient A, there was a highly significant preference for inte-
gration within genes in lymphoid tissue (lymph node plus
spleen), with a P value of ?0.001 (power, 0.9998). If integrants
in lymph nodes for patients A and B are considered together,
TABLE 3. HIV-1 integration sites in infected brains
Ankyrin repeat domain 11
Fc fragment of IgG binding protein
31 GTAGATGGAA 1p36.1-p36.2Intron 5 REREArginine-glutamic acid dipeptide
Between NFKBL1 and LTA
Similar to KIAA0033
Intron 7 LOC440026
a? and ?, the host gene and the HIV-1 insert have the same or opposite transcriptional orientation.
TABLE 2. HIV-1 integration sites in infected lymphoid tissues
Chromosome Junction sequence
Host gene DescriptionOrientationa
11 AAAACTGGAA 1q32IntergenicUpstream of ATPase Ca??
Karyopherin alpha 6
Similar to F10G7.10.P
Hypothetical protein FLJ34969
Human HIV-1 enhancer
binding protein 1
SFRS protein kinase 1
Splicing factor, arginine/serine-
rich 4, isoform C
plasma membrane 4
Protein phosphatase 4
respiratory subunit 1 like
1220 CTTCCTGGAA20q13Intron 1 PPP4R1L
136CCTCCTGGAA6q215? noncodingMHC I AMajor histocompatibility
complex I A precursor
Stromal cell-derived factor 2
Hypothetical protein FLJ22582
14 17 GCCATTGGAA17q11 Intron 2SDF2
15 22GTATGTGGAA 22q12Intron 8FLJ22582
a? and ?, the host gene and the HIV-1 insert have the same or opposite transcriptional orientation.
bLN, lymph node; SP, spleen.
VOL. 80, 2006 NOTES7767
there is also a highly significant preference for integration
within genes, with a P value of ?0.001 (power, 1.000). As with
PBMCs, there is no preference in lymphoid tissues for orien-
tation of integration, as integrants were split equally relative to
the direction of transcription of cellular genes.
Integration sites in brain cerebral cortex. We identified and
mapped five integration sites in infected cerebral cortex from
patients A and B (Table 3). Four of the five integrants in
infected brain tissue were within genes. We carried out a sta-
tistical analysis as described above for these grouped integra-
tion sites, and the data suggest a preference for integration
within genes, with a P value of 0.004 (power, 0.737). This
statistical power is below the 0.8 threshold due to the small
sample size, and therefore our data for brain tissue can only
suggest a preference for integration within genes. This con-
trasts with our data for PBMCs and lymph tissues, which dem-
onstrate a clear statistical preference for integration within
Although only approximately one-third of the human ge-
nome contains protein-coding genes (11, 19), we observed a
strong preference for HIV-1 integration sites in untreated in-
fections in this coding portion of the genome in PBMCs and
lymphoid tissues. Our data for infected brain tissue, although
from a limited data set, also suggest a preference for integra-
tion within genes. Taken together, our data indicate a clear
preference for HIV-1 integration within genes in untreated
infections. This finding agrees with previous studies that exam-
ined infections in vitro and infections in patients undergoing
antiviral therapy (8, 14, 17). Transcriptional profiles performed
for in vitro infections have demonstrated that integration strongly
favors actively transcribed genes (14, 17). We do not have
direct information about the transcriptional activity of the in-
tegration sites identified in this study, but it is likely that the
genes in which integrants were found were actively expressed
in the cells infected in vivo. Although we do not know if the
integration sites identified here are those of replication-com-
petent viruses, evidence that most integrated proviruses in
circulating CD4?T lymphocytes are replication competent
exists (2). In the case of PBMCs examined in this study, it is
therefore likely that the majority of integration sites are those
of replication-competent viruses. Finally, the demonstration
that HIV-1 integration preferentially targets protein-coding
genes suggests that the use of HIV vectors for gene therapy
purposes must be viewed with caution, as insertion within
genes whose disruption may result in pathology, such as tumor
suppressors, appears likely.
We thank Claudia Kozinetz and Hung-Wen Yeh of the Design and
Analysis Core of the Baylor–UT-Houston CFAR for statistical analysis.
This work was supported by NIH grants AI35381 (to A.P.R.),
AI47725 (to J.T.K.), and P30AI036211 (Baylor–UT-Houston CFAR).
1. Blankson, J. N., D. Persaud, and R. F. Siliciano. 2002. The challenge of viral
reservoirs in HIV-1 infection. Annu. Rev. Med. 53:557–593.
2. Brinchmann, J. E., J. Albert, and F. Vartdal. 1991. Few infected CD4?T
cells but a high proportion of replication-competent provirus copies in
asymptomatic human immunodeficiency virus type 1 infection. J. Virol. 65:
3. Bushman, F., M. Lewinski, A. Ciuffi, S. Barr, J. Leipzig, S. Hannenhalli, and
C. Hoffmann. 2005. Genome-wide analysis of retroviral DNA integration.
Nat. Rev. Microbiol. 3:848–858.
4. Chun, T. W., L. Carruth, D. Finzi, X. Shen, J. A. DiGiuseppe, H. Taylor,
M. Hermankova, K. Chadwick, J. Margolick, T. C. Quinn, Y. H. Kuo, R.
Brookmeyer, M. A. Zeiger, P. Barditch-Crovo, and R. F. Siliciano. 1997.
Quantification of latent tissue reservoirs and total body viral load in HIV-1
infection. Nature 387:183–188.
5. Ciuffi, A., M. Llano, E. Poeschla, C. Hoffmann, J. Leipzig, P. Shinn, J. R.
Ecker, and F. Bushman. 2005. A role for LEDGF/p75 in targeting HIV
DNA integration. Nat. Med. 11:1287–1289.
6. Engelman, A. 2005. The ups and downs of gene expression and retroviral
DNA integration. Proc. Natl. Acad. Sci. USA 102:1275–1276.
7. Grandgenett, D. P. 2005. Symmetrical recognition of cellular DNA target
sequences during retroviral integration. Proc. Natl. Acad. Sci. USA 102:
8. Han, Y. F., K. Lassen, D. Monie, A. R. Sedaghat, S. Shimoji, X. Liu, T. C.
Pierson, J. B. Margolick, R. F. Siliciano, and J. D. Siliciano. 2004. Resting
CD4?T cells from human immunodeficiency virus type 1 (HIV-1)-infected
individuals carry integrated HIV-1 genomes within actively transcribed host
genes. J. Virol. 78:6122–6133.
9. Hematti, P., B. K. Hong, C. Ferguson, R. Adler, H. Hanawa, S. Sellers, I. E.
Holt, C. E. Eckfeldt, Y. Sharma, M. Schmidt, C. von Kalle, D. A. Persons,
E. M. Billings, C. M. Verfaillie, A. W. Nienhuis, T. G. Wolfsberg, C. E.
Dunbar, and B. Calmels. 2004. Distinct genomic integration of MLV and
SIV vectors in primate hematopoietic stem and progenitor cells. PLoS Biol.
10. Holman, A. G., and J. M. Coffin. 2005. Symmetrical base preferences sur-
rounding HIV-1 and avian sarcoma/leukosis virus but not murine leukemia
virus integration sites. Proc. Natl. Acad. Sci. USA 102:6103–6107.
11. Lander, E. S., L. M. Linton, B. Birren, C. Nusbaum, M. C. Zody, J. Baldwin,
et al. 2001. Initial sequencing and analysis of the human genome. Nature
12. Lewinski, M. K., D. Bisgrove, P. Shinn, H. Chen, C. Hoffmann, S. Hannenhalli,
E. Verdin, C. C. Berry, J. R. Ecker, and F. D. Bushman. 2005. Genome-wide
analysis of chromosomal features repressing human immunodeficiency virus
transcription. J. Virol. 79:6610–6619.
13. Li, Q. S., L. J. Duan, J. D. Estes, Z. M. Ma, T. Rourke, Y. C. Wang, C. Reilly,
J. Carlis, C. J. Miller, and A. T. Haase. 2005. Peak SIV replication in resting
memory CD4? T cells depletes gut lamina propria CD4? T cells. Nature
14. Mitchell, R. S., B. F. Beitzel, A. R. W. Schroder, P. Shinn, H. M. Chen, C. C.
Berry, J. R. Ecker, and F. D. Bushman. 2004. Retroviral DNA integration:
ASLV, HIV, and MLV show distinct target site preferences. PLoS Biol.
15. Narezkina, A., K. D. Taganov, S. Litwin, R. Stoyanova, J. Hayashi, C. Seeger,
A. M. Skalka, and R. A. Katz. 2004. Genome-wide analyses of avian sarcoma
virus integration sites. J. Virol. 78:11656–11663.
16. Schnittman, S. M., M. C. Psallidopoulos, H. C. Lane, L. Thompson, M.
Baseler, F. Massari, C. H. Fox, N. P. Salzman, and A. S. Fauci. 1989. The
reservoir for HIV-1 in human peripheral blood is a T cell that maintains
expression of CD4. Science 245:305–308.
17. Schroder, A. R., P. Shinn, H. Chen, C. Berry, J. R. Ecker, and F. Bushman.
2002. HIV-1 integration in the human genome favors active genes and local
hotspots. Cell 110:521–529.
18. Stevenson, M., T. L. Stanwick, M. P. Dempsey, and C. A. Lamonica. 1990.
HIV-1 replication is controlled at the level of T cell activation and proviral
integration. EMBO J. 9:1551–1560.
19. Venter, J. C., M. D. Adams, E. W. Myers, P. W. Li, R J. Mural, G. G. Sutton,
et al. 2006. The sequence of the human genome. Science 291:1304–1351.
20. Wu, X. L., Y. Li, B. Crise, and S. M. Burgess. 2003. Transcription start
regions in the human genome are favored targets for MLV integration.
21. Zack, J. A., S. J. Arrigo, S. R. Weitsman, A. S. Go, A. Haislip, and I. S. Chen.
1990. HIV-1 entry into quiescent primary lymphocytes: molecular analysis
reveals a labile, latent viral structure. Cell 61:213–222.