JOURNAL OF CLINICAL MICROBIOLOGY, Aug. 2011, p. 2859–2867
Copyright © 2011, American Society for Microbiology. All Rights Reserved.
Vol. 49, No. 8
Identification of HIV Superinfection in Seroconcordant Couples in
Rakai, Uganda, by Use of Next-Generation Deep Sequencing?†
Andrew D. Redd,1Aleisha Collinson-Streng,1Craig Martens,2Stacy Ricklefs,2Caroline E. Mullis,3
Jordyn Manucci,3Aaron A. R. Tobian,3Ethan J. Selig,2Oliver Laeyendecker,1,3
Nelson Sewankambo,4,6Ronald H. Gray,5David Serwadda,4,7Maria J. Wawer,5
Stephen F. Porcella,2and Thomas C. Quinn1,3* on behalf of the
Rakai Health Sciences Program
Laboratory of Immunoregulation, DIR, NIAID, NIH, Baltimore Maryland1; Genomics Unit, Research Technologies Section,
Rocky Mountain Laboratories, DIR, NIAID, NIH, Hamilton, Montana2; Johns Hopkins Medical Institute, Johns Hopkins University,
Baltimore Maryland3; Rakai Health Sciences Program, Kalisizo, Uganda4; Bloomberg School of Public Health,
Johns Hopkins University, Baltimore, Maryland5; School of Medicine, Makerere University,
Kampala, Uganda6; and School of Public Health, Makerere University, Kampala, Uganda7
Received 18 April 2011/Returned for modification 31 May 2011/Accepted 13 June 2011
HIV superinfection, which occurs when a previously infected individual acquires a new distinct HIV strain,
has been described in a number of populations. Previous methods to detect superinfection have involved a
combination of labor-intensive assays with various rates of success. We designed and tested a next-generation
sequencing (NGS) protocol to identify HIV superinfection by targeting two regions of the HIV viral genome,
p24 and gp41. The method was validated by mixing control samples infected with HIV subtype A or D at
different ratios to determine the inter- and intrasubtype sensitivity by NGS. This amplicon-based NGS protocol
was able to consistently identify distinct intersubtype strains at ratios of 1% and intrasubtype variants at ratios
of 5%. By using stored samples from the Rakai Community Cohort Study (RCCS) in Uganda, 11 individuals
who were HIV seroconcordant but virally unlinked from their spouses were then tested by this method to detect
superinfection between 2002 and 2005. Two female cases of HIV intersubtype superinfection (18.2%) were
identified. These results are consistent with other African studies and support the hypothesis that HIV
superinfection occurs at a relatively high rate. Our results indicate that NGS can be used for detection of HIV
superinfection within large cohorts, which could assist in determining the incidence and the epidemiologic,
virologic, and immunological correlates of this phenomenon.
HIV superinfection occurs when a known HIV-infected in-
dividual is subsequently infected with a new phylogenetically
distinct viral strain or strains. The first documented cases of
HIV superinfection were found in individuals with various
modes of transmission and included inter- and intrasubtype
cases (1, 9, 17). Subsequently, multiple studies have docu-
mented superinfection in small populations of high-risk indi-
viduals (2, 3, 7, 8, 11, 13, 17, 20, 22, 23, 26, 29). The rate of HIV
superinfection in these high-risk groups was relatively frequent
and was comparable to the incidence rate in similar popula-
tions from the same regions, especially if multiple viral genes
were examined (3, 13, 14, 21, 24). In contrast, other researchers
have found no evidence of superinfection in large-scale popu-
lation studies (6, 15). One possible reason for this discrepancy
may be due to differences in techniques and criteria used to
identify superinfection (16). Initial studies designed to examine
the frequency of superinfection utilized heteroduplex mobility
assays (HMAs) or multiregion hybridization assays (MHAs)
followed by selective clonal analysis of those samples that dem-
onstrated the presence of new viral variants (3, 11, 15). MHA
screening is limited in that it can only identify intersubtype
superinfection, while possibly missing intrasubtype superinfec-
tion. Although HMA is sensitive enough to detect samples with
?1.5% differences in pairwise distance, it is susceptible to
false positives due to the presence of insertions or deletions
(16). Additionally, both the HMA and MHA methods re-
quire verification using in-depth cloning and Sanger se-
quencing (13, 16). The sensitivity of these screening/cloning
techniques is dependent on the number of clones amplified
and the number of genes examined (12–14). To detect a
minor variant approaching 1%, over 100 clones would need
to be examined per sample, preferably from multiple PCRs
to increase the amount and diversity of viral strains se-
quenced (14, 16). With the need to examine multiple regions
of the viral genome to ensure accurate phylotyping and
identification of superinfecting strains, in-depth cloning and
Sanger sequencing are prohibitively labor-intensive for
large-scale studies (12, 14, 16).
Newly developed next-generation sequencing (NGS) tech-
niques provide unprecedented sequencing depth, offer the
ability to multiplex samples, and are quicker, more cost-effec-
tive, and less labor-intensive than cloning and Sanger-based
sequencing (12). Using several genomic targets and high se-
quence volume, NGS should be able to distinguish minor vari-
ants that arose spontaneously either through recombination,
* Corresponding author. Mailing address: Johns Hopkins University
School of Medicine, Rangos Building, Room 530, 855 N Wolfe St,
Baltimore, MD 21205. Phone: (410) 955-7635. Fax: (410) 614-9775.
† Supplemental material for this article may be found at http://jcm
?Published ahead of print on 22 June 2011.
within-host viral evolution, or from newly introduced strains or
subtypes (18, 30).
We designed and tested an NGS protocol and sequence
analysis pipeline that focuses on amplification and sequencing
of the p24 region of the viral capsid and the gp41 region of the
viral envelope. These genomic regions were chosen for exam-
ination because they are relatively genetically stable and of
sufficient length, are suitable for phylotyping, were previously
used in PCR cloning and Sanger sequencing studies, and are
not high in polymeric regions. We further tested the protocol
with 11 individuals from virally unlinked HIV seroconcordant
couples from Rakai District, Uganda, to detect the occurrence
of HIV superinfection.
MATERIALS AND METHODS
Ethics statement. All subjects provided written informed consent for their
samples to be stored and used for future unspecified HIV-related research. The
study was approved by the Science and Ethics Committee of the Uganda Virus
Research Institute, the Western Institutional Review Board, and the Committee
on Human Research at the Johns Hopkins Bloomberg School of Public Health.
Study population and subjects. Serum samples were retrospectively selected
from individuals in the Rakai Community Cohort Study (RCCS), a rural, com-
munity-based open cohort consisting of persons aged 15 to 49 years in Rakai
District, southwestern Uganda (27). Since 1994, interviews and venous blood
samples have been obtained annually from approximately 14,000 consenting
adults living in 50 villages. As part of the routine interview, consenting individ-
uals in stable sexual partnerships are linked as couples.
Control serum samples were selected from HIV-infected individuals who were
previously identified as being infected with either subtype A (n ? 4) or D (n ?
6) in the 2002 community survey. Identification of subtypes was performed by
Sanger sequencing of cloned PCR products of the p24 and gp41 target regions.
Using stored sera from 2002, we identified 18 HIV-infected individuals in 9
HIV-seroconcordant couples whose viruses were phylogenetically unlinked to
their partner’s virus, as determined by previous Sanger sequencing for either the
gp41 or p24 regions (4). The individual’s samples were labeled with their gender,
couple number, and year of sample draw (e.g.,. female_1_C1_2002). Of the 18
individuals, 11 had serum samples available in 2005, and these were examined for
HIV superinfection in this population. Four of the 11 individuals were from two
couples (couples 1 and 2) of which both members had serum samples available
from 2002 and 2005; however, for this analysis, each individual was analyzed
independently. The remaining seven individuals only had serum samples avail-
able in 2002 but were included in this study to search for the source of any new
superinfecting HIV strains found in their partner’s 2005 samples.
Viral RNA extraction, cDNA synthesis, and PCR target amplification. Viral
RNA was extracted from 140 ?l of serum using a QIAmp viral RNA minikit
(Qiagen, Valencia CA) and eluted into 50 ?l of Qiagen buffer AVE. For each
genomic target region (p24 and gp41), two 50-?l reverse transcription-PCRs
(RT-PCRs) were performed simultaneously to maximize the amount and diver-
sity of viral RNA genomes amplified per sample. For the gp41 region, each 50-?l
RT-PCR was performed using a 40-?l master mix composed of 20 ?l of double-
distilled water (ddH2O), 10 ?l of 5? buffer, 3 ?l of deoxynucleoside triphos-
phates (dNTPs), and 2 ?l of enzyme mixture from the Qiagen OneStep RT-PCR
kit. One microliter of RNase inhibitor was also added, along with 2 ?l of 20 ?M
dilutions of both the forward primer (GP50F1-HXB2 nt 769137720) and the
reverse primer (GP41R1-HXB2 nt 834748374) (see the supplemental material).
This master mix was combined with 10 ?l of purified viral RNA and incubated
for 30 min at 50°C and 15 min at 94°C for RT extension. PCR was then
performed for 35 cycles of 30 s at 94°C, 35 s at 53.5°C, and 90 s at 72°C, followed
by 72°C for 10 min. For the p24 region, the 50-?l RT and PCRs were carried out
using the same master mix as described above, with one exception: forward and
reverse primers specific for the p24 target were used and designated G00 (HXB2
nt 7643782) and G01 (HXB2 nt 226442281), respectively (see the supplemen-
tal material). For two samples that did not amplify the p24 region during the
initial PCR, a reformulated 40-?l master mix containing 20 ?l of ddH2O, 10 ?l
of 5? buffer, 3 ?l of dNTPs, and 2 ?l of enzyme mix from the Qiagen OneStep
RT-PCR kit, as well as 2 ?l of MgCl2, 1 ?l of RNase inhibitor, and 1.5 ?l of 20
?M dilutions of both the forward primer and reverse primers, was used. The two
samples were pooled to maximize the depth of detection, and 10 ?l of this pool
was used in a nested 100-?l PCR using primer sets for gp41 (E55 primer set with
14 454-bar-coded variations [MID1 to MID14]) or p24 (G100 primer set with 14
454-bar-coded variations [MID1 to MID14]) (Roche, Inc., Branford, CT) (see
the supplemental material). Briefly, each nested-PCR mixture for gp41 or p24
contained 90 ?l of master mix composed of 50.4 ?l ddH2O, 10 ?l 10? reaction
buffer, 20 ?l MgCl2, 3 ?l dNTPs, and 0.6 ?l HotStarTaq DNA polymerase
(Qiagen, Valencia, CA) as well as 3 ?l of the forward and reverse E55 primers
or 3 ?l of the forward and reverse G100 primer set both at a 20 ?M final
concentration for both regions (see the supplemental material). The PCR am-
TABLE 1. Sequence read totals and consensus distribution for pure subtype samples and mixture analysis
No. of consensus
Minor variant/no. of
detected (?10 reads)
aNA, not applicable.
2860 REDD ET AL.J. CLIN. MICROBIOL.
plification conditions for the 100-?l nested reactions were identical to the first-
round PCR conditions described above. Successful single-band amplification of
gp41 or p24 target products was verified by agarose gel electrophoresis.
Serum HIV-1 RNA concentrations (viral loads) were determined by the Am-
plicor v1.5 (Roche Diagnostics, Basel, Switzerland).
Generation of control samples for inter- and intrasubtype NGS threshold
detection. Control serum samples from HIV-infected individuals, previously
identified via Sanger sequencing of PCR fragments as being infected with either
subtype A (n ? 4) or D (n ? 6), were used to determine the assay’s limit of
detection. Phylogenetically unlinked viral isolates were mixed in inter- and in-
trasubtype experiments for each viral target region.
For the p24 region, viral extracts from two HIV subtype A-infected control
individuals (A1and A2) and four subtype D-infected individuals (D1to D4) were
amplified separately in the first-round PCR. Aliquots were collected and set
aside for pure sample analysis, while aliquots of each control sample were also
mixed at a variety of ratios. The following ratios were tested for the p24 target
region: 50:50 A2-D1, 95:5 A2-D1, 99:1 A2-D1, 99.9:0.1 D3-A2, 95:5 A1-A2, 95:5
D1-D2, and 95:5 D3-D4. Nested PCRs were performed with these samples as
For the gp41 region, viral extracts from two HIV subtype A-infected individ-
uals (A3and A4) and two subtype D-infected control individuals (D5and D6)
were amplified separately in the first-round PCR. Aliquots of the first-round
PCR were collected and set aside for pure sample analysis, while aliquots of each
were mixed at a variety of ratios. The following ratios were tested for the gp41
target region: 50:50 A4-D5, 95:5 A4-D5, 99:1 A4-D5, 99.9:0.1 A4-D5, 95:5 A1-A2,
and 95:5 D5-D6. Nested PCRs were performed with these samples as described
PCR product purification. The amplicon library preparation method was per-
formed as recommended by the manufacturer (Roche, Branford, CT), and all
PCR products were purified with the following minor alterations. In an effort to
eliminate excess primers, the bead/target ratio was reduced by incubation of 30
?l of AMPure XP beads (Agencourt, Beckman Coulter Genomics, Danvers,
MA) with 25 ?l of PCR product diluted in 25 ?l of water. Purified PCR products
were quantified using PicoGreen (Invitrogen, Carlsbad, CA), and each template
was diluted to 1 ? 109molecules/?l stock. The amplicon pools were made by
combining 5 ?l of each diluted barcoded template to make a final 1 ? 109
molecules/?l stock containing 14 bar-coded amplicons.
DNA sequencing. Preparation of templated beads for NGS followed the
emPCR Method Manual—Lib-L-MV (17a). The library pools containing 1 ? 109
molecules/?l were diluted to 1 ? 105molecules/?l for a target addition of 0.175
copies per bead to the DNA capture beads. The live amplification mixture was
based on the reagent volumes for paired-end libraries to reduce the amount of
amplification primer in the reactions and thereby reduce the bead signal intensity
during sequencing. Enriched DNA capture beads were sequenced on the Roche
454 system (Roche, Branford, CT) per the manufacturer’s instructions, using a
four-region gasket when indicated.
Sequence segregation. Sequencing results were analyzed using the GS Ampli-
con variant analyzer, version 2.5 (Roche, Branford, CT). All sequence reads were
compared, and similar sequences were combined into a single consensus se-
quence. Generated consensus sequences that were within 10 bases from both
FIG. 1. Mixture analysis of intersubtype detection by NGS. Neighbor-joining trees of p24 next-generation consensus sequences (?10 identical
reads) of control samples of A2(A; blue), subtypes D1(B; green), a mixture of A2and D1at a 50:50 ratio (C; red), and a merged tree of all three
sample runs (D) are shown. The trees were constructed with a selection of subtype reference sequences and random sequences from individuals
in Rakai shown in black. Brackets demonstrate the source of different clades within the merged trees. Bootstrap values higher than 80% are shown
for nonmerged trees (1,000 replicates).
VOL. 49, 2011 HIV SUPERINFECTION IN RAKAI, UGANDA2861
ends of the amplicon and comprised of a cluster of 10 individual, nearly identical
sequences or more were determined using the Roche Amplicon software and
were classified as being consensus sequences of HIV variants. These consensus
sequences were used for subsequent phylogenetic analysis.
Phylogenetic analysis. Consensus sequences, subtype reference sequences,
and a selection of subtype reference sequences collected from Rakai (see the
supplemental material) were aligned using ClustalW (25). Phylogenetic trees
were generated by the neighbor-joining method (19). Statistical support for a
specific clade in each phylogeny was obtained by bootstrapping (1,000 replicates).
The NGS consensus sequences for gp41 and p24 have been submitted to
GenBank (see below) and are also available upon request (email@example.com).
HIV superinfection definition and analysis. HIV superinfection was defined in
an individual whose 2005 serum sample demonstrated two or more distinct
consensus sequences forming a monophyletic cluster that was phylogenetically
unlinked from the individual’s entire consensus sequences in the 2002 sample. In
order to be considered a superinfection, the genetic distance of the new mono-
phyletic cluster from the closest related viral sequences found at the earlier time
point had to be either ?0.55% per year for the p24 region, ?0.98% per year for
the gp41 region for subtype D and ?0.59% per year for the p24 region, or
?0.72% per year for the gp41 region for subtype D, which is equal to the mean
plus twice the standard deviation of the intraperson viral divergence or evolu-
tionary rate of each HIV-1 subtype in Rakai, Uganda (data not shown). All newly
identified consensus sequences were phylogenetically compared to the most
prominent strains of the other bar-coded samples within NGS runs to search for
microcontamination, misclassification, or sequencing errors. If instances of these
errors were found, these consensus sequences were eliminated. For further
verification, newly identified superinfecting viral strain sequences were translated
and analyzed in order to check that a functional protein sequence was encoded
in the sequence. Newly discovered superinfecting consensus sequences within an
individual were compared phylogenetically to their partner’s consensus viral
sequences in order to determine if the partner was the source of the new
Nucleotide sequence accession numbers. The nucleotide consensus sequences
for the gp41 region have been deposited in GenBank under accession no.
JN153104 to JN155099, and the nucleotide consensus sequences for the p24
region have been deposited in GenBank under accession no. JN155100 to
JN157600. The sequences are also available on request.
Genomic target regions, sequencing depth, and consensus
sequence analysis. The p24 and gp41 regions of the viral ge-
nome were chosen for NGS because they are located at op-
posing ends of the HIV genome and are two of the more
conserved areas of the genome. Previous research has indi-
cated that the sensitivity of NGS for HIV quasispecies detec-
FIG. 2. Inter- and intrasubtype detection of the p24 region by NGS. Neighbor-joining trees of p24 next-generation consensus sequences (?10
identical reads) of intersubtype mixtures of A2and D1(red) at 95:5 (A) and 99:1 (B) ratios are shown. Intrasubtype mixtures of A2and A1(C;
blue) and D3and D4(D; green) at the ratio of 95:5 are shown. The trees were constructed with a selection of subtype reference sequences and
random sequences from individuals in Rakai shown in black. Brackets demonstrate the source of different clades within the merged trees.
Bootstrap values higher than 80% are shown (1,000 replicates).
2862REDD ET AL. J. CLIN. MICROBIOL.
tion is 0.1% (30). Therefore, estimating an approximate read
volume of 10,000 reads per sample, a cutoff of 10 similar reads,
as determined by the Roche segregation software, was selected
to qualify as a consensus sequence for further analysis. A cutoff
of five sequences was also examined and found to not affect the
findings and the overall sensitivity of the assay (12). However,
when the consensus cutoff was dropped to two similar se-
quences, small amounts of microcontaminating sequences re-
flecting the inherent error rate for the technology were discov-
ered. Therefore, for the purposes of this study, 10 reads or
more was the threshold for quality consensus viral sequences
(see Fig. S1 in the supplemental material).
p24 inter- and intrasubtype analysis. Previous Sanger se-
quencing of PCR fragments of the p24 region identified two
subtype A (A1and A2) and four subtype D (D1, D2, D3, and
D4) samples used in this analysis (Table 1) (5). In order to test
the intra- and intersubtype viral population sensitivities of our
NGS protocol, first-round PCR products targeting the p24
region from these subtype A and subtype D samples were
mixed in various ratios, amplified, and sequenced on the
Roche 454 system as described above (Table 1 and Fig. 1 and
2A to D; see Fig. S2A to C in the supplemental material). In
order to exclude cross contamination or poor-quality reads,
consensus read data sets for all mixtures were merged, and the
resulting trees were constructed (Fig. 1D). These data demon-
strate that reads specific for the mixed-ratio samples are seg-
regating properly to their respective branch locations for the
components of the mixture and that the NGS protocol pro-
vides good depth and quality sequence sorting during phylo-
genetic analysis (Fig. 1). The ratios of A2to D1of 95:5 and 99:1
were examined to determine if NGS would provide adequate
depth and representation of the subtypes at these ratios (Fig.
2A and B). The lower frequency of the minor variant (D1in
both cases) was adequately represented in both trees, although
FIG. 3. Intersubtype detection of the gp41 region by NGS. Neighbor-joining trees of gp41 next-generation consensus sequences (?10 identical
reads) of intersubtype mixtures of A4and D5(red) at 50:50 (A), 95:5 (B), 99:1 (C), and 99.9:0.1 (D) ratios are shown. The trees were constructed
with a selection of subtype reference sequences and random sequences from individuals in Rakai shown in black. Brackets demonstrate the source
of different clades within the merged trees. Bootstrap values higher than 80% are shown (1,000 replicates).
VOL. 49, 2011HIV SUPERINFECTION IN RAKAI, UGANDA2863
with a slight decrease in the number of consensus reads in the
99:1 ratio (Fig. 2B).
To further test the sensitivity of this assay, we analyzed a
mixture of D3to A2at a ratio of 99.9:0.1. When we merged
these ratio data with the control data sets (D3and A2), the
minor variant (A2) did not appear in the data (see Fig. S2C in
the supplemental material). These results suggest that for the
p24 target, an intersubtype ratio of ?0.1% cannot be reliably
identified by this NGS protocol.
In order to test the protocol for its ability to adequately
sequence and separate related subtypes, the following ratios
were tested: 95:5 A1-A2, 95:5 D1-D2, and 95:5 D3-D4(Table 1
and Fig. 2C and D; see Fig. S2A and B in the supplemental
material). The minor viral variant population in the 95:5 A1/A2
ratio (A2) was identified as 14.5% of the total number of
consensus sequences (Table 1 and Fig. 2C). The 95:5 D1/D2
ratio sample did not appear to adequately amplify the minor
variant (D2) when the data were merged with the data sets for
D1and D2(Table 1; see Fig. S2B in the supplemental mate-
rial). This suggests a lower limit for D1- versus D2-related
intrasubtype identification for the p24 target. To determine if
this lack of detection or amplification of D2was unique to the
D1/D2ratio of 95:5, this test was repeated using the ratio of
95:5 D3-D4. In this test, the minor variant (D4) was identified
in 25% of the total number of consensus sequences (Table 1
and Fig. 2D). It was found that the consensus sequences that
were expanded from the minor variant (D3) corresponded to
the most prominent subtype sequences present in the pure
sample for D3(see Fig. S2A in the supplemental material).
gp41 inter- and intrasubtype analysis. Due to limited
amounts of viral RNA available for samples A1, A2, and D1to
D4, different control samples were used to test the minor intra-
and intersubtype viral population sensitivities of our NGS pro-
tocol of the gp41 region (A3, A4, D5, and D6,) (Table 1 and Fig.
3). The majority of the p24 NGS reactions were performed on
a full 454 slide with 14 different bar-coded samples, whereas
the gp41 test samples were run on a slide that had been divided
into four quadrants. The reason for this change was to increase
the sample throughput per run, resulting in a lower read vol-
ume per bar-coded sample (Table 1).
NGS analysis of all four intersubtype mixtures (A versus D)
for the gp41 region demonstrated detectable consensus se-
quences of the minor variant (Table 1 and Fig. 3A to D).
However, in the case of the 99.9:0.1 mixture, only one consen-
sus sequence from the minority variant subtype was amplified
(Table 1 and Fig. 3D). While the sensitivity for minor viral
variants was increased for gp41 relative to the results for p24,
the lack of two or more distinct consensus sequences means
that this would not qualify as a superinfecting viral species
according to the parameters described above.
TABLE 2. Subject viral loads, sequence read totals, and consensus subtype distributiona
No. of consensus
No. of consensus
sequences by subtype:
aResults for samples with superinfecting strains are shown in bold. Corresponding partner read totals and consensus sequence are indicated for the individual’s
samples, which are labeled with their gender, couple number, and year of sample draw.
2864 REDD ET AL. J. CLIN. MICROBIOL.
NGS analysis of the two intrasubtype comparisons (A3ver-
sus A4,or D5versus D6) at a 95:5 ratio demonstrated that in a
merged data format, the minor variants (A4and D5) were
detected (Table 1; see Fig. S3A and B in the supplemental
material). These data also demonstrated that the A3individ-
ual, who previously was identified by PCR cloning and Sanger
sequencing analysis as being infected with only subtype A, was
in fact infected with two distinct variants which coincided with
both subtypes A and D (see Fig. S3A in the supplemental
HIV superinfection in Rakai, Uganda. Eleven HIV-infected
individuals from whom serum samples were collected at
2002 and 2005 were evaluated at both p24 and gp41for
evidence of HIV superinfection (Table 2). In addition, for
each individual, their partner’s sample from 2002, or in the
case of two couples (C_1 and C_2), the samples from 2002
and 2005, were amplified and sequenced by NGS to examine
if superinfecting strains discovered in 2005 originated from
their partner (Table 2). Serum HIV loads were calculated
for each sample tested (Table 2). Each member was treated
independently in this analysis.
Using NGS, two of the 11 individuals (18.2%) had evidence
of HIV superinfection in their 2005 sera (Table 2 and Fig. 4
and 5). The first case of superinfection was documented in
female_C1, who was infected in 2002 with a viral population
that grouped with subtype D in the p24 region and with sub-
types D and C in the gp41 region (Table 2 and Fig. 4A; see Fig.
S4 in the supplemental material). In 2005, she had multiple
consensus sequences in the p24 target region which grouped
with subtype A, indicating a superinfection of a new HIV
species (Fig. 4B). NGS analysis of her male partner (male_C1)
demonstrated that he was infected with an apparent D/C re-
combinant strain that was linked with his female partner’s viral
strains in both regions in 2002 and 2005 when examined in a
merged phylogenetic tree (merged data not shown), indicating
that she was superinfected by another source (Table 2).
The second case of superinfection was observed in female_C3,
who was initially infected with HIV subtype D in both genomic
FIG. 4. Detection of HIV superinfection in the p24 region. Neighbor-joining trees of HIV p24 next-generation consensus sequences (?10
identical reads) from female_C1 (green) in 2002 (A) and 2005 (B) are shown. The trees were constructed with a selection of subtype reference
sequences and random sequences from individuals in Rakai shown in black. Superinfecting strains are shown with a circle. Brackets demonstrate
the individual’s HIV subtypes within the trees. Bootstrap values higher than 80% are shown (1,000 replicates).
VOL. 49, 2011 HIV SUPERINFECTION IN RAKAI, UGANDA2865
regions (Table 2 and Fig. 5A; see Fig. S5 in the supplemental
material). In her 2005 sample, she had acquired a new viral
strain in the p24 region with multiple consensus sequences that
clustered with subtype A (Fig. 5B). Her partner, male_C3, was
infected in 2002 with a dual population of viruses that clustered
with subtypes D and C in the gp41 region and subtype D in the
p24 region (Table 2). Merged phylogenetic tree analysis dem-
onstrated that her superinfecting strain was not found in her
partner, suggesting she was superinfected by another source
(merged data not shown). No other cases of superinfection
were observed in the remaining nine individuals during merged
and unmerged phylogenetic tree analysis (Table 2).
Identification of HIV superinfection in the past has been
accomplished using a variety of screening techniques in con-
junction with labor-intensive cloning or single-genome ampli-
fication (3, 6, 11–13, 21). This has led to a significant amount
of variability in the estimated rates of HIV superinfection (3, 6,
8, 21). The data presented here describe a new NGS protocol
to identify HIV superinfection with relatively high inter- and
intrasubtype sensitivities. The consensus of 10 repeated se-
quences was chosen since it was approximately 1/1,000 of the
estimated total reads and appeared to be an appropriate cutoff
to identify inter- and intrasubtype minor variants while avoid-
ing data artifacts. Using mixtures of HIV-infected samples
containing subtypes A and D, the predominant viral species
found in Uganda, the assay’s intersubtype sensitivity in both
the p24 and gp41 target regions was determined to be at least
1%. Minor viral strains were found at lower levels (0.1%) in
the gp41 region, but not consistently or at high enough con-
sensus counts to lower the threshold of detection for the pro-
tocol. Intrasubtype sensitivity was approximately 5%, although
intrasubtype detection within the subtype A mixtures seemed
more robust than that for the subtype D samples. We hypoth-
esize that primer specificity and target sequence variation may
be driving some of these differences and is a limitation of our
The NGS protocol was able to identify two cases of HIV
superinfection in women from 11 individuals who were mem-
bers of virally unlinked concordantly infected couples. In both
cases, the superinfecting strain was HIV subtype A, which has
been shown to be more infectious than subtype D (10). In
addition, both women’s viral loads increased during the period.
None of the superinfecting strains were detected in the wom-
en’s male partners, suggesting that the superinfecting strain
was acquired from another source. It is possible that the new
strains found in these two individuals were present in the
earlier time points at levels that were too low to be detected in
our assay. However, according to the data from our mixture
analysis, the levels in the first time point would most likely be
less than 1%, and therefore we feel these events should be
classified as superinfections. The relatively high proportion of
superinfected individuals in our population agrees with other
studies of high-risk individuals in Africa (13, 14). However,
given the small number of individuals examined, further inves-
tigation is needed to estimate the rate and correlates of super-
infection in the Rakai population. In addition, the individuals
in this study were selected based upon a high likelihood of
FIG. 5. Detection of HIV superinfection in the p24 region. Neighbor-joining trees of HIV p24 next-generation consensus sequences (?10
identical reads) from female_C3 (blue) in 2002 (A) and 2005 (B) are shown. The trees were constructed with a selection of subtype reference
sequences and random sequences from individuals in Rakai shown in black. Superinfecting strains are shown with a circle. Brackets demonstrate
the individual’s HIV subtypes within the trees. Bootstrap values higher than 80% are shown (1,000 replicates).
2866 REDD ET AL.J. CLIN. MICROBIOL.
superinfection since they were initially virally unlinked from
their partners and therefore may not represent the natural rate
of superinfection in the larger HIV-infected population. NGS
is substantially easier and more cost-effective than previous
methods used to detect superinfection, particularly for screen-
ing large numbers of subjects (12, 28). It should be noted that
NGS protocols like ours require specialized equipment that
somewhat limits their utility in resource-poor settings. The
data presented here demonstrate that HIV superinfection can
be detected in an accurate and sensitive manner, in a high-
throughput environment, and suggest that future studies ex-
amining HIV superinfection rates in large cohorts should uti-
lize these types of deep sequencing techniques. The ability to
rapidly determine the nature and extent of HIV superinfection
could have a profound influence on studies of HIV disease,
therapeutic interventions, transmission of potential drug resis-
tance, and viral evolution in the population.
We thank all the participants of the Rakai cohort, and the staff of the
Rakai Health Science Program. We especially thank Susanna Lamers
for assistance with sequence submission.
All subjects provided written informed consent for their samples to
be stored and used for future HIV-related research. The study was
approved by the Science and Ethics Committee of the Uganda Virus
Research Institute, the Western Institutional Review Board, and the
Committee on Human Research at Johns Hopkins Bloomberg School
of Public Health. There are no conflicts of interests for any of the study
authors. The funders had no role in study design, data collection and
analysis, decision to publish, or preparation of the manuscript.
This study was supported in part by funding from the Division of
Intramural Research, NIAID, NIH, NIAID grants R01 A134826 and
R01 A134265, NICHD grant 5P30HD06826, the World Bank STI
Project, Uganda, the Henry M. Jackson Foundation, the Fogarty
Foundation (grant 5D43TW00010), and the Bill and Melinda Gates
Institute for Population and Reproductive Health at JHU.
1. Altfeld, M., et al. 2002. HIV-1 superinfection despite broad CD8? T-cell
responses containing replication of the primary virus. Nature 420:434–439.
2. Braibant, M., et al. 2010. Disease progression due to dual infection in an
HLA-B57-positive asymptomatic long-term nonprogressor infected with a
nef-defective HIV-1 strain. Virology 405:81–92.
3. Chohan, B., L. Lavreys, S. M. Rainwater, and J. Overbaugh. 2005. Evidence
for frequent reinfection with human immunodeficiency virus type 1 of a
different subtype. J. Virol. 79:10701–10708.
4. Collinson-Streng, A. N., et al. 2009. Geographic HIV type 1 subtype distri-
bution in Rakai District, Uganda. AIDS Res. Hum. Retroviruses 25:1045–
5. Conroy, S. A., et al. 2010. Changes in the distribution of HIV-1 subtypes D
and A in Rakai District, Uganda between 1994 and 2002. AIDS Res. Hum.
6. Gonzales, M. J., et al. 2003. Lack of detectable human immunodeficiency
virus type 1 superinfection during 1072 person-years of observation. J. Infect.
7. Gunthard, H. F., et al. 2009. HIV-1 superinfection in an HIV-2-infected
woman with subsequent control of HIV-1 plasma viremia. Clin. Infect. Dis.
8. Herbinger, K. H., et al. 2006. Frequency of HIV type 1 dual infection and
HIV diversity: analysis of low- and high-risk populations in Mbeya Region,
Tanzania. AIDS Res. Hum. Retroviruses 22:599–606.
9. Jost, S., et al. 2002. A patient with HIV-1 superinfection. N. Engl. J. Med.
10. Kiwanuka, N., et al. 2009. HIV-1 subtypes and differences in heterosexual
transmission of HIV among HIV-1 discordant couples in Rakai, Uganda.
11. McCutchan, F. E., et al. 2005. In-depth analysis of a heterosexually acquired
human immunodeficiency virus type 1 superinfection: evolution, temporal
fluctuation, and intercompartment dynamics from the seronegative window
period through 30 months postinfection. J. Virol. 79:11693–11704.
12. Pacold, M., et al. 2010. Comparison of methods to detect HIV dual infection.
AIDS Res. Hum. Retroviruses 26:1291–1296.
13. Piantadosi, A., B. Chohan, V. Chohan, R. S. McClelland, and J. Overbaugh.
2007. Chronic HIV-1 infection frequently fails to protect against superinfec-
tion. PLoS Pathog. 3:e177.
14. Piantadosi, A., M. O. Ngayo, B. Chohan, and J. Overbaugh. 2008. Exami-
nation of a second region of the HIV type 1 genome reveals additional cases
of superinfection. AIDS Res. Hum. Retroviruses 24:1221.
15. Rachinger, A., et al. 2010. Absence of HIV-1 superinfection 1 year after
infection between 1985 and 1997 coincides with a reduction in sexual risk
behavior in the seroincident Amsterdam cohort of homosexual men. Clin.
Infect. Dis. 50:1309–1315.
16. Rachinger, A., T. D. van de Ven, J. A. Burger, H. Schuitemaker, and A. B.
van’t Wout. 2010. Evaluation of pre-screening methods for the identification
of HIV-1 superinfection. J. Virol. Methods 165:311–317.
17. Ramos, A., et al. 2002. Intersubtype human immunodeficiency virus type 1
superinfection following seroconversion to primary infection in two injection
drug users. J. Virol. 76:7444–7452.
17a.Roche. 2009. emPCR method manual—Lib-L-MV. Roche, Branford, CT.
18. Rozera, G., et al. 2009. Massively parallel pyrosequencing highlights minority
variants in the HIV-1 env quasispecies deriving from lymphomonocyte sub-
populations. Retrovirology 6:15.
19. Saitou, N., and M. Nei. 1987. The neighbor-joining method: a new method
for reconstructing phylogenetic trees. Mol. Biol. Evol. 4:406–425.
20. Smith, D. M., et al. 2006. Lack of neutralizing antibody response to HIV-1
predisposes to superinfection. Virology 355:1–5.
21. Smith, D. M., et al. 2004. Incidence of HIV superinfection following primary
infection. JAMA 292:1177–1178.
22. Smith, D. M., et al. 2005. HIV drug resistance acquired through superinfec-
tion. AIDS 19:1251–1256.
23. Streeck, H., et al. 2008. Immune-driven recombination and loss of control
after HIV superinfection. J. Exp. Med. 205:1789–1796.
24. Taylor, J. E., and B. T. Korber. 2005. HIV-1 intra-subtype superinfection
rates: estimates using a structured coalescent with recombination. Infect.
Genet. Evol. 5:85–95.
25. Thompson, J. D., D. G. Higgins, and T. J. Gibson. 1994. CLUSTAL W:
improving the sensitivity of progressive multiple sequence alignment through
sequence weighting, position-specific gap penalties and weight matrix choice.
Nucleic Acids Res. 22:4673–4680.
26. van der Kuyl, A. C., et al. 2010. Analysis of infectious virus clones from two
HIV-1 superinfection cases suggests that the primary strains have lower
fitness. Retrovirology 7:60.
27. Wawer, M. J., et al. 1998. A randomized, community trial of intensive
sexually transmitted disease control for AIDS prevention, Rakai, Uganda.
28. Willerth, S. M., et al. 2010. Development of a low bias method for charac-
terizing viral populations using next generation sequencing technology. PLoS
29. Yang, O. O., et al. 2005. Human immunodeficiency virus type 1 clade B
superinfection: evidence for differential immune containment of distinct
clade B strains. J. Virol. 79:860–868.
30. Zagordi, O., R. Klein, M. Daumer, and N. Beerenwinkel. 2010. Error cor-
rection of next-generation sequencing data and reliable estimation of HIV
quasispecies. Nucleic Acids Res. 38:7400–7409.
VOL. 49, 2011 HIV SUPERINFECTION IN RAKAI, UGANDA2867