Page 1

Variation in HIV-1 set-point viral load: Epidemiological

analysis and an evolutionary hypothesis

Christophe Fraser†‡, T. De ´irdre Hollingsworth†, Ruth Chapman†§, Frank de Wolf†¶, and William P. Hanage†

†Department of Infectious Disease Epidemiology, Faculty of Medicine, Imperial College London, London W2 1PG, United Kingdom;§Infectious Diseases

Epidemiology Unit, London School of Hygiene and Tropical Medicine, London WC1E 7HT, United Kingdom; and¶HIV Monitoring Foundation, Academic

Medical Centre of the University of Amsterdam, 1105 AZ, Amsterdam, The Netherlands

Communicated by Stanley Falkow, Stanford University, Stanford, CA, September 10, 2007 (received for review February 9, 2007)

The natural course of HIV-1 infection is characterized by a high

degree of heterogeneity in viral load, not just within patients over

time, but also between patients, especially during the asymptom-

atic stage of infection. Asymptomatic, or set-point, viral load has

been shown to correlate with both decreased time to AIDS and

increasedinfectiousness.Theaimofthisstudyistocharacterizethe

epidemiological impact of heterogeneity in set-point viral load. By

analyzing two cohorts of untreated patients, we quantify the

relationships between both viral load and infectiousness and the

duration of the asymptomatic infectious period. We find that,

because both the duration of infection and infectiousness deter-

mine the opportunities for the virus to be transmitted, this sug-

gests a trade-off between these contributions to the overall

transmission potential. Some public health implications of varia-

tion in set-point viral load are discussed. We observe that set-point

viral loads are clustered around those that maximize the transmis-

sion potential, and this leads us to hypothesize that HIV-1 could

have evolved to optimize its transmissibility, a form of adaptation

to the human host population. We discuss how this evolutionary

hypothesis can be tested, review the evidence available to date,

and highlight directions for future research.

cohort studies ? life-history ? mathematical model ? trade-off ? virulence

V

theseverityofHIV-1infection.Althoughitsrelationshiptovirus

density in other body compartments and to viral replicative

capacity is unclear (1–3), it has a proven track record in the

prognosis of patients (4, 5) and has more recently been shown to

predict the probability of transmission between discordant cou-

ples (6, 7). Viral load is heterogeneous both within patients over

timeandbetweenpatients.Duringthelongasymptomaticperiod

of infection, viral loads fluctuate around a steady set-point value,

which varies up to 1,000-fold between patients (4, 5).

Although much work has considered the significance of

primary infection in HIV-1 transmission (e.g., refs. 8 and 9), it

is not known which set-point viral loads have the greatest

epidemiological impact, in terms of leading to the greatest

number of infections over the lifetime of the host. Such infor-

mationisimportanttodeterminethepublichealthconsequences

of targeting prevention efforts at patients with certain subsets of

viral loads: high, low, or intermediate. Although it is clear that

patients with higher viral loads will be more infectious (6, 7), it

is also well known that, untreated, these individuals have a

poorer prognosis and, hence, will have fewer lifetime opportu-

nities for transmission (4, 5). The epidemiological impact of

different viral loads will hence be determined by the interplay

between these two antagonistic processes.

The aim of this study is to use available data on infectiousness

(6, 7, 9) and duration of infection (5, 10) to determine, in general

terms, the epidemiological impact of cross-sectional variation in

viral load. We estimate the product of infectiousness and dura-

tion of infection, which we term the transmission potential. This

is the mean number of persons one index case can potentially

iral load, a measure of the density of virus particles in

peripheral blood, is an imperfect but important measure of

infect over their whole asymptomatic period, estimated as a

function of set-point viral load.

We quantify the transmission potential as a function of

set-point viral load and find that it is maximized for intermediate

viral loads, which we observe are also the most common among

untreated patients. Although individuals with high viral loads

are the most infectious in the short term, the total contribution

to infection of those with intermediate viral loads is found to be

larger because of the longer duration of asymptomatic infection.

The consequences for public health and an evolutionary hypoth-

esis arising from this observation are discussed.

Results

Set-Point Viral Load and the Duration of Asymptomatic Infection.

HIV-1 viral load follows a characteristic U-shape during the

course of untreated infection, highest at the start (primary stage)

and end (late stage) of infection, whereas lower and relatively

steady levels are maintained for a variable number of years

during asymptomatic infection. Levels measured in peripheral

blood during the asymptomatic period are very variable, ranging

from 1,000 to 1 million viral copies per milliliter, and this

quantity is positively correlated with viral levels in other body

compartments (1–5). We aim to study its relation to the duration

of infection using a flexible parametric model in a robust

inference framework.

To provide good quantitative detail, we focus on the Amster-

dam seroconverters cohort, where homosexual men were re-

cruited prospectively to study the incidence and natural history

of HIV-1 infection from 1982 onwards and followed for many

years; the cohort has been described elsewhere (5). We censor

all observations after 22 November 1993, the date when the first

protease inhibitor was used in this cohort, to avoid biases caused

by the availability of effective treatment and ignore all treatment

effects before this date; our sample size is 123 men, followed for

504 person-years. We determine set-point viral loads as the

geometric mean viral load in the interval between the end of

primary infection (defined as 6 months after first seropositive

sample) and the first AIDS-defining event (CDC type C) or

censoring, whichever occurs first. The distribution of these

set-points is plotted in Fig. 1, as well as the distribution of viral

loads collected from a Zambian cohort (7), referred to in more

detail below.

We describe the duration of the asymptomatic stage of

Author contributions: C.F., F.d.W., and W.P.H. designed research; C.F., T.D.H., and R.C.

performed research; C.F. contributed new reagents/analytic tools; C.F., T.D.H., and R.C.

analyzed data; and C.F., T.D.H., R.C., and W.P.H. wrote the paper.

The authors declare no conflict of interest.

Freely available online through the PNAS open access option.

‡To whom correspondence should be addressed at: Department of Infectious Disease

Epidemiology, Faculty of Medicine, Imperial College London, St Mary’s Campus, Norfolk

Place, London W2 1PG, United Kingdom. E-mail: c.fraser@imperial.ac.uk.

This article contains supporting information online at www.pnas.org/cgi/content/full/

0708559104/DC1.

© 2007 by The National Academy of Sciences of the USA

www.pnas.org?cgi?doi?10.1073?pnas.0708559104PNAS ?

October 30, 2007 ?

vol. 104 ?

no. 44 ?

17441–17446

EVOLUTION

Page 2

infection by a flexible parametric model described in Methods.

The model describes the change in the mean duration as a

function of viral load and also allows for variability in the

duration, given a value of the set-point viral load. The best-fit

model is shown in Fig. 2. This demonstrates a pattern of decline

in the duration of asymptomatic infection with increasing viral

load with, as expected, more uncertainty in the estimates for

atypically high or low viral loads. Estimates of the mean duration

of asymptomatic infection range from 15.6 years [95% confi-

dence interval (c.i.), 9.4–31.3] for a set-point viral load of 1,000

viral copies per milliliter, through 9.7 years (95% c.i., 7.7–12.9)

for 10,000 copies, 4.9 years (95% c.i., 4.1–6) for 100,000 copies,

to 2.1 years (95% c.i., 0.9–3.7) for 1 million copies. Because these

estimates rely on the use of parametric forms to extrapolate to

extreme viral loads, we tested the use of a very general survival

function, the generalized gamma distribution, but this did not fit

the data significantly better (P ? 0.95). We also considered

allowing for the possibility that the mean duration could plateau

to a low nonzero value at high viral loads, but this also failed to

significantly improve the fit (P ? 0.18) [see Methods and Methods

in supporting information (SI) Text].

Infectiousness and Viral Load. Several studies have empirically

estimated the rates of HIV transmission in stable heterosexual

partnerships as a function of HIV-1 load (6, 7). Because these

studies have focused on demonstrating the significance of the

association rather than the functional relationship, we reana-

lyzed the data from the Zambian study (7), using a flexible

parametric function to describe the dependence of the annual

transmission rate within a partnership on viral load. The best-fit

model,alongwith95%confidenceintervals,isplottedinFig.3A.

Notably, we find that the transmission rate tends to reach a

plateau at high viral loads. To confirm that this was not an

artifact of our parametric assumptions, we plot the directly

inferred transmission rate for eight groups of subjects classed in

increasing octiles of viral load in Fig. 3B, where it is clearly seen

thatthereisnotrendforincreasingtransmissionbetweenthetop

five octiles. Estimates of the annualized transmission rate within

astablelong-termdiscordantpartnershiprangefrom0.02year?1

(95% c.i., 0.001–0.084) for a set-point viral load of 1,000 viral

copies per milliliter, through 0.132 year?1(95% c.i., 0.08–0.223)

for 10,000 copies, 0.279 year?1(95% c.i., 0.223–0.343) for

100,000 copies, to 0.313 year?1(95% c.i., 0.233–0.471) for 1

million copies. Fig. 3A also shows the transmission rate inferred

Fig.1.

(copies per milliliter of peripheral blood) is plotted for untreated individuals

in the Amsterdam Seroconverters Cohort (black bars) and the Zambian Trans-

mission Study (7) (gray bars). The bars represent bins 0.5 log10wide and are

labeled by their midpoint viral load.

Thedistributionofset-pointviralloads.Thedistributionofviralloads

Fig. 2.

mean duration, in years, of the asymptomatic stage of infection is estimated

as a function of viral load. (A) Best-fit and 95% confidence interval estimates

areshown.(B)Toascertainthegoodnessoffit,Kaplan–Meiersurvivalplotsfor

the asymptomatic period are shown for patients grouped into quartiles of

viral load (jagged lines, with crosses showing censored patients), along with

model predictions (smooth lines).

Set-point viral load and duration of asymptomatic infection. The

Fig.3.

within a stable discordant partnership is estimated as a function of viral load.

(A) Best-fit and 95% confidence interval estimates of the transmission rate

based on a parametric model fitted to the data of ref. 7. The data from the

Rakai study (6) are shown for comparison (dashed line). (B) We also plot the

transmission rate as a function of the geometric mean viral load for subjects

grouped into ascending octiles of viral load for these data. This shows strong

evidence for saturation of the transmission rate at high viral loads.

Set-pointviralloadandinfectiousness.Thetransmissionrateperyear

17442 ?

www.pnas.org?cgi?doi?10.1073?pnas.0708559104Fraser et al.

Page 3

from a cohort of HIV serodiscordant subjects in Rakai, Uganda

(6), which are consistent with these estimates. Both these studies

(6, 7) involved a degree of counseling to reduce unprotected sex,

so transmission rates within uncounseled partnerships could be

somewhat higher, although the dependence on viral load would

be similar. These two studies of transmission also recorded very

different rates of unprotected sex despite observing similar

transmission rates, an observation that motivated our choice of

focusing on the transmission rate per unit time rather than the

more conventional but apparently less reliable choice of report-

ing the transmission probability per unprotected sex act.

Transmission Potential. One way of summarizing the epidemio-

logical contribution of individuals with different set-point viral

loads is to estimate the expected number of people infected over

their entire infectious lifespan. Many factors can affect this,

including host behavior, coinfections, and the state of epidemic

itself, because opportunities for transmission are reduced when

prevalence is already high. We define a quantity, which we call

the transmission potential, as the average number of people

potentially infected over the duration of the whole infectious

period, in circumstances where most people are uninfected, for

an infected individual with a particular viral load; where the rate

of partner change is sufficiently high that it does not limit

transmission, and where the transmission rate within partner-

ships is similar to that reported by the two cohorts studied here.

We relax the second, simplifying but not crucial assumption

regarding the partner change rate in Methods in SI Text. We

average over all cofactors affecting transmission apart from

set-point viral load. The transmission potential relates to the

better known basic reproduction number R0 discussed below,

which is obtained by specifying a sexual mixing model and

averaging over the distribution of set-point viral loads.

A particular concern at this stage is that we have analyzed data

on the duration of asymptomatic infection as a function of viral

load from a population of Dutch homosexual men infected with

HIV-1 subtype B, whereas we have considered data on infec-

tiousness from two populations of Zambian and Ugandan het-

erosexual individuals infected with mixed subtypes of virus.

However, there is some evidence that the relation between viral

load and duration of asymptomatic infection is relatively inde-

pendent of subtype, population, or setting. One study of sero-

converting women in Uganda found that the time to AIDS

(WHOstage4)andsurvivalwassimilartothatseenindeveloped

country cohorts (11), whereas the prevalence of general symp-

toms that may or may not be attributable to HIV infection

(defined as WHO stage 2 and 3 events) is much higher (12). In

SI Fig. 6, we show a direct side-by-side comparison of survival

rates between the Amsterdam seroconverters analyzed here and

a cohort of untreated female commercial sex workers in Nairobi,

Kenya, followed since seroconversion (10); these show similar

survival rates for individuals in similar viral load classes. In

situations where this equivalence between populations and set-

tings holds, our method provides a good estimate of the trans-

mission potential for untreated infection in a heterosexual

Africanpopulation(thedominantinfectedpopulation,globally).

The contribution of the asymptomatic infection stage of

infection to the transmission potential depends on set-point viral

load and is the product of the transmission rate and the duration

of infection, plotted in Fig. 4. At low viral loads, the transmission

potential is limited by low infectiousness, whereas at high viral

loads where infectiousness is maximized, the transmission po-

tential is limited by the short duration of the infectious period.

Key to this shape is the observation that infectiousness does not

appear to increase rapidly for viral loads over ?100,000 copies

per milliliter. The inference is robust to variation in the choice

of parametric model used for estimation (SI Fig. 7).

The periods of highest viral load are found at the start of

infection, during a brief period of uncontrolled replication

before the host immune system gains temporary control and,

later, during the development of AIDS. These are also the

periods at which the host will be most infectious (9). However,

these periods are very short, and, within them, viral loads differ

little between hosts, whereas there is great heterogeneity in

set-point viral load. Therefore whatever the contribution of

primary and end-stage infection to transmission potential (ex-

plored in detail in T.D.H., R. M. Anderson, and C.F., unpub-

lishedwork),itisabaselineontopofwhichthemajordifferences

in transmission potential among individual hosts are determined

by set-point viral load. The transmission potential for these

stages is 0.67 (0.32–1.23 95% c. i.) for primary infection and 0.50

(0.31–0.96 95% c. i.) for pre-AIDS/AIDS. However, the assump-

tion that partner change is frequent enough for it not to be a

limiting factor is not likely to remain valid during these short

periods of high infectiousness, and thus the transmission poten-

tial of these stages is less likely to be realized than the trans-

mission potential of asymptomatic infection. Estimates for a

variety of parameters in a simple ‘‘serial monogamy’’ scenario

are explored in SI Figs. 9A and 10A.

Discussion

Strengths and Frailties of the Transmission Potential Analysis. We

analyzed large, well studied cohorts using robust statistical

methods. The models used for inference are sufficiently flexible

that analyses are unlikely to be too dependent on the precise

parametric forms chosen (see Methods in SI Text for some

sensitivity analysis). The main assumption, that both infectious-

ness and duration of infection are, respectively, increasing and

decreasing monotonic functions of viral load, is well supported

and biologically plausible.

A significant limitation of our analysis that could be addressed

in future work is that we had only data available from different

sources for estimating the different parameters. We noted that

the relation between the duration of asymptomatic infection and

set-point viral load is similar in some different settings (10–12),

but this may not be universal. For example, one study reported

rapid disease progression in subtype D infections not explained

by higher than expected viral loads.?A further concern is the

?Laeyendecker, O., Li, X., Arroyo, M., McCutchan, F., Gray, R., Wawer, M., Serwadda, D.,

Nalugoda,F.,Kigozi,G.,Quinn,T.,etal.(2006)TheEffectofHIVSubtypeonRapidDisease

Progression in Rakai, Uganda, Abstract 44LB, 13th Conference on Retroviruses and

Opportunistic Infections, February 5–8, 2006, Denver, CO, www.retroconference.org/

2006, accessed April 27, 2007.

Fig. 4.

expected number of people one case could infect over the whole course of

asymptomatic infection, based on random contacts with susceptible individ-

uals. It is the product of the transmission rate and the mean duration of the

asymptomatic period (Figs. 2A and 3B) and is shown plotted with 95%

confidence intervals as a function of viral load.

Transmission potential. The transmission potential is defined as the

Fraser et al.

PNAS ?

October 30, 2007 ?

vol. 104 ?

no. 44 ?

17443

EVOLUTION

Page 4

possible covariance between duration of infection and infec-

tiousness, which would imply that the transmission potential,

defined as the average of the product of duration and infec-

tiousness, might not be well approximated by the product of the

averages estimated here. Such a situation could arise because of

jointdependenceoncofactorsotherthanviralload.Dependence

on factors that independently affect survival or infectiousness,

such as the dependence of infectiousness on sexual risk behavior,

would not be problematic. Resolution of these concerns could be

addressed by direct estimation of the transmission potential

within a single patient cohort.

Consequences of Variation in Set-Point Viral Load for Public Health.

Comparison of Figs. 1 and 4 shows that individuals with com-

mon, intermediate viral loads have the largest transmission

potential. For current public health initiatives based on the mass

deployment of antiretroviral therapy, we suggest that to attempt

to maximize indirect population benefits by singling out those

with the highest viral loads for treatment would be misguided (a

strategy explored but not advocated in ref. 13), because it is

actually the majority of patients with intermediate viral loads

who ultimately cause the most infections. Although indirect

population benefits of mass therapy are possible and desirable,

treatment protocols in areas of limited resource should use other

inclusion criteria, such as clinical need, likelihood of treatment

adherence, or sexual behavior.

As for future mass interventions based on, for example,

imperfect vaccines, immunotherapy, or microbicides, this frame-

work offers a simple tool for predicting so-called ‘‘perverse

outcomes’’ (14, 15). If the intervention reduces patients’ viral

load in such a way as to increase their transmission potential on

average, then incidence will increase, not decrease. An inter-

vention that reduces viral loads from high to intermediate levels

and is therefore beneficial to the individual may nevertheless

increaseoverallincidenceandthuscausemoreoverallharmthan

benefit. The ultimate outcome of any intervention that changes

the distribution of viral loads can be predicted by calculating the

change in the mean transmission potential. These conclusions

are based on the epidemiological analysis of variation in set-

point viral load and are independent of the evolutionary discus-

sion that follows.

A Hypothesis: The Evolution of HIV-1 Virulence. The viral load that

maximizes the transmission potential is 4.52 log10 copies per

milliliter (Fig. 4), close to the observed means of 4.36 and 4.74

for the Dutch and Zambian cohorts, respectively (Fig. 1). Viral

loads during the asymptomatic period are clustered around

values that maximize the transmission potential of the virus. Is

it possible that this is not coincidence but, rather, an outcome of

natural selection acting on HIV-1 to maximize opportunities for

onwards transmission? This would suggest that HIV-1 conforms

to the classical adaptive virulence model: Seen from the per-

spective of the virus, a negative correlation between infectious-

ness and duration of infection could be interpreted as a trade-off

between two viral life-history traits, with natural selection

leading to an optimal balance in this trade-off (16, 17).

This adaptive virulence model for HIV-1 results in a number of

clear predictions that could be regarded as tests of the hypothesis.

First,forthehypothesistobetrue,theobserveddistributionofviral

loads needs to be consistent with an evolutionary interpretation of

the life-history tradeoff in the transmission potential. Our analysis

supports this. Second, the hypothesis predicts that set-point viral

loads in transmitter and recipient will be correlated. If a trait is

heritable, the conclusion that natural selection can act on it follows

automatically. Conversely, if this is not the case, it is impossible for

natural selection to act on a trait, no matter what its relationship to

fitness might be.

No studies have satisfactorily addressed the question of her-

itability in viral load to date. What evidence there is, direct and

indirect in terms of other indicators of heritability in viral traits,

is reviewed in Discussion in SI Text. We leave this as an open,

testable prediction generated by our study.

Although there is no consensus on the dominant mechanisms

ofHIVpathogenesis,attentionhasshiftedfromwithin-hostviral

evolution (e.g., ref. 18) to pathological host immune activation

(e.g., refs. 19 and 20). Viral load has been implicated as a

measure of viral replication, which itself may regulate the rate of

cell destruction, and mathematical models can capture the

relation between set-point load and duration of the incubation

period of AIDS (e.g., ref. 21). The adaptive-virulence model

does not negate the role of host immune or environmental

interactionsindrivingprogressiontodiseasebut,rather,predicts

that viral genetic factors modulate this progression, leading to

marginally more or less severe distributions of outcomes in

populations.

More specifically, the model predicts that the viral population

will eventually become dominated by the ‘‘strain’’ with the

largest basic reproduction number R0, defined as the number of

individuals infected by a typical index case in a totally susceptible

population (16, 17). However, in an emerging epidemic, some

strains could initially spread faster before being replaced by

others with higher R0[so-called r-selection (22)]. We attempt to

disentangle these processes as follows.

Consider a transmission model with multiple hypothetical

viral strains. Each strain is characterized by a distribution of

set-point viral loads (representing the effects of host, environ-

mental and chance variability), but some strains have an inher-

ited tendency to produce slightly higher or lower viral loads, on

average. We use an ‘‘age-of-infection’’ framework (23) to cal-

culate the basic reproduction number, R0, and the initial expo-

nential rate of spread, r0, of strains characterized by their mean

set-point viral load. The details of the calculation, of assumed

sexual mixing, and sensitivity to the parameters are presented in

Methods in SI Text. R0is maximized for a mean set-point viral

load of 4.34 log10copies per milliliter, whereas the exponential

growthrater0ismaximizedforameanset-pointviralloadof4.83

log10copies per milliliter. These predicted ‘‘optimal’’ values are

close to the mean viral loads observed in the cohorts (Fig. 5).

Because both the negative relation between viral load and

duration of infection and the positive relation between viral load

and infectiousness can be understood in simple biological terms,

the tradeoff between these in producing the peaked transmission

potential curve does not, per se, suggest viral adaptation. How-

Fig. 5.

duction number R0(solid line) and the initial epidemic exponential growth

rate (dashed line) are estimated as functions of the mean set-point viral load

of a hypothetical viral ‘‘strain.’’ The viral load values that maximize these

quantitiesareshownasarrows,whereastheobservedmeanviralloadsforthe

cohorts are shown as circles (open for Zambia and filled for Amsterdam

Seroconverters).

Optimal viral loads in a simple transmission model. The basic repro-

17444 ?

www.pnas.org?cgi?doi?10.1073?pnas.0708559104Fraser et al.

Page 5

ever, viral adaptation does provide a natural explanation for the

relatively good agreement between the calculated optimal and

observed distribution of viral loads.

An area requiring further development is the study of inte-

grated models for exploring multilevel selection for understand-

ing the differential roles of selection for viral replication at the

cell-cell level (within the host) and host-host level (involving

transmission). Although it may be thought that high viral

turnover and mutation rate would favor within-host adaptation

[so-called short-sighted evolution (24, 25)], these factors do not

seem, in practice, to lead to any erosion of infectiousness during

HIV-1 infection (9). There is also a need to explain the diversity

of virus–host patterns for lentiviruses, to identify the determi-

nants of virulence in related lentiviruses (such as HIV-2 and the

simian (SIV) ancestors of HIV-1 and HIV-2 in chimpanzees and

sooty mangabeys, respectively), and their relation to infectious-

ness and survival. A more detailed discussion of these challenges

is included in Discussion in SI Text.

Conclusions

To summarize, we have quantified the transmission potential of

HIV-1 as a function of set-point viral load and have found that

the most common set-point viral loads result in nearly optimal

transmissibility over the lifetime of the host. Crucial to these

analyses were the availability of good long-term longitudinal

data and the use of robust statistical methods to parameterize

the dependence of infectiousness and duration of infection on

viralload.Theanalysesshouldberepeatedwithinasinglecohort

and in different settings. We have hypothesized that this situa-

tion could have arisen because of adaptive evolution of HIV-1

to maximize transmission between humans, although the agree-

ment between observed viral loads and the maximum of the

transmission potential could also, of course, be an interesting

coincidence. The phenomenon of adaptive virulence, if verified,

would have practical consequences in terms of the potential for

public health interventions to impact on virulence (26). There

may be as yet unidentified viral genetic factors that modulate the

severity of infection.

We have explored this evolutionary hypothesis, developed

testable predictions, and highlighted conceptual challenges.

Testing for the existence of differences in viral load or virulence

between populations and testing whether viral load is a trait

heritable from one infection to the next are questions that could

be answered with simple study designs. More detailed predic-

tions and tests could be devised with dynamical epidemic models

of HIV evolution. A specific challenge is predicting the time

scale and outcome of natural selection acting in conflicting

directions for within- and between-host viral replication. The

identification of human genetic factors that determine the

severity of HIV-1 infection has caused much excitement, and, to

date. human genes have been shown to account for ?10% of

variability in disease progression rates (27). This leaves consid-

erable scope for identifying other sources of variation, of which

viral genetic factors have been underexplored.

Methods

The Amsterdam Seroconverters Cohort. Patients were recruited

from 11 January 1982 onwards and followed at quarterly inter-

vals thereafter (5). To minimize biases associated with the use of

treatment, we included only data collected before 22 November

1993, when protease inhibitors were first introduced, resulting in

a sample size n ? 123.

Viral load was measured by quantitative PCR from frozen sera.

Toavoidsamplescollectedduringprimaryinfection,measurements

taken for the first 6 months after first seropositive sample were

excluded, as were viral load measurements taken after the first

AIDS-defining event (CDC type C). Set-point viral load was

determined as the geometric mean of these measurements.

Asymptomatic Duration as a Function of Viral Load. We start by

proposing the following decreasing Hill function for the duration

of the asymptomatic period D(V) as a function of the set-point

viral load V, such that D(V) ? Dmax (D50)Dk/[VDk? (D50)Dk],

where Dmaxis the maximum duration in years, D50is the viral

load at which the duration is half its maximum, and Dkis the

steepness of the decrease in duration as a function of viral load

(Hill coefficient).

To estimate the full profile of durations, we proposed that the

probability a person is still asymptomatic at time T after primary

infection is given by a gamma distribution with mean D(V) and

shape parameter ?. The cumulative ‘‘survival’’ probability is

S(V, T) ? ?(?, ?T/D(V))/?(?), where ? denotes the standard

gamma function and ? the lower, incomplete gamma function.

Our data consists of set-point viral loads vifor patients who

progressed to AIDS or were censored after a time ti, taken to

start 6 months after the first positive test, to exclude primary

infection. An indicator variable Iiis defined such that Ii? 0 if the

patient was censored and Ii? 1 if the patient developed AIDS.

The log-likelihood for this survival analysis is ¥i[Iiln[S(vi, ti)] ?

(1 ? Ii)ln[?S?(vi, ti)]], where S?(V, T) is the probability density

function corresponding to S.

The parameter values which maximize this likelihood are

Dmax? 25.4 years, D50? 3,058 copies per milliliter peripheral

blood, Dk? 0.41 and ? ? 3.46. Confidence intervals for the

duration were estimated at a specific viral load, V* say, by

treating D(V*) as a parameter, recasting Dmaxas a function of

this and by using the likelihood ratio method to determine 95%

c.i.s for D(V*). The procedure was iterated for values of V*

over a range, as shown in Fig. 2A. We consider the effect of

using other parametric models in SI Fig. 7.

Infectiousness as a Function of Viral Load. Because of substantial

inconsistencies in the reported frequency of unprotected sex acts

(6, 7, 28) despite consistent seroconversion rates (Fig. 3A), we

decided to formulate our model of infectiousness in terms of an

infection hazard (probability per unit of time or rate) rather than

as a probability per unprotected sex act. We thus introduced an

increasing Hill function for infectiousness ?(V) a function of

viral load V, ?(V) ? ?maxV?k/[V?k? (?50)?k], where ?maxis the

maximum infection rate per annum, ?50is the viral load at which

infectiousness is half its maximum, and ?kis the steepness of the

increase in infectiousness as a function of viral load. The

probability p(T, V) that a person is infected after a time T of

exposure to an infected person is p(T, V) ? 1 ? exp(??(V)T).

Given our data (from the Zambian transmission study),

consisting of set-point viral loads vifor index cases in couples

observed for a mean duration ?, and an indicator variable Ii

defined such that Ii? 0 if the partner was not infected and Ii?

1 if the partner was infected, then the log-likelihood is ¥i[Ii

ln[p(?, vi)] ? (1 ? Ii)ln[1 ? p(?, vi)]]. Ideally we would have the

duration of observation of each couple, but these data were not

made available to us. We verified by simulation that this was

unlikely to introduce systematic biases in our estimate.

The parameter values which maximize this likelihood are

?max ? 0.317 per year, ?50 ? 13,938 copies per milliliter of

peripheral blood and ?k ? 1.02. Confidence intervals were

estimated as for D(V) above.

We also considered a more general formula allowing for a

minimum infection rate ?min, i.e., ?(V) ? ?min? (?max? ?min)V?k/

[V?k? (?50)?k], but this did not improve the quality of fit (p ? 0.5

by one-sided likelihood ratio, the best-fit value was ?min? 0).

Fideli et al. (7) separate the Zambian data between male and

female index cases, and we repeated our analysis allowing for

separate parameters for male-to-female and female-to-male

Fraser et al.

PNAS ?

October 30, 2007 ?

vol. 104 ?

no. 44 ?

17445

EVOLUTION

Page 6

transmission. Although the best-fit curves looked different

(with a higher Hill coefficient for female-to-male transmis-

sion), the model did not fit significantly better (P ? 0.67 based

on likelihood ratio).

Transmission Potential. The transmission potential TP(V) is de-

fined in the main text as the product of the infection rate during

asymptomatic infection and the duration of asymptomatic in-

fection, i.e., TP(V) ? ?(V)D(V). Confidence intervals for the

transmission potential were estimated as above.

We thank R. M. Anderson, N. M. Ferguson, N. C. Grassly, B. G. Spratt,

and V. Mu ¨ller for useful discussions. We gratefully acknowledge funding

from the Wellcome Trust (R.C., T.D.H., F.d.W., and W.P.H.) and the

Royal Society (C.F.) and the assistance provided by the Amsterdam

Cohort Studies on HIV/AIDS.

1. Hockett R, Kilby J, Derdeyn C, Saag M, Sillers M, Squires K, Chiz S, Nowak

M, Shaw G, Bucy R (1999) J Exp Med 189:1545–1554.

2. Coombs RW, Reichelderfer PS, Landay AL (2003) AIDS 17:455–480.

3. Ball SC, Abraha A, Collins KR, Marozsan AJ, Baird H, Quinones-Mateu ME,

Penn-Nicholson A, Murray M, Richard N, Lobritz M, et al. (2003) J Virol

77:1021–1038.

4. Mellors J, Rinaldo C, Gupta P, White R, Todd J, Kingsley L (1996) Science

272:1167–1170.

5. de Wolf F, Spijkerman I, Schellekens PT, Langendam M, Kuiken C, Bakker M,

Roos M, Coutinho R, Miedema F, Goudsmit J (1997) AIDS 11:1799–1806.

6. Quinn TC, Wawer M, Sewankambo N, Serwadda D, Li C, Wabwire-Mangen

F, Meehan M, Lutalo T, Gray R (2000) N Engl J Med 342:921–929.

7. Fideli OS, Allen SA, Musonda R, Trask S, Hahn BH, Weiss H, Mulenga J,

Kasolo F, Vermund SH, Aldrovandi GM (2001) AIDS Res Hum Retroviruses

17:901–910.

8. Jacquez JA, Koopman JS, Simon CP, Longini IM, Jr (1994) J Acquir Immune

Defic Syndr 7:1169–1184.

9. Wawer MJ, Gray RH, Sewankambo NK, Serwadda D, Li X, Laeyendecker O,

Kiwanuka N, Kigozi G, Kiddugavu M, Lutalo T, et al. (2005) J Infect Dis

191:1403–1409.

10. Lavreys L, Baeten JM, Chohan V, McClelland RS, Hassan WM, Richardson

BA, Mandaliya K, Ndinya-Achola JO, Overbaugh J (2006) Clin Infect Dis

42:1333–1339.

11. Morgan D, Mahe C, Mayanja B, Okongo JM, Lubega R, Whitworth JA (2002)

AIDS 16:597–603.

12. Morgan D, Mahe C, Mayanja B, Whitworth JAG (2002) Br Med J 324:193–196.

13. Gray RH, Li XB, Wawer MJ, Gange SJ, Serwadda D, Sewankambo NK, Moore

R, Wabwire-Mangen F, Lutalo T, Quinn TC (2003) AIDS 17:1941–1951.

14. Anderson RM, Hanson M (2005) J Infect Dis 191:S85–96.

15. Anderson RM, Gupta S, May RM (1991) Nature 350:356–359.

16. Anderson RM, May RM (1982) Parasitology 85:411–426.

17. Levin S, Pimentel D (1981) Am Nat 117:308–315.

18. Nowak MA, Anderson RM, McLean AR, Wolfs TF, Goudsmit J, May RM

(1991) Science 254:963–969.

19. Brenchley JM, Price DA, Schacker TW, Asher TE, Silvestri G, Rao S, Kazzaz

Z, Bornstein E, Lambotte O, Altmann D, et al. (2006) Nat Med 12:1365–1371.

20. Silvestri G, Sodora DL, Koup RA, Paiardini M, O’Neil SP, McClure HM,

Staprans SI, Feinberg MB (2003) Immunity 18:441–452.

21. Fraser C, Ferguson NM, de Wolf F, Anderson RM (2001) Proc R Soc Lond Ser

B 268:2085–2095.

22. Pianka ER (1970) Am Nat 104:592–597.

23. Levin BR, Bull JJ, Stewart FM (1996) Math Biosci 132:69–96.

24. Levin BR, Bull JJ (1994) Trends Microbiol 2:76–81.

25. Bonhoeffer S, Nowak MA (1994) Proc Natl Acad Sci USA 91:8062–8066.

26. Gandon S, Mackinnon MJ, Nee S, Read AF (2001) Nature 414:751–756.

27. O’Brien SJ, Nelson GW (2004) Nat Genet 36:565–574.

28. Gray RH, Wawer MJ, Brookmeyer R, Sewankambo NK, Serwadda D, Wab-

wire-Mangen F, Lutalo T, Li XB, vanCott T, Quinn TC (2001) Lancet

357:1149–1153.

17446 ?

www.pnas.org?cgi?doi?10.1073?pnas.0708559104Fraser et al.