Content uploaded by Luis M. Lopez-Ramos

Author content

All content in this area was uploaded by Luis M. Lopez-Ramos on Mar 07, 2017

Content may be subject to copyright.

JOINT SENSING AND RESOURCE ALLOCATION FOR UNDERLAY COGNITIVE RADIOS

Luis M. Lopez-Ramos, Antonio G. Marques, and Javier Ramos

King Juan Carlos University (Madrid, Spain), Dept. of Signal Theory and Communications

ABSTRACT

Effective operation of cognitive radios (CRs) requires sensing the

spectrum and dynamic adaptation of the available resources accord-

ing to the sensed information. Although sensing and resource al-

location are coupled, most existing designs optimize each of the

tasks separately. This work optimizes them jointly for an under-

lay CR paradigm. The formulation considers that secondary users

adapt their power and rate based on the available imperfect chan-

nel state information, while taking into account the cost associated

with acquiring such an information. The objective of the optimiza-

tion is twofold: maximize the (sum-rate) performance of the CR and

protect the primary users through an average interference constraint.

Designing the sensing in our underlay paradigm amounts to decide

what channel/frequency slots are sensed at every time instant. Par-

tial observability of the channel state (due to noisy and outdated in-

formation) calls for (Bayesian) sequential estimators to keep track

of the interference channel gains, as well as for dynamic program-

ming tools to design the optimal schemes. Together with the optimal

schemes, a simple approximate solution is also developed.

Index Terms—Cognitive radio, underlay paradigm, sensing,

dual decomposition, sequential estimation, dynamic programming.

1. INTRODUCTION

Cognitive radios (CRs) are a key technology to alleviate spectrum

scarcity. When CRs are deployed, secondary users (SUs) have to

sense their radio environment to optimize their communication per-

formance while controlling the interference to primary users (PUs)

[1]. In underlay CRs, sensing amounts to acquire the channel state

information (CSI) needed to limit the power transmitted by the SU,

so that the interference inﬂicted to the PU is kept under a prescribed

limit. This typically amounts to acquiring the (fading) gains of the

SU-to-PU channels (also known as interference channels). Due to

lack of collaboration mechanisms between PU and CR systems, ac-

curately estimating the SU-to-PU channels requires considerable ef-

fort [2]. However, the cost of acquiring the CSI is often not taken

into account in the modeling; see [3, 4, 5, 6, 7] for some excep-

tions. As a result, a careful design of the sensing policy is critical

to guarantee an efﬁcient operation of the CR. This paper optimizes

the sensing and resource allocation (RA) tasks for an underlay CR

model jointly. Uncertainties on the sensed CSI and sensing cost will

be taken into account during the RA, while the actual beneﬁt of the

CSI for the SUs will be taken into account during the sensing phase.

Some important challenges to optimize the sensing and RA

jointly are: (C1) the need of the RA algorithms to deal with im-

perfect CSI (noisy and outdated) that renders the exact interfer-

ence caused to PUs uncertain; (C2) the inability to estimate the

totality of the PU-SU-channel-time grid, due to the scarcity of

This work was partially supported by the Spanish Ministry of Science

(FPU Grant AP2010-1050) and the EU-FP7 (ICT-2011-9-TUCAN3G).

resources (power, time or hardware availability); and (C3) the cou-

pling between sensing and transmission resources. To deal with

(C1), several works aim to control the average interference using a

probabilistic representation of the state information of the primary

network (SIPN) [2, 7]. Adaptive stochastic algorithms provide ro-

bustness to non-stationarities and lack of knowledge of the channel

distributions. See also [8] for a different approach to cope with

estimation errors. To deal with (C2), advanced sensing schemes

aim at optimally selecting the subset of sensed channels [9, 10, 7].

Moreover, when the SIPN exhibits time correlation, the information

acquired can be reused ahead on time (accounting for the fact that

information gets outdated). These schemes are usually designed

using dynamic programming (DP) tools such as partially observable

Markov decision processes (POMDPs) [7, 9]. Regarding (C3), RA

in underlay CRs has been extensively investigated. In [11] an RA

framework that considers both interference constraints for PUs and

QoS constraints for SUs is presented; power is optimized jointly

with admission control. The optimal RA strategies to achieve the

ergodic and outage capacity of the SU fading channel is studied in

[12] under different types of power constraints and fading channel

models. See, e.g., [13, 8, 14, 2, 15] for other relevant setups. All

those works consider that the sensing is given and, at best, account

for the SIPN uncertainties (quantized, noisy, outdated) by making

the RA aware of such imperfections. The number of works that

aim to globally optimum RA and sensing by implementing a joint

optimization is much smaller; see, e.g., [9, 16, 10, 6, 7, 17], all

for interweave setups. When a joint design is implemented, the

decision of what time instants/users/channels to sense has to take

into account what the RA is going to do with such information, as

well as the impact on CR performance for current and future time

instants. As a result, the analytical complexity of the problem and

the computational burden to obtain the optimal schemes increase

considerably.

To the best of our knowledge, no previous work has addressed

the joint design of sensing and RA for underlay CRs. The un-

derlay setup is more challenging than the interweave setup, which

only requires to know whether a frequency band is occupied or not.

Not only the variables to estimate are continuous in the underlay

setup, but their number is much higher (all SU-to-PU pairs in all fre-

quency bands). Uncertainty and time correlation in the SIPN call

for (Bayesian) sequential estimators to keep track of the interference

channel and DP/POMDP tools to design the optimal schemes.

Our design approach is similar to one followed in our previous

work [7, 17] for interweave CRs. We ﬁrst design the RA for any

sensing scheme and, then, design the optimal sensing taking into ac-

count the optimal RA. Since the modiﬁcations in the RA to account

for the sensing cost are relatively simple, the main novelty is on the

design of the sensing schemes. The main contributions of this pa-

per are: the formulation of a joint optimization of RA and sensing

for an underlay CR; the design of an algorithm that, leveraging dual

decomposition and DP/POMDP tools, solves the joint optimization;

and the design of a low-complexity algorithm that, using a greedy

(myopic) approach, approximates the optimal solution. Our paper

must be viewed as a ﬁrst step to developing low-complexity approx-

imations to the optimal solution.

The paper is organized as follows. Section 2 presents the system

setup, SIPN and state information of the secondary network (SISN)

models, design variables, and the constraints to be satisﬁed. The

problem is formulated in Section 3. Section 4 solves the problem

and analyzes its complexity. Section 5 presents a simple suboptimal

solution. Preliminary numerical simulations evaluating the perfor-

mance of the algorithms are provided in Section 6.

2. SYSTEM MODEL

A CR with MSUs (indexed by m) is considered. The frequency

band used by the CR is divided into Kfrequency-ﬂat orthogonal

subchannels (indexed by k), so that if a SU is transmitting, no other

SU can be active in the same subchannel. No constraints are im-

posed on the number of channels that can be accessed by a user.

For simplicity, we assume that there is always exactly one active

PU per channel. Extensions to scenarios where these assumption(s)

do not hold can be handled with a moderate increase in complex-

ity. Each SU can obtain (imperfect) measurements of the channel

gain between itself and the PUs. More precisely, at every time slot

(indexed by n) the following three tasks are run sequentially by the

CR: T1) the SISN is acquired; T2) based on the output of T1 (and

previous measurements) a set of users are selected to measure their

interference links; T3) the outputs of T1 and T2 are used to ﬁnd the

optimal RA for instant n. This section describes the model for the

SISN and SIPN; the variables to be designed; and the constraints that

such variables need to satisfy.

Starting with the SISN, the instantaneous fading coefﬁcient of

the channel between the mth secondary transmitter-receiver pair in

the kth channel at time nis denoted as hm

k,2[n]. This variable is

normalized with respect to noise and PU interference. Regarding

the SIPN, the noise-normalized instantaneous fading coefﬁcient of

the interference channel between the mth SU and the kth PU is de-

noted as hm

k,1[n]. Every time that the mth SU is required to obtain

measurements from its interference channels, it has to pay a power

cost denoted by qm(other sensing costs can also be accommodated

into our formulation [7]). The instantaneous value of hm

k,1[n]will

not be assumed perfectly known because of: i) outdated information

(to save power, the interference channels are not sensed at every n);

and ii) errors due to noisy measurements. As a consequence, instead

of the true value of the channel gain (perfect SIPN), only statistical

information about it is available (probabilistic SIPN).

Let ˜

hm

k,1[n]denote the observation (output of the sensing task,

possibly corrupted by noise) of hm

k,1[n]. The CR relies on the dy-

namics of hm

k,1[n]to track the SIPN. Let us deﬁne the Boolean vari-

able sm[n], which is 1 if at time nthe mth SU takes measurements

˜

hm

k,1[n], and 0 otherwise. While sm[n]will depend on measurements

acquired in the past [cf. T1], the power allocation and transmission

scheduling will also leverage the newly available measurements [cf.

T3]. Let fm

k(hm

k,1[n]|n−1) denote the information about hm

k,1before

sensing (prediction density). Similarly, let fm

k(hm

k,1[n]|n)denote

the information about hm

k,1after sensing (ﬁltering density)1. To use

a compact notation, these densities will be assumed to belong to the

same family and will be represented by their parameters. Let ˆ

Fm

k[n]

be the parameter vector of the prediction density at time n, and

1The prediction and ﬁltering densities play the role of the pre-decision

(prior) and post-decision (posterior) beliefs, respectively [17, 7, 18, 19].

Fm

k[n]that of the ﬁltering density. This way, fm

k(hm

k,1[n]|n−1) :=

f(hm

k,1[n]; ˆ

Fm

k[n]) and fm

k(hm

k,1[n]|n) := f(hm

k,1[n]; Fm

k[n]). The

stochastic ﬁlter that tracks the SIPN works as follows [19]. The pre-

diction density parameters at time nare deterministically computed

at the prediction step from the previous ﬁltering density parameters:

ˆ

Fm

k[n] := P(Fm

k[n−1]).(1)

The ﬁltering density at time nwill depend on sm[n]. If sm[n]=0,

then Fm

k[n] = ˆ

Fm

k[n]; if sm[n] = 1, then

Fm

k[n] := U(ˆ

Fm

k[n],˜

hm

k,1[n]).(2)

There exist different alternatives to model the stochastic process

hm

k,1. Here, the time dynamics of the complex-valued secondary-

primary channel gain hm

k,1[n]are described by an auto-regressive

(AR) model with circularly-symmetric complex normal (CSCN)

innovations and CSCN noise. As a result, the parameter vectors cor-

respond to the mean and variance of the densities, and the prediction

and correction steps of the channel estimation can be effected by a

standard Kalman ﬁlter [19]. Since the time variability of the SISN is

considered faster than that of the SIPN, hm

k,2[n]will be considered

i.i.d. across time. As stated in [15], such a heterogeneous system

information model is well suited for scenarios where the mobility of

the PUs is low and sensing the SIPN is more difﬁcult than sensing

the SISN.

Next, we introduce the design variables wm

k[n](scheduling co-

efﬁcients), pm

k[n](transmit power), and sm[n](sensing decision, al-

ready described). Coefﬁcients wm

k[n]effect the orthogonal access

among SUs. Speciﬁcally, wm

k[n]is 1 if the mth SU is scheduled to

transmit into the kth band at time nand 0 otherwise. Moreover, if

wm

k[n] = 1,pm

k[n]denotes the instantaneous nominal power trans-

mitted over the kth band by the mth SU. This means that power

pm

k[n]is consumed when wm

k[n]=1. Under bit error rate or capac-

ity constraints, instantaneous rate and power variables are coupled.

This rate-power coupling will be represented by the non-decreasing

function Cm

k(hm

k[n], pm

k[n]) and βmwill denote the beneﬁt (price)

associated with the rate.

The last step is to describe the constraints that the aforemen-

tioned variables need to satisfy. The sensing decision variable is

binary, so that sm[n]∈ {1,0}. Powers are non-negative, so that

pm

k[n]≥0. Moreover, orthogonal access requires

wm

k[n]∈ {0,1}and Pmwm

k[n]≤1.(3)

The average (long-term) power the mth SU can consume (including

the power devoted to transmit and the power devoted to estimate the

interference channel gains) is upper bounded, that is, ∀m

lim

N→∞

N

X

n=1

γnE"qmsm[n] +X

k

wm

k[n]pm

k[n]#≤lim

N→∞

N

X

n=1

γnˇpm,

(4)

where 0< γ < 1is a discount factor that is typically included in

inﬁnite horizon formulations to facilitate the design of the optimal

schemes and accommodate potential non-stationarities [18]. Note

also that the right hand side of (4) is equivalent to ˇpm

1−γ. The allocated

power will generate interference to PUs. Since an underlay setup is

considered, each time a SU transmits in channel k, the interference

generated at the PU receiver is hm

k,1[n]pm

k[n]. To protect the PUs, a

limit on the average (long-term) interference at each PU is enforced.

This amounts to require for all k

lim

N→∞

N

X

n=1

γnE"X

m

wm

k[n]hm

k,1[n]pm

k[n]#≤lim

N→∞

N

X

n=1

γnˇok,

(5)

Here, we controlled interference by limiting the average interfering

power at the PU receiver [11]. This keeps the modeling simple and

leads to a convex constraint. Alternative metrics can be used to con-

trol interference (e.g. outage probability [15]), provided that the in-

crease in computational complexity can be afforded.

3. PROBLEM FORMULATION

The last step to formulate the optimization problem is to identify

the metric to be maximized. In this work, the average sum rate

achieved by the secondary network will be maximized. With X:=

{sm[n], wm

k[n], pm

k[n]|∀m, k, n}, the optimal joint design is then

max

Xlim

N→∞

N

X

n=1

γnEhXk,m βmwm

k[n]Cm

k(hm

k,2[n], pm

k[n])i(6a)

s.to : (3),(4),(5), pm

k[n]≥0, sm[n]∈ {0,1}.(6b)

The two main issues that render this problem challenging to

solve are: i) The design variables wm

k[n]and sm[n]are binary, so

that the complexity to optimize over them is combinatorial; and

ii) The value of some design variables at time nhas an impact on

the state variables at instants n0≥n(speciﬁcally, sm[n]has an im-

pact on future beliefs through Fm

k[n]) – as a consequence, solving

(6) optimally requires using DP tools.

Regarding the ﬁrst challenge, the combinatorial complexity as-

sociated with optimizing over wm

k[n]can be bypassed by relaxing

the binary constraint to its convex counterpart wm

k[n]∈[0,1]. Such

a relaxation can be shown optimal because {wm

k[n]}are present only

in linear terms and because {wm

k[n]}do not have an impact on the

future state variables; see, e.g., [15] for details. Unfortunately, that

is not true for sm[n]and, hence, the associated complexity remains

combinatorial. The optimal solution is presented in the next section,

while Section 5 presents a low-complexity approximation.

4. OPTIMAL SOLUTION

After dualizing the long-term constraints (4) and (5), the opti-

mization of {wm

k[n]}and {pm

k[n]}can be separated across time

and channels. This fact, together with other properties of (6)

will be leveraged to decrease the computational complexity re-

quired to solve the DP. The critical step is to tackle the op-

timization in two stages: i) ﬁnding the optimal {wm

k[n]}and

{pm

k[n]}for any sensing policy; and ii) substituting the output

of (i) into (6) and solving for the optimal {sm[n]}. Note that

this does not entail a loss of optimality because the solution in

(i) is a function of the sensing policy, which is later optimized in

(ii). Mathematically, for a generic function f(x, y), the approach

amounts to ﬁnd (x∗, y∗) = arg minx,y f(x, y)as follows: i)

x∗(y) := arg minxf(x, y)and ii) y∗= arg minyf(x∗(y), y).

4.1. Optimal RA

The optimization carried out in the ﬁrst step yields a problem of

the same form than that solved in [7]; for this reason, the optimal

solution is given here directly. The optimal solution to the problem

at hand consists in deﬁning a link quality indicator (LQI) ϕm

k(p),

optimizing it with respect to the power for every user-channel pair,

and selecting for transmission the SU with the highest LQI in each

channel. The LQI for the problem at hand is:

ϕm

k(p) := βmC(hm

k,2[n], p)−(πm+θkµm

k[n]) p(7)

where µm

k[n] := E|hm

k,1[n]|2Fm

k[n]is the expected power gain

of the interference channel (according to the post-decision be-

lief); and πmand θkare the Lagrange multipliers associated with

constraints (4) and (5), respectively2. To optimize the RA for in-

stant n, select pm,?

k[n] := arg maxpϕm

k(p), and wm,?

k[n] :=

{ϕm

k(p)=maxq,p ϕq

k(p)}, where {·} is the indicator function. Note

that (7) can be expressed in closed form for several choices of C(·).

For example, if C(·)is Shannon’s capacity, then pm,?

ktakes the form

of the water-ﬁlling solution [20, 12].

4.2. Optimal sensing

Leveraging the expressions for the optimal RA, we now solve for

the optimal sm[n]. First, we deﬁne the instantaneous reward R[n],

which accounts for the terms at time nthat depend on sm[n]:

R[n] := Xkmax

mϕm,?

k(µm

k[n]) −Xmπmqmsm[n],(8)

where ϕm,?

k(µm

k[n]) is the optimal value of ϕm

k[n]for a given

µm

k[n]. Note that ϕm,?

k(µm

k[n]) depends on s[n]because the SIPN

µm

k[n]depends on s[n][cf. (2)].

After substituting the optimal RA and (8) into the Lagrangian of

(6), the maximization boils down to

max

{s[n]∈M}∀n

lim

N→∞

N

X

n=1

γnEhR[n]s[n]i.(9)

To stress that the sensing decisions of all users have to be jointly

optimized, the notation sm[n]∈ {0,1} ∀mhas been replaced with

s[n]∈M. The coupling exists because the sensing decision for

user maffects its probability (and hence, also the other users’ prob-

abilities) of being scheduled.

The main differences between (9) and the original formulation in

(6) are that now: i) as a result of the Lagrangian relaxation of the DP,

the objective has been augmented with the terms accounting for the

dualized constraints; ii) the only remaining optimization variables

are s[n]; and iii) because the optimal RA fulﬁlls the constraints (3)-

(5) and pm

k[n]≥0, the only remaining constraint is s[n]∈M

(standard DP algorithms usually assume countable action spaces).

The problem falls into the class of POMDP because state tran-

sitions and average rewards only depend on the current state-action

pair, and the system state is not known perfectly. Only an observa-

tion (affected by noise or missing data) of the state is available in-

stead [21]. In this model, the partially observable variable is hm

k,1[n].

The belief variable required to solve this POMDP is constituted by

the prediction and ﬁltering densities associated with hm

k,1[n].

To solve for s∗[n], we derive the Bellman equations [18] associ-

ated with (9). The objective is split into present and future rewards,

yielding

s∗[n]=argmax

s∈MnR[n]s[n]=s+

∞

X

t=n+1

γt−nR[t]s[n]=so.(10)

To account for the effect of current actions in future instants, the

value function V(h2[n],ˆ

F[n]) is introduced (where ˆ

F[n]collects

ˆ

Fm

k[n]∀(k, m)and h2[n]collects hm

k,2[n]∀(k, m)). It quantiﬁes the

expected sum reward for all future instants. Since the latter is an

inﬁnite horizon DP with γ < 1, the optimal value function is sta-

tionary and its existence is guaranteed [18]. Similarly to [7], since

2Since, after relaxation, the RA problem has zero duality gap, there exists

a constant (stationary) optimal value for each multiplier [7]. In this work, the

optimal values of {πm, θk}are assumed to be known.

the hm

k,2[n]are considered i.i.d. across time and independent of

sm[n], the Bellman equations that drive the optimal sensing can be

expressed in terms of ¯

V(ˆ

F[n]) := Eh2[V(h2[n],ˆ

F[n])]:

s∗[n] = arg max

s∈MnE˜

hhR[n]+ γ¯

V(ˆ

F[n+1])s[n]=sio (11)

¯

V(ˆ

F[n])=Eh2max

s∈MnE˜

hhR[n]+ γ¯

V(ˆ

F[n+1])s[n]=sio(12)

where E˜

his the expectation over the distribution of {˜

hm

k,1}∀(m, k).

The only remaining step to design the sensing scheme is to design

an algorithm to compute ¯

V(ˆ

F[n]). There exist different alternatives

that exploit the recursive deﬁnition in (12) to accomplish this task

[18]. Space limitations prevent us to delve into the details of such

algorithms, but it is important to stress that (even after leveraging the

problem structure) their computational complexity is very large.

5. APPROXIMATE SOLUTION

The two main sources of complexity to ﬁnd s∗[n]are: i) during the

initialization phase, the multidimensional function ¯

V(·)needs to be

estimated iteratively using a Monte Carlo approach and ii) at every

time instant, an exhaustive search over Mneeds to be implemented.

Since (i) is run off line only once, we focus on reducing the online

complexity in (ii). In particular, we use a greedy approach under

which users are sequentially selected to measure the channel. We

start by supposing that no user senses the channel and sequentially

set sm[n] = 1 for the SU that yields the highest (positive) expected

reward. The algorithm stops either when none of the remaining SUs

yields a positive reward, or when all users are scheduled to sense

the channel. The approximation is well justiﬁed because channels

across SUs are not correlated. Algorithm 1 lists the main steps of the

algorithm, with 0and 1denoting the all-zeros and all-ones vectors

and emthe mth canonical M×1vector.

Algorithm 1 Greedy approximation to the optimal sensing policy.

1: ˜s ←0and M ← {1,...,M}

2: repeat

3: m?←arg maxm∈M E[R[n] +γ¯

V(ˆ

F[n+1])|s[n] = ˜s +em]

4: ∆R←E[R[n] +γ¯

V(ˆ

F[n+1])|s[n] = ˜s +em?]−E[R[n] +

γ¯

V(ˆ

F[n+1])|s[n] = ˜s]

5: if ∆R > 0,then ˜s ←˜s +em?and M ← M \ {m∗}

6: until ∆R < 0or ˜s =1

7: return s[n] := ˜s

The expectations in line 3 (which are taken over ˜

h) can be run

efﬁciently using a Monte Carlo method. Since the imperfections in

hm

k,1[n]are independent across (m, k), each of the Mexpectations

in line 3 can be implemented with complexity O(MKN ), where

Nis the number of random realizations per ˜

hm

k,1[n]. Then, the on-

line complexity of the overall algorithm is O(M3KN ), because the

repeat loop is executed at most Mtimes. Note that Algorithm 1 as-

sumes that ¯

V(·)has been computed off line. If the associated burden

cannot be afforded, further simpliﬁed algorithms can be developed

either by approximating ¯

V(·), or just by dropping it (myopic pol-

icy).3.

3A myopic policy ignores the impact of the sensing decision on future

time instants, focusing only on maximizing the instantaneous reward. Math-

ematically this amounts to setting ¯

V(·)=0.

0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 0.5 0.55 0.6

0.6

0.7

0.8

0.9

1

1.1

1.2

1.3

1.4

1.5

1.6

Av era ge I nt er fe re nc e Pow er G ai n

Av er ag e s p ec t ra l e ﬃc ie nc y (b ps /H z)

Exhaustive

Algorithm 1

Round−Robin

Always Sense

Never Sense

Random

Upper Bound

Fig. 1.1. Results for Test Case 1

123456789

0

0.2

0.4

0.6

0.8

1

1.2

Nu mbe r of c han ne ls

Av er age s pe ctr al eﬃc ie nc y (b ps /Hz )

Fig. 1.2. Results for Test Case 2

6. NUMERICAL RESULTS

A CR with M= 4 SUs is simulated. The SISN follows a Rayleigh

fading model with an average SNR of -5 dB. All users have the same

priority, so that βm= 1 ∀m. The SIPN follows an AR-1 model

with a coefﬁcient of 0.95. The observed SIPN is corrupted by ad-

ditive gaussian noise with a SNR of 3 dB. The power constraint at

the SU transmitters is set to [ˇp1, ...ˇp4] = [6.0,7.2,9.0,12.0]. The

interference power constraint at the PU receivers is ˇok= 2.0∀k.

The sensing cost parameter [cf. (4)] is qm= 5 ∀m. The Lagrange

multipliers [cf. (7)] are computed using the method in [15].

Since we focus on the sensing policy, all tested schemes imple-

ment the optimal RA policy in Section 4.1. We are interested in

comparing the performance of the myopic policy using the follow-

ing schemes: i) an exhaustive search over s[n](combinatorial com-

plexity); ii) Algorithm 1 (proposed, polynomial complexity); iii) a

round-robin scheme that sequentially selects a single different user

at each n; iv) a scheme that randomly selects sm[n]mimicking the

distribution of sm[n]at (ii); deterministic schemes that v) always

sense, vi) never sense; and vii) an upper bound on the system perfor-

mance (using the algorithm in (v) and setting qm= 0 ∀m).

Two test cases are run: TC1) the average power gain of the in-

terference is ﬁxed to -3 dB and the spectral efﬁciency is plotted vs.

Kin Fig. 1.1; TC2) K= 4 and the spectral efﬁciency is plotted vs.

the average power gain of the interference in Fig 1.2. The average

power and interference constraints are tightly satisﬁed in all cases.

Results show close performance of Algorithm 1 and exhaustive

search for the simulated test cases. This suggests that Algorithm 1

can be a good option when Mis large. Further, this motivates using

the greedy approach to compute a suboptimal estimation of ¯

V(·).

Such schemes will be addressed in future work.

7. REFERENCES

[1] S. Haykin, “Cognitive radio: brain-empowered wireless com-

munications,” IEEE J. Sel. Areas Commun., vol. 23, no. 2, pp.

201–220, Feb 2005.

[2] E. Dall’Anese, S.-J. Kim, G. Giannakis, and S. Pupolin,

“Power control for cognitive radio networks under channel un-

certainty,” Wireless Communications, IEEE Transactions on, ,

no. 99, pp. 1–11, 2011.

[3] Y.-C. Liang, Y. Zeng, E.C.Y. Peh, and A.T. Hoang, “Sensing-

throughput tradeoff for cognitive radio networks,” IEEE Trans.

Wireless Commun., vol. 7, no. 4, pp. 1326–1337, 2008.

[4] G. Xiong, S. Kishore, and A. Yener, “Cost constrained spec-

trum sensing in cognitive radio networks,” in 44th Conf. on

Information Sciences and Systems (CISS), Princeton, NJ, Mar.

17–19, 2010.

[5] D. Xu and X. Liu, “Opportunistic spectrum access in cognitive

radio networks: When to turn off the spectrum sensors,” in

4th Intl. Wireless Internet Conf. (WiCON), Maui, HW, Nov.

17–19, 2008.

[6] S.-J. Kim and G. Giannakis, “Sequential and cooperative sens-

ing for multi-channel cognitive radios,” IEEE Trans. Signal

Process., vol. 58, no. 8, pp. 4239˜

n4253, Aug. 2010.

[7] L. M. Lopez-Ramos, A. G. Marques, and J. Ramos, “Jointly

optimal sensing and resource allocation for multiuser overlay

cognitive radios,” CoRR, vol. abs/arXiv/1211.0954, 2012.

[8] Y. Chen, G. Yu, Z. Zhang, H.-H. Chen, and P. Qiu, “On cogni-

tive radio networks with opportunistic power control strategies

in fading channels,” Wireless Communications, IEEE Trans-

actions on, vol. 7, no. 7, pp. 2752–2761, 2008.

[9] Q. Zhao, L. Tong, A. Swami, and Y. Chen, “Decentralized

cognitive MAC for opportunistic spectrum access in ad hoc

networks: A POMDP framework,” Selected Areas in Com-

munications, IEEE Journal on, vol. 25, no. 3, pp. 589–600,

2007.

[10] X. Wang, “Joint sensing-channel selection and power control

for cognitive radios,” IEEE Trans. Wireless Commun., vol. 10,

no. 3, pp. 958–967, Mar. 2011.

[11] L. B. Le and E. Hossain, “Resource allocation for spectrum

underlay in cognitive radio networks,” Wireless Communica-

tions, IEEE Transactions on, vol. 7, no. 12, pp. 5306–5315,

2008.

[12] X. Kang, Y.-C. Liang, A. Nallanathan, H.K. Garg, and

R. Zhang, “Optimal power allocation for fading channels in

cognitive radio networks: Ergodic capacity and outage capac-

ity,” Wireless Communications, IEEE Transactions on, vol. 8,

no. 2, pp. 940–950, 2009.

[13] X. Gong, S. Vorobyov, and C. Tellambura, “Optimal band-

width and power allocation for sum ergodic capacity under fad-

ing channels in cognitive radio networks,” IEEE Trans. Signal

Process., vol. 59, no. 4, pp. 1814˜

n–1826, Apr. 2011.

[14] Y. Y. He and S. Dey, “Power allocation in spectrum sharing

cognitive radio networks with quantized channel information,”

IEEE Trans. Commun., vol. 59, no. 6, pp. 1644–1656, Jun.

2011.

[15] A.G. Marques, L.M. Lopez-Ramos, G.B. Giannakis, and

J. Ramos, “Resource allocation for interweave and underlay

crs under probability-of-interference constraints,” Selected Ar-

eas in Communications, IEEE Journal on, vol. 30, no. 10, pp.

1922–1933, 2012.

[16] Y. Chen, Q. Zhao, and A. Swami, “Joint sensing-channel se-

lection and power control for cognitive radios,” IEEE Trans.

Inf. Theory, vol. 54, no. 5, pp. 2053–˜

n2071, May 2008.

[17] L.M. Lopez-Ramos, A.G. Marques, and J. Ramos, “Soft-

decision sequential sensing for optimization of interweave cog-

nitive radio networks,” in Signal Processing Advances in Wire-

less Communications (SPAWC), 2013 IEEE 14th Workshop

on, 2013, pp. 235–239.

[18] W.B. Powell, Approximate Dynamic Programming: Solving

the Curses of Dimensionality, Wiley Series in Probability and

Statistics. Wiley, 2011.

[19] J. V. Candy, Bayesian Signal Processing: Classical, Modern

and Particle Filtering Methods, Wiley-Interscience, 2009.

[20] A. Goldsmith, Wireless Communications, Cambridge Univer-

sity Press, 2005.

[21] D. Braziunas, “POMDP solution methods,” University of

Toronto, 2003.