Page 1

A New SoC Test Scheduling Algorithm using Random Insertion

Jung-Been Im Sunghoon Chun Geunbae Kim Jin-Ho Ahn Sungho Kang

Department of Electrical and Electronic Engineering

Yonsei University

134, Shinchon-Dong Seodaemoon-Gu, Seoul, Korea

Tel: +82-2-2123-2775, Fax: +82-2-313-8053

<joazoa, shchun, kgb9572, sominaby>@soc.yonsei.ac.kr

shkang@yonsei.ac.kr

Abstract

This paper presents a new SoC (System-on-Chip) test

scheduling algorithm. Reducing the test application time is

an important issue for a core-based SoC test. In this paper,

each core is represented by a rectangle whose height is

equal to the TAM width and width is equal to the test time.

‘One-element-exchange’ algorithm is used for optimizing

the test time of each core, and ‘RAIN’ algorithm is used for

optimizing the test time of a SoC. The RAIN algorithm

uses a sequence pair data structure to represent the

placement of rectangles, and obtains the optimized results

by inserting into the random position of sequence pair

sequence. The results of the experiments conducted on

ITC ’02 SoC benchmarks show that the proposed algorithm

gives the shortest test application time than earlier

researches for most of the cases.

I. Introduction

The number of cores embedded in a SoC is increasing

rapidly, and cores are more deeply embedded in a chip.

Therefore testing the cores by means of direct access

through chip’s I/O pins is almost impossible. To solve these

problems, methods like IEEE P1500 test wrapper [1] and

TAM (Test Access Mechanism) were proposed. A test

wrapper provides the isolation of core from surrounding

logics and the interface between core and TAM. The TAM

is the mechanism to transfer test data from SoC’s I/O pins

to the wrappers of core.

However, the accessibility is not only problem of the

SoC test. Test cost reduction is another. Therefore many

researches are being done to reduce the test cost. SoC test

scheduling is one of them. SoC test scheduling is the

process to minimize test application time of whole cores in

a SoC under the given constraints like TAM bandwidth and

power. It includes the optimization of the test wrapper

design, the assignment of TAM width to each core and the

determination of test start and finish time for each core.

Several prior researches are related to the SoC test

scheduling problems. In [2], a mixed integer linear

programming (MILP) was used to solve the test scheduling.

In [5], the SoC test scheduling problem is formulated as a

2-dimentional bin packing problem, and each core was

represented by a rectangle whose width was the number of

SoC pins allocated and height was the core test time given

the number of SoC pins. The rectangle representation was

also used in [3], and a technique based on rectangle

packing was used for wrapper/TAM co-optimization and

test scheduling for SoCs.

Several recent papers started to use the sequence pair

representation for test scheduling problems [6] [7] [11].

Since [6] and [7] give the lowest test time results for most

of the cases, their results are mainly presented in the

experimental result section for comparison with ours.

In this paper, we use the sequence pair representation

and propose the RAIN (RAndom INsertion) algorithm to

minimize the test application time of all cores in a SoC.

Since the insertion into a random position of the sequence

pair is major operation of our algorithm, we name our

algorithm RAIN (RAndom INsertion).

This paper is organized as follows. In Sections 2, we

briefly describe the sequence pair representation. Wrapper

design optimization and a new ‘one-element-exchange’

algorithm are presented in Section 3, and our SoC test

scheduling algorithm is presented in Section 4. Section 5

contains our experimental results for four ITC ’02

benchmarks. Section 6 concludes the paper.

Page 2

II. Sequence Pair Representation

When a core is represented by a rectangle, the height of

the rectangle is the TAM width assigned to the core, and

the width of the rectangle is the test application time for

that TAM width. To schedule the cores represented by

rectangles, we use the sequence pair representation.

The sequenced pair representation was first introduced in

[10] for the purpose of VLSI placement. In SoC test

scheduling, [11], [6] and [7] used the sequence pair

representation. The sequence pair representation uses two

permutations (Γ+,Γ-) to describe the placement of rectangles.

The placement of rectangles is determined by the following

rules.

(Γ+,Γ-) = (••• a ••• b •••, ••• a ••• b •••)

: a is placed to the left of b

(Γ+,Γ-) = (••• b ••• a •••, ••• a ••• b •••)

: a is placed below b

To know the placement of rectangles from a sequence

pair, one can construct a 45 degree oblique grid as shown in

Figure 1 (a).

(a) Oblique Grid (b) Rectangle Placement

Figure 1. Sequence pair (Γ+,Γ-) = (abdc, dabc)

Two vertex weighted directed acyclic graphs can be

constructed from the sequence pair. One is the horizontal

directed graph GH, and the other is the vertical directed

graph GV. The weight of each vertex in the GH is the width

of the rectangle, or the test application time of the core.

And the weight of each vertex in the GV is the height of the

rectangle, or the TAM width assigned to the core. One can

apply the well-known longest path algorithm for the GH

(GV) to know the overall SoC test time (maximum TAM

width).

III. Wrapper Design Optimization

1. Wrapper Design

A test wrapper is DFT logic for the purpose of testing

cores embedded in a SoC. In the case that less wrapper pins

are assigned than I/O ports of the core, it is important to

optimize test wrappers of the core for minimizing test time.

To calculate the test application time T, we use the well-

known formula given in [8]. It is described as below,

T = { 1 + max (Si, So) } • P + min (Si, So)

where P denotes the number of test patterns, and Si (So)

is the length of the longest wrapper scan-in (scan-out) chain.

Since the number of test patterns is fixed, reducing the

length of the longest scan-in and scan-out chain is

necessary to minimize test application time. To satisfy this

need, scan chains must be assigned to the wrapper chains

so that the lengths of the wrapper scan chains are as same

as possible. Functional input, output and bidirectional ports

must be assigned as same way as the scan chains. This core

wrapper design problem is known to NP-hard [9].

The first step of core wrapper design is the partitioning

of scan chains in a core. The problem of PSC (Partitioning

of Scan Chains) is equivalent to the well-known problem of

multi-processor scheduling, and several algorithms are used

to solve the problem [8]. One of the widely used algorithms

is the algorithm based on Best Fit Decreasing (BFD)

heuristics [9]. In this paper, we use Largest Processing

Time (LPT) algorithm [8]. Given a set {T1, T2, •••, Tn} of

tasks, each task has the execution length of l(Ti). First, sorts

the tasks such that l(T1) ≥ l(T2) ≥ ••• ≥ l(Tn) and then

assigns the tasks in succession to the minimally loaded

processor. In the PSC problem, the task is the scan chain,

l(Ti) is the length of the scan chain, and minimally loaded

processor is the shortest wrapper scan chain. We use the

LPT algorithm instead of the widely used BFD algorithm,

since the LPT algorithm gives better results for more cases

after applying the ‘one-element-exchange’ algorithm

explained in Section 3.3.

The next step of core wrapper design is the assignment

of functional I/O ports in a core. Functional I/O ports are

assigned to wrapper scan chains using the same algorithm.

When there are no scan chains in a core, unbalanced design

can be used [6]. The unbalanced design is to assign

different number of wrapper ports to scan-in and scan-out.

Though it can reduce the length of the longest wrapper scan

chain than that by using balanced design, it requires that

TAM supports bidirectional I/Os of test data. In this paper,

the unbalanced design is used for comparison with the

results of [6] and [7].

Page 3

2. One-Element-Exchange

The results of applying the algorithm like LPT are not

always the best optimized results. Since the PSC problem

does not require real time solution, we apply an additional

‘one-element exchange’ algorithm to the results of the LPT

algorithm. The ‘one-element-exchange’ algorithm is that

first finds the longest wrapper scan chain among the current

wrapper scan chains, and then searches the other ones to

find a scan chain element which can reduce the length of

the longest wrapper scan chain through the exchange with

the scan chain element of current longest wrapper scan

chain. If such elements are found, then they are exchanged.

This procedure is repeated until that there are no

exchangeable elements in the wrapper scan chains. The

pseudo-code of the ‘one-element-exchange’ algorithm is

given in Figure 2.

1. Apply the LPT algorithm to the PSC problem of n wrapper scan chain;

2. for i = 1 to n-1

3. Sort the wrapper scan chains in decreasing order;

4. WCL = the longest wrapper scan chain;

5. l(WCL) = length of the longest wrapper scan chain;

6. WCS = the ith shortest wrapper scan chain;

7. l(WCS) = length of the ith shortest wrapper scan chain;

8. Find two scan chains SCL and SCS in WCL and WCS, respectively such that

{ l(SCL)- l (SCS) } is maximized and

{ 0 < { l (SCL) - l (SCS) } <= { l (WCL) - l (WCS)}/2 }

is satisfied;

9. if such scan chains exist then

Exchange them;

i = 0;

10. end for;

Figure 2 ‘One-element-exchange’ algorithm

521

521

520

521

521

521

521

521

521

521

520

520

520

520

520

520

520

520

520 520520

520

520

520

520

520

520

520

520

520

520

520

520

520

520

520

520

520

520 500

500

TAM 1

TAM 2

TAM 3

TAM 4

TAM 5

4662

4662

4662

5142

4661

500 500500

500

500

(a) Result of applying LPT algorithm only

521

521

520

521

521 521

521

521

521 521

520

520

520

520

520

520

520

520

520520520

520

520

520

520

520

520

520

520

520

520

520

520

520

520

520

520

520

520

500 500

TAM 1

TAM 2

TAM 3

TAM 4

TAM 5

(b) Result of applying‘one-element-exchange’algorithm to (a)

4682

4682

4683

5060

4682

500 500500500 500

Figure 3. Before and after applying ‘one-element-exchange’

algorithm for core 6 in p93791

By using this algorithm, it is possible to reduce the

length of the longest wrapper scan chain, therefore the test

application time is also reduced for the core. The run time

of this algorithm is less than 1 seconds for all cores in any

benchmark. Figure 3 illustrates such an example for the

core 6 from p93791 benchmark. The result using only the

LPT algorithm for the 5 TAM width shows that the length

of the longest wrapper scan chain is 5142 bits. However,

the result after applying the ‘one-element-exchange’

algorithm is 5060 bits. Also the test application time is

reduced by 17958 clock-cycles from 1126316 clock-cycles

to 1108358 clock-cycles.

In case of using an algorithm based on BFD instead of

LPT before applying the ‘one-element-exchange’ algorithm,

better results can be obtained for some cases. If the

minimum value among the results of these algorithms is

selected, more optimized wrapper design will be achieved.

However, only the LPT algorithm is used for experiments

in this paper.

IV. RAIN Scheduling Algorithm

1. Excess-Area

One can obtain some relations between TAM width and

test time by the use of the equation presented in Section 2

after the core wrapper design in Section 3. The test time

varies with TAM width as a staircase function [9], and only

pareto-optimal points [9] among them are considered for

the SoC test scheduling.

We add a new factor, ‘excess-area’ to the relation

between TAM width and test time. The ‘excess-area’ is

obtained by the following sequences. First, the value

corresponding to the area of a rectangle is calculated by the

multiplication of TAM width and test time for each TAM

width case, and then the ‘excess-area’ is calculated by

subtraction of the smallest area from each area. The

smallest area is generally the case of 1 TAM width, but it is

not always because of using the unbalanced design. An

example is presented in Table 1 for some pareto-optimal

points of the core 6 from p93791. By using the ‘excess-

area’ information, we select the elements which have too

large ‘excess-area’ over the designated limit, and exclude

them from the candidates set for wrapper design selection,

since they increase the probability of obtaining less

optimized scheduling result and the run time of the

algorithm. We set the limit of the ‘excess-area’ as Figure 4.

Cores are divided two (large and small) groups and the

limit values of the ‘excess-area’ are set to be different in

two groups, since the value of the large group is too large

Page 4

for the small group. Though we set the limit from the

results of d695 benchmark, it is well applied to other

benchmarks.

Table 1. ‘Excess-area’ calculation for the pareto-optimal

points of the core 6 from p97391

TAM

Width

(1)

Longest

Wrapper

Scan-In

Chain (2)

Longest

Wrapper

Scan-Out

Chain (3)

Test Time (4)

= ((2)+1) x

218 + (3)

Area (5)

= (4)x(1)

Excess-Area

= (5) -

5317007

1 24278 24185 5317007

5317007

5317226

0

2 12139 12093 2658613 219

3 8180 8180 1791638 5374914 57907

4 6180 6180 1353638 5414552 97545

5 5060 5060 1108358 5541790 224783

13 2060 2060 451358 5867654 550647

15 2000 2000 438218 6573270 1256263

23 1056 1052 231478 5323994 6987

24

43

46

1040

1000

528

1040

1000

526

227978

219218

115848

5471472

9426374

5329008

154465

4109367

12001

1. Obtain average_area from TAM 1 areas of each core;

2. Divide cores into two groups :

for each core

if (TAM 1 area >= 80% of average_area) then

core is member of Group_L (Large);

else

core is member of Group_S (Small);

3. Find the smallest excess-area for cores in Group_L (SEAL) which satisfies that at

least one pareto-optimal whose time and TAM width is less or equal than target

ones exists;

4. Limitation_of_Group_L = SEAL • 1.4;

5. Find the smallest excess-area for cores in Group_S (SEAS) as same way as

Group_L;

6. if (SEAS < SEAL /2) then

Limitation_of_Group_S = Limitation_of_Group_L / 2;

else

Limitation_of_Group_S = SEAS;

Figure 4. Setting the limit of the ‘excess-area’

2. Basic Idea of RAIN Algorithm

Basic idea of the RAIN algorithm comes from the

following characteristic of sequence pair. As shown in

Figure 5, the relative placement of rectangles (Figure 5(a))

is not changed by inserting a new element into the arbitrary

positions of sequence pair. Therefore it is possible to

schedule a new core without breaking down the well-

scheduled results until now. It is only required to check

whether all constraints are satisfied after each insertion.

b

c

a

a

bc

a

b

c

a

bc

a

b

d

c

d

a

b

c

a

bcd

b

c

d

a

(a) (Γ+,Γ- )=(abc, abc) (b) (Γ+,Γ- )=(abdc, abdc)

b

c

d

a

a

b

d

c

d

a

b

c

a

bc

d

b

c

d

a

a

b

d

c

d

a

b

c

a

bc

d

(c) (Γ+,Γ- )=(abcd, adbc) (d) (Γ+,Γ- )=(abcd, dabc)

Figure 5. Insertion of a new element, d, into the arbitrary

positions of sequence pair

3. RAIN Algorithm

The algorithm used in this paper is named the

‘RAIN’(RAndom INsertion) because it only uses the

insertion of a new core into an arbitrary position in a

sequence pair when scheduling a new core. A brief

description of the algorithm is shown in Figure 6.

After the determination of the target TAM width and

initial test time, wrapper design is selected under

constraints, and core order for scheduling is determined

(line 2 ~ 10 in Figure 6). Then initial solution set S_4 which

satisfies the constraints is made for the first 4 cores by

random selection and swapping in sequence pairs (line 11).

The 5th core is inserted in the elements of the current

solution set, S_4, and solution set S_5 is generated (line 13).

This insertion procedure is repeated until last core is

inserted (line 14). If the scheduling is successful, target test

time is decreased and scheduling is reissued. If not, target

test time is increased and scheduling is reissued (line 5 ~ 17

loop).

Page 5

1. N = Number of Cores in a SoC

2. Design wrapper;

3. TW = Target TAM width;

4. Select wrapper designs under constraints (TW);

5. do

6. TT = Set target test time;

7. Select wrapper designs under constraints (TW, TT);

8. EA = Set excess-area;

9. Select wrapper designs under constraints (TW, TT, EA);

10. Determine core order for scheduling;

11. S_4 = Make initial solution set for first 4 cores;

12. do

13. S_5 = Make solution set by inserting 5th core into S_4;

14. for i = 6 to N

S_i = Make solution set by inserting ith core into S_(i-1);

15. if (Scheduling succeeds) then

break;

16. while (Different element in S_5 can be generated);

17. while (Target test time can be adjusted);

Figure 6. RAIN scheduling algorithm

Designing wrapper (line 2 in Figure 6) and setting

excess-area (line 8) are explained in Section 3 and 4.1,

respectively. Setting the target test time (line 6) is used for

increasing and decreasing the target test time of scheduling

between initial and minimum time until the amount of time

change is decreased to less than the designated limit. The

amount of time change is divided by 2 at every setting.

Determining core order for scheduling (line 10) is

performed by sorting the test times of the cores for TAM 1

in decreasing order so as to schedule the largest core first.

If the number of pareto-optimal points for a core in current

wrapper design set is equal or less than 5, the core’s test

time used for ordering is multiplied by 2. Since inserting

the core which has small number of pareto-optimal points

later on is not good for the successful scheduling, it needs

to be scheduled fast. Making a solution set S_i (line 13 ~

14) is performed by inserting the ith core into the elements

of S_(i-1). The ith core with randomly selected wrapper

design is inserted into randomly selected position in the

sequence pair of the element of S_(i-1). All constraints

(target time and TAM) are checked at every insertion, and

if they are satisfied, the generated element is included in

S_i. This procedure is repeated until the number of

elements in S_i or insertion failures becomes the designated

number.

Since the RAIN algorithm randomly selects the position

and the wrapper design which will be inserted, its results

can be different every runs. To reduce the randomness,

initial solution set, S_4, is generated as many as possible

(line 11 in Figure 6) and S_5 is generated for all cases (line

12 ~ 16 loop). However, the run time of the algorithm is

increased as a result of such adjustments. One of the ways

used to reduce such an increase of run time is to check the

sum of minimum area and excess-areas at every insertion.

Since the element whose sum of excess-areas is large has

low probability to succeed in scheduling into target test

time, it is not selected for the next solution set.

V. Experimental Results

The experimental results on ITC ’02 benchmarks [4] are

presented in Table 2. We set the number of the elements of

the solution set S_i to 20 except for S_4. The experimental

results are obtained using a 1.2 Ghz Sun Blade 2000

workstation. All experiments are repeated 10 times, and the

minimum and maximum values of time results are

presented. The average run times of scheduling are

presented in Table 3.

As seen in the results, our test time results are the lowest

for most of the cases. However, the run times of the

algorithm vary from less than 5 seconds to more than 20

minutes. The long run time is because of the repetitive runs

to reduce the influence of randomness. Another cause of

large run time is that too much time is spent to know that

the target time is too small for successful scheduling. Table

4 shows such a case. As seen in Table 4, target time less

than 422345 fails to schedule. However, too much time

(663), more than two times of all successful scheduling

times (245), is required to know that to schedule into the

time less than 422345 is impossible. To reduce such time is

needed to run the scheduling algorithm more quickly.

To run more quickly, setting initial value of the target

test time to the result of previous paper can be used. It

reduces the number of resetting target test time and

reissuing the scheduling. Another way of the quick run is to

set the lower limit value of time change to be large (6 in

our experiments). Though it can reduce the number of

scheduling failures after final result time, it also can be an

obstacle to obtain a more optimized result time.

VI. Conclusion

In this paper, we present a new algorithm to minimize

test application time of a SoC. First, the ‘one-element-

exchange’ algorithm is used for optimizing the wrapper

Page 6

design of the cores. Then the ‘RAIN’ algorithm which is

motivated by the insertion characteristic of the sequence

pair is applied to the optimized wrapper design in order to

minimize the test application time of a SoC. The

experiments are conducted on ITC ’02 benchmark, and its

results are presented. Though our algorithm requires large

run time and its result time is not fixed because of the

randomness, it gives the best results for most of the

benchmarks.

References

[1] IEEE P1500 Website. http://grouper.ieee.org/groups/1500/.

[2] K. Chakrabarty, "Test Scheduling for Core-Based

Systems Using Mixed-Integer Linear Programming," IEEE

TCAD, pp. 1163-1174, 2000.

[3] V. Iyengar, K. Chakrabarty and E. J. Marinissen, "On

using Rectangle Packing for SOC Wrapper/TAM Co-

optimization," VTS, pp.253-258, 2002.

[4] E. J. Marinissen, V. Iyengar and K. Chakrabarty,

ITC'02 SoC Test Benchmarks,

http://www.extra.research.philips.com/itc02socbench/.

[5] Y. Huang, W.-T. Cheng, C.-C. Tsai, N, N. Mukherjee,

O. Samman, Y. Zaidan and S. M. Reddy, “Resource

Allocation and Test Scheduling for Concurrent Test of

Core-Based SOC Design,” ATS, pp. 265-270, 2001

[6] W. Zou, S. R. Reddy, I. Pomeranz and Y. Huang, "SOC

Test Scheduling Using Simulated Annealing," VTS 2003

[7] Y. Xia, M. Chrzanowska-Jeske, B. Wang and M Jeske,

"Using a Distributed Rectangle Bin-Packing Approach for

Core-based SoC Test Scheduling with Power Constraints,"

ICCAD, pp. 100-105, 2003

[8] E. J. Marinissen, S. K. Goel and M. Lousberg,

"Wrapper Design for Embedded Core Test," ITC, pp. 911-

920, 2000.

[9] V. Iyengar, K. Chakrabarty and E. J. Marinissen, "Test

Wrapper and Test Access Mechanism Co-Optimization for

System-op-Chip," ITC, pp. 1023-1032, 2001

[10] H. Murata, K. Fujiyoshi, S. Nakatake and Y. Kajatani,

"VLSI Module Placement Based on Rectangle-Packing by

the Sequence-Pair," IEEE TCAD, pp.1518-1524, 1996.

[11] S. Koranne and V. Iyengar, "On the Use of k-tuples

for SoC Test Schedule Representation," ITC, pp. 539-548,

2002.

[12] S. K. Goel, E. J. Marinissen, "Effective and Efficient Test

Architecture Design for SoCs," ITC, pp. 529-538, 2002.

Table 2. Test Application Times for ITC ’02

Benchmarks with Unbalanced Design

Number of TAM Wires

32 40

20948 16852

20962 16853

21014 16908

21142 17015

21161 16993

214354

173637

215005 174154

226545

167792

228732 183133

218855 175946

544579 544579

545432 544579

544579 544579

544579 544579

544579 544579

870059 701204

877354 708035

886038 706820

900388 724758

878493 718005

Benchmark Algorithm

16

41442

41519

41553

41809

41604

422345

422771

438783

438619

438619

939855

939855

939855

939855

944768

1742995

1743426

1754980

1754980

1757452

Table 3. Run times for ITC ’02 benchmarks with unbalanced design (seconds)

24

27725

27767

27982

27989

28064

284632

285120

292824

289237

289287

625543

627577

641514

637263

628602

1157974

1162500

1171190

1184630

1169945

48

14182

14182

14240

14236

14182

145781

145781

153260

153525

147944

544579

546938

544579

544579

544579

587907

595174

600986

611029

594575

56

12084

12092

11988

12134

12085

126548

127421

133094

130949

126947

544579

544579

544579

544579

544579

500976

506389

501057

520868

509041

64

10628

10711

10571

10788

10723

112620

113620

117638

116625

109591

544579

546650

544579

544579

544579

441786

454229

445748

458389

447974

RAIN min

RAIN max

EA C [7]

EA nC [7]

SA [6]

RAIN min

RAIN max

EA C [7]

EA nC [7]

SA [6]

RAIN min

RAIN max

EA C [7]

EA nC [7]

SA [6]

RAIN min

RAIN max

EA C [7]

EA nC [7]

SA [6]

d695

10cores

p22810

30cores

p34392

21cores

p93791

32cores

Number of TAM Wires

3240

190 236

11.3 12.8

1457 963

278.0207.6

19

1.1 0.5

655414

84.0121.2

Benchmark

16

95

19.8

782

227.5

214

16.7

177

33.7

24

226

24.5

1537

407.5

377

61.2

1077

182.0

48

262

28.6

82

33.4

3

0.6

288

64.4

56

260

11.3

312

112.3

2

0.6

543

127.5

64

132

25.7

131

40.8

1

0.5

116

27.7

mean

stddev

mean

stddev

mean

stddev

mean

stddev

d695

p22810

2

p34392

p93791

Table 4. Run time of scheduling p22810

benchmark using 16 TAM wires

Target

Time

Target

Time

Change

•

-51080

-25540

-12770

-6385

-3192

-1596

-798

+399

-199

+99

+49

+24

+12

Sum

Result

Time

Run time

(success)

Run time

(failure)

523322

472242

446702

433932

427547

424355

422759

421961

422360

422161

422260

422309

422333

422345

519366

470830

446211

433124

426847

424239

422742

Failed

422345

Failed

Failed

Failed

Failed

•

2

2

4

6

9

22

52

•

148

•

•

•

•

•

•

•

•

•

•

•

•

69

•

128

146

139

181

•

245 663