NAOSITE: Nagasaki University's Academic Output SITENAOSITE: Nagasaki University's Academic Output SITE
FPGA implementation of a data-driven stochastic biochemical
simulator with the next reaction method
Yoshimi, Masato; Iwaoka, Yow; Nishikawa, Yuri; Kojima,
Toshinori; Osana, Yasunori; Funahashi, Akira; Hiroi, Noriko;
Shibata, Yuichiro; Iwanaga, Naoki; Yamada, Hideki; Kitano,
Hiroaki; Amano, Hideharu
2007 International Conference on Field Programmable Logic
and Applications, pp.254-259
2007 International Conference on Field Programmable Logic and
Applications : Amsterdam, Netherlands, 2007.08.27-2007.08.29
(c)2007 IEEE. Personal use of this material is permitted. However,
permission to reprint/republish this material for advertising or promotional
purposes or for creating new collective works for resale or redistribution to
servers or lists, or to reuse any copyrighted component of this work in other
works must be obtained from the IEEE.
1000 512 256 128 64
Model Size [N]
Clock cycle per reaction
Fig. 7. Average clock cycles per reaction cycle versus num-
ber of TPU
Based on the analysis above, an improved interconnection
is expected to suppress the degradation of the operating fre-
quency. One idea is to replace the multiplexers with a hier-
archical bus structure.
affect the throughput in such structure, therefore we also
evaluated an average clock cycles per one reaction cycle for
different model sizes N in case of T = 16 and T = 30. The
result is shown in Fig.7. The figure tells that the increase
of the clock cycles were small and follows O(log(N)). In
case of D4S model, the number of calculation and update of
heap tree is same among all model sizes. Thus, Fig.7 corre-
sponds to the difference in time required to update the heap
tree in case of the same T and different N, and difference
of waiting time to the multiplexer in case of the same N and
different T. Clock cycles are larger for T = 30 than T = 16
in anycase as small as 5 clockcycles. This is becausedata is
not frequently sent over the interconnection, and the length
is 2 flits at most. Thus, replacement of the interconnect from
multiplexers to the hierarchical bus structure would possi-
bly minimize throughput degradation due to the increase of
waiting time in data transfer.
These evaluation results indicate scalability of the cur-
rent NRM circuit design to the increasing model size, espe-
cially in contrast with execution on Xeon processor. Fea-
sibility of achieving higher throughput is also suggested by
modifying the structure of the interconnection network.
6. CONCLUSION AND FUTURE WORK
This paper described an FPGA-based design of a biochemi-
cal simulation circuit for performing a stochastic simulation
based on the Next Reaction Method, and evaluated its pro-
totype implementation. The circuit was designed to achieve
ning in parallel. Every module in the circuit is categorized
into a group that should be prepared for each simulation
thread and a group that are shared among multiple threads.
Currently, their interconnection is designed with a multi-
plexer, and approximately 5.2 to 8.4 times higher through-
put was obtained compared to execution on Xeon 2.80 GHz.
Some investigation results are given to suggest a feasibility
of a higher throughput design by selecting appropriate inter-
As afuture work,weareplanningtoimprovethroughput
based on the current structure. Methodology of data transfer
will also be modified from current ping-pong transmission
to a mechanism that tolerates continuous requests. Further-
more, we will analyze utilization of each arithmetic unit and
datatransferratewithseveral biochemicalmodels, andcarry
out more investigation of a suitable design for the intercon-
nection network with higher throughput and scalability.
This work is supported by VLSI Design and Education Cen-
ter(VDEC), The University of Tokyo with the collaboration
with Cadence Corporation.
 D. T. Gillespie, “A general method for numerically simulating
the stochastic time evolution of coupled chemical reactions,”
Journal of Computational Physics, vol. 22, pp. 403–434, 1976.
 J. F. Keane, C. Bradley, and C. Ebeling, “A compiled accel-
erator for biological cell signaling simulations,” in The 12th
International Symposium on Field-Programmable Gate Ar-
rays(FPGA), Feb. 2004, pp. 233–241.
 M. Yoshimi, Y. Osana, Y. Iwaoka, Y. Nishikawa, T. Kojima,
A. Funahashi, N. Hiroi, Y. Shibata, N. Iwanaga, H. Kitano, and
H. Amano, “An FPGA Implementation of High Throughput
Stochastic Simulator for Large-Scale Biochemical Systems,”
in The 16th International Conference on Field Programmable
Logic and Applications, Aug. 2006, pp. 227–232.
 M. A.Gibson and J. Bruck, “Efficient exact stochastic simula-
tion of chemical systems with many species and many chan-
nels,” Journal of Physical Chemistry A, vol. 104, no. 9, pp.
 Y. Cao, H. Li, and L. Petzold, “Efficient formulation of the
stochastic simulation algorithm for chemically recting sys-
tems,” Journal of Chemical Physics, vol. 121, no. 9, pp. 4059–
 K. Takahashi, K. Yugi, K. Hashimoto, Y. Yamada, C. J. F.
Pickett, and M. Tomita, “A multi-algorithm, multi-timescale
method for cell simulation,” Bioinformatics, vol. 20, no. 4, pp.
538–546, Mar. 2004.
 L. Salwinski and D. Eisenberg, “In silico simulation of biolog-
ical network dynamics,” Nature Biotechnology, vol. 22, no. 8,
pp. 1017–1019, Aug. 2004.
 M. Yoshimi, Y. Osana, T. Fukushima, and H. Amano,
“Stochastic simulation for biochemical reactions on FPGA,”
in The 14th International Conference on Field Programmable
Logic and Applications, ser. Lecture Notes in Computer Sci-
ence, vol. 3203. Springer, Aug. 2004, pp. 105–114.