Publications (30)27.28 Total impact

Article: Application of GRAPE9MPX for High Precision Calculation in Particle Physics and Performance Results
[Show abstract] [Hide abstract]
ABSTRACT: There are scientific applications which require calculations with high precision such as Feynman loop integrals and orbital integrations. These calculations also need to be accelerated. We have been developing dedicated accelerator systems which consist of processing elements (PE) for high precision arithmetic operations and a programing interface. GRAPE9MPX is our latest system with multiple Field Programmable Gate Array (FPGA) boards on which our developed PEs are implemented. We present the performance results for GRAPE9MPX extended to have up to 16 FPGA boards for quadruple, hexuple, and octuple precision calculation. The achieved performance for a Feynman loop integral with 12 FPGA boards is 26.5 Gflops for quadruple precision, 13.2 Gflops for hexuple precision, and 6.36 Gflops for octuple precision. We show that our hardware implementation is 80  200 times faster than software implementations. We also give analysis of the performance results.Procedia Computer Science 01/2015; 51:13231332. DOI:10.1016/j.procs.2015.05.317  [Show abstract] [Hide abstract]
ABSTRACT: We analyse the effect of financial development and globalization (i.e. the reduction of trade costs) on income distribution when a financial institution is imperfect. Financial imperfection creates income inequality, by benefiting borrowers (entrepreneurs) and harming lenders through its effect of lowering the capital rental rate. We show that globalization changes income for both borrowers and lenders in the same direction. However, the poorer is the financial institution, the greater its effect for entrepreneurs and the smaller for lenders. We also examine the effect of financial development and globalization in an enriched model where individuals are different in their abilities as well as their capital endowments. We show that financial development mitigates capital misallocation while the reduction of trade costs will not improve efficiency.Pacific Economic Review 12/2014; 19(5). DOI:10.1111/14680106.12086 · 0.56 Impact Factor  [Show abstract] [Hide abstract]
ABSTRACT: Higher order corrections in perturbative quantum field theory are required for precise theoretical analysis to investigate new physics beyond the Standard Model. This indicates that we need to evaluate Feynman loop diagram with multiloop integral which may require multiprecision calculation. We developed a dedicated accelerator system for multiprecision calculation (GRAPE9MPX). We present performance results of our system for the case of Feynman twoloop box and threeloop selfenergy diagrams with multiprecision.Journal of Physics Conference Series 10/2014; 608. DOI:10.1088/17426596/608/1/012011  [Show abstract] [Hide abstract]
ABSTRACT: We examine the gravitational capture probability of colliding particles in circumplanetary particle disks and accretion rates of small particles onto an embedded moonlet, using analytic calculation, threebody orbital integrations, and Nbody simulations. Expanding our previous work, we take into account the Rayleigh distribution of particles' orbital eccentricities and inclinations in our analytic calculation and orbital integration and confirm agreement between them when the particle velocity dispersion is comparable to or larger than their mutual escape velocity and the ratio of the sum of the physical radii of colliding particles to their mutual Hill radius () is much smaller than unity. As shown by our previous work, the capture probability decreases significantly when the velocity dispersion is larger than the escape velocity and/or . Rough surfaces of particles can enhance the capture probability. We compare the results of threebody calculations with Nbody simulations for accretion of small particles by an embedded moonlet and find agreement at the initial stage of accretion. However, when particles forming an aggregate on the moonlet surface nearly fill the Hill sphere, the aggregate reaches a quasisteady state with a nearly constant number of particles covering the moonlet, and the accretion rate is significantly reduced compared to the threebody results.The Astronomical Journal 06/2013; 146(2):25. DOI:10.1088/00046256/146/2/25 · 4.05 Impact Factor  [Show abstract] [Hide abstract]
ABSTRACT: Using local Nbody simulations, we investigate the process of accretion of small ring particles onto a larger moonlet in Saturn's rings.  [Show abstract] [Hide abstract]
ABSTRACT: Viscosity in planetary rings arises from collisional and gravitational interactions between constituent particles. Angular momentum is transferred mainly through mutual collisions and gravitational encounters when the optical depth of the ring is low, while the formation of gravitational wakes significantly enhances the viscosity in dense rings (Daisaka, Tanaka, Ida 2001, Icarus 154, 296). On the basis of the formulation derived by Tanaka, Ohtsuki, and Daisaka (2003, Icarus 161, 144) and using analytic calculation, threebody orbital integration and Nbody simulation, we investigate viscosity in selfgravitating planetary rings, both with and without the effect of particle spins. In the case of rings with low optical depth, we confirmed agreement between results of threebody calculation and Nbody simulation. When the optical depth is low and the effect of particles' mutual gravity is weak, particles' surface friction with a reasonable range of a friction parameter tends to reduce their random velocity and, consequently, viscosity as well. However, in dense rings with gravitational wakes, the effect of selfgravity is dominant. We obtain ring viscosity for a wide range of parameter values, and derive an approximate expression which reproduces our numerical results. This work was supported by NASA's Outer Planets Research Program and the Cassini Project.The Astronomical Journal 05/2012; 143(5). DOI:10.1088/00046256/143/5/110 · 4.05 Impact Factor  [Show abstract] [Hide abstract]
ABSTRACT: Using local Nbody simulation, we examine viscosity of planetary rings consisting of spinning, selfgravitating particles for a wide range of parameters, including the cases of dense rings with temporary aggregate formation. 
Conference Paper: GRAPE8  An accelerator for gravitational Nbody simulation with 20.5Gflops/W performance
[Show abstract] [Hide abstract]
ABSTRACT: In this paper, we describe the design and performance of GRAPE8 accelerator processor for gravitational Nbody simulations. It is designed to evaluate gravitational interaction with cutoff between particles. The cutoff function is useful for schemes like TreePM or ParticleParticle ParticleTree, in which gravitational force is divided to shortrange and longrange components. A single GRAPE8 processor chip integrates 48 pipeline processors. The effective number of floatingpoint operations per interaction is around 40. Thus the peak performance of a single GRAPE8 processor chip is 480 Gflops. A GRAPE8 processor card houses two GRAPE8 chips and one FPGA chip for PCIExpress interface. The total power consumption of the board is 46W. Thus, theoretical peak performance per watt is 20.5 Gflops/W. The effective performance of the total system, including the host computer, is around 5Gflops/W. This is more than a factor of two higher than the highest number in the current Green500 list.High Performance Computing, Networking, Storage and Analysis (SC), 2012 International Conference for; 01/2012 
Conference Paper: GRAPEMPs: Implementation of an SIMD for Quadruple/Hexuple/OctuplePrecision Arithmetic Operation on a Structured ASIC and an FPGA
[Show abstract] [Hide abstract]
ABSTRACT: We describe the design and performance of the GRAPEMPs, a series of SIMD accelerator boards for quadruple/hexuple/octupleprecision arithmetic operations. Basic design of GRAPEMPs is that it consists of a number of processing elements (PE) and memory components which handle data with quadruple/hexuple/octupleprecision. A GRAPEMPs processor is implemented on a structured ASIC chip and an FPGA chip. GRAPEMP (quadrupleprecision) uses a structured ASIC chip from eASIC corp., which has 6 PE and operates with 100MHz clock cycle. The theoretical peak quadrupleprecision performance of the single board is 1.2 Gflops and the achieved performance for the Feynman loop integrals is about 0.5 Gflops. GRAPEMP4/6/8 (quadruple/hexuple/octupleprecision) uses an FPGA chip from Aletra corporation. For example, in the current implementation, MP8 has 10 PE with 70MHz operation clock cycle. We also present the performance results with the multiple GRAPEMPs boards. The achieved performance of four MP8 boards is about 1.6 Gflops. It is roughly 90 times faster than the performance of a single core of a CPU with comparable precision. We show that our hardware based approach to evaluate the Feynman loop integrals in high precision arithmetic operations is highly effective.Embedded Multicore Socs (MCSoC), 2012 IEEE 6th International Symposium on; 01/2012  [Show abstract] [Hide abstract]
ABSTRACT: Ultraluminous infrared galaxies (ULIRGs) with multiple ($\ge 3$) nuclei are frequently observed. It has been suggested that these nuclei are produced by multiple major mergers of galaxies. The expected rate of such mergers is, however, too low to reproduce the observed number of ULIRGs with multiple nuclei. We have performed highresolution simulations of the merging of two gasrich disk galaxies. We found that extremely massive and compact star clusters form from the strongly disturbed gas disks after the first or second encounter between the galaxies. The mass of such clusters reaches $\sim 10^8 M_{\odot}$, and their halfmass radii are $2030 \rm{pc}$. Since these clusters consist of young stars, they appear to be several bright cores in the galactic central region ($\sim \rm{kpc}$). The peak luminosity of these clusters reaches $\sim 10%$ of the total luminosity of the merging galaxy. These massive and compact clusters are consistent with the characteristics of the observed multiple nuclei in ULIRGs. Multiple mergers are not necessary to explain multiple nuclei in ULIRGs.The Astrophysical Journal 11/2011; 746(1). DOI:10.1088/0004637X/746/1/26 · 6.28 Impact Factor  [Show abstract] [Hide abstract]
ABSTRACT: Saturn's rings are composed of many icy particles, and angular momentum is transported due to collision and gravitational interaction between these particles. Viscosity in the rings arising from such interactions between particles governs the rate of dynamical evolution and structure formation in the rings. We examine the effect of surface friction on the viscosity and the dependence of the viscosity on optical depth and distance from Saturn, using local Nbody simulation. We derive semianalytic formula that approximately reproduces our numerical results.  [Show abstract] [Hide abstract]
ABSTRACT: To investigate dynamic paths of network formation of free trade agreements (FTAs), we conduct simulations of Goyal and Joshi's (2006) model of FTA network formation game with many countries. We compare the results between two protocols regarding the choice of the pair that decides if they form (or sever) an FTA link in each round in the network evolution. The first protocol is the random protocol in which a pair of countries is randomly chosen in each period. The other protocol is the maximum protocol in which the country that has the largest incentive can propose to form or sever an FTA link. We find that with the random protocol, FTA evolution processes reach the complete FTA network in some cases but its likelihood becomes smaller if the number of countries grows. With the maximum protocol, the network evolution always ends with a unique final FTA network for each case of n ∈ {3, 4, · · · , 100} countries. The final FTA network may or may not be the complete network.Review of International Economics 07/2011; 22(3). DOI:10.1111/roie.12126 · 0.63 Impact Factor 
Article: Angular Momentum Transport in Planetary Rings: Effects of SelfGravity and Spins of Particles
[Show abstract] [Hide abstract]
ABSTRACT: Using local Nbody simulation for planetary rings consisting of selfgravitating particles with surface friction, we examine the dependence of viscosity on various parameters such as optical depth and normal and tangential restitution coefficients.  [Show abstract] [Hide abstract]
ABSTRACT: As to April 2010, 48 TNO (transNeptunian Object) binaries have been found. This is about 6% of known TNOs. However, in previous theoretical studies of planetary formation in the TNO region, the effect of binary formation has been neglected. TNO binaries can be formed through a variety of mechanisms, such as threebody process, dynamical friction on two massive bodies, inelastic collisions between two bodies etc. Most of these mechanisms become more effective as the distance from the Sun increases. In this paper, we studied threebody process using direct Nbody simulations. We systematically changed the distance from the Sun, the number density of planetesimals, and the radius of the planetesimals and studied the effect of the binaries on the collision rate of planetesimals. In the TNO region, binaries are involved in 1/3  1/2 of collisions, and the collision rate is increased by about a factor of a few compared to the theoretical estimate for the direct twobody collisions. Thus, it is possible that the binaries formed through threebody process significantly enhance the collision rate and reduce the growth time scale. In the terrestrial planet region, binaries are less important, because the ratio between the Hill radius and physical size of the planetesimals is relatively small. Although the time scale of our simulations is short, they clearly demonstrated that the accretion process in the TNO region is quite different from that in the terrestrial planet region. Simulations which cover longer time scale are required to obtain more accurate estimate for the accretion enhancement.Publications Astronomical Society of Japan 02/2011; DOI:10.1093/pasj/63.6.1331 · 2.01 Impact Factor  [Show abstract] [Hide abstract]
ABSTRACT: We studied the formation process of star clusters using highresolution Nbody/smoothed particle hydrodynamcs simulations of colliding galaxies. The total number of particles is 1.2x10^8 for our high resolution run. The gravitational softening is 5 pc and we allow gas to cool down to \sim 10 K. During the first encounter of the collision, a giant filament consists of cold and dense gas found between the progenitors by shock compression. A vigorous starburst took place in the filament, resulting in the formation of star clusters. The mass of these star clusters ranges from 10^{58} Msun. These star clusters formed hierarchically: at first small star clusters formed, and then they merged via gravity, resulting in larger star clusters.Proceedings of the International Astronomical Union 01/2011; 270. DOI:10.1017/S1743921311000846  [Show abstract] [Hide abstract]
ABSTRACT: We describe the design and performance of the GRAPEMP board, an SIMD accelerator board for quadrupleprecision arithmetic operations. A GRAPEMP board houses one GRAPEMP processor chip and an FPGA chip which handles the communication with the host computer. A GRAPEMP chip has 6 processing elements (PE) and operates with 100MHz clock cycle. Each PE can perform one addition and one multiplication in every clock cycle. The architecture of the GRAPEMP is similar to that of the GRAPEDR. It is implemented using the structured ASIC chip from eASIC corp. A GRAPEMP processor board has the theoretical peak quadrupleprecision performance of 1.2 Gflops. As a preliminary result, we present the performance of the GRAPEMP board for two target applications. The performance of the numerical integration of Feynman loop is 0.53 Gflops. The performance of a Nbody simulation with the second order leapfrog schema is 0.505 Gflops for N = 1984, which is more than 10 times faster than the performance of the host computer.Procedia Computer Science 01/2011; 4:878887. DOI:10.1016/j.procs.2011.04.093  [Show abstract] [Hide abstract]
ABSTRACT: We describe the implementation and performance of dense matrix multiplication and LU decomposition on the GRAPEDR SIMD accelerator board. A GRAPEDR card, with 4 GRAPEDR chips, has the theoretical peak DP performance of 819 Gflops. Each GRAPEDR chip has 512 processing elements and operates with 400MHz clock cycle. each PE can perform one addition and one multiplication in every two clock cycles. The measured performance of matrix multiplication is 730 Gflops for the multiplication of matrices with size 51200 by 2048 and 2048 by 51200. The performance of LU decomposition is 480 Gflops for the problem size of 51200.Procedia Computer Science 01/2011; 4:888897. DOI:10.1016/j.procs.2011.04.094  [Show abstract] [Hide abstract]
ABSTRACT: We performed high resolution simulations of galaxygalaxy merging in order to investigate the mechanism of a starburst. The multiphase nature of the interstellar medium is correctly taken into account in our model, which allows the use of a realistic star formation model. In these simulations, we for the first time found that the shockinduced starburst involving star cluster (SC) formation takes place during the first encounter. Detailed analyses show that the SC formation is mainly driven by hierarchical mergings of proto (smaller) SCs. Our result implies that SCs can become much more massive than the local Jeans mass of the SC forming region.06/2010;  [Show abstract] [Hide abstract]
ABSTRACT: We investigated the evolution of interacting disk galaxies using highresolution $N$body/SPH simulations, taking into account the multiphase nature of the interstellar medium (ISM). In our highresolution simulations, a largescale starburst occurred naturally at the collision interface between two gas disks at the first encounter, resulting in the formation of star clusters. This is consistent with observations of interacting galaxies. The probability distribution function (PDF) of gas density showed clear change during the galaxygalaxy encounter. The compression of gas at the collision interface between the gas disks first appears as an excess at $n_{\rm H} \sim 10{\rm cm^{3}}$ in the PDF, and then the excess moves to higher densities ($n_{\rm H} \gtrsim 100{\rm cm^{3}}$) in a few times $10^7$ years where starburst takes place. After the starburst, the PDF goes back to the quasisteady state. These results give a simple picture of starburst phenomena in galaxygalaxy encounters.Publications Astronomical Society of Japan 06/2008; DOI:10.1093/pasj/61.3.481 · 2.01 Impact Factor  [Show abstract] [Hide abstract]
ABSTRACT: We performed 3dimensional Nbody/SPH simulations to study how mass resolution and other model parameters such as the star formation efficiency parameter, C* and the threshold density, nth affect structures of the galactic gaseous/stellar disk in a static galactic potential. We employ 10^6  10^7 particles to resolve a cold and dense (T < 100 K & n_H > 100 cm^{3}) phase. We found that structures of the ISM and the distribution of young stars are sensitive to the assumed nth. Highnth models with nth = 100 cm^{3} yield clumpy multiphase features in the ISM. Young stars are distributed in a thin disk of which halfmass scale height is 10  30 pc. In lownth models with nth = 0.1 cm^{3}, the stellar disk is found to be several times thicker, and the gas disk appears smoother than the highnth models. A highresolution simulation with highnth is necessary to reproduce the complex structure of the gas disk. The global properties of the model galaxies in lownth models, such as star formation histories, are similar to those in the highnth models when we tune the value of C* so that they reproduce the observed relation between surface gas density and surface star formation rate density. We however emphasize that highnth models automatically reproduce the relation, regardless of the values of C*. The ISM structure, phase distribution, and distributions of young star forming region are quite similar between two runs with values of C* which differ by a factor of 15. We also found that the timescale of the flow from n_H ~1 cm^{3} to n_H > 100 cm^{3} is about 5 times as long as the local dynamical time and is independent of the value of C*. The use of a highnth criterion for star formation in highresolution simulations makes numerical models fairy insensitive to the modelling of star formation. (Abridged)Publications Astronomical Society of Japan 03/2008; DOI:10.1093/pasj/60.4.667 · 2.01 Impact Factor
Publication Stats
164  Citations  
27.28  Total Impact Points  
Top Journals
Institutions

2008–2014

Hitotsubashi University
Edo, Tōkyō, Japan


1999–2001

Tokyo Institute of Technology
 Earth and Planetary Sciences Department
Edo, Tōkyō, Japan
