• Home
  • IBM
  • IBM Research Tokyo
  • Tamiya Onodera
Tamiya Onodera

Tamiya Onodera
IBM · IBM Research Tokyo

Doctor of Information Science

About

69
Publications
26,505
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
903
Citations

Publications

Publications (69)
Article
Many quantum algorithms contain an important subroutine—the quantum amplitude estimation. As the name implies, this is essentially the parameter estimation problem and thus can be handled via the established statistical estimation theory. However, this problem has an intrinsic difficulty that the system, i.e., the real quantum computing device, ine...
Preprint
Many quantum algorithms contain an important subroutine, the quantum amplitude estimation. As the name implies, this is essentially the parameter estimation problem and thus can be handled via the established statistical estimation theory. However, this problem has an intrinsic difficulty that the system, i.e., the real quantum computing device, in...
Article
Full-text available
Recently we find several candidates of quantum algorithms that may be implementable in near-term devices for estimating the amplitude of a given quantum state, which is a core subroutine in various computing tasks such as the Monte Carlo methods. One of those algorithms is based on the maximum likelihood estimate with parallelized quantum circuits....
Article
Full-text available
In this paper, we propose a quantum amplitude estimation method that uses a modified Grover operator and quadratically improves the estimation accuracy in the ideal case, as in the conventional one using the standard Grover operator. Under the depolarizing noise, the proposed method can outperform the conventional one in the sense that it can in pr...
Preprint
Efficient methods for loading given classical data into quantum circuits are essential for various quantum algorithms. In this paper, we propose an algorithm called that can effectively load all the components of a given real-valued data vector into the amplitude of quantum state, while the previous proposal can only load the absolute values of tho...
Preprint
In this paper, we propose a quantum amplitude estimation method that uses a modified Grover operator and quadratically improves the estimation accuracy in the ideal case, as in the conventional one using the standard Grover operator. Under the depolarizing noise, the proposed method can outperform the conventional one in the sense that it can in pr...
Preprint
Recently we find several candidates of quantum algorithms that may be implementable in near-term devices for estimating the amplitude of a given quantum state, which is a core subroutine in various computing tasks such as the Monte Carlo methods. One of those algorithms is based on the maximum likelihood estimate with parallelized quantum circuits;...
Article
Full-text available
This paper focuses on the quantum amplitude estimation algorithm, which is a core subroutine in quantum computation for various applications. The conventional approach for amplitude estimation is to use the phase estimation algorithm, which consists of many controlled amplification operations followed by a quantum Fourier transform. However, the wh...
Preprint
This paper focuses on the quantum amplitude estimation algorithm, which is a core subroutine in quantum computation for various applications. The conventional approach for amplitude estimation is to use the phase estimation algorithm which consists of many controlled amplification operations followed by the quantum Fourier transform. The whole proc...
Conference Paper
Work-stealing is promising for scheduling and balancing parallel workloads. It has a wide range of applicability on middleware, libraries, and runtime systems of programming languages. OpenJDK uses work-stealing for copying garbage collection (GC) to balance copying tasks among GC threads. Each thread has its own queue to store tasks. When a thread...
Article
Work-stealing is promising for scheduling and balancing parallel workloads. It has a wide range of applicability on middleware, libraries, and runtime systems of programming languages. OpenJDK uses work-stealing for copying garbage collection (GC) to balance copying tasks among GC threads. Each thread has its own queue to store tasks. When a thread...
Patent
Full-text available
A system including multiple application servers for accessing shared data and a centralized control unit for centrally controlling a lock applied to the shared data by each of the application servers. Each application server includes a distributed control unit for controlling a lock applied to the shared data by the application server and a selecti...
Patent
Full-text available
A method, system, and program for recording an object allocation site. In the structure of an object, a pointer to a class of an object is replaced by a pointer to an allocation site descriptor which is unique to each object allocation site, a common allocation site descriptor is used for objects created at the same allocation site, and the class o...
Patent
Full-text available
A computer implemented control method, article of manufacture, and computer implemented system for determining whether stack allocation is possible. The method includes: allocating an object created by a method frame to a stack. The allocation is performed in response to: calling a first and second instruction in the method frame; the first instruc...
Patent
Full-text available
A computer implemented control method, article of manufacture, and computer implemented system for determining whether stack allocation is possible. The method includes: allocating an object created by a method frame to a stack. The allocation is performed in response to: calling a first and second instruction in the method frame; the first instruc...
Patent
Full-text available
A technique for comprehensively acquiring calling-context information at a low cost. Call site IDs are held for each thread as a call history and used as context information. At the time of calling a method, the call history existing in a current frame is shifted left, and stacked in a new frame, with the call site ID of the call site put in the lo...
Conference Paper
Modern storage systems employing quorum replication are often configured to use partial, non-strict quorums to prioritize performance over consistency. These systems return the most recently changed data item only from a set of replicas to respond more quickly to a read request without guaranteeing that the data item is the most recently changed fo...
Conference Paper
To increase the memory efficiency in physical servers is a significant concern for increasing the number of virtual machines (VM) in them. When similar web application service runs in each guest VM, many string data with the same values are created in every guest VMs. These duplications of string data are redundant from the viewpoint of memory effi...
Patent
Full-text available
In a multiprocessor computer system, a lock operation is maintained with a thread using non-atomic instructions. Identifiers are assigned to each thread. Flags in conjunction with the thread identifiers are used to determine the continuity of the lock with a thread. However, in the event continuity of the lock with the thread ceases, a compare-and-...
Patent
The specification of a string within source code written in a programming language is received. The source code is processed for ultimate execution of a computer program encompassing the source code, by at least performing the following. It is determined whether the string specified is a short string or a long string. The string is processed in acc...
Conference Paper
Improving memory utilization is important for improving the efficiency of a cloud datacenter by increasing the number of usable VMs. Memory over-commitment is a common technique for this purpose. Transparent Page Sharing (TPS) is a technique to improve the utilization by sharing identical memory pages to reduce the total memory consumption. For a c...
Patent
Full-text available
A high-speed web server that generates an HTML file upon receipt of an HTTP request is described. The server includes an application executor device and an HTTP server device that receives the HTTP request and sends an HTTP response to the HTTP request. A method for sending an HTTP response in a server that generates an HTML file upon receipt of an...
Conference Paper
OpenJPA is an implementation of the Java Persistence API (JPA) for Apache, with a caching layer for database queries. However the caching performance is poor when an application includes write transactions, because the OpenJPA cache-invalidation mechanism is coarse-grained and this results in a low cache hit rate. In this research, we implemented a...
Conference Paper
Full-text available
Optimizing city transportation for smarter cities can have a major impact on the quality of life in urban areas in terms of economic merits and low environmental load. In many cities of the world, transport authorities are facing common challenges such as worsening congestion, insufficient transport infrastructure, increasing carbon emissions, and...
Article
Two new techniques for improving performance of reified generics without specializing types are presented. With these techniques, the cost of method dispatch is reduced by 95% from the regular self dispatching based implementation, and the cost of returning primitive value is reduced by 15% from the regular boxing based implementation. These techni...
Article
X10 is a programming language that incorporates distributed processing functions. The execution model of X10 is called "APGAS", where each object belongs to a specific place (an abstraction of a shared-memory computer), but can be remotely referenced from other places using a mechanism named GlobalRef. This means that a remotely-referenced object m...
Article
X10 is a new programming language for improving the software productivity in the multicore era by making parallel/distributed programming easier. X10 programs are compiled into C++ or Java source code, but X10 supports various features not supported directly in Java. To implement them efficiently in Java, new compilation techniques are needed. This...
Conference Paper
Full-text available
A Java application sometimes raises an out-of-memory ex-ception. This is usually because it has exhausted the Java heap. However, a Java application can raise an out-of-memory exception when it exhausts the memory used by Java that is not in the Java heap. We call this area non-Java memory. For example, an out-of-memory exception in the non-Java me...
Conference Paper
Full-text available
Tracking the allocation site of every object at runtime is useful for reliable, optimized Java. To be used in production environments, the tracking must be accurate with minimal speed loss. Previous approaches suffer from performance degradation due to the additional field added to each object or track the allocation sites only probabilistically. W...
Conference Paper
Full-text available
Programmers who develop Web applications often use dynamic scripting languages such as Perl, PHP, Python, and Ruby. For general purpose scripting language usage, interpreter-based implementations are efficient and popular but the server-side usage for Web application development implies an opportunity to significantly enhance Web server throughput....
Article
Programmers who develop Web applications often use dynamic scripting languages such as Perl, PHP, Python, and Ruby. For general purpose scripting language usage, interpreter-based implementations are efficient and popular but the server-side usage for Web application development implies an opportunity to significantly enhance Web server throughput....
Conference Paper
Full-text available
OpenJPA is an implementation of the Java persistence API (JPA) for Apache, with a caching layer for databases queries to share cached objects among multiple client sessions. This is a critical component for high performance, since the caching layer can handle many database requests. However the performance is limited when an application includes wr...
Conference Paper
ETL (Extract-Transform-Load) processing is filling an increasingly critical role in analyzing business data and in taking appropriate business actions based on the results. As the volume of business data to be analyzed increases and quick responses are more critical for business success, there are strong demands for scalable high-performance ETL pr...
Conference Paper
Full-text available
PHP is a popular language for server-side applications. In PHP, assignment to variables copies the assigned values, according to its so-called copy-on-assignment semantics. In contrast, a typical PHP implementation uses a copy-on-write scheme to reduce the copy overhead by delaying copies as much as possible. This leads us to ask if the semantics a...
Conference Paper
Full-text available
The performance of server-side applications is becoming increasingly important as more applications exploit the Web application model. Extensive work has been done to improve the performance of individual software components such as Web servers and programming language runtimes. This paper describes a novel approach to boost Web application perform...
Conference Paper
Full-text available
This paper describes a novel approach to reduce the memory consumption of Java programs, by focusing on their "string memory inefficiencies". In recent Java applications, string data occupies a large amount of the heap area. For example, about 40% of the live heap area is used for string data when a production J2EE application server is running. By...
Conference Paper
Full-text available
PHP is well known as a programming language in the Web 2.0 era enabling agile server-side software development. It has officially supported SOAP messaging since version 5 through a C-based built-in library. In this paper we perform a thorough study of the capability of PHP as a web service engine in both qualitative and quantitative aspects while c...
Conference Paper
Full-text available
In this paper, we describe static analysis techniques for finding bugs in programs using the Java Native Interface (JNI). The JNI is both tedious and error-prone because there are many JNI-specific mistakes that are not caught by a native compiler. This paper is focused on four kinds of common mistakes. First, explicit statements to handle a possib...
Conference Paper
There is a growing need to translate large-scale legacy mainframe applications from COBOL to Java. This is to transform the applications into modern Web-based services, without sacrificing the original programming investments. Most often, COBOL-to-Java translators are used first for the base program transformations, and then corrections and fine tu...
Conference Paper
Full-text available
Java has been successful particularly for writing applications in the server environment. However, isolation of multiple applications has not been efficiently achieved in Java. Many customers require that their applications are guarded by independent OS processes, but starting a Java application with a new process results in a long sequence of init...
Conference Paper
The performance of Java has been tremendously improved by the advance of Just-in-Time (JIT) compilation technologies. However, debugging such a dynamic compiler is much harder than a static compiler. Recompiling the problematic method to produce a diagnostic output does not necessarily work as expected, because the compilation of a method depends o...
Conference Paper
Full-text available
The performance of Java has been tremendously improved by the advance of Just-in-Time (JIT) compilation technologies. However, debugging such a dynamic compiler is much harder than a static compiler. Recompiling the problematic method to produce a diagnostic output does not necessarily work as expected, because the compilation of a method depends o...
Article
Full-text available
The performance of Java has been tremendously improved by the advance of the compilation technology. However, debugging a dynamic compiler is much harder than a static compiler. Recompiling the problematic method again to produce a diagnostic output does not necessarily work because the compilation of a method depends on the runtime information at...
Article
Full-text available
Java™ has gained widespread popularity in the industry, and an efficient Java virtual machine (JVM™) and just-in-time (JIT) compiler are crucial in providing high performance for Java applications. This paper describes the design and implementation of our JIT compiler for IA-32 platforms by focusing on the recent advances achieved in the past sever...
Conference Paper
Lock reservation, a powerful optimization for Java locks, is based on the observation that, in Java, each lock tends to be dominantly acquired and released by a specific thread. Reserving a lock for such a dominant thread allows the owner thread of the lock to acquire and release the lock without any atomic read-modify-write instructions. A recentl...
Conference Paper
Full-text available
This paper describes the system overview of our Java Just-In-Time (JIT) compiler, which is the basis for the latest production version of IBM Java JIT compiler that supports a diversity of processor architectures including both 32-bit and 64-bit modes, CISC, RISC, and VLIW architectures. In particular, we focus on the design and evaluation of the c...
Conference Paper
Software prefetching is a promising technique to hide cache miss latencies, but it remains challenging to effectively prefetch pointer-based data structures because obtaining the memory address to be prefetched requires pointer dereferences. The recently proposed stride prefetching overcomes this problem, but it only exploits inter-iteration stride...
Conference Paper
Because of the built-in support for multi-threaded programming, Java programs perform many lock operations. Although the overhead has been significantly reduced in the recent virtual machines, One or more atomic operations are required for acquiring and releasing an object's lock even in the fastest cases.This paper presents a novel algorithm calle...
Conference Paper
Full-text available
Because of the built-in support for multi-threaded programming, Java programs perform many lock operations. Although the overhead has been significantly reduced in the recent virtual machines, One or more atomic operations are required for acquiring and releasing an object's lock even in the fastest cases.This paper presents a novel algorithm calle...
Article
The Java language incurs a runtime overhead for exception checks and object accesses, which are executed without an interior pointer in order to ensure safety. It also requires type inclusion test, dynamic class loading, and dynamic method calls in order to ensure flexibility. A "Just-In-Time" (JIT) compiler generates native code from Java byte cod...
Conference Paper
Full-text available
Object locking can be efficiently implemented by bi- modal use of a field reserved in an object. The field is used as a lightweight lock in one mode, while it holds a reference to a heavyweight lock in the other mode. A bimodal locking algorithm recently proposed for Java achieves the highest performance in the absence of con- tention, and is still...
Article
Object locking can be efficiently implemented by bimodal use of a field reserved in an object. The field is used as a lightweight lock in one mode, while it holds a reference to a heavyweight lock in the other mode. A bimodal locking algorithm recently proposed for Java achieves the highest performance in the absence of contention, and is still fas...
Conference Paper
Full-text available
Selector code indexing is a simple and effective way of optimizing method lookups. However, it has not been considered practically applicable in Smalltalk, because the space overhead is prohibitive. We propose a new technique called “dispatch caches indexed by selector codes” (CISCO), which maintains a small number of dispatch tables indexed by a s...
Conference Paper
Two major issues related to storing program information in an OODB are sharing and clustering. The former is important since it prevents the database from consuming excessive disk space, while the latter is crucial, since it keeps clients running without thrashing. In our database, objects are shared across multiple programs' translation units, and...
Article
A copying collector has two excellent properties: it compacts the heap, and the execution time depends solely on the number of live objects. Use of a copying collector is thought by some to be a more efficient way of managing the heap than explicit freeing of objects. This paper describes a high-performance copying collector for a hybrid object-ori...
Article
In language systems that support separate compilation, we often observe that header files are internalized over and over again when the source files that depend on them are compiled. Making a compiler a long-lived server eliminates such redundant processing of header files, thus reducing the compilation time. The paper first describes compilation s...
Article
A formalization of graphical processes in computer graphics systems is presented in terms of functions and their system of axioms. The concept of the viewing pipeline is formalized as operation sequence which is a sequential composition of graphical elementary operations. The formalization includes two kinds of operation sequences which are used as...
Conference Paper
Full-text available
The dynamic scripting language PHP has become enormously popular for implementing lightweight web applications, and is widely used as a server-side scripting language for web servers. To contrast the performance of PHP and JSP for this purpose, we used the SPECweb2005 benchmark, which provides three application scenarios implemented in both PHP and...
Article
Full-text available
This paper describes a novel approach to reduce the memory consumption of Java programs, by reducing the string memory waste in the runtime. In recent Java applications, string data occupies a large amount of the heap area. For example, more than 30% of the live heap area is used for string data when WebSphere Applica-tion Server with Trade6 is run...

Network