-
IEICE Transactions. 01/2006; 89-D:789-797.
-
IEICE Transactions. 01/2004; 87-D:1721-1728.
-
[show abstract]
[hide abstract]
ABSTRACT: In this paper we propose a distributed algorithm to construct a scatternet for multi-hop ad hoc networks of Bluetooth devices. This algorithm is fully distributed and does not require the nodes in the networks being in-range (i.e., each pair of nodes in the network may be unable to communicate with each other directly). The role-selection process in existing scatternet formation mostly uses the strategy of message exchange and comparing their weights like IDs or power strength. This results in a large amount of control messages to be sent and a longer scatternet formation time. In our algorithm, the role selection procedure is simple. Nodes can decide their role by a randomly generated counter rather than their 'weights'. According to the proposed approach, nodes can determine their role of either a master or a slave of the piconet without recognizing its neighbors' 'weight'. The algorithm performs better time and reduces the number of control messages remarkably during the role-selection process. In this paper, we also define the gateways of 2-hops and 3-hops for evaluating the distance between two piconets.
Parallel Processing Workshops, 2003. Proceedings. 2003 International Conference on; 11/2003
-
[show abstract]
[hide abstract]
ABSTRACT: With the development of mobile devices, people can execute many applications on their personal devices. However, many limitations of the mobile devices, e.g., CPU, memory, power supply, etc., make them impossible to be completely the same as the desktop PCs. In this paper we present an integrated management architecture for thin-client purpose called Agent and Profile Management System (APMS). The users only need to download a simple service agent to their mobile device and install it, and then the service agent will connect to the corresponding service provider send a request to the back-end server After receiving the request, the back-end server will execute appropriate processes and return a response to the mobile device. Most of the procedures are accomplished on the server side. Therefore, the workload of mobile devices is relatively low and the cost of mobile devices can be effectively reduced.
Advanced Information Networking and Applications, 2003. AINA 2003. 17th International Conference on; 04/2003
-
Parallel and Distributed Processing and Applications, International Symposium, ISPA 2003, Aizu, Japan, July 2-4, 2003, Proceedings; 01/2003
-
[show abstract]
[hide abstract]
ABSTRACT: In many parallel programs, run-time data redistribution is usually required to enhance data locality and reduce remote memory access on the distributed memory multicomputers. Recently researches in data redistribution algorithm have become very mature. The time required to generate data sets and processor sets is much lesser then before. That means packing/unpacking becomes a relatively heavy cost in the redistribution. In this paper we present methods to perform BLOCK-CYCLIC(s) to BLOCK-CYCLIC(t) redistribution using MPI user-defined types. In this approach, we can reduce the requirement of memory buffers and avoid unnecessary data-movement. The theoretical models are presented to determine the best method for redistribution. To evaluate the performance of the proposed methods, we have implemented our methods on an IBM SP2 parallel machine. The experimental results show that this approach can obviously improve the performance of redistribution in most cases.
Cyber Worlds, 2002. Proceedings. First International Symposium on; 02/2002
-
[show abstract]
[hide abstract]
ABSTRACT: In many scientific applications, dynamic array redistribution is
usually required to enhance the performance of an algorithm. In this
paper, we present a generalized basic-cycle calculation (GBCC) method to
efficiently perform a BLOCK-CYCLIC(s) over P processors to
BLOCK-CYCLIC(t) over Q processors array redistribution. In the GBCC
method, a processor first computes the source/destination processor/data
sets of array elements in the first generalized basic-cycle of the local
array it owns. A generalized basic-cycle is defined as lcm(sP,
tQ)/(gcd(s,t)×P) in the source distribution and lcm(sP,
tQ)/(gcd(s,t)×Q) in the destination distribution. From the
source/destination processor/data sets of array elements in the first
generalized basic-cycle, we can construct packing/unpacking pattern
tables to minimize the data-movement operations. Since each generalized
basic-cycle has the same communication pattern, based on the
packing/unpacking pattern tables, a processor can pack/unpack array
elements efficiently. To evaluate the performance of the GBCC method, we
have implemented this method on an IBM SP2 parallel machine, along with
the PITFALLS method and the ScaLAPACK method. The cost models for these
three methods are also presented. The experimental results show that the
GBCC method outperforms the PITFALLS method and the ScaLAPACK method for
all test samples. A brief description of the extension of the GBCC
method to multidimensional array redistributions is also presented
IEEE Transactions on Parallel and Distributed Systems 01/2001; · 1.40 Impact Factor
-
IEEE Trans. Parallel Distrib. Syst. 01/2000; 11:1201-1216.
-
[show abstract]
[hide abstract]
ABSTRACT: In many scientific applications, dynamic array redistribution is usually required to enhance the performance of an algorithm. We present a generalized basic cycle calculation (GBCC) method to efficiently perform a BLOCK-CYCLIC(s) over P processors to BLOCK-CYCLIC(t) over Q processors array redistribution. In the GBCC method, a processor first computes the source/destination processor/data sets of array elements in the first generalized basic cycle of the local array it owns. A generalized basic cycle is defined as lcm(sP,tQ)/(gcd(s,t)×P) in the source distribution and lcm(sP,tQ)/(gcd(s,t)×Q) in the destination distribution. From the source/destination processor/data sets of array elements in the first generalized basic cycle, we can construct packing/unpacking pattern tables. Based on the packing/unpacking pattern tables, a processor can pack/unpack array elements efficiently. To evaluate the performance of the GBCC method, we have implemented this method on an IBM SP2 parallel machine, along with the PITFALLS method and the ScaLAPACK method. The cost models for these three methods are also presented. The experimental results show that the GBCC method outperforms the PITFALLS method and the ScaLAPACK method for all test samples. A brief description of the extension of the GBCC method to multi dimensional array redistributions is also presented
Parallel and Distributed Systems, 1998. Proceedings., 1998 International Conference on; 01/1999
-
[show abstract]
[hide abstract]
ABSTRACT: Array redistribution is usually required to enhance algorithm
performance in many parallel programs on distributed memory
multicomputers. Since it is performed at run-time, there is a
performance trade-off between the efficiency of the new data
decomposition for a subsequent phase of an algorithm and the cost of
redistributing data among processors. In this paper, we present a
basic-cycle calculation technique to efficiently perform BLOCK-CYCLIC(S)
to BLOCK-CYCLIC(t) redistribution. The main idea of the basic-cycle
calculation technique is, first, to develop closed forms for computing
source/destination processors of some specific array elements in a
basic-cycle, which is defined as icm(s,t)/gcd(s,t). These closed forms
are then used to efficiently determine the communication sets of a
basic-cycle. From the source/destination processor/data sets of a
basic-cycle, we can efficiently perform a BLOCK-CYCLIC(s) to
BLOCK-CYCLIC(t) redistribution. To evaluate the performance of the
basic-cycle calculation technique, we have implemented this technique on
an IBM SP2 parallel machine, along with the PITFALLS method and the
multiphase method. The cost models for these three methods are also
presented. The experimental results show that the basic-cycle
calculation technique outperforms the PITFALLS method and the multiphase
method for most test samples
IEEE Transactions on Parallel and Distributed Systems 05/1998; · 1.40 Impact Factor
-
01/1998