Conference Paper

Characterizing parallel workloads to reduce multiple writer overhead in shared virtual memory systems

Departamento de Informatica de Sistemas y Computadores, Univ. Politecnica de Valencia
DOI: 10.1109/EMPDP.2002.994285 Conference: Parallel, Distributed and Network-based Processing, 2002. Proceedings. 10th Euromicro Workshop on
Source: IEEE Xplore

ABSTRACT Shared virtual memory (SVM) systems, because of their software
implementation, enable shared-memory programming at a low design and
maintenance cost. Nevertheless, as hardware implementations become
faster, their performance is still far from that achieved by distributed
shared memory (DSM) systems. Nowadays, SVM systems use relaxed memory
consistency models and multiple writer protocols as techniques to reduce
latencies and false sharing, respectively. However, these techniques
induce additional overhead that decreases system performance. We
performed a study of workload behavior aimed at improving the design of
SVM protocols. The work focused on the identification of the type of
shared data patterns that can appear in the accesses to protected
sections using semaphores. Most coherence actions in SVM systems are
performed as a consequence of the write operations executed in critical
sections, so we pay special attention to the write operations performed
when multiple writers are allowed. As these write operations may present
spatial locality, we also study the write patterns on shared pages with
similar behaviour. Different software filters are applied in the
instrumented parallel workloads selected to capture and classify the
most common sharing patterns. This enables the recognition of those
patterns in which coherence overhead can be reduced by modifying the
coherence actions performed by the protocol. Despite the fact that the
performance evaluation of new coherence solutions is not our main goal,
the ideas presented to improve the behaviour of SVM systems can be
implemented at a reasonable hardware/software cost

  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: The data vortex photonic interconnection network is studied for application to clustering and hierarchical layering of nodes. Performance is examined for varying cluster counts and under loads of varying network locality. In today's technology, similar performance is attained at high network communication locality loads 2/3, and a 19% latency reduction is obtained at the highest locality loads 95% for current optical switching technology. For projected future technology, the clustered system is shown to yield up to a 55% reduc-tion in latency for applications with 2/3 or better locality. © 2007 Optical So-ciety of America OCIS codes: 060.0060, 060.2310, 060.4250, 200.4650.
    Journal of Optical Networking 08/2007; 6(9). · 1.08 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Reducing communication latency in multiprocessor interconnection networks can increase system performance on a broad range of applications. The data vortex photonic network reduces message latency by utilizing all-optical end-to-end transparent links and deflection routing. Cylinders replace node storage for buffering messages. The cylinder circumference (measured as number of angles) has a significant impact on the message acceptance rate and average message latency. A new symmetric mode of usage for the data vortex is discussed in which a fraction of the angles is used for input/output (I/O), and the remainder is used for "virtual buffering" of messages. For single-angle injection, six total angles provide the best performance. Likewise, the same ratio of 5 : 1 purely routing nodes versus I/O nodes is shown to produce greater than 99% acceptance, under normal loading conditions for all other network sizes studied. It is shown that for a given network I/O size, a shorter height and wider circumference data vortex organization provides acceptable latency with fewer total nodes than a taller but narrower data vortex. The performance versus system cost is discussed and evaluated, and the 5 : 1 noninjection-to-injection angle ratio is shown to be cost effective when constructing a system in current optical technology
    Journal of Lightwave Technology 10/2006; · 2.56 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: The definition of the data vortex architecture leaves broad room for decisions regarding the exact design point required for achieving a desired performance level. A detailed simulation-based study of various parameters that affect a data vortex interconnection network's performance is reported. Three implementations are compared by acceptance rate, latency, and cost.
    Journal of Optical Networking 04/2007; 6(4):369-374. · 1.08 Impact Factor

Full-text (4 Sources)

Available from
May 30, 2014