The Kendall Square Research KSR1 MPP system has a shared address space, which spreads over physically distributed memory modules. Thus, memory access time can vary over a wide range even when accessing the same variable, depending on how this variable is being referenced and updated by the various processors. Since the processor stalls during this access time, the KSR1 performance depends considerably on the program's locality of reference. The KSR1 provides two novel features to reduce such long memory latencies: prefetch and post-store instructions. This paper analyzes the various memory latency factors which stalls the processor during program execution. A suitable model for evaluating these factors is developed for the execution of FORTRAN DO-loops parallelized with the Tile construct using the Slice strategy. The DO-loops used in the benchmark program perform sparse matrix-vector multiply, vector-vector dot product, and vector-vector addition, which are typically executed in an it...