Experience with Fine-grain Synchronization in MIMD Machines for Preconditioned Conjugate Gradient

Yeung, Donald; Agarwal, Anant

dc.contributor.author	Yeung, Donald	en_US
dc.contributor.author	Agarwal, Anant	en_US
dc.date.accessioned	2023-03-29T14:36:34Z
dc.date.available	2023-03-29T14:36:34Z
dc.date.issued	1992-10
dc.identifier.uri	https://hdl.handle.net/1721.1/149203
dc.description.abstract	This paper discusses our experience with fine-grain synchronization for the preconditioned conjugate gradient method using the modified incomplete Cholesky factorization of the coefficient matrix as a preconditioner. This algorithm represents a large class of algorithms that have been widely used but traditionally difficult to implement efficiently on vector and parallel machines. Through a series of experiments conducted using a simulator of a distributed shared-memory multiprocessor, this paper addresses two major questions related to fine-grain synchronization in the context of this application. First, what is the overall impact of fine-grain synchronization on performance? Second, what are the individual contributions of the following three mechanisms typically provided to support fine-grain synchronization: language-level support, full-empty bits for compact storage and communication of synchronization state, and efficient processor operations on the state bits? The experiments indicate that fine-grain synchronization improves overall performance by a factor of 3.7 on 16 processors using the largest problem size we could simulate; the paper also projects that a significant performance advantage will be sustained for larger problem sizes. Preliminary experience shows that the bulk of the performance advantage for this application can be attributed to exposing increased parallelism through language-level expression of fine-grain synchronization. A smaller fraction relies on a compact-implementation of synchronization state, while an even smaller fraction results from efficient full-empty bit operations. The paper also shows that the last two components are likely to have a greater impact on performance as mechanisms for latency tolerance are employed.	en_US
dc.relation.ispartofseries	MIT-LCS-TM-479
dc.title	Experience with Fine-grain Synchronization in MIMD Machines for Preconditioned Conjugate Gradient	en_US
dc.identifier.oclc	27929962

Files in this item

Name:: MIT-LCS-TM-479.pdf
Size:: 332.3Kb
Format:: PDF

View/Open

This item appears in the following Collection(s)

LCS Technical Memos (1974 - 2003)

Show simple item record