Browsing LCS Publications by Author "Agarwal, Anant"
Now showing items 1-18 of 18
-
Analyzing Multiprocessor Cache Behavior Through Data Reference Modeling
Tsai, Jory; Agarwal, Anant (1993-02)This paper develops a data reference modeling technique to estimate with high accuracy the cache miss ratio in cache-coherent multiprocessors. The technique involves analyzing the dynamic data referencing behavior of ... -
APRIL: A Processor Architecture for Multiprocessing
Agarwal, Anant; Lim, Beng-Hong; Kranz, David; Kubiatowicz, John (1991)Processors in large-scale multiprocessors must be able to tolerate large communication latencies and synchronization delays. This paper describes the architecture of a rapid-context-switching processor called APRIL with ... -
Automatic Partitioning of Parallel Loops and Data Arrays for Distributed Shared-memory Multiprocessors
Agarwal, Anant; Kranz, David A.; Natarajan, Venkat (1995-09)This paper presents a theoretical framework for automatically partitioning parallel loops to minimize cache coherency traffic on shared-memory multiprocessors. While several previous papers have looked at hyperplane ... -
Automatic Partitioning of Parallel Loops for Cache-coherent Multiprocessors
Agarwal, Anant; Kranz, David; Natarajan, Venkat (1992-12)This paper presents a theoretical framework for automatically partitioning parallel loops to minimize cache coherency traffic on shared-memory multiprocessors. The framework introduces the notion of uniformly intersecting ... -
Column-associative Caches: A Technique for Reducing the Miss Rate of Direct-mapped Caches
Agarwal, Anant; Pudar, Steven D. (1993-11)Direct-mapped caches are a popular design choice for high-performance processors; unfortunately, direct-mapped caches suffer systematic interference misses when more than one address map into the same cache set. This paper ... -
Communication-Minimal Partitioning of Parallel Loops and Data Arrays for Cache-Coherent Distributed -Memory Multiprocess
Barua, Rajeev; Kranz, David; Agarwal, Anant (1995-01)Harnessing the full performance potential of cache-coherent distributed shared memory multiprocessors without inordinate user effort requires a compilation technology that can automatically manage multiple levels of memory ... -
Exploring Optimal Cost-Performance Designs for RAW processors
Moritz, Csaba Andras; Yeung, Donald; Agarwal, Anant (1998-06)The semiconductor industry roadmap projects that advances in VLSI technology will permit more than one billion transistors on a chip by the year 2010. The MIT Raw microprocessor is a proposed architecture that strives to ... -
FUGU: Implementing Translation and Protection in a Multiuser, Multimodel Multiprocessor
Mackenzie, Kenneth; Kubiatowicz, John; Agarwal, Anant; Kaashoek, M. Frans (1994-10)Multimodel multiprocessors provide both shared memory and message passing primitives to the user for efficient communication. In a multiuser machine, translation permits machine resource to be virtualized and protection ... -
Hierarchical Compilation of Macro Dataflow Graphs for Multiprocessors with Local Memory
Prasanna, G.N. Srinivasa; Agarwal, Anant; Musicus, Bruce R. (1992-10)This paper presents a hierarchical approach for compiling macro dataflow graphs for multiprocessors with local memory. Macro dataflow graphs comprise several nodes (or macros operations) that must be executed subject to ... -
Hierarchical Compilation of Macro Dataflow Graphs for Multiprocessors with Local Memory
Prasanna, G.N. Srinivasa; Agarwal, Anant; Musicus, Bruce R. (1992-12)This paper presents a hierarchical approach for compiling macro dataflow graphs for multiprocessors with local memory. Macro dataflow graphs comprise several nodes (or macros operations) that must be executed subject to ... -
Integrating Message-passing and Shared-memory: Early Experience
Kranz, David; Johnson, Kirk; Agarwal, Anant; Kubiatowicz, John; Lim, Beng-Hong (1992-10)This paper discusses some of the issues involved in implementing a shared-address space programming model on large-scale, distributed-memory multiprocessors. Because message-passing mechanisms are much more efficient than ... -
Low-cost Support for Fine-grain Synchronization in Multiprocessors
Kranz, David; Lim, Beng-Hong; Agarwal, Anant (1992-06)As multiprocessors scale beyond the limits of a few tens of processors, they must look beyond traditional methods of synchronization to minimize serialization and achieve the high degrees of parallelism required to utilize ... -
Maps: a Compiler-Managed Memory System for RAW Machines
Barua, Rajeev; Lee, Walter; Amarasinghe, Saman; Agarwal, Anant (1998-07)Microprocessors of the next decade and beyond will be built using VLSI chips employing billions of transistors. In this generation of microprocessors, achieving a high level of parallelism at a reasonable clock speed will ... -
Performance Tradeoffs in Multithreaded Processors
Agarwal, Anant (1991-04)High network latencies in large-scale multiprocessors can cause a significant drop in processor utilization. By maintaining multiple process contexts in hardware and switching among them in a few cycles, multithreaded ... -
The Sensitivity of Communication Mechanisms to Bandwidth and Latency
Chong, Frederic T.; Barua, Rajeev; Dahlgren, Fredrik; Kubiatowicz, John D.; Agarwal, AnantThe goal of this paper is to gain insight into the relative performance of communication mechanisms as bisection bandwidth and network latency vary. We compare shared memory with and without prefetching, message passing ... -
Shared Memory Versus Message Passing for Iterative Solution of Sparse, Irregular Problems
Chong, Frederic T.; Agarwal, Anant (1996-10)The benefits of hardware support for shared memory versus those formessage passing are difficult to evaluate without an in-depth study ofreal applications on a common platform. We evaluate the communicationmechanisms of ... -
Stream Algorithms and Architecture
Henry, Hoffman; Strumpen, Volker; Agarwal, Anant (2003-03)Wire-exposed, programmable microarchitectures including Trips [11]], Smart Memories [8], and Raw [13] offer an opportunity to schedule instruction execution and data movement explicitly. This paper proposes stream algorithms, ... -
Virtual Wires: Overcoming Pin Limitations in FPGA-based Logic Emulators
Babb, Jonathan; Tessier, Russell; Agarwal, Anant (1992-11)Existing FPGA-based logic emulators suffer from limited inter-chip communication bandwidth, resulting in low gate utilization (10 20 percent). This resource imbalance increases the number of chips needed to emulate a ...