University of SaskatchewanHARVEST
  • Login
  • Submit Your Work
  • About
    • About HARVEST
    • Guidelines
    • Browse
      • All of HARVEST
      • Communities & Collections
      • By Issue Date
      • Authors
      • Titles
      • Subjects
      • This Collection
      • By Issue Date
      • Authors
      • Titles
      • Subjects
    • My Account
      • Login
      JavaScript is disabled for your browser. Some features of this site may not work without it.
      View Item 
      • HARVEST
      • Electronic Theses and Dissertations
      • Graduate Theses and Dissertations
      • View Item
      • HARVEST
      • Electronic Theses and Dissertations
      • Graduate Theses and Dissertations
      • View Item

      Future effectiveness of centralized and distributed memory architectures for parallel computing systems

      Thumbnail
      View/Open
      NQ37877.pdf (7.983Mb)
      Date
      1999-04-01
      Author
      Chinthamani, Meemakshisundaram Ramanthan
      Type
      Thesis
      Degree Level
      Doctoral
      Metadata
      Show full item record
      Abstract
      This work is concerned with the question of how current parallel systems would need to evolve in terms of hardware capabilities, architectures, and software control policies, if they are to continue to be useful computing platforms in the future. This question is motivated by the past and continuing increases in processor speeds that substantially exceed the increases in performance of other hardware resources, including, for example, the networks needed for communication between processors in parallel systems. The hardware capabilities considered include processor speed, inter-processor communication network latency, inter-processor communication network bandwidth, and cache sizes. The candidate parallel system architectures considered include centralized memory architectures, in which all main memory accesses must traverse an interconnection network to access a shared, centralized memory, and distributed memory architectures, in which the total system main memory is distributed among the processors, such that only accesses to a part of the main memory allocated to another processor need traverse a global interconnection network. The software control policies considered include affinity scheduling policies that assign computations to various processors based on the likely contents of their caches and the local memories, and latency tolerant scheduling policies that permit aggregation of computation and communication into larger units. Results of scalability analyses are presented in two forms: asymptotic results and transient results. Asymptotic results are developed by studying the execution characteristics of a number of data parallel numeric and scientific applications, as the system parameters of processor speed, cache sizes, inter-processor communication network bandwidth, and latency, and the application parameters are scaled, while holding the number of processors fixed, under a time-constrained scaling model. Asymptotic results are concerned with the question of whether an application will eventually become communication bound, and thus the machine becomes unsuitable as a parallel computing platform for the application. On the other hand, the transient results are concerned with the rate at which the asymptotic results take hold. Based on the asymptotic results for centralized memory architectures, the set of applications considered in this work are classified into four types, termed Type I, Type II, Type III, and Type IV. Affinity scheduling techniques do not have any impact on the asymptotic results for Type I or Type III applications. However, affinity scheduling techniques have a significant impact on the asymptotic results for Type II and Type IV applications. These scheduling techniques reduce significantly the bus bandwidth scaling factor required for Type II and Type IV applications to become computation bound asymptotically, provided cache sizes scale fast enough to eventually contain the entire application data sets. For the class of data parallel near-neighbour computations, a new latency tolerant scheduling policy is proposed. It is shown to alleviate substantially the potential "latency bottleneck" in high latency parallel computing environments. Yet, it is shown that for the class of one and two-dimensional near-neighbour computations, the proposed scheduling policy increases the total communication volume only by at most a constant factor compared to the conventional scheduling policy. It is also shown that, for arbitrary d-dimensional near-neighbour computations d>=1 , the asymptotic bandwidth scaling requirements are at the same level as for conventional scheduling. The benefits of the proposed latency tolerant scheduling policy are also demonstrated through an experimental study.
      Degree
      Doctor of Philosophy (Ph.D.)
      Department
      Computer Science
      Program
      Computer Science
      Committee
      Eager, Derek L.
      Copyright Date
      April 1999
      URI
      http://hdl.handle.net/10388/etd-10212004-001331
      Subject
      computer science
      Collections
      • Graduate Theses and Dissertations
      University of Saskatchewan

      University Library

      © University of Saskatchewan
      Contact Us | Disclaimer | Privacy