出版时间:1999-9 出版社:机械工业出版社 作者:David E.Culler 页数:1025
Tag标签:无
内容概要
当今并行计算机体系结构最令人振奋的发展是对传统的相互各异的并行实现方式的完美综合。本书正是以此技术为背景,通过大量的实例,精确的数据和作者对并行结构深邃的理解向人们提示了蕴藏于并行体系结构中的强大力量,并同时首次对设计的平衡性能做了充分的定量评估。本书用硬件、软件方面的最新技术对并行结构设计中的若干重大问题做了全面、深刻的追踪探讨。本书是诸多专家智慧的结晶、经验的融合,是广大学生、科研人员、工程人员的权威教材,也是奉献于并行结构科学的经典之作。
书籍目录
ContentsForewordPreface1 Introduction 1.1 Why Parallel Architecture 1.2 Convergence of Parallel Architectures 1.3 Fundamental Design Issues 1.4 Concluding Remarks 1.5 Historical Refernces 1.6 Exercises2 Parallel Programs 2.1 parallel Application Case Studies 2.2 The Parallelization Process 2.3 Paralleliation of an Example Program 2.4Concluding Remarks 2.5 Exercises3 Programming for Performance 3.1 Partitioning for Performance 3.2 Data Access and Communication in a Multimemory System 3.3 Orchestration for Performance 3.4 Performance Factors from the Processor's Perspective 3.5 The Parallel Application Case Studies:An In-Depth Look 3.6 Implications for Programming Models 3.7 Concluding Reamarks 3.8 Exercises4 Workload-Driven Evaluation 4.1 Scaling Workloads and Machines 4.2 Evaluating a Real Machine 4.3 Evaluating an Architectural Idea or Trade-off 4.4 Illustrating Workload Characterization 4.5 Concluding Remarks 4.6 Exercises5 Shared Memory Multiprocessors 5.1 Cache Coherence 5.2 Memory consistency 5.3 Design Space for Snooping Protocols 5.4 Assessing Protocol Design Trade-offs 5.5 Synchronization 5.6 Implications for Software 5.7 Concluding Remarks 5.8 Exercises6 Snoop-Based Multiprocessor Design 6.1 Correctness Requirements 6.2 Base Design :simgle-Level Caches with an Atomic Bus 6.3 Multilevel Cache Hierarchies 6.4 Split-Transaction Bus 6.5 Case Studies :SGI Challenge and Sun Enterprise 6.6 Extending Cache Coherence 6.7 Concluding Remarks 6.8 Exercises7 Scalable Multiprocessors 7.1 Scalability 7.2 Realizing Programming Models 7.3 Physical DMA 7.4 User-Level Access 7.5 Dedicated Message Processing 7.6 Shared Physical Address Space 7.7 Clusters and Networks of Workstatiomns 7.8 Implications for Parallel Software 7.9 Synchronization 7.10 Concluding Remarks 7.11 Exercises8 Directory-Based Cache Coherence 8.1 Scalable Cache Coherence 8.2 Overview of Directory-Based Approaches 8.3 Assessing Directory Protocols and Trade-Offs 8.4 Design Challenges for Directory Protocols 8.5 Memory-Based Directory Protocols:The SGI Origin System 8.6 Cache-Based Directory Protocols:The Sequent NUMA-Q 8.7 Performacne Parameters and Protocol Performacne 8.8 Synchronization 8.9 Implications for Parallel software 8.10 Advanced topics 8.11 Concluding Remarks 8.12 Exercises9 Haradware/Software Trade-Offs 9.1 Relaxed Memory Consistency Models 9.2 Overcoming Capacity Limitations 9.3 Reducing Hardware Cost 9.4 Putting It All Together:Ataxonomy and Simple COMA 9.5 Implications for Parallel Software 9.6 Advanced topics 9.7 Concluding Remarks 9.8 Exercises10 Interconnection Network Design 10.1 Basic Definitions 10.2 Basic Communication Performance 10.3 Organizational Structure 10.4 Interconnection Topologies 10.5 Evaluating Design Trade-Offs in Network Topology 10.6 Routing 10.7 Switch Design 10.8 Flow Control 10.9 Case Studies 10.10 Concluding Remarks 10.11 Exercises11 Latency Tolerance 11.1 Overview of Latency tolerance 11.2 Latency Tolerance in Explicit message Passing 11.3 latency Tolerance in a Shared Address Space 11.4 Block Data TRansfer in a Shared Address Space 11.5 Proceeding Past Long-Latency Events 11.6 Precommunication in a Shared Address Space 11.7 Multithreading in a Shared Address Space 11.8 Lockup-Free Cache Design 11.9 Concluding Remarks 11.10 Exercises12 Future Directions 12.1 Technology and Architecture 12.2 Applications and System SoftwareAppendix:Parallel Benchmark Suites
图书封面
图书标签Tags
无
评论、评分、阅读与下载