The goal of the workshop is to provide a forum for researchers and practitioners to exchange ideas and discuss the latest advances in the field of computer architecture modeling ans simulation. The focus on modeling and simulation techniques is of vital importance to the ongoing advancements in microarchitecture, as these methods are essential tools for improving system performance, efficiency, and reliability.
The workshop will cover various aspects of computer architecture modeling and simulation, including but not limited to:
All times are in Central Standard Time (UTC-6).
Time | Event |
8:00 - 8:10 | Opening Remarks |
8:10 - 9:00 | Keynote |
9:00 - 10:00 | Paper Talks |
9:00 - 9:15 | [Paper] Demystifying Platform Requirements for Diverse LLM
Inference Use Cases
Abhimanyu Bambhaniya, Ritik Raj, Geonhwa Jeong (Georgia Institute of Technology), Souvik Kundu (Intel Labs), Sudarshan Srinivasan, Midhilesh Elavazhagan, Madhu Kumar (Intel) and Tushar Krishna (Georgia Institute of Technology) Large language models (LLMs) have shown remarkable performance across a wide range
of applications, often outperforming human experts. However, deploying these
parameter-heavy models efficiently for diverse inference use cases requires
carefully designed hardware platforms with ample computing, memory,
and network resources. With LLM deployment scenarios and models
evolving at breakneck speed, the hardware requirements to meet Service
Level Objectives(SLOs) remains an open research question.
|
9:15 - 9:30 | [Paper] BottleneckAI: Harnessing Machine Learning and Knowledge
Transfer for Detecting Architectural Bottlenecks
Jihyun Ryoo, Gulsum Gudukbay Akbulut, Huaipan Jiang, Xulong Tang, Suat Akbulut, Jack Sampson, Vijaykrishnan Narayanan and Mahmut Taylan Kandemir (The Pennsylvania State University) The architectural analysis tools that output bottleneck information do not allow knowledge transfer to other applications or architectures. So, we propose a novel tool that can predict a known application's bottlenecks for previously unseen architectures or an unknown application's bottlenecks for known architectures. We (i) identify the bottleneck characteristics of 44 applications and use this as the dataset for our ML/DL model; (ii) identify the correlations between metrics and bottlenecks to create our tool's initial feature list; (iii) propose an architectural bottleneck analysis model - BottleneckAI - that employs random forest regression (RFR) and multi-layer perceptron (MLP) regression; (iv) present results that indicate BottleneckAI tool can achieve 0.70 (RFR) and 0.72 (MLP) R^2 inference accuracy in predicting bottlenecks; (v) present five versions of BottleneckAI, four of which are trained with single architecture data, and one of which is trained with multiple architecture data, to predict bottlenecks for new architectures. |
9:30 - 9:45 | [Paper] How Accurate is Accurate Enough for Simulators? A
Review of Simulation Validation
Shiyuan Li (Oregon State University) and Yifan Sun (The College of William and Mary) Simulators are vital tools for evaluating the performance of innovative architectural designs. To ensure an accurate simulation results, researchers must validate these simulators. However, even validated simulators can exhibit unreliability when facing new workloads or modified architectural designs. This paper seeks to enhance simulator trustworthiness by refining the validation process. Through a comprehensive review of existing literature, the nuances of simulator accuracy and reliability are examined from a broader perspective on simulation error that goes beyond simple accuracy validation. Our proposals for improving simulator trustworthiness include selecting a representative benchmark set and expanding the configuration set during validation. Additionally, we aim to predict errors associated with new workloads by leveraging the error profiles obtained from the validation process. To further enhance overall simulator trustworthiness, we suggest incorporating error tolerance in the simulator calibration process. Ultimately, we propose additional validation with new benchmarks and minimal calibration, as this approach closely mimics real-world usage environments. |
9:45 - 10:00 | [Paper] Parallelizing a Modern GPU Simulator
Rodrigo Huerta and Antonio Gonzalez (Universitat Politècnica de Catalunya) Simulators are a primary tool in computer architecture research but are extremely computationally intensive. Simulating modern architectures with increased core counts and recent workloads can be challenging, even on modern hardware. This paper demonstrates that simulating some GPGPU workloads in a single-threaded state-of-the-art simulator such as Accel-sim can take more than five days. In this paper we present a simple approach to parallelize this simulator with minimal code changes by using OpenMP. Moreover, our parallelization technique is deterministic, so the simulator provides the same results for single-threaded and multi-threaded simulations. Compared to previous works, we achieve a higher speed-up, and, more importantly, the parallel simulation does not incur any inaccuracies. When we run the simulator with 16 threads, we achieve an average speed-up of 5.8x and reach 14x in some workloads. This allows researchers to simulate applications that take five days in less than 12 hours. By speeding up simulations, researchers can model larger systems, simulate bigger workloads, add more detail to the model, increase the efficiency of the hardware platform where the simulator is run, and obtain results sooner. |
10:00 - 10:30 | Coffee break |
10:30 - 12:00 | Simulator Release Talks |
10:30 - 10:50 | What's new in gem5 24.0 (Jason Lowe-Power)
In this talk, we will explore the significant advancements and new features introduced in gem5 v24.0 over the past five years. We will discuss the development of a robust and inclusive community. Key updates include the introduction of a standard library for simplified simulation setup, the implementation of the CHI coherence protocol for enhanced cache hierarchy configurability, and support for full system machine learning stacks using unmodified ML frameworks like PyTorch and TensorFlow. |
10:50 - 11:10 | Release of Sniper v8.1 and Guide on Common Simulation Practices
(Alen Sabu, Trevor E. Carlson)
In this talk, we will introduce the latest release of Sniper, version 8.1. This Sniper release includes support for Pac-Sim, a sampled simulation technique suitable for dynamically scheduled multi-threaded workloads. Pac-Sim eliminates the need for upfront profiling, allowing users to simulate large multi-threaded workloads more efficiently. Further, we release a document that assists computer architects and practitioners with selecting the right tools for their performance evaluation studies. We hope the document to be the starting point for any simulation-based research in computer architecture. |
11:10 - 11:30 | User-Friendly Tools in Akita (Yifan Sun)
In this talk, we will present the real-time monitoring tool for Akita---AkitaRTM---and the default trace visualization tool for Akita---Daisen. |
11:30 - 11:50 | SST 14.1 Highlights (Patrick Lavin)
In this talk, we will cover the improvements made to the Structural Simulation Toolkit over the past several years. We will look at improvements made to the parallel core, most notably checkpoint/restart, as well as additions to the included simulation components such as Merlin, a network simulator, and Mercury, a large-scale application model. We will also share work done to help new users, including a new documentation website and an interactive utility for learning about simulation components. |
11:50 - 12:00 | Closing Remarks |
Abstract: The breakdown in Moore’s Law and Dennard Scaling is leading to drastic changes
in the makeup and constitution of computing systems. For example, a single
chip integrates 10-100s of cores and has a heterogeneous mix of general-purpose
compute engines and highly specialized accelerators. Traditionally, computer
architects have relied on tools like architectural simulators to accurately
perform early-stage prototyping and optimizations for the proposed research.
However, as systems become increasingly complex and heterogeneous, architectural
tools are straining to keep up. In particular, publicly available architectural
simulators are often not very representative of the industry parts they intend
to represent. This leads to a mismatch in expectations; when prototyping new
optimizations researchers may draw the wrong conclusions about the efficacy
of proposed optimizations if the tool’s models do not provide high fidelity.
Moreover, modeling and simulation tools are also struggling to keep pace with
increasingly large, complex workloads from domains such as machine learning (ML).
In this talk, I will discuss our work on improving the open source, publicly
available GPU models in the widely used gem5 simulator. gem5 can run entire
systems, including CPUs, GPUs, and accelerators, as well as the operating system,
runtime, network, and other related components. Thus, gem5 has the potential to
allow users to study the behavior of the entire heterogeneous systems.
Unfortunately, some of gem5’s publicly available models do not always provide
high accuracy relative to their ”real” counterparts, especially for the memory
subsystem. I will discuss my group's efforts to overcome these challenges and
improve the fidelity of gem5's GPU models, as well as our ongoing efforts to
scalably run modern ML and HPC workloads in frameworks such as PyTorch and
TensorFlow in gem5. Collectively, this work significantly enhances the
state-of-the-art and enables more widespread adoption of gem5 as an accurate
platform for heterogeneous architecture research.
Bio: I am an Assistant Professor in the Computer Sciences Department at the University of Wisconsin-Madison. I am also an Affiliate Faculty in the ECE Department and Teaching Academy at UW-Madison. My research primarily focuses on how to design, program, and optimize future heterogeneous systems. I also design the tools for future heterogeneous systems, including serving on the gem5 Project Management Committee and the MLCommons Power and HPC Working Groups. I am a recipient of the NSF CAREER award, and my work has been funded by the DOE, Google, NSF, and SRC. My research has also been recognized several times, including an ACM Doctoral Dissertation Award nomination, a Qualcomm Innovation Fellowship, the David J. Kuck Outstanding PhD Thesis Award, and an ACM SIGARCH - IEEE Computer Society TCCA Outstanding Dissertation Award Honorable Mention. I am also the current steward for the ISCA Hall of Fame.
The workshop invites submissions of original work in the form of full papers (up to 6 pages, reference not included) covering all aspects of computer architecture modeling and simulation. Submissions will be peer-reviewed, and accepted papers will be included in the workshop proceedings.
August 16, 2024
August 30, 2024 (Anywhere on Earth)
September 15, 2024
September 23, 2024
Full paper submissions must be in PDF format for US letter-size or A4 paper. They must not exceed 6 pages (excluding unlimited references) in standard ACM two-column conference format (review mode, with page numbers and both 9 or 10pt can be used). More concise papers with ideas clearly expressed are also welcomed. Authors can select if they want to reveal their identity in the submission. Templates for ACM format are available for Microsoft Word and LaTeX at this link. https://www.acm.org/publications/proceedings-template
We do not put the paper in the ACM or IEEE digital libraries. Therefore, the papers submitted to the event can be submitted to other venues without restrictions.
At least one author of accepted papers is expected to present in person during the event. We understand the travel difficulty of the post-pandemia era. In extreme cases, we will allow remote or pre-recorded presentations.
Submission Site: https://easychair.org/conferences/?conf=cams2024
Yifan Sun | Trevor E. Carlson | Sabila Al Jannat |
Chair | Chair | Web Chair |
William & Mary | National University of Singapore | William & Mary |
In this workshop, we are experimenting with a PhD and practitioner-led PC. We believe that PhD students and practitioners are the end users of simulation and performance modeling tools and hence, should know the tools the best. We will report our experience during the workshop event.