


default search action
ACM Transactions on Architecture and Code Optimization, Volume 18
Volume 18, Number 1, January 2021
- Ari Rasch

, Richard Schulze, Michel Steuwer
, Sergei Gorlatch:
Efficient Auto-Tuning of Parallel Programs with Interdependent Tuning Parameters via Auto-Tuning Framework (ATF). 1:1-1:26 - Syed Mohammad Asad Hassan Jafri, Hasan Hassan, Ahmed Hemani, Onur Mutlu

:
Refresh Triggered Computation: Improving the Energy Efficiency of Convolutional Neural Network Accelerators. 2:1-2:29 - Solomon Abera, M. Balakrishnan, Anshul Kumar:

Performance-Energy Trade-off in Modern CMPs. 3:1-3:26 - Atefeh Mehrabi, Aninda Manocha, Benjamin C. Lee, Daniel J. Sorin:

Bayesian Optimization for Efficient Accelerator Synthesis. 4:1-4:25 - Minsu Kim, Jeong-Keun Park, Soo-Mook Moon:

Irregular Register Allocation for Translation of Test-pattern Programs. 5:1-5:23 - Negin Nematollahi, Mohammad Sadrosadati, Hajar Falahati, Marzieh Barkhordar, Mario Paulo Drumond, Hamid Sarbazi-Azad, Babak Falsafi:

Efficient Nearest-Neighbor Data Sharing in GPUs. 6:1-6:26 - Lorenz Braun

, Sotirios Nikas, Chen Song
, Vincent Heuveline
, Holger Fröning:
A Simple Model for Portable and Fast Prediction of Execution Time and Power Consumption of GPU Kernels. 7:1-7:25 - Marcel Mettler

, Daniel Mueller-Gritschneder
, Ulf Schlichtmann:
A Distributed Hardware Monitoring System for Runtime Verification on Multi-Tile MPSoCs. 8:1-8:25 - Yu Emma Wang, Carole-Jean Wu, Xiaodong Wang, Kim M. Hazelwood, David Brooks:

Exploiting Parallelism Opportunities with Deep Learning Frameworks. 9:1-9:23 - Wenjie Liu

, Shoaib Akram
, Jennifer B. Sartor, Lieven Eeckhout:
Reliability-aware Garbage Collection for Hybrid HBM-DRAM Memories. 10:1-10:25 - Sanket Tavarageri, Alexander Heinecke, Sasikanth Avancha, Bharat Kaul, Gagandeep Goyal, Ramakrishna Upadrasta:

PolyDL: Polyhedral Optimizations for Creation of High-performance DL Primitives. 11:1-11:27 - Sujay Yadalam

, Vinod Ganapathy, Arkaprava Basu:
SGXL: Security and Performance for Enclaves Using Large Pages. 12:1-12:25 - Kleovoulos Kalaitzidis

, André Seznec:
Leveraging Value Equality Prediction for Value Speculation. 13:1-13:20 - Abhishek Singh

, Shail Dave
, Pantea Zardoshti
, Robert Brotzman, Chao Zhang, Xiaochen Guo, Aviral Shrivastava
, Gang Tan, Michael F. Spear
:
SPX64: A Scratchpad Memory for General-purpose Microprocessors. 14:1-14:26 - Sooraj Puthoor, Mikko H. Lipasti:

Systems-on-Chip with Strong Ordering. 15:1-15:27 - Paolo Sylos Labini, Marco Cianfriglia, Damiano Perri

, Osvaldo Gervasi
, Grigori Fursin, Anton Lokhmotov, Cedric Nugteren, Bruno Carpentieri, Fabiana Zollo
, Flavio Vella
:
On the Anatomy of Predictive Models for Accelerating GPU Convolution Kernels and Beyond. 16:1-16:24
Volume 18, Number 2, March 2021
- Nils Voss, Bastiaan Kwaadgras, Oskar Mencer, Wayne Luk, Georgi Gaydadjiev

:
On Predictable Reconfigurable System Design. 17:1-17:28 - Anirudh Mohan Kaushik, Gennady Pekhimenko, Hiren D. Patel:

Gretch: A Hardware Prefetcher for Graph Analytics. 18:1-18:25 - Nhut-Minh Ho

, Himeshi De Silva, Weng-Fai Wong
:
GRAM: A Framework for Dynamically Mixing Precisions in GPU Applications. 19:1-19:24 - Arnab Kumar Biswas

:
Cryptographic Software IP Protection without Compromising Performance or Timing Side-channel Leakage. 20:1-20:20 - Maxime France-Pillois

, Jérôme Martin, Frédéric Rousseau:
A Non-Intrusive Tool Chain to Optimize MPSoC End-to-End Systems. 21:1-21:22 - Pengyu Wang

, Jing Wang, Chao Li, Jianzong Wang
, Haojin Zhu, Minyi Guo:
Grus: Toward Unified-memory-efficient High-performance Graph Processing on GPU. 22:1-22:25 - Ramin Izadpanah, Christina L. Peterson, Yan Solihin, Damian Dechev:

PETRA: Persistent Transactional Non-blocking Linked Data Structures. 23:1-23:26 - Muhammad Hassan, Chang Hyun Park

, David Black-Schaffer:
A Reusable Characterization of the Memory System Behavior of SPEC2017 and SPEC2006. 24:1-24:20
Volume 18, Number 3, June 2021
- Sugandha Tiwari, Neel Gala, Chester Rebeiro, V. Kamakoti:

PERI: A Configurable Posit Enabled RISC-V Core. 25:1-25:26 - George Charitopoulos, Dionisios N. Pnevmatikatos

, Georgi Gaydadjiev
:
MC-DeF: Creating Customized CGRAs for Dataflow Applications. 26:1-26:25 - Jose M. Rodriguez Borbon

, Junjie Huang
, Bryan M. Wong
, Walid A. Najjar:
Acceleration of Parallel-Blocked QR Decomposition of Tall-and-Skinny Matrices on FPGAs. 27:1-27:25 - Michael Stokes, David B. Whalley, Soner Önder:

Decreasing the Miss Rate and Eliminating the Performance Penalty of a Data Filter Cache. 28:1-28:22 - Shoaib Akram

:
Performance Evaluation of Intel Optane Memory for Managed Workloads. 29:1-29:26 - Ya-Shuai Lü, Hui Guo, Libo Huang, Qi Yu, Li Shen, Nong Xiao, Zhiying Wang:

GraphPEG: Accelerating Graph Processing on GPUs. 30:1-30:24 - Hamza Omar, Omer Khan:

PRISM: Strong Hardware Isolation-based Soft-Error Resilient Multicore Architecture with High Performance and Availability at Low Hardware Overheads. 31:1-31:25 - Devashree Tripathy

, AmirAli Abdolrashidi, Laxmi Narayan Bhuyan, Liang Zhou, Daniel Wong:
PAVER: Locality Graph-Based Thread Block Scheduling for GPUs. 32:1-32:26 - Wim Heirman

, Stijn Eyerman, Kristof Du Bois, Ibrahim Hur:
Automatic Sublining for Efficient Sparse Memory Accesses. 33:1-33:23 - Mustafa Cavus, Mohammed Shatnawi, Resit Sendag, Augustus K. Uht:

Fast Key-Value Lookups with Node Tracker. 34:1-34:26 - Weijia Song, Christina Delimitrou, Zhiming Shen, Robbert van Renesse, Hakim Weatherspoon, Lotfi Benmohamed, Frederic J. de Vaulx, Charif Mahmoudi

:
CacheInspector: Reverse Engineering Cache Resources in Public Clouds. 35:1-35:25 - Daniel Rodrigues Carvalho, André Seznec:

Understanding Cache Compression. 36:1-36:27 - Daniel Thuerck, Nicolas Weber, Roberto Bifulco:

Flynn's Reconciliation: Automating the Register Cache Idiom for Cross-accelerator Programming. 37:1-37:26 - João P. L. de Carvalho

, Braedy Kuzma
, Ivan Korostelev, José Nelson Amaral, Christopher Barton, José E. Moreira, Guido Araujo:
KernelFaRer: Replacing Native-Code Idioms with High-Performance Library Calls. 38:1-38:22 - Ricardo Alves, Stefanos Kaxiras, David Black-Schaffer:

Early Address Prediction: Efficient Pipeline Prefetch and Reuse. 39:1-39:22
Volume 18, Number 4, December 2021
- Kaustav Goswami

, Dip Sankar Banerjee
, Shirshendu Das:
Towards Enhanced System Efficiency while Mitigating Row Hammer. 40:1-40:26 - Jerzy Proficz:

All-gather Algorithms Resilient to Imbalanced Process Arrival Patterns. 41:1-41:22 - Rui Xu, Sheng Ma, Yaohua Wang, Xinhai Chen

, Yang Guo:
Configurable Multi-directional Systolic Array Architecture for Convolutional Neural Networks. 42:1-42:24 - Wonik Seo, Sanghoon Cha, Yeonjae Kim, Jaehyuk Huh, Jongse Park:

SLO-Aware Inference Scheduler for Heterogeneous Processors in Edge Platforms. 43:1-43:26 - Yasir Mahmood Qureshi

, William Andrew Simon, Marina Zapater
, Katzalin Olcoz
, David Atienza
:
Gem5-X: A Many-core Heterogeneous Simulation Platform for Architectural Exploration and Optimization. 44:1-44:27 - Tina Jung

, Fabian Ritter
, Sebastian Hack:
PICO: A Presburger In-bounds Check Optimization for Compiler-based Memory Safety Instrumentations. 45:1-45:27 - Zhibing Sha

, Jun Li
, Lihao Song, Jiewen Tang, Min Huang, Zhigang Cai, Lianju Qian, Jianwei Liao, Zhiming Liu:
Low I/O Intensity-aware Partial GC Scheduling to Reduce Long-tail Latency in SSDs. 46:1-46:25 - Syed Asad Alam, James Garland

, David Gregg:
Low-precision Logarithmic Number Systems: Beyond Base-2. 47:1-47:25 - Candace Walden, Devesh Singh, Meenatchi Jagasivamani, Shang Li, Luyi Kang, Mehdi Asnaashari, Sylvain Dubois, Bruce L. Jacob, Donald Yeung:

Monolithically Integrating Non-Volatile Main Memory over the Last-Level Cache. 48:1-48:26 - Matthew Tomei, Shomit Das, Mohammad Seyedzadeh, Philip Bedoukian, Bradford M. Beckmann, Rakesh Kumar, David A. Wood:

Byte-Select Compression. 49:1-49:27 - Cunlu Li, Dezun Dong, Shazhou Yang, Xiangke Liao, Guangyu Sun, Yongheng Liu:

CIB-HIER: Centralized Input Buffer Design in Hierarchical High-radix Routers. 50:1-50:21 - Tobias Gysi, Christoph Müller, Oleksandr Zinenko

, Stephan Herhut, Eddie Davis, Tobias Wicky, Oliver Fuhrer, Torsten Hoefler, Tobias Grosser
:
Domain-Specific Multi-Level IR Rewriting for GPU: The Open Earth Compiler for GPU-accelerated Climate Simulation. 51:1-51:23 - An Zou, Huifeng Zhu, Jingwen Leng, Xin He, Vijay Janapa Reddi, Christopher D. Gill, Xuan Zhang:

System-level Early-stage Modeling and Evaluation of IVR-assisted Processor Power Delivery System. 52:1-52:27 - Aninda Manocha

, Tyler Sorensen
, Esin Tureci, Opeoluwa Matthews, Juan L. Aragón
, Margaret Martonosi:
GraphAttack: Optimizing Data Supply for Graph Applications on In-Order Multicore Architectures. 53:1-53:26 - Joscha Benz

, Oliver Bringmann
:
Scenario-Aware Program Specialization for Timing Predictability. 54:1-54:26 - Shounak Chakraborty, Magnus Själander

:
WaFFLe: Gated Cache-Ways with Per-Core Fine-Grained DVFS for Reduced On-Chip Temperature and Leakage Consumption. 55:1-55:25 - Sriseshan Srikanth

, Anirudh Jain
, Thomas M. Conte
, Erik P. DeBenedictis, Jeanine E. Cook:
SortCache: Intelligent Cache Management for Accelerating Sparse Data Workloads. 56:1-56:24 - Paul Metzger

, Volker Seeker, Christian Fensch, Murray Cole:
Device Hopping: Transparent Mid-Kernel Runtime Switching for Heterogeneous Systems. 57:1-57:25 - Yu Zhang

, Da Peng, Xiaofei Liao, Hai Jin, Haikun Liu, Lin Gu, Bingsheng He:
LargeGraph: An Efficient Dependency-Aware GPU-Accelerated Large-Scale Graph Processing. 58:1-58:24 - M. Hüsrev Cilasun, Salonik Resch, Zamshed I. Chowdhury, Erin Olson, Masoud Zabihi, Zhengyang Zhao, Thomas Peterson, Keshab K. Parhi

, Jianping Wang, Sachin S. Sapatnekar, Ulya R. Karpuzcu:
Spiking Neural Networks in Spintronic Computational RAM. 59:1-59:21

manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.


Google
Google Scholar
Semantic Scholar
Internet Archive Scholar
CiteSeerX
ORCID














