


default search action
ACM Transactions on Architecture and Code Optimization, Volume 11
Volume 11, Number 1, February 2014
- Neeraj Goel, Anshul Kumar, Preeti Ranjan Panda:

Shared-port register file architecture for low-energy VLIW processors. 1:1-1:32 - Zheng Wang

, Georgios Tournavitis, Björn Franke
, Michael F. P. O'Boyle:
Integrating profile-driven parallelism detection and machine-learning-based mapping. 2:1-2:26 - Mehrzad Samadi, Amir Hormati, Janghaeng Lee, Scott A. Mahlke:

Leveraging GPUs using cooperative loop speculation. 3:1-3:26 - Jue Wang, Xiangyu Dong, Yuan Xie, Norman P. Jouppi:

Endurance-aware cache line management for non-volatile caches. 4:1-4:25 - Lei Liu, Zehan Cui, Yong Li, Yungang Bao, Mingyu Chen, Chengyong Wu:

BPM/BPM+: Software-based dynamic memory partitioning mechanisms for mitigating DRAM bank-/channel-level interferences in multicore systems. 5:1-5:28 - Christian Häubl, Christian Wimmer, Hanspeter Mössenböck

:
Trace transitioning and exception handling in a trace-based JIT compiler for java. 6:1-6:26 - Yongbing Huang, Licheng Chen, Zehan Cui, Yuan Ruan, Yungang Bao, Mingyu Chen, Ninghui Sun:

HMTT: A hybrid hardware/software tracing system for bridging the DRAM access trace's semantic gap. 7:1-7:25 - Quan Chen, Minyi Guo:

Adaptive workload-aware task scheduling for single-ISA asymmetric multicore architectures. 8:1-8:25 - Gülfem Savrun-Yeniçeri, Wei Zhang, Huahan Zhang, Eric Seckler, Chen Li, Stefan Brunthaler

, Per Larsen, Michael Franz:
Efficient hosted interpreters on the JVM. 9:1-9:24 - Prashant J. Nair, Chia-Chen Chou, Moinuddin K. Qureshi:

Refresh pausing in DRAM memory systems. 10:1-10:26 - Komal Jothi, Haitham Akkary:

Tuning the continual flow pipeline architecture with virtual register renaming. 11:1-11:27 - Thomas Carle

, Dumitru Potop-Butucaru:
Predicate-aware, makespan-preserving software pipelining of scheduling tables. 12:1-12:26 - Angeliki Kritikakou

, Francky Catthoor, Vasilios I. Kelefouras
, Costas E. Goutis:
A scalable and near-optimal representation of access schemes for memory management. 13:1-13:25 - Hugh Leather

, Edwin V. Bonilla, Michael F. P. O'Boyle:
Automatic feature generation for machine learning-based optimising compilation. 14:1-14:32
Volume 11, Number 2, June 2014
- Theo Kluter, Samuel Burri, Philip Brisk

, Edoardo Charbon, Paolo Ienne:
Virtual Ways: Low-Cost Coherence for Instruction Set Extensions with Architecturally Visible Storage. 15:1-15:26 - Bin Ren, Todd Mytkowicz, Gagan Agrawal:

A Portable Optimization Engine for Accelerating Irregular Data-Traversal Applications on SIMD Architectures. 16:1-16:31 - Zhengwei Qi, Jianguo Yao, Chao Zhang, Miao Yu, Zhizhou Yang, Haibing Guan:

VGRIS: Virtualized GPU Resource Isolation and Scheduling in Cloud Gaming. 17:1-17:25 - Bor-Yeh Shen, Wei-Chung Hsu

, Wuu Yang:
A Retargetable Static Binary Translator for the ARM Architecture. 18:1-18:25 - Darío Suárez Gracia, Alexandra Ferrerón-Labari, Luis Montesano Del Campo, Teresa Monreal Arnal, Víctor Viñals Yúfera

:
Revisiting LP-NUCA Energy Consumption: Cache Access Policies and Adaptive Block Dropping. 19:1-19:26 - Zhibin Liang, Wei Zhang, Yung-Cheng Ma

:
Deadline-Constrained Clustered Scheduling for VLIW Architectures using Power-Gated Register Files. 20:1-20:26 - Shuangde Fang, Zidong Du, Yuntan Fang, Yuanjie Huang, Yang Chen, Lieven Eeckhout, Olivier Temam, Huawei Li

, Yunji Chen
, Chengyong Wu:
Performance Portability Across Heterogeneous SoCs Using a Generalized Library-Based Approach. 21:1-21:25 - Abdul Rahman Kaitoua, Hazem M. Hajj, Mazen A. R. Saghir, Hassan Artail, Haitham Akkary, Mariette Awad

, Mageda Sharafeddine
, Khaleel W. Mershad
:
Hadoop Extensions for Distributed Computing on Reconfigurable Active SSD Clusters. 22:1-22:26
Volume 11, Number 3, July/August 2014
- Jue Wang, Xiangyu Dong, Yuan Xie:

Preventing STT-RAM Last-Level Caches from Port Obstruction. 23:1-23:19 - Miguel A. Gonzalez-Mesa, Eladio Gutiérrez, Emilio L. Zapata, Oscar G. Plata:

Effective Transactional Memory Execution Management for Improved Concurrency. 24:1-24:27 - Rakesh Kumar, Alejandro Martínez, Antonio González

:
Efficient Power Gating of SIMD Accelerators Through Dynamic Selective Devectorization in an HW/SW Codesigned Environment. 25:1-25:23 - Stefano Di Carlo

, Salvatore Galfano, Marco Indaco, Paolo Prinetto, Davide Bertozzi, Piero Olivo
, Cristian Zambelli
:
FLARES: An Aging Aware Algorithm to Autonomously Adapt the Error Correction Capability in NAND Flash Memories. 26:1-26:25 - Davide B. Bartolini

, Filippo Sironi, Donatella Sciuto
, Marco D. Santambrogio
:
Automated Fine-Grained CPU Provisioning for Virtual Machines. 27:1-27:25 - Trevor E. Carlson

, Wim Heirman, Stijn Eyerman, Ibrahim Hur, Lieven Eeckhout:
An Evaluation of High-Level Mechanistic Core Models. 28:1-28:25 - Farrukh Hijaz, Omer Khan:

NUCA-L1: A Non-Uniform Access Latency Level-1 Cache Architecture for Multicores Operating at Near-Threshold Voltages. 29:1-29:28 - Andi Drebes, Karine Heydemann, Nathalie Drach, Antoniu Pop, Albert Cohen:

Topology-Aware and Dependence-Aware Scheduling and Memory Allocation for Task-Parallel Languages. 30:1-30:25 - Venkata Kalyan Tavva

, Ravi Kasha, Madhu Mutyam
:
EFGR: An Enhanced Fine Granularity Refresh Feature for High-Performance DDR4 DRAM Devices. 31:1-31:26 - Gulay Yalcin, Oguz Ergin

, Emrah Islek, Osman Sabri Unsal
, Adrián Cristal
:
Exploiting Existing Comparators for Fine-Grained Low-Cost Error Detection. 32:1-32:24 - Pradeep Ramachandran, Siva Kumar Sastry Hari, Man-Lap Li, Sarita V. Adve:

Hardware Fault Recovery for I/O Intensive Applications. 33:1-33:25 - Stijn Eyerman, Pierre Michaud

, Wouter Rogiest:
Multiprogram Throughput Metrics: A Systematic Approach. 34:1-34:26
Volume 11, Number 4, December 2014
- Cedric Nugteren, Henk Corporaal:

Bones: An Automatic Skeleton-Based C-to-CUDA Compiler for GPUs. 35:1-35:25 - Jue Wang, Xiangyu Dong, Yuan Xie:

Building and Optimizing MRAM-Based Commodity Memories. 36:1-36:22 - Rakesh Komuravelli, Sarita V. Adve, Ching-Tsun Chou:

Revisiting the Complexity of Hardware Cache Coherence and Some Implications. 37:1-37:22 - Gabriel Rodríguez

, Juan Touriño
, Mahmut T. Kandemir:
Volatile STT-RAM Scratchpad Design and Data Allocation for Low Energy. 38:1-38:26 - Cristobal Camarero, Enrique Vallejo

, Ramón Beivide:
Topological Characterization of Hamming and Dragonfly Networks and Its Implications on Routing. 39:1-39:25 - HanBin Yoon, Justin Meza, Naveen Muralimanohar, Norman P. Jouppi, Onur Mutlu

:
Efficient Data Mapping and Buffering Techniques for Multilevel Cell Phase-Change Memories. 40:1-40:25 - Nathanaël Prémillieu, André Seznec:

Efficient Out-of-Order Execution of Guarded ISAs. 41:1-41:21 - Zheng Wang

, Dominik Grewe, Michael F. P. O'Boyle:
Automatic and Portable Mapping of Data Parallel Programs to OpenCL for GPU-Based Heterogeneous Systems. 42:1-42:26 - Dan He, Fang Wang, Hong Jiang, Dan Feng, Jingning Liu, Wei Tong

, Zheng Zhang:
Improving Hybrid FTL by Fully Exploiting Internal SSD Parallelism with Virtual Blocks. 43:1-43:19 - Eri Rubin, Ely Levy, Amnon Barak, Tal Ben-Nun:

MAPS: Optimizing Massively Parallel Applications Using Device-Level Memory Abstraction. 44:1-44:22
Volume 11, Number 4, January 2015
- Alessandro Cilardo, Luca Gallo:

Improving Multibank Memory Access Parallelism with Lattice-Based Partitioning. 45:1-45:25 - Jan Kasper Martinsen, Håkan Grahn

, Anders Isberg:
The Effects of Parameter Tuning in Software Thread-Level Speculation in JavaScript Engines. 46:1-46:25 - Quentin Colombet, Florian Brandner, Alain Darte:

Studying Optimal Spilling in the Light of SSA. 47:1-47:26 - Jawad Haj-Yihia, Yosi Ben-Asher, Efraim Rotem, Ahmad Yasin, Ran Ginosar:

Compiler-Directed Power Management for Superscalars. 48:1-48:21 - Hong-Phuc Trinh, Marc Duranton, Michel Paindavoine

:
Efficient Data Encoding for Convolutional Neural Network application. 49:1-49:21 - Maximilien Breughe, Stijn Eyerman, Lieven Eeckhout:

Mechanistic Analytical Modeling of Superscalar In-Order Processor Performance. 50:1-50:26 - Vivek Seshadri, Samihan Yedkar, Hongyi Xin

, Onur Mutlu
, Phillip B. Gibbons, Michael A. Kozuch, Todd C. Mowry:
Mitigating Prefetcher-Caused Pollution Using Informed Caching Policies for Prefetched Blocks. 51:1-51:22 - George Matheou, Paraskevas Evripidou:

Architectural Support for Data-Driven Execution. 52:1-52:25 - Amir Morad, Leonid Yavits

, Ran Ginosar:
GP-SIMD Processing-in-Memory. 53:1-53:26 - Thomas Schaub, Simon Moll, Ralf Karrenberg, Sebastian Hack:

The Impact of the SIMD Width on Control-Flow and Memory Divergence. 54:1-54:25 - Zhenman Fang, Sanyam Mehta, Pen-Chung Yew

, Antonia Zhai, James B. S. G. Greensky, Gautham Beeraka, Binyu Zang:
Measuring Microarchitectural Details of Multi- and Many-Core Memory Systems through Microbenchmarking. 55:1-55:26 - Chi Ching Chi, Mauricio Alvarez-Mesa, Ben H. H. Juurlink:

Low-Power High-Efficiency Video Decoding using General-Purpose Processors. 56:1-56:25 - Fabio Luporini, Ana Lucia Varbanescu, Florian Rathgeber

, Gheorghe-Teodor Bercea, J. Ramanujam
, David A. Ham, Paul H. J. Kelly:
Cross-Loop Optimization of Arithmetic Intensity for Finite Element Local Assembly. 57:1-57:25 - Xing Zhou, María Jesús Garzarán, David A. Padua:

Optimal Parallelogram Selection for Hierarchical Tiling. 58:1-58:23 - Leo Porter

, Michael A. Laurenzano, Ananta Tiwari, Adam Jundt, William A. Ward Jr., Roy L. Campbell, Laura Carrington:
Making the Most of SMT in HPC: System- and Application-Level Perspectives. 59:1-59:26 - Xin Tong, Toshihiko Koju, Motohiro Kawahito, Andreas Moshovos:

Optimizing Memory Translation Emulation in Full System Emulators. 60:1-60:24 - Martin Kong

, Antoniu Pop, Louis-Noël Pouchet, R. Govindarajan, Albert Cohen, P. Sadayappan:
Compiler/Runtime Framework for Dynamic Dataflow Parallelization of Tiled Programs. 61:1-61:30 - Nicolas Melot, Christoph W. Keßler, Jörg Keller, Patrick Eitschberger:

Fast Crown Scheduling Heuristics for Energy-Efficient Mapping and Scaling of Moldable Streaming Tasks on Manycore Systems. 62:1-62:24 - Wenjia Ruan, Yujie Liu, Michael F. Spear

:
Transactional Read-Modify-Write Without Aborts. 63:1-63:24 - Zia Ul Huda

, Ali Jannesari
, Felix Wolf:
Using Template Matching to Infer Parallel Design Patterns. 64:1-64:21 - Heiner Litz, Ricardo J. Dias

, David R. Cheriton:
Efficient Correction of Anomalies in Snapshot Isolation Transactions. 65:1-65:24 - Helge Bahmann, Nico Reissmann, Magnus Jahre

, Jan Christian Meyer:
Perfect Reconstructability of Control Flow from Demand Dependence Graphs. 66:1-66:25 - Venmugil Elango, Naser Sedaghati, Fabrice Rastello, Louis-Noël Pouchet, J. Ramanujam

, Radu Teodorescu, P. Sadayappan
:
On Using the Roofline Model with Lower Bounds on Data Movement. 67:1-67:23

manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.


Google
Google Scholar
Semantic Scholar
Internet Archive Scholar
CiteSeerX
ORCID














