default search action
22nd HPCA 2016: Barcelona, Spain
- 2016 IEEE International Symposium on High Performance Computer Architecture, HPCA 2016, Barcelona, Spain, March 12-16, 2016. IEEE Computer Society 2016, ISBN 978-1-4673-9211-2
Session 1A - Hardware Accelerators
- Mahdi Nazm Bojnordi, Engin Ipek:
Memristive Boltzmann machine: A hardware accelerator for combinatorial optimization and deep learning. 1-13 - Divya Mahajan, Jongse Park, Emmanuel Amaro, Hardik Sharma, Amir Yazdanbakhsh, Joon Kyung Kim, Hadi Esmaeilzadeh:
TABLA: A unified template-based framework for accelerating statistical machine learning. 14-26 - Tony Nowatzki, Vinay Gangadhar, Karthikeyan Sankaralingam, Greg Wright:
Pushing the limits of accelerator efficiency while retaining programmability. 27-39
Session 1B - Mobile/IoT
- Yajing Chen, Shengshuo Lu, Hun-Seok Kim, David T. Blaauw, Ronald G. Dreslinski, Trevor N. Mudge:
A low power software-defined-radio baseband processor for the Internet of Things. 40-51 - Benjamin Gaudette, Carole-Jean Wu, Sarma B. K. Vrudhula:
Improving smartphone user experience by balancing performance and energy with probabilistic QoS guarantee. 52-63 - Matthew Halpern, Yuhao Zhu, Vijay Janapa Reddi:
Mobile CPU's rise to power: Quantifying the impact of generational mobile CPU design trends on performance, energy, and user satisfaction. 64-76
Session 2A - Non-volatile Memories
- Kshitij A. Doshi, Ellis Giles, Peter J. Varman:
Atomic persistence for SCM with a non-intrusive backend controller. 77-89 - Poovaiah M. Palangappa, Kartik Mohanram:
CompEx: Compression-expansion coding for energy, latency, and lifetime improvements in MLC/TLC NVM. 90-101 - Miguel Angel Lastras-Montaño, Amirali Ghofrani, Kwang-Ting Cheng:
A low-power hybrid reconfigurable architecture for resistive random-access memories. 102-113
Session 2B - Reconfigurable Architectures
- Ze-ke Wang, Bingsheng He, Wei Zhang, Shunning Jiang:
A performance analysis framework for optimizing OpenCL applications on FPGAs. 114-125 - Mingyu Gao, Christos Kozyrakis:
HRL: Efficient and flexible reconfigurable logic for near-data processing. 126-137 - Matthew A. Watkins, Tony Nowatzki, Anthony Carno:
Software transparent dynamic binary translation for coarse-grain reconfigurable architectures. 138-150
Session 3A - GPUs
- Renji Thomas, Kristin Barber, Naser Sedaghati, Li Zhou, Radu Teodorescu:
Core tunneling: Variation-aware voltage noise mitigation in GPUs. 151-162 - Keunsoo Kim, Sangpil Lee, Myung Kuk Yoon, Gunjae Koo, Won Woo Ro, Murali Annavaram:
Warped-preexecution: A GPU pre-execution approach for improving latency hiding. 163-175 - Daniel Wong, Nam Sung Kim, Murali Annavaram:
Approximating warps with intra-warp operand value similarity. 176-187 - Gennady Pekhimenko, Evgeny Bolotin, Nandita Vijaykumar, Onur Mutlu, Todd C. Mowry, Stephen W. Keckler:
A case for toggle-aware compression for GPU systems. 188-200
Session 3B - Caches
- Elvira Teran, Yingying Tian, Zhe Wang, Daniel A. Jiménez:
Minimal disturbance placement and promotion. 201-211 - Hongil Yoon, Gurindar S. Sohi:
Revisiting virtual L1 caches: A practical design using dynamic synonym remapping. 212-224 - Nathan Beckmann, Daniel Sánchez:
Modeling cache performance beyond LRU. 225-236 - Hakbeom Jang, Yongjun Lee, Jongwon Kim, Youngsok Kim, Jangwoo Kim, Jinkyu Jeong, Jae W. Lee:
Efficient footprint caching for Tagless DRAM Caches. 237-248
Session 4A - Coherence and Consistency
- Yuelu Duan, David A. Koufaty, Josep Torrellas:
SCsafe: Logging sequential consistency violations continuously and precisely. 249-260 - Liang Luo, Akshitha Sriraman, Brooke Fugate, Shiliang Hu, Gilles Pokam, Chris J. Newburn, Joseph Devietti:
LASER: Light, Accurate Sharing dEtection and Repair. 261-273 - Sui Chen, Lu Peng:
Efficient GPU hardware transactional memory through early conflict resolution. 274-284 - Sunjae Park, Milos Prvulovic, Christopher J. Hughes:
PleaseTM: Enabling transaction conflict management in requester-wins hardware transactional memory. 285-296
Session 4B - Interconnects
- Jieming Yin, Onur Kayiran, Matthew Poremba, Natalie D. Enright Jerger, Gabriel H. Loh:
Efficient synthetic traffic models for large, complex SoCs. 297-308 - Yuan Yao, Zhonghai Lu:
DVFS for NoCs in CMPs: A thread voting approach. 309-320 - Yigit Demir, Nikos Hardavellas:
SLaC: Stage laser control for a flattened butterfly network. 321-332 - Zimo Li, Joshua San Miguel, Natalie D. Enright Jerger:
The runahead network-on-chip. 333-344
Session 5A - GPGPUs
- Tianhao Zheng, David W. Nellans, Arslan Zulfiqar, Mark Stephenson, Stephen W. Keckler:
Towards high performance paged memory for GPUs. 345-357 - Zhenning Wang, Jun Yang, Rami G. Melhem, Bruce R. Childers, Youtao Zhang, Minyi Guo:
Simultaneous Multikernel GPU: Multi-tasking throughput processors via fine-grained sharing. 358-369 - Minseok Lee, Gwangsun Kim, John Kim, Woong Seo, Yeon-Gon Cho, Soojung Ryu:
iPAWS: Instruction-issue pattern-based adaptive warp scheduling for GPGPUs. 370-381
Session 5B - Security
- Andrew Ferraiuolo, Yao Wang, Danfeng Zhang, Andrew C. Myers, G. Edward Suh:
Lattice priority scheduling: Low-overhead timing-channel protection for a shared memory controller. 382-393 - Zhen Hang Jiang, Yunsi Fei, David R. Kaeli:
A complete key recovery timing attack on a GPU. 394-405 - Fangfei Liu, Qian Ge, Yuval Yarom, Frank McKeen, Carlos V. Rozas, Gernot Heiser, Ruby B. Lee:
CATalyst: Defeating last-level cache side channel attacks in cloud computing. 406-418
Session 6A - Large-Scale Systems
- Wei Wang, Jack W. Davidson, Mary Lou Soffa:
Predicting the memory bandwidth and optimal core allocations for multi-threaded applications on large-scale NUMA machines. 419-431 - Mohammad A. Islam, Xiaoqi Ren, Shaolei Ren, Adam Wierman, Xiaorui Wang:
A market approach for handling power emergencies in multi-tenant data center. 432-443 - Yang Li, Di Wang, Saugata Ghose, Jie Liu, Sriram Govindan, Sean James, Eric Peterson, John Siegler, Rachata Ausavarungnirun, Onur Mutlu:
SizeCap: Efficiently handling power surges in fuel cell powered data centers. 444-456
Session 6B - Potpourri
- Donglin Wang, Xueliang Du, Leizu Yin, Chen Lin, Hong Ma, Weili Ren, Huijuan Wang, Xingang Wang, Shaolin Xie, Lei Wang, Zijun Liu, Tao Wang, Zhonghua Pu, Guangxin Ding, Mengchen Zhu, Lipeng Yang, Ruoshan Guo, Zhiwei Zhang, Xiao Lin, Jie Hao, Yongyong Yang, Wenqin Sun, Fabiao Zhou, NuoZhou Xiao, Qian Cui, Xiaoqin Wang:
MaPU: A novel mathematical computing architecture. 457-468 - Pierre Michaud:
Best-offset hardware prefetching. 469-480 - Hao Wang, Jie Zhang, Sharmila Shridhar, Gieseo Park, Myoungsoo Jung, Nam Sung Kim:
DUANG: Fast and lightweight page migration in asymmetric memory systems. 481-493
Session 7A - Industry Session
- Neha Agarwal, David W. Nellans, Eiman Ebrahimi, Thomas F. Wenisch, John Danskin, Stephen W. Keckler:
Selective GPU caches to eliminate CPU-GPU HW cache coherence. 494-506 - Jianbo Dong, Rui Hou, Michael C. Huang, Tao Jiang, Boyan Zhao, Sally A. McKee, Haibin Wang, Xiaosong Cui, Lixin Zhang:
Venice: Exploring server architectures for effective resource sharing. 507-518 - Bin Nie, Devesh Tiwari, Saurabh Gupta, Evgenia Smirni, James H. Rogers:
A large-scale study of soft-errors on GPUs in the field. 519-530 - Sungyong Seo, Youngjin Cho, Youngkwang Yoo, Otae Bae, Jaegeun Park, Heehyun Nam, Sunmi Lee, Yongmyung Lee, Seungdo Chae, Moonsang Kwon, Jin-Hyeok Choi, Sangyeun Cho, Jaeheon Jeong, Duckhyun Chang:
Design and implementation of a mobile storage leveraging the DRAM interface. 531-542
Session 7B - Memory Technology
- XianWei Zhang, Youtao Zhang, Bruce R. Childers, Jun Yang:
Restore truncation for performance improvement in future DRAM systems. 543-554 - Xun Jian, Vilas Sridharan, Rakesh Kumar:
Parity Helix: Efficient protection for single-dimensional faults in multi-dimensional memory systems. 555-567 - Kevin K. Chang, Prashant J. Nair, Donghyuk Lee, Saugata Ghose, Moinuddin K. Qureshi, Onur Mutlu:
Low-Cost Inter-Linked Subarrays (LISA): Enabling fast inter-subarray data movement in DRAM. 568-580 - Hasan Hassan, Gennady Pekhimenko, Nandita Vijaykumar, Vivek Seshadri, Donghyuk Lee, Oguz Ergin, Onur Mutlu:
ChargeCache: Reducing DRAM latency by exploiting row access locality. 581-593
Session 8 - Modeling and Testing
- William J. Song, Saibal Mukhopadhyay, Sudhakar Yalamanchili:
Amdahl's law for lifetime reliability scaling in heterogeneous multicore processors. 594-605 - Sina Hassani, Gabriel Southern, Jose Renau:
LiveSim: Going live with microarchitecture simulation. 606-617 - Marco Elver, Vijay Nagarajan:
McVerSi: A test generation framework for fast memory consistency verification in simulation. 618-630
Session 9A - Caches and TLB
- Vasileios Karakostas, Jayneel Gandhi, Adrián Cristal, Mark D. Hill, Kathryn S. McKinley, Mario Nemirovsky, Michael M. Swift, Osman S. Unsal:
Energy-efficient address translation. 631-643 - Madhavan Manivannan, Vassilis Papaefstathiou, Miquel Pericàs, Per Stenström:
RADAR: Runtime-assisted dead region management for last-level caches. 644-656 - Andrew Herdrich, Edwin Verplanke, Priya Autee, Ramesh Illikkal, Chris Gianos, Ronak Singhal, Ravi R. Iyer:
Cache QoS: From concept to reality in the Intel® Xeon® processor E5-2600 v3 product family. 657-668
Session 9B - Microarchitecture
- Josué Feliu, Stijn Eyerman, Julio Sahuquillo, Salvador Petit:
Symbiotic job scheduling on the IBM POWER8. 669-680 - Bhargava Gopireddy, Choungki Song, Josep Torrellas, Nam Sung Kim, Aditya Agrawal, Asit K. Mishra:
ScalCore: Designing a core for voltage scalability. 681-693 - Arthur Perais, André Seznec:
Cost effective physical register sharing. 694-706
manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.