default search action
CGO 2015: San Francisco, CA, USA
- Kunle Olukotun, Aaron Smith, Robert Hundt, Jason Mars:
Proceedings of the 13th Annual IEEE/ACM International Symposium on Code Generation and Optimization, CGO 2015, San Francisco, CA, USA, February 07 - 11, 2015. IEEE Computer Society 2015, ISBN 978-1-4799-8161-8
GPU optimization
- Qing Jiao, Mian Lu, Huynh Phung Huynh, Tulika Mitra:
Improving GPGPU energy-efficiency through concurrent kernel execution and DVFS. 1-11 - Naznin Fauzia, Louis-Noël Pouchet, P. Sadayappan:
Characterizing and enhancing global memory data coalescing on GPUs. 12-22 - Chao Li, Yi Yang, Zhen Lin, Huiyang Zhou:
Automatic data placement into GPU on-chip memory resources. 23-33
Tools and debugging
- Kyle Dewey, Vineeth Kashyap, Ben Hardekopf:
A parallel abstract interpreter for JavaScript. 34-45 - Evgeniy Stepanov, Konstantin Serebryany:
MemorySanitizer: fast detector of uninitialized memory use in C++. 46-55 - Long Zheng, Xiaofei Liao, Bingsheng He, Song Wu, Hai Jin:
On performance debugging of unnecessary lock contentions on multicore processors: a replay-based approach. 56-67
Runtime optimization and techniques
- Byron Hawkins, Brian Demsky, Derek Bruening, Qin Zhao:
Optimizing binary translation of dynamically generated code. 68-78 - William Arthur, Ben Mehne, Reetuparna Das, Todd M. Austin:
Getting in control of your control flow with control-data isolation. 79-90 - Jithendra Srinivas, Wei Ding, Mahmut T. Kandemir:
Reactive tiling. 91-102
Microarchitecture
- Erven Rohou, Bharath Narasimha Swamy, André Seznec:
Branch prediction and the performance of interpreters: don't trust folklore. 103-114 - James Pallister, Kerstin Eder, Simon J. Hollis:
Optimizing the flash-RAM energy trade-off in deeply embedded systems. 115-124 - Lawrence C. McAfee, Kunle Olukotun:
EMEURO: a framework for generating multi-purpose accelerators via deep learning. 125-135
Parallelism and concurrency
- Wai Teng Tang, Ruizhe Zhao, Mian Lu, Yun Liang, Huynh Phung Huyng, Xibai Li, Rick Siow Mong Goh:
Optimizing and auto-tuning scale-free sparse matrix-vector multiplication on Intel Xeon Phi. 136-145 - Brandon Lucia, Luis Ceze:
Data provenance tracking for concurrent programs. 146-156 - Sunil Shrestha, Guang R. Gao, Joseph B. Manzano, Andrès Márquez, John Feo:
Locality aware concurrent start for stencil applications. 157-166
Code generation and optimization
- Niranjan Hasabnis, Rui Qiao, R. Sekar:
Checking correctness of code generator architecture specifications. 167-178 - JinSeok Oh, Soo-Mook Moon:
Snapshot-based loading-time acceleration for web applications. 179-189
Static program analysis and optimization
- Vasileios Porpodas, Alberto Magni, Timothy M. Jones:
PSLP: padded SLP automatic vectorization. 190-201 - Roland Leißa, Marcel Köster, Sebastian Hack:
A graph-based higher-order intermediate representation. 202-212 - Cosmin E. Oancea, Lawrence Rauchwerger:
Scalable conditional induction variables (CIV) analysis. 213-224
Best paper session
- Vaivaswatha Nagaraj, R. Govindarajan:
Approximating flow-sensitive pointer analysis using frequent itemset mining. 225-234 - Simone Campanoni, Glenn H. Holloway, Gu-Yeon Wei, David M. Brooks:
HELIX-UP: relaxing program semantics to unleash parallelization. 235-245 - Xiaochun Zhang, Qi Guo, Yunji Chen, Tianshi Chen, Weiwu Hu:
HERMES: a fast cross-ISA binary translator with post-optimization. 246-256 - Hee-Seok Kim, Izzat El Hajj, John A. Stratton, Steven S. Lumetta, Wen-mei W. Hwu:
Locality-centric thread scheduling for bulk-synchronous programming models on CPU architectures. 257-268
manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.