


Остановите войну!
for scientists:


default search action
Andy Zou
Person information

Refine list

refinements active!
zoomed in on ?? of ?? records
view refined list in
export refined list as
2020 – today
- 2023
- [c8]Alexander Pan, Jun Shern Chan, Andy Zou, Nathaniel Li, Steven Basart, Thomas Woodside, Hanlin Zhang, Scott Emmons, Dan Hendrycks:
Do the Rewards Justify the Means? Measuring Trade-Offs Between Rewards and Ethical Behavior in the Machiavelli Benchmark. ICML 2023: 26837-26867 - [i11]Kai Hu, Andy Zou, Zifan Wang, Klas Leino, Matt Fredrikson:
Scaling in Depth: Unlocking Robustness Certification on ImageNet. CoRR abs/2301.12549 (2023) - [i10]Ram M. Kripa
, Andy Zou, Ryan Jia, Kenny Huang:
Papaya: Federated Learning, but Fully Decentralized. CoRR abs/2303.06189 (2023) - [i9]Alexander Pan, Jun Shern Chan, Andy Zou, Nathaniel Li, Steven Basart, Thomas Woodside, Jonathan Ng, Hanlin Zhang, Scott Emmons, Dan Hendrycks:
Do the Rewards Justify the Means? Measuring Trade-Offs Between Rewards and Ethical Behavior in the MACHIAVELLI Benchmark. CoRR abs/2304.03279 (2023) - [i8]Andy Zou, Zifan Wang, J. Zico Kolter, Matt Fredrikson:
Universal and Transferable Adversarial Attacks on Aligned Language Models. CoRR abs/2307.15043 (2023) - [i7]Andy Zou, Long Phan, Sarah Chen, James Campbell, Phillip Guo, Richard Ren, Alexander Pan, Xuwang Yin, Mantas Mazeika, Ann-Kathrin Dombrowski, Shashwat Goel, Nathaniel Li, Michael J. Byun, Zifan Wang, Alex Mallen, Steven Basart, Sanmi Koyejo, Dawn Song, Matt Fredrikson, J. Zico Kolter, Dan Hendrycks:
Representation Engineering: A Top-Down Approach to AI Transparency. CoRR abs/2310.01405 (2023) - 2022
- [c7]Dan Hendrycks, Andy Zou, Mantas Mazeika, Leonard Tang, Bo Li, Dawn Song, Jacob Steinhardt:
PixMix: Dreamlike Pictures Comprehensively Improve Safety Measures. CVPR 2022: 16762-16771 - [c6]Dan Hendrycks, Steven Basart, Mantas Mazeika, Andy Zou, Joseph Kwon, Mohammadreza Mostajabi, Jacob Steinhardt, Dawn Song:
Scaling Out-of-Distribution Detection for Real-World Settings. ICML 2022: 8759-8773 - [c5]Mantas Mazeika, Eric Tang, Andy Zou, Steven Basart, Jun Shern Chan, Dawn Song, David A. Forsyth, Jacob Steinhardt, Dan Hendrycks:
How Would The Viewer Feel? Estimating Wellbeing From Video Scenarios. NeurIPS 2022 - [c4]Andy Zou, Tristan Xiao, Ryan Jia, Joe Kwon, Mantas Mazeika, Richard Li, Dawn Song, Jacob Steinhardt, Owain Evans, Dan Hendrycks:
Forecasting Future World Events With Neural Networks. NeurIPS 2022 - [i6]Aarohi Srivastava, Abhinav Rastogi, Abhishek Rao, Abu Awal Md Shoeb, Abubakar Abid, Adam Fisch, Adam R. Brown, Adam Santoro, Aditya Gupta, Adrià Garriga-Alonso, Agnieszka Kluska, Aitor Lewkowycz, Akshat Agarwal, Alethea Power
, Alex Ray, Alex Warstadt, Alexander W. Kocurek, Ali Safaya, Ali Tazarv, Alice Xiang, Alicia Parrish, Allen Nie, Aman Hussain, Amanda Askell, Amanda Dsouza, Ameet Rahane, Anantharaman S. Iyer, Anders Andreassen, Andrea Santilli, Andreas Stuhlmüller, Andrew M. Dai, Andrew La, Andrew K. Lampinen, Andy Zou, Angela Jiang, Angelica Chen, Anh Vuong, Animesh Gupta, Anna Gottardi, Antonio Norelli, Anu Venkatesh, Arash Gholamidavoodi, Arfa Tabassum, Arul Menezes, Arun Kirubarajan, Asher Mullokandov, Ashish Sabharwal, Austin Herrick, Avia Efrat, Aykut Erdem, Ayla Karakas, et al.:
Beyond the Imitation Game: Quantifying and extrapolating the capabilities of language models. CoRR abs/2206.04615 (2022) - [i5]Andy Zou, Tristan Xiao, Ryan Jia, Joe Kwon, Mantas Mazeika, Richard Li, Dawn Song, Jacob Steinhardt, Owain Evans, Dan Hendrycks:
Forecasting Future World Events with Neural Networks. CoRR abs/2206.15474 (2022) - [i4]Mantas Mazeika, Eric Tang, Andy Zou, Steven Basart, Jun Shern Chan, Dawn Song, David A. Forsyth, Jacob Steinhardt, Dan Hendrycks:
How Would The Viewer Feel? Estimating Wellbeing From Video Scenarios. CoRR abs/2210.10039 (2022) - 2021
- [c3]Dan Hendrycks, Collin Burns, Steven Basart, Andy Zou, Mantas Mazeika, Dawn Song, Jacob Steinhardt:
Measuring Massive Multitask Language Understanding. ICLR 2021 - [c2]Dan Hendrycks, Mantas Mazeika, Andy Zou, Sahil Patel, Christine Zhu, Jesus Navarro, Dawn Song, Bo Li, Jacob Steinhardt:
What Would Jiminy Cricket Do? Towards Agents That Behave Morally. NeurIPS Datasets and Benchmarks 2021 - [c1]Mantas Mazeika, Dan Hendrycks, Huichen Li, Xiaojun Xu, Sidney Q. Hough, Andy Zou, Arezoo Rajabi, Qi Yao, Zihao Wang, Jian Tian, Yao Tang, Di Tang, Roman Smirnov, Pavel Pleskov, Nikita Benkovich, Dawn Song, Radha Poovendran, Bo Li, David A. Forsyth:
The Trojan Detection Challenge. NeurIPS (Competition and Demos) 2021: 279-291 - [i3]Dan Hendrycks, Mantas Mazeika, Andy Zou, Sahil Patel, Christine Zhu, Jesus Navarro, Dawn Song, Bo Li, Jacob Steinhardt:
What Would Jiminy Cricket Do? Towards Agents That Behave Morally. CoRR abs/2110.13136 (2021) - [i2]Dan Hendrycks, Andy Zou, Mantas Mazeika, Leonard Tang, Bo Li, Dawn Song, Jacob Steinhardt:
PixMix: Dreamlike Pictures Comprehensively Improve Safety Measures. CoRR abs/2112.05135 (2021) - 2020
- [i1]Dan Hendrycks, Collin Burns, Steven Basart, Andy Zou, Mantas Mazeika, Dawn Song, Jacob Steinhardt:
Measuring Massive Multitask Language Understanding. CoRR abs/2009.03300 (2020)
Coauthor Index

manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.
Unpaywalled article links
Add open access links from to the list of external document links (if available).
Privacy notice: By enabling the option above, your browser will contact the API of unpaywall.org to load hyperlinks to open access articles. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Unpaywall privacy policy.
Archived links via Wayback Machine
For web page which are no longer available, try to retrieve content from the of the Internet Archive (if available).
Privacy notice: By enabling the option above, your browser will contact the API of archive.org to check for archived content of web pages that are no longer available. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Internet Archive privacy policy.
Reference lists
Add a list of references from ,
, and
to record detail pages.
load references from crossref.org and opencitations.net
Privacy notice: By enabling the option above, your browser will contact the APIs of crossref.org, opencitations.net, and semanticscholar.org to load article reference information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Crossref privacy policy and the OpenCitations privacy policy, as well as the AI2 Privacy Policy covering Semantic Scholar.
Citation data
Add a list of citing articles from and
to record detail pages.
load citations from opencitations.net
Privacy notice: By enabling the option above, your browser will contact the API of opencitations.net and semanticscholar.org to load citation information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the OpenCitations privacy policy as well as the AI2 Privacy Policy covering Semantic Scholar.
OpenAlex data
Load additional information about publications from .
Privacy notice: By enabling the option above, your browser will contact the API of openalex.org to load additional information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the information given by OpenAlex.
last updated on 2023-12-02 23:08 CET by the dblp team
all metadata released as open data under CC0 1.0 license
see also: Terms of Use | Privacy Policy | Imprint