


default search action
Ronghang Hu
Person information
Refine list

refinements active!
zoomed in on ?? of ?? records
view refined list in
export refined list as
2020 – today
- 2024
- [i25]Nikhila Ravi, Valentin Gabeur, Yuan-Ting Hu, Ronghang Hu, Chaitanya Ryali, Tengyu Ma, Haitham Khedr, Roman Rädle, Chloé Rolland, Laura Gustafson, Eric Mintun, Junting Pan, Kalyan Vasudev Alwala, Nicolas Carion, Chao-Yuan Wu, Ross B. Girshick, Piotr Dollár, Christoph Feichtenhofer:
SAM 2: Segment Anything in Images and Videos. CoRR abs/2408.00714 (2024) - 2023
- [c22]Sanghyun Woo, Shoubhik Debnath, Ronghang Hu, Xinlei Chen, Zhuang Liu, In So Kweon, Saining Xie:
ConvNeXt V2: Co-designing and Scaling ConvNets with Masked Autoencoders. CVPR 2023: 16133-16142 - [c21]Yanghao Li, Haoqi Fan, Ronghang Hu, Christoph Feichtenhofer, Kaiming He:
Scaling Language-Image Pre-Training via Masking. CVPR 2023: 23390-23400 - [c20]Dave Zhenyu Chen, Ronghang Hu, Xinlei Chen, Matthias Nießner, Angel X. Chang:
UniT3D: A Unified Transformer for 3D Dense Captioning and Visual Grounding. ICCV 2023: 18063-18073 - [i24]Sanghyun Woo, Shoubhik Debnath, Ronghang Hu, Xinlei Chen, Zhuang Liu, In So Kweon, Saining Xie:
ConvNeXt V2: Co-designing and Scaling ConvNets with Masked Autoencoders. CoRR abs/2301.00808 (2023) - 2022
- [c19]Amanpreet Singh, Ronghang Hu, Vedanuj Goswami, Guillaume Couairon, Wojciech Galuba, Marcus Rohrbach, Douwe Kiela:
FLAVA: A Foundational Language And Vision Alignment Model. CVPR 2022: 15617-15629 - [i23]Ronghang Hu, Shoubhik Debnath, Saining Xie, Xinlei Chen:
Exploring Long-Sequence Masked Autoencoders. CoRR abs/2210.07224 (2022) - [i22]Yanghao Li, Haoqi Fan, Ronghang Hu, Christoph Feichtenhofer, Kaiming He:
Scaling Language-Image Pre-training via Masking. CoRR abs/2212.00794 (2022) - [i21]Dave Zhenyu Chen, Ronghang Hu, Xinlei Chen, Matthias Nießner, Angel X. Chang:
UniT3D: A Unified Transformer for 3D Dense Captioning and Visual Grounding. CoRR abs/2212.00836 (2022) - 2021
- [c18]Ronghang Hu, Amanpreet Singh:
UniT: Multimodal Multitask Learning with a Unified Transformer. ICCV 2021: 1419-1429 - [c17]Ronghang Hu, Nikhila Ravi, Alexander C. Berg, Deepak Pathak:
Worldsheet: Wrapping the World in a 3D Sheet for View Synthesis from a Single Image. ICCV 2021: 12508-12517 - [i20]Ronghang Hu, Amanpreet Singh:
Transformer is All You Need: Multimodal Multitask Learning with a Unified Transformer. CoRR abs/2102.10772 (2021) - [i19]Amanpreet Singh, Ronghang Hu, Vedanuj Goswami, Guillaume Couairon, Wojciech Galuba, Marcus Rohrbach, Douwe Kiela:
FLAVA: A Foundational Language And Vision Alignment Model. CoRR abs/2112.04482 (2021) - 2020
- [b1]Ronghang Hu:
Structured Models for Vision-and-Language Reasoning. University of California, Berkeley, USA, 2020 - [c16]Ronghang Hu, Amanpreet Singh, Trevor Darrell, Marcus Rohrbach:
Iterative Answer Prediction With Pointer-Augmented Multimodal Transformers for TextVQA. CVPR 2020: 9989-9999 - [c15]Oleksii Sidorov, Ronghang Hu, Marcus Rohrbach, Amanpreet Singh:
TextCaps: A Dataset for Image Captioning with Reading Comprehension. ECCV (2) 2020: 742-758 - [i18]Oleksii Sidorov, Ronghang Hu, Marcus Rohrbach, Amanpreet Singh:
TextCaps: a Dataset for Image Captioning with Reading Comprehension. CoRR abs/2003.12462 (2020) - [i17]Ronghang Hu, Deepak Pathak:
Worldsheet: Wrapping the World in a 3D Sheet for View Synthesis from a Single Image. CoRR abs/2012.09854 (2020)
2010 – 2019
- 2019
- [c14]Ronghang Hu, Daniel Fried, Anna Rohrbach, Dan Klein, Trevor Darrell, Kate Saenko:
Are You Looking? Grounding to Multiple Modalities in Vision-and-Language Navigation. ACL (1) 2019: 6551-6557 - [c13]Ronghang Hu, Anna Rohrbach, Trevor Darrell, Kate Saenko
:
Language-Conditioned Graph Networks for Relational Reasoning. ICCV 2019: 10293-10302 - [i16]Ronghang Hu, Anna Rohrbach, Trevor Darrell, Kate Saenko:
Language-Conditioned Graph Networks for Relational Reasoning. CoRR abs/1905.04405 (2019) - [i15]Ronghang Hu, Daniel Fried, Anna Rohrbach, Dan Klein, Trevor Darrell, Kate Saenko:
Are You Looking? Grounding to Multiple Modalities in Vision-and-Language Navigation. CoRR abs/1906.00347 (2019) - [i14]Ronghang Hu, Amanpreet Singh, Trevor Darrell, Marcus Rohrbach:
Iterative Answer Prediction with Pointer-Augmented Multimodal Transformers for TextVQA. CoRR abs/1911.06258 (2019) - 2018
- [c12]Ronghang Hu, Piotr Dollár, Kaiming He, Trevor Darrell, Ross B. Girshick:
Learning to Segment Every Thing. CVPR 2018: 4233-4241 - [c11]Ronghang Hu
, Jacob Andreas
, Trevor Darrell
, Kate Saenko
:
Explainable Neural Computation via Stack Neural Module Networks. ECCV (7) 2018: 55-71 - [c10]Lisa Anne Hendricks
, Ronghang Hu
, Trevor Darrell
, Zeynep Akata
:
Grounding Visual Explanations. ECCV (2) 2018: 269-286 - [c9]Daniel Fried, Ronghang Hu, Volkan Cirik, Anna Rohrbach, Jacob Andreas, Louis-Philippe Morency, Taylor Berg-Kirkpatrick, Kate Saenko, Dan Klein, Trevor Darrell:
Speaker-Follower Models for Vision-and-Language Navigation. NeurIPS 2018: 3318-3329 - [i13]Daniel Fried, Ronghang Hu, Volkan Cirik, Anna Rohrbach, Jacob Andreas, Louis-Philippe Morency, Taylor Berg-Kirkpatrick, Kate Saenko, Dan Klein, Trevor Darrell:
Speaker-Follower Models for Vision-and-Language Navigation. CoRR abs/1806.02724 (2018) - [i12]Lisa Anne Hendricks, Ronghang Hu, Trevor Darrell, Zeynep Akata:
Generating Counterfactual Explanations with Natural Language. CoRR abs/1806.09809 (2018) - [i11]Ronghang Hu, Jacob Andreas, Trevor Darrell, Kate Saenko:
Explainable Neural Computation via Stack Neural Module Networks. CoRR abs/1807.08556 (2018) - [i10]Lisa Anne Hendricks, Ronghang Hu, Trevor Darrell, Zeynep Akata:
Grounding Visual Explanations. CoRR abs/1807.09685 (2018) - 2017
- [c8]Ronghang Hu, Marcus Rohrbach, Jacob Andreas, Trevor Darrell, Kate Saenko
:
Modeling Relationships in Referential Expressions with Compositional Modular Networks. CVPR 2017: 4418-4427 - [c7]Ronghang Hu, Jacob Andreas, Marcus Rohrbach, Trevor Darrell, Kate Saenko
:
Learning to Reason: End-to-End Module Networks for Visual Question Answering. ICCV 2017: 804-813 - [i9]Ronghang Hu, Jacob Andreas, Marcus Rohrbach, Trevor Darrell, Kate Saenko:
Learning to Reason: End-to-End Module Networks for Visual Question Answering. CoRR abs/1704.05526 (2017) - [i8]Lisa Anne Hendricks, Ronghang Hu, Trevor Darrell, Zeynep Akata:
Grounding Visual Explanations (Extended Abstract). CoRR abs/1711.06465 (2017) - [i7]Ronghang Hu, Piotr Dollár, Kaiming He, Trevor Darrell, Ross B. Girshick:
Learning to Segment Every Thing. CoRR abs/1711.10370 (2017) - 2016
- [c6]Ronghang Hu, Huazhe Xu, Marcus Rohrbach, Jiashi Feng, Kate Saenko
, Trevor Darrell:
Natural Language Object Retrieval. CVPR 2016: 4555-4564 - [c5]Ronghang Hu, Marcus Rohrbach, Trevor Darrell:
Segmentation from Natural Language Expressions. ECCV (1) 2016: 108-124 - [c4]Anna Rohrbach, Marcus Rohrbach, Ronghang Hu, Trevor Darrell, Bernt Schiele:
Grounding of Textual Phrases in Images by Reconstruction. ECCV (1) 2016: 817-834 - [i6]Ronghang Hu, Marcus Rohrbach, Trevor Darrell:
Segmentation from Natural Language Expressions. CoRR abs/1603.06180 (2016) - [i5]Ronghang Hu, Marcus Rohrbach, Subhashini Venugopalan, Trevor Darrell:
Utilizing Large Scale Vision and Text Datasets for Image Segmentation from Referring Expressions. CoRR abs/1608.08305 (2016) - [i4]Ronghang Hu, Marcus Rohrbach, Jacob Andreas, Trevor Darrell, Kate Saenko:
Modeling Relationships in Referential Expressions with Compositional Modular Networks. CoRR abs/1611.09978 (2016) - 2015
- [c3]Damian Mrowca, Marcus Rohrbach, Judy Hoffman
, Ronghang Hu, Kate Saenko, Trevor Darrell:
Spatial Semantic Regularisation for Large Scale Object Detection. ICCV 2015: 2003-2011 - [i3]Damian Mrowca, Marcus Rohrbach, Judy Hoffman, Ronghang Hu, Kate Saenko, Trevor Darrell:
Spatial Semantic Regularisation for Large Scale Object Detection. CoRR abs/1510.02949 (2015) - [i2]Anna Rohrbach, Marcus Rohrbach, Ronghang Hu, Trevor Darrell, Bernt Schiele:
Grounding of Textual Phrases in Images by Reconstruction. CoRR abs/1511.03745 (2015) - [i1]Ronghang Hu, Huazhe Xu, Marcus Rohrbach, Jiashi Feng, Kate Saenko, Trevor Darrell:
Natural Language Object Retrieval. CoRR abs/1511.04164 (2015) - 2014
- [c2]Ronghang Hu, Ruiping Wang, Shiguang Shan
, Xilin Chen:
Robust Head-Shoulder Detection Using a Two-Stage Cascade Framework. ICPR 2014: 2796-2801 - [c1]Judy Hoffman
, Sergio Guadarrama, Eric Tzeng, Ronghang Hu, Jeff Donahue, Ross B. Girshick, Trevor Darrell, Kate Saenko:
LSDA: Large Scale Detection through Adaptation. NIPS 2014: 3536-3544
Coauthor Index

manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.
Unpaywalled article links
Add open access links from to the list of external document links (if available).
Privacy notice: By enabling the option above, your browser will contact the API of unpaywall.org to load hyperlinks to open access articles. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Unpaywall privacy policy.
Archived links via Wayback Machine
For web page which are no longer available, try to retrieve content from the of the Internet Archive (if available).
Privacy notice: By enabling the option above, your browser will contact the API of archive.org to check for archived content of web pages that are no longer available. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Internet Archive privacy policy.
Reference lists
Add a list of references from ,
, and
to record detail pages.
load references from crossref.org and opencitations.net
Privacy notice: By enabling the option above, your browser will contact the APIs of crossref.org, opencitations.net, and semanticscholar.org to load article reference information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Crossref privacy policy and the OpenCitations privacy policy, as well as the AI2 Privacy Policy covering Semantic Scholar.
Citation data
Add a list of citing articles from and
to record detail pages.
load citations from opencitations.net
Privacy notice: By enabling the option above, your browser will contact the API of opencitations.net and semanticscholar.org to load citation information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the OpenCitations privacy policy as well as the AI2 Privacy Policy covering Semantic Scholar.
OpenAlex data
Load additional information about publications from .
Privacy notice: By enabling the option above, your browser will contact the API of openalex.org to load additional information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the information given by OpenAlex.
last updated on 2024-09-13 00:44 CEST by the dblp team
all metadata released as open data under CC0 1.0 license
see also: Terms of Use | Privacy Policy | Imprint