


default search action
Qi Wu 0001
吴琦
Person information
- unicode name: 吴琦
- affiliation: University of Adelaide, School of Computer Science, Australian Centre for Robotic Vision, Adelaide, Australia
- affiliation (PhD 2015): University of Bath, UK
Other persons with the same name
- Qi Wu — disambiguation page
- Qi Wu 0002 — Chinese Academy of Sciences, Institute of Computing Technology, Beijing, China
- Qi Wu 0003
(aka: Edmond Qi Wu) — Shanghai Jiao Tong University, School of Electronic, Information and Electrical Engineering, China (and 2 more)
- Qi Wu 0004
— Technical University of Hamburg-Harburg, Germany (and 2 more)
- Qi Wu 0005
— Beihang University, School of Instrumentation and Optoelectronic Engineering, Beijing, China
- Qi Wu 0006 — ScaleFlux Inc., San Jose, CA, USA (and 2 more)
- Qi Wu 0007
— Shanghai Jiao Tong University, School of Electronic Information and Electrical Engineering, Shanghai Key Laboratory of Navigation and Location-based Services, Shanghai, China
- Qi Wu 0008 — Jiangxi University of Finance and Economics, School of Information Technology, Nanchang, China
- Qi Wu 0009
— City University of Hong Kong, School of Data Science, Hong Kong (and 2 more)
- Qi Wu 0010
— Chongqing University of Posts and Telecommunications, School of Communication and Information Engineering, Chongqing, China
- Qi Wu 0011
— Huazhong University of Science and Technology, School of Electrical and Electronic Engineering, State Key Laboratory of Advanced Electromagnetic Engineering and Technology, Wuhan, China
- Qi Wu 0012
— Huazhong University of Science and Technology, School of Mechanical Science and Engineering, National NC System Engineering Research Center, Wuhan, China
- Qi Wu 0013
— Guangdong Police College, Department of Computer Science, Guangzhou, China
- Qi Wu 0014 (aka: Tony Qi Wu) — Carnegie Mellon University, Electrical and Computer Engineering Department, Pittsburgh, PA, USA (and 1 more)
- Qi Wu 0015
— University of California, Davis, CA, USA (and 1 more)
- Qi Wu 0016
— Xiangtan University, School of Mathematics and Computational Science, Xiangtan, China (and 1 more)
- Qi Wu 0017 — Megvii Technology, Beijing, China
- Qi Wu 0018
— Zhejiang University of Technology, College of Information Engineering, Hangzhou, China
- Qi Wu 0019
— State Grid Changzhou Power Supply Company, 500 kV Substation Operation and Overhaul Center, Changzhou, China (and 1 more)
- Qi Wu 0020
— Wuhan University of Science and Technology, School of Automobile and Traffic Engineering, China
Refine list

refinements active!
zoomed in on ?? of ?? records
view refined list in
export refined list as
showing all ?? records
2020 – today
- 2024
- [j37]Mengyang Sun, Wei Suo, Peng Wang
, Kai Niu, Le Liu, Guosheng Lin, Yanning Zhang, Qi Wu:
An Adaptive Correlation Filtering Method for Text-Based Person Search. Int. J. Comput. Vis. 132(10): 4440-4455 (2024) - [j36]Yutong Xie
, Lin Gu
, Tatsuya Harada
, Jianpeng Zhang, Yong Xia
, Qi Wu:
Rethinking masked image modelling for medical image representation. Medical Image Anal. 98: 103304 (2024) - [j35]Ning Ding
, Chaorui Deng
, Mingkui Tan
, Qing Du
, Zhiwei Ge, Qi Wu
:
Image Captioning With Controllable and Adaptive Length Levels. IEEE Trans. Pattern Anal. Mach. Intell. 46(2): 764-779 (2024) - [j34]Chen Gao
, Si Liu
, Jinyu Chen
, Luting Wang
, Qi Wu
, Bo Li
, Qi Tian
:
Room-Object Entity Prompting and Reasoning for Embodied Referring Expression. IEEE Trans. Pattern Anal. Mach. Intell. 46(2): 994-1010 (2024) - [j33]Yutong Xie
, Jianpeng Zhang
, Yong Xia
, Qi Wu
:
UniMiSS+: Universal Medical Self-Supervised Learning From Cross-Dimensional Unpaired Data. IEEE Trans. Pattern Anal. Mach. Intell. 46(12): 10021-10035 (2024) - [j32]Zhiquan Wen
, Shuaicheng Niu, Ge Li
, Qingyao Wu
, Mingkui Tan
, Qi Wu
:
Test-Time Model Adaptation for Visual Question Answering With Debiased Self-Supervisions. IEEE Trans. Multim. 26: 2137-2147 (2024) - [j31]Mingkui Tan
, Zhiquan Wen
, Leyuan Fang
, Qi Wu
:
Transformer-Based Relational Inference Network for Complex Visual Relational Reasoning. ACM Trans. Multim. Comput. Commun. Appl. 20(1): 10:1-10:23 (2024) - [c117]Qi Chen, Dileepa Pitawela, Chongyang Zhao, Gengze Zhou, Hsiang-Ting Chen, Qi Wu:
WebVLN: Vision-and-Language Navigation on Websites. AAAI 2024: 1165-1173 - [c116]Bahram Mohammadi, Yicong Hong, Yuankai Qi
, Qi Wu, Shirui Pan, Javen Qinfeng Shi:
Augmented Commonsense Knowledge for Remote Object Grounding. AAAI 2024: 4269-4277 - [c115]Yuanmin Tang, Jing Yu, Keke Gai, Jiamin Zhuang, Gang Xiong, Yue Hu, Qi Wu:
Context-I2W: Mapping Images to Context-Dependent Words for Accurate Zero-Shot Composed Image Retrieval. AAAI 2024: 5180-5188 - [c114]Gengze Zhou, Yicong Hong, Qi Wu:
NavGPT: Explicit Reasoning in Vision-and-Language Navigation with Large Language Models. AAAI 2024: 7641-7649 - [c113]Qi Chen
, Yutong Xie
, Biao Wu, Xiaomin Chen, James Ang, Minh-Son To, Xiaojun Chang, Qi Wu:
Act Like a Radiologist: Radiology Report Generation Across Anatomical Regions. ACCV (6) 2024: 36-52 - [c112]Zixiong Huang, Qi Chen, Libo Sun, Yifan Yang, Naizhou Wang, Qi Wu, Mingkui Tan:
G-NeRF: Geometry-enhanced Novel View Synthesis from Single-View Images. CVPR 2024: 10117-10126 - [c111]Yiwen Ye, Yutong Xie, Jianpeng Zhang, Ziyang Chen, Qi Wu, Yong Xia:
Continual Self-Supervised Learning: Towards Universal Multi-Modal Medical Data Representation Learning. CVPR 2024: 11114-11124 - [c110]Vu Minh Hieu Phan, Yutong Xie, Yuankai Qi, Lingqiao Liu, Liyang Liu, Bowen Zhang, Zhibin Liao, Qi Wu, Minh-Son To, Johan W. Verjans:
Decomposing Disease Descriptions for Enhanced Pathology Detection: A Multi-Aspect Vision-Language Pre-Training Framework. CVPR 2024: 11492-11501 - [c109]Yutong Xie, Qi Chen, Sinuo Wang, Minh-Son To, Iris Lee, Ee Win Khoo, Kerolos Hendy, Daniel Koh, Yong Xia, Qi Wu:
PairAug: What Can Augmented Image-Text Pairs Do for Radiology? CVPR 2024: 11652-11661 - [c108]Xinyu Wang, Bohan Zhuang, Qi Wu:
ModaVerse: Efficiently Transforming Modalities with LLMs. CVPR 2024: 26596-26606 - [c107]Gengze Zhou
, Yicong Hong
, Zun Wang
, Xin Eric Wang
, Qi Wu
:
NavGPT-2: Unleashing Navigational Reasoning Capability for Large Vision-Language Models. ECCV (7) 2024: 260-278 - [c106]Yanyuan Qiao
, Qianyi Liu
, Jiajun Liu
, Jing Liu
, Qi Wu
:
LLM as Copilot for Coarse-Grained Vision-and-Language Navigation. ECCV (5) 2024: 459-476 - [c105]Haodong Hong, Sen Wang, Zi Huang, Qi Wu, Jiajun Liu:
Why Only Text: Empowering Vision-and-Language Navigation with Multi-modal Prompts. IJCAI 2024: 839-847 - [c104]Zilin Lu, Yutong Xie, Qingjie Zeng, Mengkang Lu, Qi Wu, Yong Xia:
Spot the Difference: Difference Visual Question Answering with Residual Alignment. MICCAI (5) 2024: 649-658 - [c103]Yili Li
, Jing Yu
, Keke Gai
, Bang Liu
, Gang Xiong
, Qi Wu
:
T2VIndexer: A Generative Video Indexer for Efficient Text-Video Retrieval. ACM Multimedia 2024: 3955-3963 - [c102]Xiangyan Qu
, Jing Yu
, Keke Gai
, Jiamin Zhuang
, Yuanmin Tang
, Gang Xiong
, Gaopeng Gou
, Qi Wu
:
Visual-Semantic Decomposition and Partial Alignment for Document-based Zero-Shot Learning. ACM Multimedia 2024: 4581-4590 - [c101]Haodong Hong
, Sen Wang
, Zi Huang
, Qi Wu
, Jiajun Liu
:
Navigating Beyond Instructions: Vision-and-Language Navigation in Obstructed Environments. ACM Multimedia 2024: 7639-7648 - [c100]Yicheng Wu
, Yutong Xie
, Xiangde Luo
, Qi Wu
, Jianfei Cai
:
Dataset, Challenge, and Evaluation for Tumor Segmentation Variability. ACM Multimedia 2024: 11302-11303 - [c99]Keji He, Kehan Chen, Jiawang Bai, Yan Huang, Qi Wu, Shu-Tao Xia, Liang Wang:
Everyday Object Meets Vision-and-Language Navigation Agent via Backdoor. NeurIPS 2024 - [c98]Jiazhao Zhang, Kunyu Wang, Rongtao Xu, Gengze Zhou, Yicong Hong, Xiaomeng Fang, Qi Wu, Zhizheng Zhang, He Wang:
NaVid: Video-based VLM Plans the Next Step for Vision-and-Language Navigation. Robotics: Science and Systems 2024 - [i118]Xinyu Wang, Bohan Zhuang, Qi Wu:
ModaVerse: Efficiently Transforming Modalities with LLMs. CoRR abs/2401.06395 (2024) - [i117]Jiazhao Zhang, Kunyu Wang, Rongtao Xu, Gengze Zhou, Yicong Hong, Xiaomeng Fang, Qi Wu, Zhizheng Zhang, He Wang:
NaVid: Video-based VLM Plans the Next Step for Vision-and-Language Navigation. CoRR abs/2402.15852 (2024) - [i116]Vu Minh Hieu Phan, Yutong Xie, Yuankai Qi, Lingqiao Liu, Liyang Liu, Bowen Zhang, Zhibin Liao, Qi Wu, Minh-Son To, Johan W. Verjans
:
Decomposing Disease Descriptions for Enhanced Pathology Detection: A Multi-Aspect Vision-Language Pre-training Framework. CoRR abs/2403.07636 (2024) - [i115]Yanyuan Qiao, Zheng Yu, Longteng Guo, Sihan Chen, Zijia Zhao, Mingzhen Sun, Qi Wu, Jing Liu:
VL-Mamba: Exploring State Space Models for Multimodal Learning. CoRR abs/2403.13600 (2024) - [i114]Yutong Xie, Qi Chen, Sinuo Wang, Minh-Son To, Iris Lee, Ee Win Khoo, Kerolos Hendy, Daniel Koh, Yong Xia, Qi Wu:
PairAug: What Can Augmented Image-Text Pairs Do for Radiology? CoRR abs/2404.04960 (2024) - [i113]Zixiong Huang, Qi Chen, Libo Sun, Yifan Yang, Naizhou Wang, Mingkui Tan, Qi Wu:
G-NeRF: Geometry-enhanced Novel View Synthesis from Single-View Images. CoRR abs/2404.07474 (2024) - [i112]Feng Chen, Zhen Yang, Bohan Zhuang, Qi Wu:
Streaming Video Diffusion: Online Video Editing with Diffusion Models. CoRR abs/2405.19726 (2024) - [i111]Bahram Mohammadi, Yicong Hong, Yuankai Qi, Qi Wu, Shirui Pan, Javen Qinfeng Shi:
Augmented Commonsense Knowledge for Remote Object Grounding. CoRR abs/2406.01256 (2024) - [i110]Haodong Hong, Sen Wang, Zi Huang, Qi Wu, Jiajun Liu:
Why Only Text: Empowering Vision-and-Language Navigation with Multi-modal Prompts. CoRR abs/2406.02208 (2024) - [i109]Yue Zhang, Ziqiao Ma, Jialu Li, Yanyuan Qiao, Zun Wang, Joyce Chai, Qi Wu, Mohit Bansal, Parisa Kordjamshidi:
Vision-and-Language Navigation Today and Tomorrow: A Survey in the Era of Foundation Models. CoRR abs/2407.07035 (2024) - [i108]Gengze Zhou, Yicong Hong, Zun Wang, Xin Eric Wang, Qi Wu:
NavGPT-2: Unleashing Navigational Reasoning Capability for Large Vision-Language Models. CoRR abs/2407.12366 (2024) - [i107]Xiangyan Qu, Jing Yu, Keke Gai, Jiamin Zhuang, Yuanmin Tang, Gang Xiong, Gaopeng Gou, Qi Wu:
Visual-Semantic Decomposition and Partial Alignment for Document-based Zero-Shot Learning. CoRR abs/2407.15613 (2024) - [i106]Biao Wu, Yutong Xie, Zeyu Zhang, Minh Hieu Phan, Qi Chen, Ling Chen, Qi Wu:
XLIP: Cross-modal Attention Masked Modelling for Medical Language-Image Pre-Training. CoRR abs/2407.19546 (2024) - [i105]Haodong Hong, Sen Wang, Zi Huang, Qi Wu, Jiajun Liu:
Navigating Beyond Instructions: Vision-and-Language Navigation in Obstructed Environments. CoRR abs/2407.21452 (2024) - [i104]Yili Li, Jing Yu, Keke Gai, Bang Liu, Gang Xiong, Qi Wu:
T2VIndexer: A Generative Video Indexer for Efficient Text-Video Retrieval. CoRR abs/2408.11432 (2024) - [i103]Yanyuan Qiao, Wenqi Lyu, Hui Wang, Zixu Wang, Zerui Li, Yuan Zhang, Mingkui Tan, Qi Wu:
Open-Nav: Exploring Zero-Shot Vision-and-Language Navigation in Continuous Environment with Open-Source LLMs. CoRR abs/2409.18794 (2024) - [i102]Yuanmin Tang, Jing Yu, Keke Gai, Jiamin Zhuang, Gaopeng Gou, Gang Xiong, Qi Wu:
Denoise-I2W: Mapping Images to Denoising Words for Accurate Zero-Shot Composed Image Retrieval. CoRR abs/2410.17393 (2024) - [i101]Ruoxi Sun, Jiamin Chang, Hammond Pearce, Chaowei Xiao, Bo Li, Qi Wu, Surya Nepal, Minhui Xue:
SoK: Unifying Cybersecurity and Cybersafety of Multimodal Foundation Models with an Information Theory Approach. CoRR abs/2411.11195 (2024) - [i100]Liangqi Lei, Keke Gai, Jing Yu, Liehuang Zhu, Qi Wu:
Conceptwm: A Diffusion Model Watermark for Concept Protection. CoRR abs/2411.11688 (2024) - [i99]Qi Chen, Ruoshan Zhao, Sinuo Wang, Vu Minh Hieu Phan, Anton van den Hengel, Johan Verjans, Zhibin Liao, Minh-Son To, Yong Xia, Jian Chen, Yutong Xie, Qi Wu:
A Survey of Medical Vision-and-Language Applications and Their Techniques. CoRR abs/2411.12195 (2024) - [i98]Feng Chen, Chenhui Gou, Jing Liu, Yang Yang, Zhaoyang Li, Jiyuan Zhang, Zhenbang Sun, Bohan Zhuang, Qi Wu:
Evaluating and Advancing Multimodal Large Language Models in Ability Lens. CoRR abs/2411.14725 (2024) - [i97]Gengze Zhou, Yicong Hong, Zun Wang, Chongyang Zhao, Mohit Bansal, Qi Wu:
SAME: Learning Generic Language-Guided Visual Navigation with State-Adaptive Mixture of Experts. CoRR abs/2412.05552 (2024) - [i96]Yuanmin Tang, Xiaoting Qin, Jue Zhang, Jing Yu, Gaopeng Gou, Gang Xiong, Qingwei Ling, Saravan Rajmohan, Dongmei Zhang, Qi Wu:
Reason-before-Retrieve: One-Stage Reflective Chain-of-Thoughts for Training-Free Zero-Shot Composed Image Retrieval. CoRR abs/2412.11077 (2024) - 2023
- [j30]Zhihong Lin
, Donghao Zhang, Qingyi Tao, Danli Shi
, Gholamreza Haffari, Qi Wu, Mingguang He, Zongyuan Ge:
Medical visual question answering: A survey. Artif. Intell. Medicine 143: 102611 (2023) - [j29]Yanyuan Qiao
, Yuankai Qi
, Yicong Hong
, Zheng Yu
, Peng Wang
, Qi Wu
:
HOP+: History-Enhanced and Order-Aware Pre-Training for Vision-and-Language Navigation. IEEE Trans. Pattern Anal. Mach. Intell. 45(7): 8524-8537 (2023) - [j28]Zihan Wang
, Olivia Byrnes, Hu Wang
, Ruoxi Sun
, Congbo Ma, Huaming Chen, Qi Wu
, Minhui Xue
:
Data Hiding With Deep Learning: A Survey Unifying Digital Watermarking and Steganography. IEEE Trans. Comput. Soc. Syst. 10(6): 2985-2999 (2023) - [j27]Mengge He
, Wenjing Du
, Zhiquan Wen
, Qing Du
, Yutong Xie, Qi Wu
:
Multi-Granularity Aggregation Transformer for Joint Video-Audio-Text Representation Learning. IEEE Trans. Circuits Syst. Video Technol. 33(6): 2990-3002 (2023) - [j26]Wei Suo
, Mengyang Sun
, Peng Wang
, Yanning Zhang
, Qi Wu
:
Rethinking and Improving Feature Pyramids for One-Stage Referring Expression Comprehension. IEEE Trans. Image Process. 32: 854-864 (2023) - [j25]Hao Li
, Jinfa Huang, Peng Jin
, Guoli Song
, Qi Wu
, Jie Chen:
Weakly-Supervised 3D Spatial Reasoning for Text-Based Visual Question Answering. IEEE Trans. Image Process. 32: 3367-3382 (2023) - [j24]Mengyang Sun
, Wei Suo
, Peng Wang
, Yanning Zhang
, Qi Wu
:
A Proposal-Free One-Stage Framework for Referring Expression Comprehension and Generation via Dense Cross-Attention. IEEE Trans. Multim. 25: 2446-2458 (2023) - [c97]Zhiquan Wen, Yaowei Wang, Mingkui Tan, Qingyao Wu, Qi Wu:
Digging out Discrimination Information from Generated Samples for Robust Visual Question Answering. ACL (Findings) 2023: 6910-6928 - [c96]Wei Suo, Mengyang Sun, Weisong Liu, Yiqi Gao, Peng Wang, Yanning Zhang, Qi Wu:
S3C: Semi-Supervised VQA Natural Language Explanation via Self-Critical Learning. CVPR 2023: 2646-2656 - [c95]Gaoxiang Cong, Liang Li, Yuankai Qi
, Zheng-Jun Zha, Qi Wu, Wenyu Wang, Bin Jiang, Ming-Hsuan Yang, Qingming Huang:
Learning to Dub Movies via Hierarchical Prosody Models. CVPR 2023: 14687-14697 - [c94]Cristian Rodriguez Opazo, Edison Marrese-Taylor, Basura Fernando, Hiroya Takamura, Qi Wu:
Memory-efficient Temporal Moment Localization in Long Videos. EACL 2023: 1901-1916 - [c93]Xi Tian, Yong-Liang Yang, Qi Wu:
ShapeScaffolder: Structure-Aware 3D Shape Generation from Text. ICCV 2023: 2715-2724 - [c92]Zun Wang, Jialu Li, Yicong Hong, Yi Wang, Qi Wu, Mohit Bansal, Stephen Gould, Hao Tan, Yu Qiao:
Scaling Data Generation in Vision-and-Language Navigation. ICCV 2023: 11975-11986 - [c91]Chaorui Deng, Da Chen
, Qi Wu:
Identity-Consistent Aggregation for Video Object Detection. ICCV 2023: 13388-13398 - [c90]Shubo Liu, Hongsheng Zhang, Yuankai Qi
, Peng Wang, Yanning Zhang, Qi Wu:
AerialVLN: Vision-and-Language Navigation for UAVs. ICCV 2023: 15338-15348 - [c89]Yanyuan Qiao, Zheng Yu, Qi Wu:
VLN-PETL: Parameter-Efficient Transfer Learning for Vision-and-Language Navigation. ICCV 2023: 15397-15406 - [c88]Chaorui Deng, Qi Chen, Pengda Qin, Da Chen, Qi Wu:
Prompt Switch: Efficient CLIP Adaptation for Text-Video Retrieval. ICCV 2023: 15602-15612 - [c87]Yanyuan Qiao, Yuankai Qi
, Zheng Yu, Jing Liu, Qi Wu:
March in Chat: Interactive Prompting for Remote Embodied Referring Expression. ICCV 2023: 15712-15721 - [c86]Yutong Xie, Lin Gu, Tatsuya Harada, Jianpeng Zhang, Yong Xia, Qi Wu:
MedIM: Boost Medical Image Representation via Radiology Report-Guided Masking. MICCAI (1) 2023: 13-23 - [c85]Zheng Yu, Yutong Xie, Yong Xia, Qi Wu:
PLMVQA: Applying Pseudo Labels for Medical Visual Question Answering with Limited Data. MTSAIL/LEAF/AI4Treat/MMMI/REMIA@MICCAI 2023: 357-367 - [c84]Qingbiao Guan, Yutong Xie, Bing Yang, Jianpeng Zhang, Zhibin Liao, Qi Wu, Yong Xia:
Unpaired Cross-Modal Interaction Learning for COVID-19 Segmentation on Limited CT Images. MICCAI (3) 2023: 603-613 - [c83]Biao Wu, Yutong Xie, Zeyu Zhang, Jinchao Ge, Kaspar Yaxley, Suzan Bahadir, Qi Wu, Yifan Liu, Minh-Son To:
BHSD: A 3D Multi-class Brain Hemorrhage Segmentation Dataset. MLMI@MICCAI (1) 2023: 147-156 - [c82]Zheng Yu, Yanyuan Qiao, Yutong Xie, Qi Wu:
Multi-modal Adapter for Medical Vision-and-Language Learning. MLMI@MICCAI (1) 2023: 393-402 - [c81]Chongyang Zhao
, Yuankai Qi
, Qi Wu
:
Mind the Gap: Improving Success Rate of Vision-and-Language Navigation by Revisiting Oracle Success Routes. ACM Multimedia 2023: 4349-4358 - [c80]Jingying Gao, Qi Wu, Alan Blair, Maurice Pagnucco:
LoRA: A Logical Reasoning Augmented Dataset for Visual Question Answering. NeurIPS 2023 - [i95]Anthony Manchin, Jamie Sherrah, Qi Wu, Anton van den Hengel:
Program Generation from Diverse Video Demonstrations. CoRR abs/2302.00178 (2023) - [i94]Qi Chen, Yutong Xie, Biao Wu, Minh-Son To, James Ang, Qi Wu:
S4M: Generating Radiology Reports by A Single Model for Multiple Body Parts. CoRR abs/2305.16685 (2023) - [i93]Gengze Zhou, Yicong Hong, Qi Wu:
NavGPT: Explicit Reasoning in Vision-and-Language Navigation with Large Language Models. CoRR abs/2305.16986 (2023) - [i92]Yutong Xie, Bing Yang, Qingbiao Guan, Jianpeng Zhang, Qi Wu, Yong Xia:
Attention Mechanisms in Medical Image Segmentation: A Survey. CoRR abs/2305.17937 (2023) - [i91]Zun Wang, Jialu Li, Yicong Hong, Yi Wang, Qi Wu, Mohit Bansal, Stephen Gould, Hao Tan, Yu Qiao:
Scaling Data Generation in Vision-and-Language Navigation. CoRR abs/2307.15644 (2023) - [i90]Chongyang Zhao, Yuankai Qi, Qi Wu:
Mind the Gap: Improving Success Rate of Vision-and-Language Navigation by Revisiting Oracle Success Routes. CoRR abs/2308.03244 (2023) - [i89]Shubo Liu, Hongsheng Zhang, Yuankai Qi, Peng Wang, Yaning Zhang, Qi Wu:
AerialVLN: Vision-and-Language Navigation for UAVs. CoRR abs/2308.06735 (2023) - [i88]Chaorui Deng, Qi Chen, Pengda Qin, Da Chen
, Qi Wu:
Prompt Switch: Efficient CLIP Adaptation for Text-Video Retrieval. CoRR abs/2308.07648 (2023) - [i87]Chaorui Deng, Da Chen
, Qi Wu:
Identity-Consistent Aggregation for Video Object Detection. CoRR abs/2308.07737 (2023) - [i86]Qi Chen, Chaorui Deng, Zixiong Huang, Bowen Zhang, Mingkui Tan, Qi Wu:
Likelihood-Based Text-to-Image Evaluation with Patch-Level Perceptual and Semantic Credit Assignment. CoRR abs/2308.08525 (2023) - [i85]Yanyuan Qiao, Yuankai Qi, Zheng Yu, Jing Liu, Qi Wu:
March in Chat: Interactive Prompting for Remote Embodied Referring Expression. CoRR abs/2308.10141 (2023) - [i84]Yanyuan Qiao, Zheng Yu, Qi Wu:
VLN-PETL: Parameter-Efficient Transfer Learning for Vision-and-Language Navigation. CoRR abs/2308.10172 (2023) - [i83]Biao Wu, Yutong Xie, Zeyu Zhang, Jinchao Ge, Kaspar Yaxley, Suzan Bahadir, Qi Wu, Yifan Liu, Minh-Son To:
BHSD: A 3D Multi-Class Brain Hemorrhage Segmentation Dataset. CoRR abs/2308.11298 (2023) - [i82]Wei Suo, Mengyang Sun, Weisong Liu, Yiqi Gao, Peng Wang, Yanning Zhang, Qi Wu:
S3C: Semi-Supervised VQA Natural Language Explanation via Self-Critical Learning. CoRR abs/2309.02155 (2023) - [i81]Xinyu Wang, Bohan Zhuang, Qi Wu:
SwitchGPT: Adapting Large Language Models for Non-Text Outputs. CoRR abs/2309.07623 (2023) - [i80]Yuanmin Tang, Jing Yu, Keke Gai, Jiamin Zhuang, Gang Xiong
, Yue Hu, Qi Wu:
Context-I2W: Mapping Images to Context-dependent Words for Accurate Zero-Shot Composed Image Retrieval. CoRR abs/2309.16137 (2023) - [i79]Yuanmin Tang, Jing Yu, Keke Gai, Yujing Wang, Yue Hu, Gang Xiong
, Qi Wu:
Align before Search: Aligning Ads Image to Text for Accurate Cross-Modal Sponsored Search. CoRR abs/2309.16141 (2023) - [i78]Xiangyu Shi, Yanyuan Qiao, Qi Wu, Lingqiao Liu, Feras Dayoub:
Improving Online Source-free Domain Adaptation for Object Detection by Unsupervised Data Acquisition. CoRR abs/2310.19258 (2023) - [i77]Yuanmin Tang, Jing Yu, Keke Gai, Xiangyan Qu, Yue Hu, Gang Xiong
, Qi Wu:
Watermarking Vision-Language Pre-trained Models for Multi-modal Embedding as a Service. CoRR abs/2311.05863 (2023) - [i76]Yiwen Ye, Yutong Xie, Jianpeng Zhang, Ziyang Chen, Qi Wu, Yong Xia:
Continual Self-supervised Learning: Towards Universal Multi-modal Medical Data Representation Learning. CoRR abs/2311.17597 (2023) - [i75]Yunchuan Ma, Chang Teng, Yuankai Qi, Guorong Li, Laiyun Qing, Qi Wu, Qingming Huang:
Subject-Oriented Video Captioning. CoRR abs/2312.13330 (2023) - [i74]Qi Chen, Dileepa Pitawela, Chongyang Zhao, Gengze Zhou, Hsiang-Ting Chen, Qi Wu:
WebVLN: Vision-and-Language Navigation on Websites. CoRR abs/2312.15820 (2023) - 2022
- [b2]Qi Wu
, Peng Wang
, Xin Wang
, Xiaodong He, Wenwu Zhu:
Visual Question Answering - From Theory to Application. Advances in Computer Vision and Pattern Recognition, Springer 2022, ISBN 978-981-19-0963-4, pp. 1-236 - [j23]Chaorui Deng
, Qi Wu
, Qingyao Wu
, Fuyuan Hu
, Fan Lyu
, Mingkui Tan
:
Visual Grounding Via Accumulated Attention. IEEE Trans. Pattern Anal. Mach. Intell. 44(3): 1670-1684 (2022) - [j22]Chenyu Gao
, Qi Zhu
, Peng Wang
, Hui Li
, Yuliang Liu
, Anton van den Hengel
, Qi Wu
:
Structured Multimodal Attentions for TextVQA. IEEE Trans. Pattern Anal. Mach. Intell. 44(12): 9603-9614 (2022) - [j21]Zeren Sun
, Huafeng Liu, Qiong Wang
, Tianfei Zhou
, Qi Wu
, Zhenmin Tang:
Co-LDL: A Co-Training-Based Label Distribution Learning Method for Tackling Label Noise. IEEE Trans. Multim. 24: 1093-1104 (2022) - [j20]Chuanyi Zhang
, Qiong Wang
, Guo-Sen Xie
, Qi Wu
, Fumin Shen
, Zhenmin Tang:
Robust Learning From Noisy Web Images Via Data Purification for Fine-Grained Recognition. IEEE Trans. Multim. 24: 1198-1209 (2022) - [j19]Amin Parvaneh
, Ehsan Abbasnejad, Qi Wu
, Qinfeng (Javen) Shi
, Anton van den Hengel
:
Show, Price and Negotiate: A Negotiator With Online Value Look-Ahead. IEEE Trans. Multim. 24: 1426-1434 (2022) - [c79]Chenchen Jing, Yunde Jia, Yuwei Wu, Chuanhao Li, Qi Wu:
Learning the Dynamics of Visual Relational Reasoning via Reinforced Path Routing. AAAI 2022: 1122-1130 - [c78]Jing Gu, Eliana Stefani, Qi Wu, Jesse Thomason, Xin Wang:
Vision-and-Language Navigation: A Survey of Tasks, Methods, and Future Directions. ACL (1) 2022: 7606-7623 - [c77]Xi Tian, Yongliang Yang, Qi Wu:
Enhancing Person Synthesis in Complex Scenes via Intrinsic and Contextual Structure Modeling. BMVC 2022: 491 - [c76]Anthony Manchin, Jamie Sherrah, Qi Wu, Anton van den Hengel:
Program Generation from Diverse Video Demonstrations. BMVC 2022: 1039 - [c75]Yang Ding, Jing Yu, Bang Liu, Yue Hu, Mingxin Cui, Qi Wu:
MuKEA: Multimodal Knowledge Extraction and Accumulation for Knowledge-based Visual Question Answering. CVPR 2022: 5079-5088 - [c74]Chenchen Jing, Yunde Jia, Yuwei Wu, Xinyu Liu, Qi Wu:
Maintaining Reasoning Consistency in Compositional Visual Question Answering. CVPR 2022: 5089-5098 - [c73]Yanyuan Qiao, Yuankai Qi
, Yicong Hong, Zheng Yu, Peng Wang, Qi Wu:
HOP: History-and-Order Aware Pretraining for Vision-and-Language Navigation. CVPR 2022: 15397-15406 - [c72]Yicong Hong, Zun Wang, Qi Wu, Stephen Gould:
Bridging the Gap Between Learning in Discrete and Continuous Environments for Vision-and-Language Navigation. CVPR 2022: 15418-15428 - [c71]Qi Chen, Mingkui Tan, Yuankai Qi
, Jiaqiu Zhou, Yuanqing Li