default search action
Shih-Fu Chang
Person information
- affiliation: Columbia University, New York City, USA
Refine list
refinements active!
zoomed in on ?? of ?? records
view refined list in
export refined list as
showing all ?? records
2020 – today
- 2024
- [c389]Hammad A. Ayyubi, Christopher Thomas, Lovish Chum, Rahul Lokesh, Long Chen, Yulei Niu, Xudong Lin, Xuande Feng, Jaywon Koo, Sounak Ray, Shih-Fu Chang:
Beyond Grounding: Extracting Fine-Grained Event Hierarchies across Modalities. AAAI 2024: 17664-17672 - [c388]Kung-Hsiang Huang, Mingyang Zhou, Hou Pong Chan, Yi Fung, Zhenhailong Wang, Lingyu Zhang, Shih-Fu Chang, Heng Ji:
Do LVLMs Understand Charts? Analyzing and Correcting Factual Errors in Chart Captioning. ACL (Findings) 2024: 730-749 - [c387]Brian Chen, Nina Shvetsova, Andrew Rouditchenko, Daniel Kondermann, Samuel Thomas, Shih-Fu Chang, Rogério Feris, James R. Glass, Hilde Kuehne:
What, When, and Where? Self-Supervised Spatio- Temporal Grounding in Untrimmed Multi-Action Videos from Narrated Instructions. CVPR 2024: 18419-18429 - [c386]Jiawei Ma, Po-Yao Huang, Saining Xie, Shang-Wen Li, Luke Zettlemoyer, Shih-Fu Chang, Wen-Tau Yih, Hu Xu:
MoDE: CLIP Data Experts via Clustering. CVPR 2024: 26344-26353 - [c385]Yulei Niu, Wenliang Guo, Long Chen, Xudong Lin, Shih-Fu Chang:
SCHEMA: State CHangEs MAtter for Procedure Planning in Instructional Videos. ICLR 2024 - [c384]Haoxuan You, Haotian Zhang, Zhe Gan, Xianzhi Du, Bowen Zhang, Zirui Wang, Liangliang Cao, Shih-Fu Chang, Yinfei Yang:
Ferret: Refer and Ground Anything Anywhere at Any Granularity. ICLR 2024 - [i129]Yulei Niu, Wenliang Guo, Long Chen, Xudong Lin, Shih-Fu Chang:
SCHEMA: State CHangEs MAtter for Procedure Planning in Instructional Videos. CoRR abs/2403.01599 (2024) - [i128]Kung-Hsiang Huang, Hou Pong Chan, Yi Ren Fung, Haoyi Qiu, Mingyang Zhou, Shafiq Joty, Shih-Fu Chang, Heng Ji:
From Pixels to Insights: A Survey on Automatic Chart Understanding in the Era of Large Foundation Models. CoRR abs/2403.12027 (2024) - [i127]Ali Zare, Yulei Niu, Hammad A. Ayyubi, Shih-Fu Chang:
RAP: Retrieval-Augmented Planner for Adaptive Procedure Planning in Instructional Videos. CoRR abs/2403.18600 (2024) - [i126]Haotian Zhang, Haoxuan You, Philipp Dufter, Bowen Zhang, Chen Chen, Hong-You Chen, Tsu-Jui Fu, William Yang Wang, Shih-Fu Chang, Zhe Gan, Yinfei Yang:
Ferret-v2: An Improved Baseline for Referring and Grounding with Large Language Models. CoRR abs/2404.07973 (2024) - [i125]Jiawei Ma, Po-Yao Huang, Saining Xie, Shang-Wen Li, Luke Zettlemoyer, Shih-Fu Chang, Wen-Tau Yih, Hu Xu:
MoDE: CLIP Data Experts via Clustering. CoRR abs/2404.16030 (2024) - [i124]Junzhang Liu, Zhecan Wang, Hammad A. Ayyubi, Haoxuan You, Christopher Thomas, Rui Sun, Shih-Fu Chang, Kai-Wei Chang:
Detecting Multimodal Situations with Insufficient Context and Abstaining from Baseless Predictions. CoRR abs/2405.11145 (2024) - [i123]Jiawei Ma, Yulei Niu, Shiyuan Huang, Guangxing Han, Shih-Fu Chang:
WIDIn: Wording Image for Domain-Invariant Representation in Single-Source Domain Generalization. CoRR abs/2405.18405 (2024) - 2023
- [c383]Guang Yang, Manling Li, Jiajie Zhang, Xudong Lin, Heng Ji, Shih-Fu Chang:
Video Event Extraction via Tracking Visual States of Arguments. AAAI 2023: 3136-3144 - [c382]Rui Sun, Zhecan Wang, Haoxuan You, Noel Codella, Kai-Wei Chang, Shih-Fu Chang:
UniFine: A Unified and Fine-grained Approach for Zero-shot Vision-Language Understanding. ACL (Findings) 2023: 778-793 - [c381]Mingyang Zhou, Yi Ren Fung, Long Chen, Christopher Thomas, Heng Ji, Shih-Fu Chang:
Enhanced Chart Understanding via Visual Language Pre-training on Plot Table Pairs. ACL (Findings) 2023: 1314-1326 - [c380]Yu Zhou, Sha Li, Manling Li, Xudong Lin, Shih-Fu Chang, Mohit Bansal, Heng Ji:
Non-Sequential Graph Script Induction via Multimedia Grounding. ACL (1) 2023: 5529-5545 - [c379]Hammad A. Ayyubi, Rahul Lokesh, Alireza Zareian, Bo Wu, Shih-Fu Chang:
Learning from Children: Improving Image-Caption Pretraining via Curriculum. ACL (Findings) 2023: 13378-13386 - [c378]Jiawei Ma, Yulei Niu, Jincheng Xu, Shiyuan Huang, Guangxing Han, Shih-Fu Chang:
DiGeo: Discriminative Geometry-Aware Learning for Generalized Few-Shot Object Detection. CVPR 2023: 3208-3218 - [c377]Andrew Lu, Xudong Lin, Yulei Niu, Shih-Fu Chang:
In Defense of Structural Symbolic Representation for Video Event-Relation Prediction. CVPR Workshops 2023: 4940-4950 - [c376]Hung-Ting Su, Yulei Niu, Xudong Lin, Winston H. Hsu, Shih-Fu Chang:
Language Models are Causal Knowledge Extractors for Zero-shot Video Question Answering. CVPR Workshops 2023: 4951-4960 - [c375]Xudong Lin, Simran Tiwari, Shiyuan Huang, Manling Li, Mike Zheng Shou, Heng Ji, Shih-Fu Chang:
Towards Fast Adaptation of Pretrained Contrastive Models for Multi-channel Video-Language Retrieval. CVPR 2023: 14846-14855 - [c374]Han Lin, Guangxing Han, Jiawei Ma, Shiyuan Huang, Xudong Lin, Shih-Fu Chang:
Supervised Masked Knowledge Distillation for Few-Shot Transformers. CVPR 2023: 19649-19659 - [c373]Zhecan Wang, Long Chen, Haoxuan You, Keyang Xu, Yicheng He, Wenhao Li, Noel Codella, Kai-Wei Chang, Shih-Fu Chang:
Dataset Bias Mitigation in Multiple-Choice Visual Question Answering and Beyond. EMNLP (Findings) 2023: 8598-8617 - [c372]Haoxuan You, Rui Sun, Zhecan Wang, Long Chen, Gengyu Wang, Hammad A. Ayyubi, Kai-Wei Chang, Shih-Fu Chang:
IdealGPT: Iteratively Decomposing Vision and Language Reasoning via Large Language Models. EMNLP (Findings) 2023: 11289-11303 - [c371]Yuncong Yang, Jiawei Ma, Shiyuan Huang, Long Chen, Xudong Lin, Guangxing Han, Shih-Fu Chang:
TempCLR: Temporal Alignment Representation with Contrastive Learning. ICLR 2023 - [c370]Brian Chen, Ramprasaath R. Selvaraju, Shih-Fu Chang, Juan Carlos Niebles, Nikhil Naik:
PreViTS: Contrastive Pretraining with Video Tracking Supervision. WACV 2023: 1560-1570 - [i122]Andrew Lu, Xudong Lin, Yulei Niu, Shih-Fu Chang:
In Defense of Structural Symbolic Representation for Video Event-Relation Prediction. CoRR abs/2301.03410 (2023) - [i121]Jiawei Ma, Yulei Niu, Jincheng Xu, Shiyuan Huang, Guangxing Han, Shih-Fu Chang:
DiGeo: Discriminative Geometry-Aware Learning for Generalized Few-Shot Object Detection. CoRR abs/2303.09674 (2023) - [i120]Han Lin, Guangxing Han, Jiawei Ma, Shiyuan Huang, Xudong Lin, Shih-Fu Chang:
Supervised Masked Knowledge Distillation for Few-Shot Transformers. CoRR abs/2303.15466 (2023) - [i119]Brian Chen, Nina Shvetsova, Andrew Rouditchenko, Daniel Kondermann, Samuel Thomas, Shih-Fu Chang, Rogério Feris, James R. Glass, Hilde Kuehne:
What, when, and where? - Self-Supervised Spatio-Temporal Grounding in Untrimmed Multi-Action Videos from Narrated Instructions. CoRR abs/2303.16990 (2023) - [i118]Hung-Ting Su, Yulei Niu, Xudong Lin, Winston H. Hsu, Shih-Fu Chang:
Language Models are Causal Knowledge Extractors for Zero-shot Video Question Answering. CoRR abs/2304.03754 (2023) - [i117]Haoxuan You, Rui Sun, Zhecan Wang, Long Chen, Gengyu Wang, Hammad A. Ayyubi, Kai-Wei Chang, Shih-Fu Chang:
IdealGPT: Iteratively Decomposing Vision and Language Reasoning via Large Language Models. CoRR abs/2305.14985 (2023) - [i116]Hammad A. Ayyubi, Rahul Lokesh, Alireza Zareian, Bo Wu, Shih-Fu Chang:
Learning from Children: Improving Image-Caption Pretraining via Curriculum. CoRR abs/2305.17540 (2023) - [i115]Yu Zhou, Sha Li, Manling Li, Xudong Lin, Shih-Fu Chang, Mohit Bansal, Heng Ji:
Non-Sequential Graph Script Induction via Multimedia Grounding. CoRR abs/2305.17542 (2023) - [i114]Mingyang Zhou, Yi R. Fung, Long Chen, Christopher Thomas, Heng Ji, Shih-Fu Chang:
Enhanced Chart Understanding in Vision and Language Task via Cross-modal Pre-training on Plot Table Pairs. CoRR abs/2305.18641 (2023) - [i113]Rui Sun, Zhecan Wang, Haoxuan You, Noel Codella, Kai-Wei Chang, Shih-Fu Chang:
UniFine: A Unified and Fine-grained Approach for Zero-shot Vision-Language Understanding. CoRR abs/2307.00862 (2023) - [i112]Haoxuan You, Haotian Zhang, Zhe Gan, Xianzhi Du, Bowen Zhang, Zirui Wang, Liangliang Cao, Shih-Fu Chang, Yinfei Yang:
Ferret: Refer and Ground Anything Anywhere at Any Granularity. CoRR abs/2310.07704 (2023) - [i111]Zhecan Wang, Long Chen, Haoxuan You, Keyang Xu, Yicheng He, Wenhao Li, Noel Codella, Kai-Wei Chang, Shih-Fu Chang:
Dataset Bias Mitigation in Multiple-Choice Visual Question Answering and Beyond. CoRR abs/2310.14670 (2023) - [i110]Shiyuan Huang, Robinson Piramuthu, Vicente Ordonez, Shih-Fu Chang, Gunnar A. Sigurdsson:
Characterizing Video Question Answering with Sparsified Inputs. CoRR abs/2311.16311 (2023) - [i109]Hammad A. Ayyubi, Tianqi Liu, Arsha Nagrani, Xudong Lin, Mingda Zhang, Anurag Arnab, Feng Han, Yukun Zhu, Jialu Liu, Shih-Fu Chang:
Video Summarization: Towards Entity-Aware Captions. CoRR abs/2312.02188 (2023) - [i108]Kung-Hsiang Huang, Mingyang Zhou, Hou Pong Chan, Yi R. Fung, Zhenhailong Wang, Lingyu Zhang, Shih-Fu Chang, Heng Ji:
Do LVLMs Understand Charts? Analyzing and Correcting Factual Errors in Chart Captioning. CoRR abs/2312.10160 (2023) - 2022
- [j115]Mang Ye, Jianbing Shen, Xu Zhang, Pong C. Yuen, Shih-Fu Chang:
Augmentation Invariant and Instance Spreading Feature for Softmax Embedding. IEEE Trans. Pattern Anal. Mach. Intell. 44(2): 924-939 (2022) - [j114]Zhizheng Zhang, Cuiling Lan, Wenjun Zeng, Zhibo Chen, Shih-Fu Chang:
Beyond Triplet Loss: Meta Prototypical N-Tuple Loss for Person Re-identification. IEEE Trans. Multim. 24: 4158-4169 (2022) - [c369]Guangxing Han, Shiyuan Huang, Jiawei Ma, Yicheng He, Shih-Fu Chang:
Meta Faster R-CNN: Towards Accurate Few-Shot Object Detection with Attentive Feature Alignment. AAAI 2022: 780-789 - [c368]Zhecan Wang, Haoxuan You, Liunian Harold Li, Alireza Zareian, Suji Park, Yiqing Liang, Kai-Wei Chang, Shih-Fu Chang:
SGEITL: Scene Graph Enhanced Image-Text Learning for Visual Commonsense Reasoning. AAAI 2022: 5914-5922 - [c367]Revanth Gangi Reddy, Xilin Rui, Manling Li, Xudong Lin, Haoyang Wen, Jaemin Cho, Lifu Huang, Mohit Bansal, Avirup Sil, Shih-Fu Chang, Alexander G. Schwing, Heng Ji:
MuMuQA: Multimedia Multi-Hop News Question Answering via Cross-Media Knowledge Extraction and Grounding. AAAI 2022: 11200-11208 - [c366]Jianhang Chen, Xu Zhang, Yue Wu, Shalini Ghosh, Pradeep Natarajan, Shih-Fu Chang, Jan P. Allebach:
One-Stage Object Referring with Gaze Estimation. CVPR Workshops 2022: 5017-5026 - [c365]Guangxing Han, Jiawei Ma, Shiyuan Huang, Long Chen, Shih-Fu Chang:
Few-Shot Object Detection with Fully Cross-Transformer. CVPR 2022: 5311-5320 - [c364]Shiyuan Huang, Jiawei Ma, Guangxing Han, Shih-Fu Chang:
Task-Adaptive Negative Envision for Few-Shot Open-Set Recognition. CVPR 2022: 7161-7170 - [c363]Xudong Lin, Fabio Petroni, Gedas Bertasius, Marcus Rohrbach, Shih-Fu Chang, Lorenzo Torresani:
Learning To Recognize Procedural Activities with Distant Supervision. CVPR 2022: 13843-13853 - [c362]Manling Li, Ruochen Xu, Shuohang Wang, Luowei Zhou, Xudong Lin, Chenguang Zhu, Michael Zeng, Heng Ji, Shih-Fu Chang:
CLIP-Event: Connecting Text and Images with Event Structures. CVPR 2022: 16399-16408 - [c361]Jiawei Ma, Guangxing Han, Shiyuan Huang, Yuncong Yang, Shih-Fu Chang:
Few-Shot End-to-End Object Detection via Constantly Concentrated Encoding Across Heads. ECCV (26) 2022: 57-73 - [c360]Haoxuan You, Luowei Zhou, Bin Xiao, Noel Codella, Yu Cheng, Ruochen Xu, Shih-Fu Chang, Lu Yuan:
Learning Visual Representation from Modality-Shared Contrastive Language-Image Pre-training. ECCV (27) 2022: 69-87 - [c359]Christopher Thomas, Yipeng Zhang, Shih-Fu Chang:
Fine-Grained Visual Entailment. ECCV (36) 2022: 398-416 - [c358]Shiyuan Huang, Robinson Piramuthu, Shih-Fu Chang, Gunnar A. Sigurdsson:
Video in 10 Bits: Few-Bit VideoQA for Efficiency and Privacy. ECCV Workshops (5) 2022: 738-754 - [c357]Haoxuan You, Rui Sun, Zhecan Wang, Kai-Wei Chang, Shih-Fu Chang:
Find Someone Who: Visual Commonsense Understanding in Human-Centric Grounding. EMNLP (Findings) 2022: 5444-5454 - [c356]Zhecan Wang, Haoxuan You, Yicheng He, Wenhao Li, Kai-Wei Chang, Shih-Fu Chang:
Understanding ME? Multimodal Evaluation for Fine-grained Visual Commonsense. EMNLP 2022: 9212-9224 - [c355]Long Chen, Yulei Niu, Brian Chen, Xudong Lin, Guangxing Han, Christopher Thomas, Hammad A. Ayyubi, Heng Ji, Shih-Fu Chang:
Weakly-Supervised Temporal Article Grounding. EMNLP 2022: 9402-9413 - [c354]Gourav Datta, Tyler Etchart, Vivek Yadav, Varsha Hedau, Pradeep Natarajan, Shih-Fu Chang:
Asd-Transformer: Efficient Active Speaker Detection Using Self And Multimodal Transformers. ICASSP 2022: 4568-4572 - [c353]Jiawei Ma, Xu Zhang, Yue Wu, Varsha Hedau, Shih-Fu Chang:
Few-Shot Gaze Estimation with Model Offset Predictors. ICASSP 2022: 4893-4897 - [c352]Zhenhailong Wang, Manling Li, Ruochen Xu, Luowei Zhou, Jie Lei, Xudong Lin, Shuohang Wang, Ziyi Yang, Chenguang Zhu, Derek Hoiem, Shih-Fu Chang, Mohit Bansal, Heng Ji:
Language Models with Image Descriptors are Strong Few-Shot Video-Language Learners. NeurIPS 2022 - [i107]Manling Li, Ruochen Xu, Shuohang Wang, Luowei Zhou, Xudong Lin, Chenguang Zhu, Michael Zeng, Heng Ji, Shih-Fu Chang:
CLIP-Event: Connecting Text and Images with Event Structures. CoRR abs/2201.05078 (2022) - [i106]Zhecan Wang, Noel Codella, Yen-Chun Chen, Luowei Zhou, Jianwei Yang, Xiyang Dai, Bin Xiao, Haoxuan You, Shih-Fu Chang, Lu Yuan:
CLIP-TD: CLIP Targeted Distillation for Vision-Language Tasks. CoRR abs/2201.05729 (2022) - [i105]Xudong Lin, Fabio Petroni, Gedas Bertasius, Marcus Rohrbach, Shih-Fu Chang, Lorenzo Torresani:
Learning To Recognize Procedural Activities with Distant Supervision. CoRR abs/2201.10990 (2022) - [i104]Guangxing Han, Jiawei Ma, Shiyuan Huang, Long Chen, Shih-Fu Chang:
Few-Shot Object Detection with Fully Cross-Transformer. CoRR abs/2203.15021 (2022) - [i103]Christopher Thomas, Yipeng Zhang, Shih-Fu Chang:
Fine-Grained Visual Entailment. CoRR abs/2203.15704 (2022) - [i102]Guangxing Han, Jiawei Ma, Shiyuan Huang, Long Chen, Rama Chellappa, Shih-Fu Chang:
Multimodal Few-Shot Object Detection with Meta-Learning Based Cross-Modal Prompting. CoRR abs/2204.07841 (2022) - [i101]Zhecan Wang, Noel Codella, Yen-Chun Chen, Luowei Zhou, Xiyang Dai, Bin Xiao, Jianwei Yang, Haoxuan You, Kai-Wei Chang, Shih-Fu Chang, Lu Yuan:
Multimodal Adaptive Distillation for Leveraging Unimodal Encoders for Vision-Language Tasks. CoRR abs/2204.10496 (2022) - [i100]Zhenhailong Wang, Manling Li, Ruochen Xu, Luowei Zhou, Jie Lei, Xudong Lin, Shuohang Wang, Ziyi Yang, Chenguang Zhu, Derek Hoiem, Shih-Fu Chang, Mohit Bansal, Heng Ji:
Language Models with Image Descriptors are Strong Few-Shot Video-Language Learners. CoRR abs/2205.10747 (2022) - [i99]Xudong Lin, Simran Tiwari, Shiyuan Huang, Manling Li, Mike Zheng Shou, Heng Ji, Shih-Fu Chang:
Towards Fast Adaptation of Pretrained Contrastive Models for Multi-channel Video-Language Retrieval. CoRR abs/2206.02082 (2022) - [i98]Hammad A. Ayyubi, Christopher Thomas, Lovish Chum, Rahul Lokesh, Yulei Niu, Xudong Lin, Long Chen, Jaywon Koo, Sounak Ray, Shih-Fu Chang:
Multimodal Event Graphs: Towards Event Centric Understanding of Multimodal World. CoRR abs/2206.07207 (2022) - [i97]Haoxuan You, Luowei Zhou, Bin Xiao, Noel Codella, Yu Cheng, Ruochen Xu, Shih-Fu Chang, Lu Yuan:
Learning Visual Representation from Modality-Shared Contrastive Language-Image Pre-training. CoRR abs/2207.12661 (2022) - [i96]Shiyuan Huang, Robinson Piramuthu, Shih-Fu Chang, Gunnar A. Sigurdsson:
Video in 10 Bits: Few-Bit VideoQA for Efficiency and Privacy. CoRR abs/2210.08391 (2022) - [i95]Long Chen, Yulei Niu, Brian Chen, Xudong Lin, Guangxing Han, Christopher Thomas, Hammad A. Ayyubi, Heng Ji, Shih-Fu Chang:
Weakly-Supervised Temporal Article Grounding. CoRR abs/2210.12444 (2022) - [i94]Guang Yang, Manling Li, Jiajie Zhang, Xudong Lin, Shih-Fu Chang, Heng Ji:
Video Event Extraction via Tracking Visual States of Arguments. CoRR abs/2211.01781 (2022) - [i93]Zhecan Wang, Haoxuan You, Yicheng He, Wenhao Li, Kai-Wei Chang, Shih-Fu Chang:
Understanding ME? Multimodal Evaluation for Fine-grained Visual Commonsense. CoRR abs/2211.05895 (2022) - [i92]Haoxuan You, Rui Sun, Zhecan Wang, Kai-Wei Chang, Shih-Fu Chang:
Find Someone Who: Visual Commonsense Understanding in Human-Centric Grounding. CoRR abs/2212.06971 (2022) - [i91]Yuncong Yang, Jiawei Ma, Shiyuan Huang, Long Chen, Xudong Lin, Guangxing Han, Shih-Fu Chang:
TempCLR: Temporal Alignment Representation with Contrastive Learning. CoRR abs/2212.13738 (2022) - 2021
- [j113]Yulei Niu, Hanwang Zhang, Zhiwu Lu, Shih-Fu Chang:
Variational Context: Exploiting Visual and Textual Context for Grounding Referring Expressions. IEEE Trans. Pattern Anal. Mach. Intell. 43(1): 347-359 (2021) - [c351]Long Chen, Wenbo Ma, Jun Xiao, Hanwang Zhang, Shih-Fu Chang:
Ref-NMS: Breaking Proposal Bottlenecks in Two-Stage Referring Expression Grounding. AAAI 2021: 1036-1044 - [c350]Yi R. Fung, Christopher Thomas, Revanth Gangi Reddy, Sandeep Polisetty, Heng Ji, Shih-Fu Chang, Kathleen R. McKeown, Mohit Bansal, Avi Sil:
InfoSurgeon: Cross-Media Fine-grained Information Consistency Checking for Fake News Detection. ACL/IJCNLP (1) 2021: 1683-1698 - [c349]Sijie Song, Xudong Lin, Jiaying Liu, Zongming Guo, Shih-Fu Chang:
Co-Grounding Networks With Semantic Attention for Referring Expression Comprehension in Videos. CVPR 2021: 1346-1355 - [c348]Xudong Lin, Gedas Bertasius, Jue Wang, Shih-Fu Chang, Devi Parikh, Lorenzo Torresani:
Vx2Text: End-to-End Learning of Video-Based Text Generation From Multimodal Inputs. CVPR 2021: 7005-7015 - [c347]Alireza Zareian, Kevin Dela Rosa, Derek Hao Hu, Shih-Fu Chang:
Open-Vocabulary Object Detection Using Captions. CVPR 2021: 14393-14402 - [c346]Brian Chen, Xudong Lin, Christopher Thomas, Manling Li, Shoya Yoshida, Lovish Chum, Heng Ji, Shih-Fu Chang:
Joint Multimedia Event Extraction from Video and Article. EMNLP (Findings) 2021: 74-88 - [c345]Guangxing Han, Yicheng He, Shiyuan Huang, Jiawei Ma, Shih-Fu Chang:
Query Adaptive Few-Shot Object Detection with Heterogeneous Graph Convolutional Networks. ICCV 2021: 3243-3252 - [c344]Brian Chen, Andrew Rouditchenko, Kevin Duarte, Hilde Kuehne, Samuel Thomas, Angie W. Boggust, Rameswar Panda, Brian Kingsbury, Rogério Feris, David Harwath, James R. Glass, Michael Picheny, Shih-Fu Chang:
Multimodal Clustering Networks for Self-supervised Learning from Unlabeled Videos. ICCV 2021: 7992-8001 - [c343]Jiawei Ma, Hanchen Xie, Guangxing Han, Shih-Fu Chang, Aram Galstyan, Wael Abd-Almageed:
Partner-Assisted Learning for Few-Shot Image Classification. ICCV 2021: 10553-10562 - [c342]Zhizheng Zhang, Cuiling Lan, Wenjun Zeng, Zhibo Chen, Shih-Fu Chang:
Uncertainty-Aware Few-Shot Image Classification. IJCAI 2021: 3420-3426 - [c341]Qingyun Wang, Manling Li, Xuan Wang, Nikolaus Nova Parulian, Guangxing Han, Jiawei Ma, Jingxuan Tu, Ying Lin, Haoran Zhang, Weili Liu, Aabhas Chauhan, Yingjun Guan, Bangzheng Li, Ruisong Li, Xiangchen Song, Yi R. Fung, Heng Ji, Jiawei Han, Shih-Fu Chang, James Pustejovsky, Jasmine Rah, David Liem, Ahmed Elsayed, Martha Palmer, Clare R. Voss, Cynthia Schneider, Boyan A. Onyshkevych:
COVID-19 Literature Knowledge Graph Construction and Drug Repurposing Report Generation. NAACL-HLT (Demonstrations) 2021: 66-77 - [c340]Haoyang Wen, Ying Lin, Tuan Manh Lai, Xiaoman Pan, Sha Li, Xudong Lin, Ben Zhou, Manling Li, Haoyu Wang, Hongming Zhang, Xiaodong Yu, Alexander Dong, Zhenhailong Wang, Yi Ren Fung, Piyush Mishra, Qing Lyu, Dídac Surís, Brian Chen, Susan Windisch Brown, Martha Palmer, Chris Callison-Burch, Carl Vondrick, Jiawei Han, Dan Roth, Shih-Fu Chang, Heng Ji:
RESIN: A Dockerized Schema-Guided Cross-document Cross-lingual Cross-media Information Extraction and Event Tracking System. NAACL-HLT (Demonstrations) 2021: 133-143 - [c339]Liunian Harold Li, Haoxuan You, Zhecan Wang, Alireza Zareian, Shih-Fu Chang, Kai-Wei Chang:
Unsupervised Vision-and-Language Pre-training Without Parallel Images and Captions. NAACL-HLT 2021: 5339-5350 - [c338]Hassan Akbari, Liangzhe Yuan, Rui Qian, Wei-Hong Chuang, Shih-Fu Chang, Yin Cui, Boqing Gong:
VATT: Transformers for Multimodal Self-Supervised Learning from Raw Video, Audio and Text. NeurIPS 2021: 24206-24221 - [i90]Xudong Lin, Gedas Bertasius, Jue Wang, Shih-Fu Chang, Devi Parikh, Lorenzo Torresani:
VX2TEXT: End-to-End Learning of Video-Based Text Generation From Multimodal Inputs. CoRR abs/2101.12059 (2021) - [i89]Sijie Song, Xudong Lin, Jiaying Liu, Zongming Guo, Shih-Fu Chang:
Co-Grounding Networks with Semantic Attention for Referring Expression Comprehension in Videos. CoRR abs/2103.12346 (2021) - [i88]Guangxing Han, Shiyuan Huang, Jiawei Ma, Yicheng He, Shih-Fu Chang:
Meta Faster R-CNN: Towards Accurate Few-Shot Object Detection with Attentive Feature Alignment. CoRR abs/2104.07719 (2021) - [i87]Hassan Akbari, Liangzhe Yuan, Rui Qian, Wei-Hong Chuang, Shih-Fu Chang, Yin Cui, Boqing Gong:
VATT: Transformers for Multimodal Self-Supervised Learning from Raw Video, Audio and Text. CoRR abs/2104.11178 (2021) - [i86]Brian Chen, Andrew Rouditchenko, Kevin Duarte, Hilde Kuehne, Samuel Thomas, Angie W. Boggust, Rameswar Panda, Brian Kingsbury, Rogério Schmidt Feris, David Harwath, James R. Glass, Michael Picheny, Shih-Fu Chang:
Multimodal Clustering Networks for Self-supervised Learning from Unlabeled Videos. CoRR abs/2104.12671 (2021) - [i85]Jiawei Ma, Hanchen Xie, Guangxing Han, Shih-Fu Chang, Aram Galstyan, Wael Abd-Almageed:
Partner-Assisted Learning for Few-Shot Image Classification. CoRR abs/2109.07607 (2021) - [i84]Brian Chen, Xudong Lin, Christopher Thomas, Manling Li, Shoya Yoshida, Lovish Chum, Heng Ji, Shih-Fu Chang:
Joint Multimedia Event Extraction from Video and Article. CoRR abs/2109.12776 (2021) - [i83]Brian Chen, Ramprasaath R. Selvaraju, Shih-Fu Chang, Juan Carlos Niebles, Nikhil Naik:
PreViTS: Contrastive Pretraining with Video Tracking Supervision. CoRR abs/2112.00804 (2021) - [i82]Zhecan Wang, Haoxuan You, Liunian Harold Li, Alireza Zareian, Suji Park, Yiqing Liang, Kai-Wei Chang, Shih-Fu Chang:
SGEITL: Scene Graph Enhanced Image-Text Learning for Visual Commonsense Reasoning. CoRR abs/2112.08587 (2021) - [i81]Guangxing Han, Yicheng He, Shiyuan Huang, Jiawei Ma, Shih-Fu Chang:
Query Adaptive Few-Shot Object Detection with Heterogeneous Graph Convolutional Networks. CoRR abs/2112.09791 (2021) - [i80]Revanth Gangi Reddy, Xilin Rui, Manling Li, Xudong Lin, Haoyang Wen, Jaemin Cho, Lifu Huang, Mohit Bansal, Avirup Sil, Shih-Fu Chang, Alexander G. Schwing, Heng Ji:
MuMuQA: Multimedia Multi-Hop News Question Answering via Cross-Media Knowledge Extraction and Grounding. CoRR abs/2112.10728 (2021) - 2020
- [j112]Xu Zhang, Zhaohui H. Sun, Svebor Karaman, Shih-Fu Chang:
Discovering Image Manipulation History by Pairwise Relation and Forensics Tools. IEEE J. Sel. Top. Signal Process. 14(5): 1012-1023 (2020) - [c337]Brian Chen, Bo Wu, Alireza Zareian, Hanwang Zhang, Shih-Fu Chang:
General Partial Label Learning via Dual Bipartite Graph Autoencoder. AAAI 2020: 10502-10509 - [c336]Manling Li, Alireza Zareian, Ying Lin, Xiaoman Pan, Spencer Whitehead, Brian Chen, Bo Wu, Heng Ji, Shih-Fu Chang, Clare R. Voss, Daniel Napierski, Marjorie Freedman:
GAIA: A Fine-grained Multimedia Knowledge Extraction System. ACL (demo) 2020: 77-86 - [c335]Manling Li, Alireza Zareian, Qi Zeng, Spencer Whitehead, Di Lu, Heng Ji, Shih-Fu Chang:
Cross-media Structured Common Space for Multimedia Event Extraction. ACL 2020: 2557-2568 - [c334]Alireza Zareian, Svebor Karaman, Shih-Fu Chang:
Weakly Supervised Visual Semantic Parsing. CVPR 2020: 3733-3742 - [c333]Dídac Surís, Dave Epstein, Heng Ji, Shih-Fu Chang, Carl Vondrick:
Learning to Learn Words from Visual Scenes. ECCV (29) 2020: 434-452 - [c332]Alireza Zareian, Svebor Karaman, Shih-Fu Chang:
Bridging Knowledge Graphs to Generate Scene Graphs. ECCV (23) 2020: 606-623 - [c331]