For any question or support, please reach @Mohamed Elhoseiny.
Fifth workshop on Closing the Loop Between Vision and Language, hosted by #ICCV2023.
Paper ID | Paper Title | Author Names | |||
---|---|---|---|---|---|
3 | Sparse Linear Concept Discovery Models (oral) | Panousis, Konstantinos*; Ienco, Dino; Marcos, Diego | |||
7 | Compositional Image Search with Progressive Vision-language Alignment and Multimodal Fusion | Hu, Zhizhang*; Zhu, Xinliang; Tran, Son ; Vidal, Rene; Dhua, Arnab | |||
8 | Vision-Language Models Performing Zero-Shot Tasks Exhibit Disparities Between Gender Groups | Hall, Melissa*; Gustafson, Laura; Adcock, Aaron; Misra, Ishan; Ross, Candace | |||
11 | BiLMa: Bidirectional Local-Matching for Text-based Person Re-identification | Fujii, Takuro*; Tarashima, Shuhei | |||
12 | Alignment and Generation Adapter for Efficient Video-text Understanding | Fang, Han*; Yang, Zhifei; Wei, Yuhan; zang, xianghao; ban, chao; Feng, Zerun; He, Zhongjiang; Li, Yongxiang; Sun, Hao | |||
13 | LLaViLo: Boosting Video Moment Retrieval via Adapter-Based Multimodal Modeling | Ma, Kaijing; zang, xianghao; Feng, Zerun; Fang, Han*; ban, chao; Wei, Yuhan; He, Zhongjiang; Li, Yongxiang; Sun, Hao | |||
16 | Zero-Shot and Few-Shot Video Question Answering with Multi-Modal Prompts (oral) | Engin, Deniz*; Avrithis, Yannis | |||
19 | ECO: Ensembling Context Optimization for Vision-Language Models | Agnolucci, Lorenzo*; Baldrati, Alberto; Todino, Francesco; Becattini, Federico; Bertini, Marco; Del Bimbo, Alberto | |||
20 | A Cross-Dataset Study on the Brazilian Sign Language Translation | Sarmento, Amanda H A*; Ponti, Moacir A | |||
22 | Context-VQA: Towards Context-Aware and Purposeful Visual Question Answering | Naik, Nandita S*; Potts, Christopher; Kreiss, Elisa | |||
25 | Explaining Vision and Language through Graphs of Events in Space and Time | Masala, Mihai LLP; Cudlenco, Nicolae; Rebedea, Traian; Leordeanu, Marius* | |||
27 | Mapping Memes to Words for Multimodal Hateful Meme Classification | Burbi, Giovanni; Baldrati, Alberto*; Agnolucci, Lorenzo; Bertini, Marco; Del Bimbo, Alberto | |||
28 | Cross-Modal Dense Passage Retrieval for Open Knowledge Visual Question Answering | Reichman, Benjamin*; Heck, Larry | |||
29 | PatFig: Generating Short and Long Captions for Patent Figures | Aubakirova, Dana*; Gerdes, Kim; Liu, Lufei | |||
30 | An empirical study of the effect of video encoders on Temporal Video Grounding | Meza, Ignacio A*; Rodriguez, Cristian; Marrese-Taylor, Edison; Bravo-Marquez, Felipe | |||
31 | Towards Vision-Language Mechanistic Interpretability: A Causal Tracing Tool for BLIP | Palit, Vedant; Pandey, Rohan*; Arora, Aryaman; Liang, Paul Pu | |||
32 | Multimodal Neurons in Pretrained Text-Only Transformers (oral) | Schwettmann, Sarah*; Chowdhury, Neil; Klein, Samuel J; Bau, David; Torralba, Antonio | |||
6 | VQA Therapy: Exploring Answer Differences by Visually Grounding Answers | Chen, Chongyan*; Anjum, Samreen; Gurari, Danna | |||
9 | HRS-Bench: Holistic, Reliable and Scalable Benchmark for Text-to-Image Models | Abdelrahman, Eslam Mohamed*; Sun, Pengzhan; shen, xiaoqian; Khan, Faizan Farooq; Li, Li Erran; Elhoseiny, Mohamed | |||
14 | In-Style: Bridging Text and Uncurated Videos with Style Transfer for Text-Video Retrieval | Shvetsova, Nina*; Kukleva, Anna; Schiele, Bernt; Kuehne, Hilde | |||
23 | Pretrained Language Models as Visual Planners for Human Assistance | Patel, Dhruvesh; Eghbalzadeh, Hamid; Chen, Brian; Kamra, Nitin; Iuzzolino, Michael; Jain, Unnat; Desai, Ruta P* | |||
33 | Painter: Teaching Auto-regressive Language Models to Draw Sketches | Pourreza, Reza*; Bhattacharyya, Apratim; Panchal, Sunny P; Lee, Mingu; Madan, Pulkit; Memisevic, Roland | |||
34 | LLMs as Zero-Shot Visual Reasoners: Bridging Visual Understanding and Commonsense Reasoning with Large Language Models | Zhou, Kaiwen*; Lee, Kwonjoon; Misu, Teruhisa; Wang, Xin Eric | |||
36 | Improving Commonsense in Vision-Language Models via Knowledge Graph Riddles | Ye, Shuquan*; Xie, Yujia; Chen, DongDong; Xu, Yichong; Yuan, Lu; Zhu, Chenguang; Liao, Jing | |||
37 | Zero-Shot Composed Image Retrieval with Textual Inversion | Baldrati, Alberto*; Agnolucci, Lorenzo; Bertini, Marco; Del Bimbo, Alberto | |||
38 | Coarse-to-Fine Contrastive Learning in Image-Text-Graph Space for Improved Vision-Language Compositionality (oral) | Singh, Harman*; Zhang, Pengchuan; Wang, Qifan; Wang, Mengjiao; Xiong, Wenhan; Du, Jingfei; Chen, Hugo | |||
39 | Simple Token-Level Confidence Improves Caption Correctness (oral) | Petryk, Suzanne*; Whitehead, Spencer; Gonzalez, Joseph; Darrell, Trevor; Rohrbach, Anna; Rohrbach, Marcus | |||
40 | DeViL: Decoding Vision features into Language | Dani, Meghal*; Rio-Torto, Isabel; Alaniz, Stephan; Akata, Zeynep | |||
41 | MAtch, eXpand and Improve: Unsupervised Finetuning for Zero-Shot Action Recognition with Language Knowledge | Lin, Wei*; Karlinsky, Leonid; Shvetsova, Nina; Possegger, Horst; Kozinski, Mateusz; Panda, Rameswar; Feris, Rogerio; Kuehne, Hilde; Bischof, Horst | |||
42 | Improved Probabilistic Image-Text Representations | Chun, Sanghyuk* | |||
44 | Look, Remember and Reason: Visual Reasoning with Grounded Rationales | Bhattacharyya, Apratim*; Panchal, Sunny P; Lee, Mingu; Pourreza, Reza; Madan, Pulkit; Memisevic, Roland | |||
45 | Visual Coherence Loss for Coherent and Visually Grounded Story Generation (oral) | Hong, Xudong*; Demberg, Vera; Sayeed, Asad B; Schiele, Bernt | |||
46 | Comics for Everyone: Generating Accessible Text Descriptions for Comic Strips | R, Reshma* | |||
47 | Instruction-tuned Self-Questioning Framework for Multimodal Reasoning | Jang, Youwon*; Heo, Yu-Jung; Kim, Jaeseok; Lee, Minsu; Chang, Du-Seong; Zhang, Byoung-Tak |
Event | Date |
---|---|
Paper submission deadline | July 25, 2023 - 11:59 PM PT |
Notification to authors | August 8, 2023 |
Camera-ready deadline | August 14, 2023 |
Workshop date | October 2, 2023 |
Event | Date |
---|---|
Paper submission deadline | August 15, 2023 |
Notification to authors | September 7, 2023 |
Camera-ready deadline | September 15, 2023 |
Workshop date | October 2, 2023 |
The workshop program is as follows: