**Preprints** [New!]Location-Sensitive Visual Recognition with Cross-IOU Loss
Kaiwen Duan, Lingxi Xie, Honggang Qi, Song Bai, Qingming Huang, Qi Tian
( PDF | CODE )

Selected Conference and Journal Papers. Full list can be found in Google Scholar

[New!]DragDiffusion: Harnessing Diffusion Models for Interactive Point-based Image Editing
Yujun Shi, Chuhui Xue, Jiachun Pan, Wenqing Zhang, Vincent Y. F. Tan, Song Bai
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2024
( PDF | Project Page )

[New!]General Object Foundation Model for Images and Videos at Scale
Junfeng Wu, Yi Jiang, Qihao Liu, Zehuan Yuan, Xiang Bai, Song Bai
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2024

[New!]DIRECT-3D: Learning Direct Text-to-3D Generation on Massive Noisy 3D Data
Qihao Liu, Yi Zhang, Song Bai, Adam Kortylewski, Alan Yuille
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2024

[New!]Discovering Failure Modes of Text-guided Diffusion Models via Adversarial Search
Qihao Liu, Adam Kortylewski, Yutong Bai, Song Bai, Alan Yuille
International Conference on Learning Representations (ICLR), 2024

Mixed Samples as Probes for Unsupervised Model Selection in Domain Adaptation
Dapeng Hu, Jian Liang, Jun Hao Liew, Chuhui Xue, Song Bai, Xinchao Wang
Neural Information Processing Systems (NeurIPS), 2023

CenterNet++ for Object Detection
Kaiwen Duan, Song Bai, Lingxi Xie, Honggang Qi, Qingming Huang, Qi Tian
IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2023

Understanding and Mitigating Dimensional Collapse in Federated Learning
Yujun Shi, Jian Liang, Wenqing Zhang, Chuhui Xue, Vincent Y. F. Tan, Song Bai
IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2023

Holistically-attracted Wireframe Parsing: From Supervised to Self-supervised Learning
Nan Xue, Tianfu Wu, Song Bai, Fu-Dong Wang, Gui-Song Xia, Liangpei Zhang and Philip H.S. Torr
IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2023

MOSE: A New Dataset for Video Object Segmentation in Complex Scenes
Henghui Ding, Chang Liu, Shuting He, Xudong Jiang, Philip H.S. Torr, Song Bai
IEEE Conference on Computer Vision (ICCV), 2023
( PDF )

SRFormer: Permuted Self-Attention for Single Image Super-Resolution
Yupeng Zhou, Zhen Li, Chun-Le Guo, Song Bai, Ming-Ming Cheng, Qibin Hou
IEEE Conference on Computer Vision (ICCV), 2023
( PDF )

InstMove: Instance Motion for Object-centric Video Segmentation
Qihao Liu, Junfeng Wu, Yi Jiang, Xiang Bai, Alan Yuille, Song Bai
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2023
( PDF )

PLA: Language-Driven Open-Vocabulary 3D Scene Understanding
Runyu Ding, Jihan Yang, Chuhui Xue, Wenqing Zhang, Song Bai, Xiaojuan Qi
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2023
( PDF )

Towards Understanding and Mitigating Dimensional Collapse in Heterogeneous Federated Learning
Yujun Shi, Jian Liang, Wenqing Zhang, Vincent Y.F. Tan, Song Bai
International Conference on Learning Representations (ICLR), 2023
( PDF )

Is Synthetic Data from Generative Models Ready for Image Recognition?
Ruifei He, Shuyang Sun, Xin Yu, Chuhui Xue, Wenqing Zhang, Philip Torr, Song Bai, Xiaojuan Qi
International Conference on Learning Representations (ICLR), 2023
( PDF )

PV3D: A 3D Generative Model for Portrait Video Generation
Eric Zhongcong Xu, Jianfeng Zhang, Jun Hao Liew, Wenqing Zhang, Song Bai, Jiashi Feng, Mike Zheng Shou
International Conference on Learning Representations (ICLR), 2023
( PDF )

Image-to-Character-to-Word Transformers for Accurate Scene Text Recognition
Chuhui Xue, Jiaxing Huang, Wenqing Zhang, Shijian Lu, Changhu Wang, Song Bai
IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2023
( PDF )

Language Matters: A Weakly Supervised Vision-Language Pre-training Approach for Scene Text Detection and Spotting
Chuhui Xue, Yu Hao, Shijian Lu, Philip Torr, Song Bai
European Conference on Computer Vision (ECCV), Oral, 2022
( PDF )

In Defense of Online Models for Video Instance Segmentation
Junfeng Wu, Qihao Liu, Yi Jiang, Song Bai, Alan Yuille, Xiang Bai
European Conference on Computer Vision (ECCV), Oral, 2022
( PDF )

Contextual Text Block Detection towards Scene Text Understanding
Chuhui Xue, Jiaxing Huang, Shijian Lu, Changhu Wang, Song Bai
European Conference on Computer Vision (ECCV), 2022
( PDF | DATASET )

SeqFormer: Sequential Transformer for Video Instance Segmentation
Junfeng Wu, Yi Jiang, Song Bai, Wenqing Zhang, Xiang Bai
European Conference on Computer Vision (ECCV), Oral, 2022
( PDF )

Explicit Occlusion Reasoning for Multi-person 3D Human Pose Estimation
Qihao Liu, Yi Zhang, Song Bai, Alan Yuille
European Conference on Computer Vision (ECCV), 2022
( PDF )

Occluded Video Instance Segmentation: A Benchmark
Jiyang Qi, Yan Gao, Yao Hu, Xinggang Wang, Xiaoyu Liu, Xiang Bai, Serge Belongie, Alan Yuille, Philip H.S. Torr, Song Bai
International Journal of Computer Vision (IJCV), 2022
( PDF | DATASET )

TransMix: Attend to Mix for Vision Transformers
Jie-Neng Chen, Shuyang Sun, Ju He, Philip H.S. Torr, Alan Yuille, Song Bai
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2022
( PDF )

Fourier Document Restoration for Robust Document Dewarping and Recognition
Chuhui Xue, Zichen Tian, Fangneng Zhan, Shijian Lu, Song Bai
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2022
( PDF | DATASET )

Knowledge Distillation as Efficient Pre-training: Faster Convergence, Higher Data-efficiency, and Better Transferability
Ruifei He, Shuyang Sun, Jihan Yang, Song Bai, Xiaojuan Qi
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2022
( PDF )

Mimicking the Oracle: An Initial Phase Decorrelation Approach for Class Incremental Learning
Yujun Shi, Kuangqi Zhou, Jian Liang, Zihang Jiang, Jiashi Feng, Philip H.S. Torr, Song Bai, Vincent Y. F. Tan
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2022
( PDF )

DanceTrack: Multi-Object Tracking in Uniform Appearance and Diverse Motion
Peize Sun, Jinkun Cao, Yi Jiang, Zehuan Yuan, Song Bai, Kris Kitani, Ping Luo
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2022
( PDF | DATASET )

An Empirical Study of End-to-End Temporal Action Detection
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2022
Xiaolong Liu, Song Bai, Xiang Bai
( PDF )

YouMVOS: An Actor-centric Multi-shot Video Object Segmentation Dataset
Donglai Wei, Siddhant Kharbanda, Sarthak Arora, Roshan Roy, Nishant Jain, Akash Palrecha, Tanav Shah, Shray Mathur, Ritik Mathur, Abhijay Kemka, Anirudh Chakravarthy, Zudi Lin, Won-Dong Jang, Yansong Tang, Song Bai, James Tompkin, Philip H.S. Torr, Hanspeter Pfister
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2022
( PDF | DATASET )

Autoscale: Learning to Scale for Crowd Counting
Chenfeng Xu, Dingkang Liang, Yongchao Xu, Song Bai, Wei Zhan, Xiang Bai, Masayoshi Tomizuka
International Journal of Computer Vision (IJCV), 2022
( PDF )

PlaneTR: Structure-Guided Transformers for 3D Plane Recovery
Bin Tan, Nan Xue, Song Bai, Tianfu Wu, Gui-Song Xia
IEEE Conference on Computer Vision (ICCV), 2021
( PDF )

Occluded Video Instance Segmentation: Dataset and ICCV 2021 Challenge
Jiyang Qi, Yan Gao, Yao Hu, Xinggang Wang, Xiaoyu Liu, Xiang Bai, Serge Belongie, Alan Yuille, Philip Torr, Song Bai
NeurIPS 2021 Datasets and Benchmarks Track, 2021
( PDF )

Advancing COVID-19 Diagnosis with Privacy-Preserving Collaboration in Artificial Intelligence
Xiang Bai, Hanchen Wang, Liya Ma, Yongchao Xu, Jiefeng Gan, Ziwei Fan, Fan Yang, Ke Ma, Jiehua Yang, Song Bai, et al.
Nature Machine Intelligence, 2021
( PDF | Project Page )

Multi-shot Temporal Event Localization: a Benchmark
Xiaolong Liu, Yao Hu, Song Bai, Fei Ding, Xiang Bai, Philip H.S. Torr
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2021
( PDF | CODE | DATASET )

Anchor-Free Person Search
Yichao Yan, Jingpeng Li, Jie Qin, Song Bai, Shengcai Liao, Li Liu, Fan Zhu, Ling Shao
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2021
( PDF | CODE )

SwiftNet: Real-time Video Object Segmentation
Haochen Wang, Xiaolong Jiang, Haibing Ren, Yao Hu, Song Bai
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2021
( PDF )

Deep Interactive Video Inpainting: An Invisibility Cloak for Harry Potter
Cheng Chen, Jiayin Cai, Yao Hu, Xu Tang, Xinggang Wang, Chun Yuan, Xiang Bai, Song Bai
ACM Multimedia, 2021
( PDF )

Hypergraph Convolution and Hypergraph Attention
Song Bai, Feihu Zhang, Philip H.S. Torr
Pattern Recognition (PR), 2021
( PDF | CODE )

XingGAN for Person Image Generation
Hao Tang, Song Bai, Li Zhang, Philip H.S. Torr, Nicu Sebe
European Conference on Computer Vision (ECCV), 2020
( PDF | CODE )

Regional Homogeneity: Towards Learning Transferable Universal Adversarial Perturbations Against Defenses
Yingwei Li, Song Bai, Cihang Xie, Zhenyu Liao, Xiaohui Shen, Alan Yuille
European Conference on Computer Vision (ECCV), 2020
( PDF | CODE )

Corner Proposal Network for Anchor-free, Two-stage Object Detection
Kaiwen Duan, Lingxi Xie, Honggang Qi, Song Bai, Qingming Huang, Qi Tian
European Conference on Computer Vision (ECCV), Spotlight, 2020
( PDF | CODE )

Neural Architecture Search for Lightweight Non-Local Networks
Yingwei Li, Xiaojie Jin, Jieru Mei, Xiaochen Lian, Linjie Yang, Cihang Xie, Qihang Yu, Yuyin Zhou, Song Bai, Alan Yuille
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2020
( PDF | CODE )

Holistically-Attracted Wireframe Parsing
Nan Xue, Tianfu Wu, Song Bai, Fudong Wang, Gui-Song Xia, Liangpei Zhang, Philip H.S. Torr
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2020
( PDF | CODE )

Learning Transferable Adversarial Examples via Ghost Networks
Yingwei Li, Song Bai, Yuyin Zhou, Cihang Xie, Zhishuai Zhang, Alan Yuille
AAAI Conference on Artificial Intelligence (AAAI), 2020
IEEE Conference on Computer Vision and Pattern Recognition Workshop, 2019
( PDF | BibTex | CODE )

Dual Attention GANs for Semantic Image Synthesis
Hao Tang, Song Bai, Nicu Sebe
ACM Multimedia, 2020
( PDF )

Instance Segmentation of LiDAR Point Clouds
Feihu Zhang, Chenye Guan, Jin Fang, Song Bai, Ruigang Yang, Philip H.S. Torr, Victor Prisacariu
IEEE International Conference on Robotics and Automation (ICRA), 2020
( PDF | CODE )

Adversarial Metric Attack and Defense for Person Re-identification
Song Bai, Yingwei Li, Yuyin Zhou, Qizhu Li, Philip H.S. Torr
IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2020
( PDF | CODE )

An Improved Multi-View Convolutional Neural Network for 3D Object Retrieval
Xinwei He, Song Bai, Jiajia Chu, Xiang Bai
IEEE Transactions on Image Processing (TIP), 2020
( PDF )

Anchor Diffusion for Unsupervised Video Object Segmentation
Zhao Yang, Qiang Wang, Luca Bertinetto, Weiming Hu, Song Bai, Philip H.S. Torr
IEEE Conference on Computer Vision (ICCV), 2019
( PDF | BibTex )

Asymmetric Non-local Neural Networks for Semantic Segmentation
Zhen Zhu, Mengde Xu, Song Bai, Tengteng Huang, Xiang Bai
IEEE Conference on Computer Vision (ICCV), 2019
( PDF | BibTex | CODE )

View N-Gram Network for 3D Object Retrieval
Xinwei He, Tengteng Huang, Song Bai, Xiang Bai
IEEE Conference on Computer Vision (ICCV), 2019
( PDF | BibTex )

Prior-aware Neural Network for Partially-Supervised Multi-Organ Segmentation
Yuyin Zhou, Zhe Li, Song Bai, Chong Wang, Xinlei Chen, Mei Han, Elliot Fishman, Alan Yuille
IEEE Conference on Computer Vision (ICCV), 2019
( PDF | BibTex )

CenterNet: Keypoint Triplets for Object Detection
Kaiwen Duan, Song Bai, Lingxi Xie, Honggang Qi, Qingming Huang, Qi Tian
IEEE Conference on Computer Vision (ICCV), 2019
( PDF | BibTex | CODE )

Symmetry-constrained Rectification Network for Scene Text Recognition
Mingkun Yang, Yushuo Guan, Minghui Liao, Xin He, Kaigui Bian, Song Bai, Cong Yao, Xiang Bai
IEEE Conference on Computer Vision (ICCV), 2019
( PDF | BibTex )

Learn to Scale: Generating Multipolar Normalized Density Maps for Crowd Counting
Chenfeng Xu, Kai Qiu, Jianlong Fu, Song Bai, Yongchao Xu, Xiang Bai
IEEE Conference on Computer Vision (ICCV), 2019
( PDF | BibTex )

Re-ranking via Metric Fusion for Object Retrieval and Person Re-identification
Song Bai, Peng Tang, Philip H.S. Torr, Longin Jan Latecki
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2019
( PDF | BibTex )

Learning Attraction Field Representation for Robust Line Segment Detection
Nan Xue, Song Bai, Fudong Wang, Gui-Song Xia, Tianfu Wu, Liangpei Zhang
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2019
( PDF | BibTex | CODE )

Improving Transferability of Adversarial Examples with Input Diversity
Cihang Xie, Zhishuai Zhang, Yuyin Zhou, Song Bai, Jianyu Wang, Zhou Ren, Alan Yuille
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2019
( PDF | BibTex | CODE )

Learning Regional Attraction for Line Segment Detection
Nan Xue, Song Bai, Fudong Wang, Gui-song Xia, Tianfu Wu, Liangpei Zhang, Philip H.S. Torr
IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2019
( PDF | CODE )

Hard-Aware Point-to-Set Deep Metric for Person Re-identification
Rui Yu, Zhiyong Dou, Song Bai, Zhaoxiang Zhang, Yongchao Xu, Xiang Bai
European Conference on Computer Vision (ECCV), 2018
( PDF | BibTex )

Triplet-Center Loss for Multi-View 3D Object Retrieval
Xinwei He, Yang Zhou, Zhichao Zhou, Song Bai, Xiang Bai
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2018
( PDF | BibTex )

Regularized Diffusion Process on Bidirectional Context for Object Retrieval
Song Bai, Xiang Bai, Qi Tian, Longin Jan Latecki
IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2018
( PDF | BibTex | CODE )

PCL: Proposal Cluster Learning for Weakly Supervised Object Detection
Peng Tang, Xinggang Wang, Song Bai, Wei Shen, Xiang Bai, Wenyu Liu, Alan Yuille
IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2018
( PDF | BibTex | CODE )

Automatic Ensemble Diffusion for 3D Shape and Image Retrieval
Song Bai, Zhichao Zhou, Jingdong Wang, Xiang Bai, Longin Jan Latecki, Qi Tian
IEEE Transactions on Image Processing (TIP), 2018
( PDF | BibTex | CODE )

Improving Context-sensitive Similarity via Smooth Neighborhood for Object Retrieval
Song Bai, Shaoyan Sun, Xiang Bai, Zhaoxiang Zhang, Qi Tian
Pattern Recognition (PR), 2018
( PDF | BibTex )

Ensemble Diffusion for Retrieval
Song Bai, Zhichao Zhou, Jingdong Wang, Xiang Bai, Longin Jan Latecki, Qi Tian
IEEE Conference on Computer Vision (ICCV), Oral, 2017
( PDF | BibTex | CODE | Supp. Material )

Scalable Person Re-identification on Supervised Smoothed Manifold
Song Bai, Xiang Bai, Qi Tian
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Spotlight, 2017
( PDF | BibTex | CMC_Curves | Results on GRID and Market-1501 datasets | Supp. Material )

Regularized Diffusion Process for Visual Retrieval
Song Bai, Xiang Bai, Qi Tian, Longin Jan Latecki
AAAI Conference on Artificial Intelligence (AAAI), Oral, 2017
( PDF | BibTex | CODE )

Multidimensional Scaling on Multiple Input Distance Matrices
Song Bai, Xiang Bai, Longin Jan Latecki, Qi Tian
AAAI Conference on Artificial Intelligence (AAAI), 2017
( PDF | BibTex )

GIFT: Towards Scalable 3D Shape Retrieval
Song Bai, Xiang Bai, Zhichao Zhou, Zhaoxiang Zhang, Qi Tian, Longin Jan Latecki
IEEE Transactions on Multimedia (TMM), 2017
( PDF | BibTex | CODE | DATA )

Smooth Neighborhood Structure Mining on Multiple Affinity Graphs with Applications to Context-sensitive Similarity
Song Bai, Shaoyan Sun, Xiang Bai, Zhaoxiang Zhang, Qi Tian
European Conference on Computer Vision (ECCV), 2016
( PDF | BibTex )

GIFT: A Real-time and Scalable 3D Shape Search Engine
Song Bai, Xiang Bai, Zhichao Zhou, Zhaoxiang Zhang, Longin Jan Latecki
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016
( PDF | BibTex | CODE | DATA | Results on ShapeNet Core55-SHREC2016 and SHREC2017 )

Sparse Contextual Activation for Efficient Visual Re-ranking
Song Bai, Xiang Bai
IEEE Transactions on Image Processing (TIP), 2016
( PDF | BibTex | CODE )

Multiple Stage Residual Model for Image Classification and Vector Compression
Song Bai, Xiang Bai, Wenyu Liu
IEEE Transactions on Multimedia (TMM), 2016
( PDF | BibTex )

3D Shape Matching via Two Layer Coding
Xiang Bai, Song Bai, Zhuotun Zhu, Longin Jan Latecki
IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2015
( PDF | BibTex )