Hi, I am a Ph.D. PhD student in the Bayesian and Neural Systems (BayesWatch) group at the University of Edinburgh. My principal supervisor is Dr Elliot J. Crowley. Previously, I received my MSc degree in computer science from Boston University in 2020 and my BEng degree in computer science from University of Science and Technology of China in 2018.
My primary research are about computer vision and deep learning. Currently, I am interested in developing computer vision algorithms under insufficient human annotations, including unsupervised learning and semi-supervised learning.
Prediction-Guided Distillation for Dense Object Detection
Chenhongyi Yang, Mateusz Ochal, Amos Storkey, Elliot J. Crowley
PDF | Code
Overview: We propose PGD, a new knowledge distillation framework for dense object detectors. PGD distill every objects in a few key predictive regions and use an adaptive weighting scheme for weigting distillation loss in such regions. On COCO, it achieves between +3.1% and +4.6% AP improvement using ResNet-101 and ResNet50 as the teacher and student backbones, respectively.
Contrastive Object-level Pre-training with Spatial Noise Curriculum Learning
Chenhongyi Yang, Lichao Huang, Elliot J. Crowley
PDF | Code
Overview: CCOP is an object-level self-supervised learning approach. It uses elective search to find rough object regions and use them to build an inter-image object-level contrastive loss and an intra-image object-level discrimination loss so that the model can learn detailed regional features. Moreover, a curriculum learning mechanism to allows the model to consistently acquire a useful learning signal. Experiments show that our approach improves on the MoCo v2 baseline by a large margin on multiple object-level tasks when pre-training on multi-object scene image datasets.
QueryDet: Cascaded Sparse Query for Accelerating High-Resolution Small Object Detection
CVPR 2022, Oral
Chenhongyi Yang, Zehao Huang, Naiyan Wang
PDF | Code
Overview: QueryDet that uses a novel query mechanism to accelerate the inference speed of feature-pyramid based object detectors. The pipeline composes two steps: it first predicts the coarse locations of small objects on low-resolution features and then computes the accurate detection results using high-resolution features sparsely guided by those coarse positions. On COCO dataset, QueryDet improves the detection mAP by 1.0 and mAP(small) by 2.0, and the high-resolution inference speed is improved to 3.0x on average. On VisDrone dataset, we achieve a new state-of-the-art while gaining a 2.3x high-resolution acceleration on average.
Disentangle Your Dense Object Detector
ACM Multi Media 2021, Oral
Chenhongyi Yang*, Zehui Chen*, Qiaofei Li, Feng Zhao, Zheng-Jun Zha, Feng Wu
PDF | Code
Overview: In this work, we investigate the conjunction problem in the state-of-the-art dense object detectors.Based on our finds, we propose Disentangled Dense Object Detector (DDOD) where simple and effective disentanglement mechanisms are designed and integrated the dense object detectors. DDOD lead to 2.0 mAP, 2.4 mAP and 2.2 mAP absolute improvements on RetinaNet, FCOS, and ATSS baselines with negligible extra overhead. Notably, our best model reaches 55.0 mAP on the COCO test-dev set and 93.5 AP on WIDER FACE dataset, achieving new state-of-the-art on these two competitive benchmarks.
Learning to Separate: Detecting Heavily-Occluded Objects in Urban Scenes
Chenhongyi Yang, Vitaly Ablavsky, Kaihong Wang, Qi Feng, Margrit Betke
PDF | Code | Website
Overview: SG-NMS is a new Non-Maximum-Suppression algorithm designed for detcting heavily-occluded objects. It is based on a novel embedding mechanism, in which the semantic and geometric features of the detected boxes are jointly exploited. The embedding makes it possible to determine whether two heavily-overlapping boxes belong to the same object in the physical world. We show the effectiveness of our approach by creating a model called SG-Det and testing SG-Det on two widely-adopted datasets, KITTI and CityPersons for which it achieves state-of-the-art performance.
Consistency Regularization with High-dimensional Non-adversarial Source-guided Perturbation for Unsupervised Domain Adaptation in Segmentation
Kaihong Wang, Chenhongyi Yang, Margrit Betke
PDF | Code
Overview: BiSIDA is a a bidirectional style-induced domain adaptation method that employs consistency regularization to efficiently exploit information from the unlabeled target domain dataset, requiring only a simple neural style transfer model. BiSIDA aligns domains by transferring source images into the style of target images and transferring target images into the style of source images to perform high-dimensional perturbation on the unlabeled target images. It achieves new state-of-theart on two commonly-used synthetic-to-real domain adaptation benchmarks: GTA5-to-CityScapes and SYNTHIA-to-CityScapes.