Class Semantics Modulation
for Open-Set Instance Segmentation

Yifei Yang, Zhongxiang Zhou, Jun Wu, Yue Wang, Rong Xiong,
Zhejiang University
Overview of SemSeg

Abstract

This paper addresses the challenge of open-set instance segmentation (OSIS) which segments both known objects and unknown objects not seen in training, thus is essential for enabling robots to safely work in the real world. Existing solutions adopt class-agnostic segmentation where all classes share the same mask output layer leading to inferior performance. Motivated by the superiority of the class-specific mask prediction in close-set instance segmentation, we propose SemSeg with class semantics extraction and mask prediction modulation for conducting class-specific segmentation in OSIS. To extract class semantics for both known and unknown objects in the absence of supervision on unknown objects, we use contrastive learning to construct an embedding space where objects from each known class cluster in an independent territory and the complementary region of known classes can accommodate unknown objects. To modulate the mask prediction, we convert class semantic embedding to convolutional parameters used to predict the mask. Class semantics modulated OSIS allows optimizing the mask output layer for each class independently without competition between each other. And class semantic information is engaged in the segmentation process directly so that can guide and facilitate the segmentation task, which benefits unknown objects with severe generalization challenges particularly. Experiments on the COCO and GraspNet-1Billion datasets demonstrate the merits of our proposed method, especially the strength of instance segmentation for unknown objects.

Qualitative Results

coco qulitative

Qualitative comparisons between SemSeg and other methods on COCO.

graspnet qualitative

Qualitative comparisons between SemSeg and other methods on GraspNet OSIS benchmark.

Quantitative Results

  • Comparison results on COCO.
  • AOSE \(\text{AP}_\text{k}\) \(\text{AR}_\text{unk}^{10}\) \(\text{AR}_\text{unk}^{30}\) \(\text{AR}_\text{unk}^{100}\)
    OLN-Mask+PROSER 2604 23.9 6.8 7.9 8.0
    OpenDet+CA-Mask 3181 29.8 5.9 6.5 6.5
    SemSeg 2536 30.1 11.7 16.7 19.0
  • Comparison results on GraspNet-OSIS-T1.
  • GraspNet-Test-1 GraspNet-Test-2 GraspNet-Test-3
    AOSE \(\text{AP}_\text{k}\) \(\text{AP}_\text{unk}\) AOSE \(\text{AP}_\text{k}\) \(\text{AP}_\text{unk}\) AOSE \(\text{AP}_\text{k}\) \(\text{AP}_\text{unk}\)
    OLN-Mask+PROSER 20961 59.6 33.2 65491 55.9 39.1 94131 54.9 39.4
    OpenDet+CA-Mask 100225 63.4 21.4 237535 58.7 31.9 331990 57.4 32.0
    SemSeg 17598 63.6 37.2 58111 60.7 41.7 80546 60.1 42.2
  • Comparison results on GraspNet-OSIS-T2.
  • GraspNet-Test-4 GraspNet-Test-5 GraspNet-Test-6
    AOSE \(\text{AP}_\text{k}\) \(\text{AP}_\text{unk}\) AOSE \(\text{AP}_\text{k}\) \(\text{AP}_\text{unk}\) AOSE \(\text{AP}_\text{k}\) \(\text{AP}_\text{unk}\)
    OLN-Mask+PROSER 7999 61.0 31.4 36362 55.9 40.6 56067 54.4 39.5
    OpenDet+CA-Mask 39265 63.9 21.4 127570 57.1 35.4 192734 54.9 33.5
    SemSeg 6971 64.3 37.4 33213 59.9 44.0 48885 59.1 43.0

    BibTeX

    
            @ARTICLE{10388394,
              author={Yang, Yifei and Zhou, ZhongXiang and Wu, Jun and Wang, Yue and Xiong, Rong},
              journal={IEEE Robotics and Automation Letters}, 
              title={Class Semantics Modulation for Open-Set Instance Segmentation}, 
              year={2024},
              volume={9},
              number={3},
              pages={2240-2247},
              doi={10.1109/LRA.2024.3353170}
            }