Instance Segmentation Made Simple
Instance segmentation is one of the fundamental vision tasks. I will present several new, simple approaches to instance segmentation in images. Compared to many other dense prediction tasks, e.g., semantic segmentation, it is the arbitrary number of instances that have made instance segmentation much more challenging. In order to predict a mask for each instance, mainstream approaches either follow the 'detect-then-segment' strategy as used by Mask R-CNN, or predict category masks first then use clustering techniques to group pixels into individual instances. Recently, fully convolutional instance segmentation methods have drawn much attention as they are often simpler and more efficient than two-stage approaches like Mask R-CNN.
To date, almost all such approaches fall behind the two-stage Mask R-CNN method in mask precision when models have similar computation complexity, leaving great room for improvement. First, I will present BlendMask, achieving improved mask prediction by effectively combining instance-level information with semantic information with lower-level fine-granularity. Second, we view the task of instance segmentation from a completely new perspective by introducing the notion of "instance categories", which assigns categories to each pixel within an instance according to the instance's location and size, thus nicely converting instance mask segmentation into a classification-solvable problem.
Last, I will present a simple yet effective instance segmentation framework, termed CondInst (conditional convolutions for instance segmentation). CondInst solves instance segmentation from a new perspective. Instead of using instance-wise ROIs as inputs to a network of fixed weights, CondInst employs dynamic instance-aware networks, conditioned on instances, thus eliminating ROI operations. Experiments show great promises of the proposed methods.
Professor Chunhua Shen is a Chair Professor of Computer Science at Zhejiang University as of Dec. 2021. Prior to that, he was a Full Professor of Computer Science at The University of Adelaide, and Founding Director of the Machine Learning Theory theme at the Australian Institute for Machine Learning. His research mainly focuses on Machine Learning and Computer Vision. He demonstrated innovative approaches to translate research for economic and societal gain. A recent example is the work he did with a leading mobile phone company on the development of image parsing techniques for AI driven photography-which was successfully deployed to over tens of millions mobile phones in 2018. His student alumni include two Australian Research Council DECRA fellows andadditional graduates who are now in tenured or tenure track roles in Universities including the University of Adelaide, Sydney University, Monash University, Wollongong University, Nanyang Technological University Singapore, and a few top-tier universities in China. Besides, 4 of Professor Shen’s PhD students won the prestigious Google PhD Fellowships.