Understanding what is in the image or a video frame is crucial for many computer vision applications. Researchers from Google recently published a paper where they got state-of-the-art results in panoptic segmentation (see below for intuitive explanation). They achieved it by combining information about depth, existing and new objects from two consequent frames.
– Read about the inverse projection problem here. ❗Do not scroll too much and try to guess what's in the image on the right 🧐
– To understand the difference between semantic, instance, and panoptic segmentation, see this image. If you would like to play with code and test panoptic segmentation yourself, visit the full article here.
📰 Original article: Holistic Video Scene Understanding with ViP-DeepLab