Estimating the normal vectors to all the surfaces in the image. Note that this is highly linked to depth estimation as the normal of a surface is orthogonal to its change in depth.
This model used an encoder decoder structure, but the output of this is used to predict 3 very related aspects of the scene: segmentation, depth and surface normals. For images which only have annotations for a subset of these, they use an expert model to provide preliminary annotations in order to avoid having a biased gradient.