MEt3R: Measuring Multi-View Consistency in Generated Images
We introduce MEt3R, a metric for multi-view consistency in generated images. Large-scale generative models for multi-view image generation are rapidly advancing the field of 3D inference from sparse observations. However, due to the nature of generative modeling, traditional reconstruction metrics are not suitable to measure the quality of generated outputs and metrics that are independent of the sampling procedure are desperately needed. In this work, we specifically address the aspect of consistency between generated multi-view images, which can be evaluated independently of the specific scene. Our approach uses DUSt3R to obtain dense 3D reconstructions from image pairs in a feed-forward manner, which are used to warp image contents from one view into the other. Then, feature maps of these images are compared to obtain a similarity score that is invariant to view-dependent effects. Using MEt3R, we evaluate the consistency of a large set of previous methods for novel view and video generation, including our open, multi-view latent diffusion model.
💡 Research Summary
The paper introduces MEt3R, a novel metric designed to evaluate multi‑view consistency of generated images without requiring camera poses or ground‑truth 3D data. The authors first obtain dense, pose‑free point clouds for a pair of images using DUSt3R, a recent stereo reconstruction network that regresses pixel‑aligned 3D points directly in the coordinate frame of the first view. Next, they extract semantic features from the original images with DINO, upscale them with FeatUp to retain high‑frequency details, and project these features into the shared 3D space defined by the DUSt3R point clouds. By rasterizing the projected features back onto the first view’s image plane, they obtain two feature maps that are geometrically aligned. Consistency is then measured as the cosine similarity of these maps over the overlapping region, yielding a bidirectional score S(I₁, I₂) and S(I₂, I₁). The final MEt3R value is defined as 1 – ½
Comments & Academic Discussion
Loading comments...
Leave a Comment