💤Quietscore 73.8May 15, 2026·2605.16258cs.CVcs.AIcs.RO

IVGT: Implicit Visual Geometry Transformer for Neural Scene Representation

Yuqi Wu, Tianyu Hu, Wenzhao Zheng, Yuanhui Huang, Haowen Sun, Jie Zhou, Jiwen Lu

Narrative

IVGT replaces the standard approach of predicting explicit pointmaps (pixel-aligned 3D coordinate regression, as in DUSt3R/MASt3R) with a continuous implicit representation using signed distance functions queried from a canonical coordinate system learned across multiple datasets. The system handles the full stack — mesh and point cloud reconstruction, novel view synthesis, depth and normal estimation, and camera pose estimation — all from unposed multi-view images, trained with 2D supervision plus 3D geometric regularization. No quantitative comparison numbers are available in the abstract, so the claimed margin over explicit geometry methods remains unverified without reading the full paper.

No production traction yet. The GitHub references are all automated arXiv tracking pipelines with no meaningful implementation activity, citations are at zero, and no official code repository has surfaced. This is a very fresh preprint from Tsinghua (Zhou/Lu lab), which has a track record in this space, but IVGT is purely at the research stage right now.

Abstract

Reconstructing coherent 3D geometry and appearance from unposed multi-view images is a fundamental yet challenging problem in computer vision. Most existing visual geometry foundation models predict explicit geometry by regressing pixel-aligned pointmaps, often suffering from redundancy and limited geometric continuity. We propose IVGT, an Implicit Visual Geometry Transformer that implicitly models continuous and coherent geometry from pose-free multi-view images. This formulation learns a continuous neural scene representation in a canonical coordinate system and supports continuous spatial queries at any 3D positions, retrieving local features to predict signed distance (SDF) values and colors using lightweight decoders. It allows direct extraction of continuous and coherent surface geometry, enabling rendering of RGB images, depth maps, and surface normal maps from arbitrary viewpoints. We train IVGT via multi-dataset joint optimization with 2D supervision and 3D geometric regularization. IVGT demonstrates generalization across scenes and achieves strong performance on various tasks, including mesh and point cloud reconstruction, novel view synthesis, depth and surface normal estimation, and camera pose estimation.

Citation timeline

Not enough citation snapshots yet to plot a timeline. Come back after a few cron runs.

Signal

Stars: 95
Repos: 10
Citations: 0
Velocity: 0.00/d

GitHub repos (13)

CSQianDong/Awesome-arXiv-Daily-Reporter⭐ 47
“{'arxiv_id': 'arXiv:2605.15205', 'title': 'Does Theory of Mind Improvement Really Benefit Human-AI Interactions? Empirical Findings from Interactive Evaluations', 'authors': 'Nanxu Gong, Zixin Chen, Haotian Li, Zishu Zhao, Jianxun Lian, Huamin Qu, Yanjie Fu, Xing Xie', 'link': 'h”
wwd29/arxiv-daily⭐ 21
“<ul> <li><strong>Authors: </strong>Yuqi Wu, Tianyu Hu, Wenzhao Zheng, Yuanhui Huang, Haowen Sun, Jie Zhou, Jiwen Lu</a></li> <li><strong>Subjects: </strong>cs.CV, cs.AI, cs.RO</a></li> <li><strong>Abstract URL: </strong><a href="https://arxiv.org/abs/2605.16258">https://arxiv.org”
ehijano/rss_fetch⭐ 11
“ </item> <item> <title>IVGT: Implicit Visual Geometry Transformer for Neural Scene Representation</title> <link>https://arxiv.org/abs/2605.16258</link> <description>arXiv:2605.16258v1 Announce Type: cross Abstract: Reconstructing coherent 3D geometry and”
lonePatient/lonePatient.github.io⭐ 9
“{% note blue no-icon %} ID-0-IVGT: Implicit Visual Geometry Transformer for Neural Scene Representation {% endnote %} **链接**: https://arxiv.org/abs/2605.16258 **作者**: Yuqi Wu,Tianyu Hu,Wenzhao Zheng,Yuanhui Huang,Haowen Sun,Jie Zhou,Jiwen Lu **类目**: Computer Vision and Pattern Re”
2shin0/arxiv-ai-mailing⭐ 6
“ ## 52. IVGT: Implicit Visual Geometry Transformer for Neural Scene Representation - **Authors**: Yuqi Wu , Tianyu Hu , Wenzhao Zheng , Yuanhui Huang , Haowen Sun , Jie Zhou , Jiwen Lu - **URL**: [https://arxiv.org/abs/2605.16258](https://arxiv.org/abs/2605.16258) - **Abstract**:”
ttmens/ai-radar-wiki⭐ 5
“IVGT提出了一种隐式视觉几何Transformer，用于从无位姿的多视图图像中重建一致的3D几何和外观。与现有依赖显式像素对齐点图的方法不同，该模型通过隐式表示避免了冗余和局限性，能够更高效地学习场景结构。核心创新在于结合Transformer的全局注意力机制与隐式几何学习，无需精确相机位姿即可实现鲁棒重建。该技术对AR/VR、3D内容生成和自动驾驶感知等产品具有显著商业价值，可降低数据采集成本并提升通用性。 ## 链接 - 📄 arXiv: http://arxiv.org/abs/2605.16258v1 ## PM 视角解读 > 由 Sta”
NeoCodeSmith/NeoSignal⭐ 1
“ { "id": "628473ac9e0b", "title": "IVGT: Implicit Visual Geometry Transformer for Neural Scene Representation", "url": "https://arxiv.org/abs/2605.16258", "summary": "arXiv:2605.16258v1 Announce Type: cross Abstract: Reconstructing coherent 3D geometry ”
Zjj-Low-Key/Zjj-Low-Key.github.io⭐ 0
“}</script><meta name="generator" content="Hexo 7.3.0"></head><body><div id="sidebar"><div id="menu-mask"></div><div id="sidebar-menus"><div class="avatar-img text-center"><img src="/img/myphoto.jpg" onerror="onerror=null;src='/img/friend_404.gif'" alt="avatar"/></div><div class="”
maedoc/tvb-wiki⭐ 0
“ **Source**: arxiv **ID**: 2605.16258 **URL**: https://arxiv.org/abs/2605.16258 **Date**: 2026-05-15 **Year**: 2026 **Authors**: Yuqi Wu, Tianyu Hu, Wenzhao Zheng, Yuanhui Huang, Haowen Sun, Jie Zhou, Jiwen Lu”
mickdur/tech-watch⭐ 0
“ "https://arxiv.org/abs/2605.16245": "2026-05-18T07:51:44.206446+00:00", "https://arxiv.org/abs/2605.16250": "2026-05-18T07:51:44.206446+00:00", "https://arxiv.org/abs/2605.16255": "2026-05-18T07:51:44.206446+00:00", "https://arxiv.org/abs/2605.16258": "2026-05-18T07:51:44”
mirae0708/steven⭐ 0
“ > **Source:** [arXiv](http://arxiv.org/abs/2605.16258v1) > **Category:** Artificial_Intelligence/LLM”
sirichen2/sirichen2.github.io⭐ 0
“ "Jie Zhou", "Jiwen Lu" ], "abs_url": "https://arxiv.org/abs/2605.16258v1", "pdf_url": "https://arxiv.org/pdf/2605.16258v1", "published": "2026-05-15T17:59:57+00:00", "updated": "2026-05-15T17:59:57+00:00",”
xiuguangli/DailyArxiv⭐ 0
“ "date": "2026-05-18", "date_url": "https://arxiv.org/catchup/cs.CV/2026-05-18?abs=True", "arxiv_id": "2605.16258", "abs_url": "https://arxiv.org/abs/2605.16258", "pdf_url": "https://arxiv.org/pdf/2605.16258", "title": "IVGT: Implici”