“数据−场景−智能” 融合：三维建模技术进展与趋势综述（特邀）

曹杰; 孙亚楠; 陈泓霖; 张莉; 刘韬; 孙腾骞; 郝群

doi:10.3788/IRLA20250389

“数据−场景−智能” 融合：三维建模技术进展与趋势综述（特邀）

"Data-scene-intelligence" integration: Progress and trends in 3D modeling technology (invited)

摘要

摘要: 在数字孪生与元宇宙快速演进的背景下，三维建模技术的发展对推动虚实融合具有重要意义。近年来，以“数据−场景−智能”融合为主线，激光雷达点云、倾斜摄影测量、多视图立体匹配（Multi-View Stereo, MVS）、SLAM （Simultaneous Localization and Mapping）及神经渲染等技术共同构成了三维重建的技术谱系。研究进一步表明，空地融合、神经隐式场表达与3D高斯绘制（3D Gaussian Splatting, 3DGS）等方法在复杂场景重建中表现出更强的适应性与突破性。同时，深度学习推动了模型轻量化、实时动态重建及语义−几何联合优化等方向的发展。然而，现有技术在精度与效率平衡、多源数据融合、复杂场景适应性等方面仍面临诸多挑战。未来，三维建模技术需进一步推动轻量化、实时化等性能，为各行业的数字化转型提供有力支持。

Abstract:
Significance Against the backdrop of the rapid development of digital twins and virtual worlds, 3D modeling technology, as the core means of digital expression of the physical world, plays a crucial role in promoting the integration of the virtual and the real. It has become a research hotspot in computer vision, robot navigation and geoinformation science. 3D modeling reconstructs the geometric structure and texture information of the scene through sensor data, providing fundamental support for a wide range of applications such as urban planning, digital protection of cultural heritage, and intelligent industrial detection. With the breakthroughs in multimodal sensing technology and deep learning methods, 3D modeling is accelerating its development towards high precision, real-time performance and intelligence. It is of great significance to comprehensively review its progress and trends.
Progress The article takes the integration of "data - scene - intelligence" as the main line and systematically sorts out the latest progress of various 3D modeling technology routes. Firstly, in the field of 3D modeling based on LiDAR point clouds, the development process from mechanical rotation to solid-state array LiDAR was reviewed, as well as typical solutions for challenges such as dynamic scene blurring, road surface point cloud stratification, underground pipeline reconstruction, and indoor corridor scanning. For instance, for the stratification problem of road surface point clouds, a plane-based global registration (PGR) method is proposed, and the mean root mean square error (RMSE) in different scenarios is all less than 0.05 meters. In the field of 3D modeling based on oblique photogrammetry, the advantages of this technology in efficient texture acquisition in urban-scale modeling were analyzed, and the automatic repair strategies for common defects such as stretching, deformation, and holes were explored. Methods such as boundary contraction and curve contraction flow significantly improve the integrity of the model. The boundary contraction method achieved a plane error of 5.6 cm and an elevation error of 4.0 cm in the experiment.
　　 For 3D modeling based on multi-view stereo matching (MVS), the article elaborates in detail on the evolution of deep learning-driven methods. Methods such as Ada-MVS and RayMVSNet++ have respectively achieved breakthroughs in handling inconsistent tilted aerial images and reducing computational memory consumption. In the field of 3D modeling based on multi-source data fusion, complementary enhancement methods in complex scenes are explored, such as air-ground image point cloud fusion, LiDAR-MVS collaboration, SLAM-oblique photogrammetry complementarity, and the combination of neural implicit fields and 3D Gaussian splatters (3DGS). For instance, after fusing the oblique photography image of the drone with the ground close-up image, the horizontal RMSE drops to 0.002 meters and the vertical RMSE drops to 0.001 meters. Additionally, this paper summarizes the performance characteristics of mainstream technical routes (as shown in Tab.1) and identifies key development trends, including the lightweight of models, real-time dynamic reconstruction, and semantic-geometric joint optimization driven by deep learning.
　　 In addition, although 3D modeling technology has made remarkable progress, it still faces some challenges. The balance between accuracy and efficiency remains a constraint, as deep learning networks improve detail reconstruction but lead to increased memory usage and inference latency, limiting deployment on edge devices. Due to the differences in data resolution, noise level and geometric distortion among different sensors, multi-source data fusion faces difficulties in data alignment and cross-modal calibration. In addition, complex scenarios such as weak texture regions, high-reflectivity surfaces, and extreme geometric environments impose higher requirements on the robustness of algorithms.
Conclusions and Prospects Deep learning-driven intelligent fusion and optimization will be the inevitable trend and core direction to break through the current technical bottlenecks and move towards the next generation of intelligent 3D reconstruction. Future 3D modeling technologies need to further enhance their performance in terms of lightweighting and real-time performance. By addressing the above challenges, it is possible to better adapt to the differentiated demands of different application fields, thereby providing more powerful support for the digital transformation of various industries and promoting the wide application of digital twin and metaverse technologies.Against the backdrop of the rapid development of digital twins and virtual worlds, 3D modeling technology, as the core means of digital expression of the physical world, plays a crucial role in promoting the integration of the virtual and the real. It has become a research hotspot in computer vision, robot navigation and geoinformation science. 3D modeling reconstructs the geometric structure and texture information of the scene through sensor data, providing fundamental support for a wide range of applications such as urban planning, digital protection of cultural heritage, and intelligent industrial detection. With the breakthroughs in multimodal sensing technology and deep learning methods, 3D modeling is accelerating its development towards high precision, real-time performance and intelligence. It is of great significance to comprehensively review its progress and trends.

HTML全文

参考文献(46)

施引文献

资源附件(0)