Mix Master Transformers

VGGT: Visual Geometry Grounded Transformer

Abstract: We present VGGT, a feed-forward neural network that directly infers all key 3D attributes of a scene, including camera parameters, point maps, depth maps, and 3D point tracks, from one, a ...

GitHub

YOLO-Master: MOE-Accelerated with Specialized Transformers for Enhanced Real-time Detection.

"Exploring the frontiers of Dynamic Intelligence in YOLO." This work represents our passionate exploration into the evolution of Real-Time Object Detection (RTOD). To the best of our knowledge, ...

GitHub

MitUNet: Enhancing Floor Plan Recognition

MitUNet is a hybrid deep learning architecture designed for high-precision semantic segmentation of walls in 2D floor plans. It addresses the challenge of vectorizing thin, complex structures by ...

Some results have been hidden because they may be inaccessible to you

Show inaccessible results

VGGT: Visual Geometry Grounded Transformer

YOLO-Master: MOE-Accelerated with Specialized Transformers for Enhanced Real-time Detection.

MitUNet: Enhancing Floor Plan Recognition

Trending now