# g3splat **Repository Path**: yejun668/g3splat ## Basic Information - **Project Name**: g3splat - **Description**: No description available - **Primary Language**: Unknown - **License**: MIT - **Default Branch**: main - **Homepage**: None - **GVP Project**: No ## Statistics - **Stars**: 0 - **Forks**: 0 - **Created**: 2026-01-06 - **Last Updated**: 2026-01-06 ## Categories & Tags **Categories**: Uncategorized **Tags**: None ## README

G3Splat

Geometrically Consistent Generalizable Gaussian Splatting

Mehdi Hosseinzadeh      Shin-Fang Chng      Yi Xu      Simon Lucey      Ian Reid      Ravi Garg

Project Page   arXiv   GitHub   Hugging Face   Demo

G3Splat is a pose-free, self-supervised framework for generalizable Gaussian splatting that achieves state-of-the-art performance in geometry reconstruction, relative pose estimation, and novel-view synthesis.

Teaser

--- ## โœจ Highlights - ๐ŸŽฏ **Pose-Free**: No camera poses required at inference time - ๐Ÿ”„ **Self-Supervised**: Trained without ground-truth depth or 3D supervision - ๐Ÿš€ **Feed-Forward**: Real-time inference with no per-scene optimization - ๐Ÿ“ **Geometrically Consistent**: Alignment and orientation losses for accurate 3D reconstruction - ๐ŸŽจ **Flexible**: Supports both 3D Gaussian Splatting (3DGS) and 2D Gaussian Splatting (2DGS) --- ## ๐Ÿ“‹ Table of Contents - [Installation](#installation) - [Model Zoo](#model-zoo) - [Demo](#demo) - [Datasets](#datasets) - [Evaluation](#evaluation) - [Depth Evaluation](#depth-evaluation) - [Pose Estimation](#pose-estimation) - [Novel View Synthesis](#novel-view-synthesis) - [Mesh Reconstruction](#mesh-reconstruction) - [Training](#training) - [Acknowledgements](#acknowledgements) - [Citation](#citation) --- ## ๐Ÿ› ๏ธ Installation Our implementation requires **Python 3.10+** and has been tested with PyTorch 2.1.2 and CUDA 11.8/12.1. ### 1. Clone the Repository ```bash git clone https://github.com/m80hz/g3splat cd g3splat ``` ### 2. Create Environment and Install Dependencies ```bash conda create -y -n g3splat python=3.10 conda activate g3splat pip install torch==2.1.2 torchvision==0.16.2 torchaudio==2.1.2 --index-url https://download.pytorch.org/whl/cu118 pip install "numpy<2" # Required: PyTorch 2.1.2 is incompatible with NumPy 2.x pip install -r requirements.txt # Install CUDA rasterizers pip install git+https://github.com/rmurai0610/diff-gaussian-rasterization-w-pose.git --no-build-isolation pip install git+https://github.com/hbb1/diff-surfel-rasterization.git --no-build-isolation ``` ### 3. (Optional) Compile CUDA Kernels for RoPE For faster inference, compile the CUDA kernels for RoPE positional embeddings: ```bash cd src/model/encoder/backbone/croco/curope/ python setup.py build_ext --inplace cd ../../../../../.. ``` --- ## ๐Ÿ† Model Zoo We provide pretrained checkpoints on [Hugging Face](https://huggingface.co/m80hz/g3splat) ๐Ÿค— ### Available Models | Model | Backbone | Gaussian Type | Training Data | Resolution | Download | |:------|:--------:|:-------------:|:-------------:|:----------:|:--------:| | GยณSplat-3DGS | MASt3R | 3DGS | RealEstate10K | 256ร—256 | [๐Ÿ“ฅ Download](https://huggingface.co/m80hz/g3splat/resolve/main/g3splat_mast3r_3dgs_align_orient_re10k.ckpt) | | GยณSplat-2DGS | MASt3R | 2DGS | RealEstate10K | 256ร—256 | [๐Ÿ“ฅ Download](https://huggingface.co/m80hz/g3splat/resolve/main/g3splat_mast3r_2dgs_align_orient_re10k.ckpt) | ### ๐Ÿ”œ Coming Soon | Model | Backbone | Gaussian Type | Status | |:------|:--------:|:-------------:|:------:| | GยณSplat-VGGT-3DGS | VGGT | 3DGS | ๐Ÿšง Coming Soon | > **Note**: The code and checkpoints for **GยณSplat** with the [VGGT](https://github.com/facebookresearch/vggt) backbone will be released soon. Stay tuned for updates! ### Downloading Models **Option 1: Direct Download** Download from the links in the table above and place in `pretrained_weights/`. **Option 2: Using Hugging Face Hub** ```bash pip install huggingface_hub ``` ```python from huggingface_hub import hf_hub_download # Download 3DGS model hf_hub_download( repo_id="m80hz/g3splat", filename="g3splat_mast3r_3dgs_align_orient_re10k.ckpt", local_dir="pretrained_weights" ) # Download 2DGS model hf_hub_download( repo_id="m80hz/g3splat", filename="g3splat_mast3r_2dgs_align_orient_re10k.ckpt", local_dir="pretrained_weights" ) ``` **Option 3: Using Git LFS** ```bash # Clone just the model files git lfs install git clone https://huggingface.co/m80hz/g3splat pretrained_weights ``` ### Model Configuration Expected directory structure: ``` pretrained_weights/ โ”œโ”€โ”€ g3splat_mast3r_3dgs_align_orient_re10k.ckpt โ””โ”€โ”€ g3splat_mast3r_2dgs_align_orient_re10k.ckpt ``` > โš ๏ธ **Important**: When using **2DGS** models, you must set `gaussian_type: 2d` in the config: > ```yaml > # config/model/encoder/.yaml # e.g., noposplat.yaml, etc. > # ( is a placeholder for the encoder backbone config you are using) > gaussian_adapter: > gaussian_type: 2d # Use '3d' for 3DGS models (default) > ``` > Or pass it via command line: `model.encoder.gaussian_adapter.gaussian_type=2d` --- ## ๐ŸŽฎ Demo We provide an interactive web demo powered by [Gradio](https://gradio.app/) for visualizing GยณSplat outputs. > **Note**: The demo is intended for **quick visualization** and verifying that the installation works correctly. To reproduce the quantitative results reported in the paper, please refer to the [Evaluation](#evaluation) section. ### Quick Start ```bash python demo.py --checkpoint pretrained_weights/g3splat_mast3r_3dgs_align_orient_re10k.ckpt ``` Then open your browser at `http://localhost:7860` ### Demo Features - ๐Ÿ“ธ **Image Input**: Upload custom image pairs or use provided examples - ๐ŸŽฏ **Pose-Free Inference**: No camera poses required - ๐Ÿ–ผ๏ธ **Novel View Synthesis**: Visualize rendered novel views with adjustable interpolation based on estimated poses - ๐Ÿ“Š **Geometry Visualization**: View depth maps, surface normals, and Gaussian normals - ๐ŸŒ **Interactive 3D**: Explore Gaussian splats in the browser - ๐Ÿ’พ **Export**: Download PLY files for external visualization ### Command Line Options ```bash python demo.py \ --checkpoint \ --port 7860 \ # Server port --share # Create public Gradio link ``` ### Example Images We provide example image pairs in `assets/examples/` organized by dataset: ``` assets/examples/ โ”œโ”€โ”€ re10k_001/ # RealEstate10K scene โ”‚ โ”œโ”€โ”€ context_0.png โ”‚ โ””โ”€โ”€ context_1.png โ”œโ”€โ”€ re10k_002/ โ”‚ โ”œโ”€โ”€ context_0.png โ”‚ โ””โ”€โ”€ context_1.png โ”œโ”€โ”€ scannet_001/ # ScanNet scene โ”‚ โ”œโ”€โ”€ context_0.png โ”‚ โ””โ”€โ”€ context_1.png โ””โ”€โ”€ ... ``` Scene folders are named with a **dataset prefix** (e.g., `re10k_`, `scannet_`) followed by a number. The demo automatically detects the dataset and uses appropriate camera intrinsics. --- ## ๐Ÿ“ฆ Datasets GยณSplat is trained on **RealEstate10K** and evaluated **zero-shot** on multiple benchmarks. ### Dataset Overview | Dataset | Usage | Task | Download | |:--------|:------|:-----|:--------:| | [RealEstate10K](https://google.github.io/realestate10k/) | Training and Testing | NVS, Pose | [๐Ÿ“ฅ Instructions](#realestate10k) | | [ACID](https://infinite-nature.github.io/) | Zero-shot | NVS, Pose | [๐Ÿ“ฅ Instructions](#acid) | | [ScanNet](http://www.scan-net.org/) | Zero-shot | NVS, Pose, Depth, Mesh | [๐Ÿ“ฅ Instructions](#scannet) | | [NYU Depth V2](https://cs.nyu.edu/~fergus/datasets/nyu_depth_v2.html) | Zero-shot | Single-View Depth | [๐Ÿ“ฅ Instructions](#nyudv2) | ### Expected Directory Structure ``` datasets/ โ”œโ”€โ”€ re10k/ โ”‚ โ”œโ”€โ”€ train/ โ”‚ โ”‚ โ”œโ”€โ”€ 000000.torch โ”‚ โ”‚ โ”œโ”€โ”€ ... โ”‚ โ”‚ โ””โ”€โ”€ index.json โ”‚ โ””โ”€โ”€ test/ โ”‚ โ”œโ”€โ”€ 000000.torch โ”‚ โ”œโ”€โ”€ ... โ”‚ โ””โ”€โ”€ index.json โ”œโ”€โ”€ acid/ โ”‚ โ”œโ”€โ”€ train/ โ”‚ โ”‚ โ”œโ”€โ”€ 000000.torch โ”‚ โ”‚ โ”œโ”€โ”€ ... โ”‚ โ”‚ โ””โ”€โ”€ index.json โ”‚ โ””โ”€โ”€ test/ โ”‚ โ”œโ”€โ”€ 000000.torch โ”‚ โ”œโ”€โ”€ ... โ”‚ โ””โ”€โ”€ index.json โ”œโ”€โ”€ scannetv1_test/ โ”‚ โ”œโ”€โ”€ scene0664_00/ โ”‚ โ”‚ โ”œโ”€โ”€ color/ โ”‚ โ”‚ โ”‚ โ”œโ”€โ”€ 0.png โ”‚ โ”‚ โ”‚ ... โ”‚ โ”‚ โ”œโ”€โ”€ depth/ โ”‚ โ”‚ โ”‚ โ”œโ”€โ”€ 0.png โ”‚ โ”‚ โ”‚ ... โ”‚ โ”‚ โ”œโ”€โ”€ intrinsic/ โ”‚ โ”‚ โ”‚ โ”œโ”€โ”€ intrinsic_color.txt โ”‚ โ”‚ โ”‚ โ””โ”€โ”€ intrinsic_depth.txt โ”‚ โ”‚ โ”œโ”€โ”€ pose/ โ”‚ โ”‚ โ”‚ โ”œโ”€โ”€ 0.txt โ”‚ โ”‚ โ”‚ ... โ”‚ โ”‚ โ””โ”€โ”€ mesh/ โ”‚ โ”‚ โ””โ”€โ”€ scene0664_00_vh_clean_2.ply โ”‚ โ”œโ”€โ”€ ... โ”‚ โ””โ”€โ”€ scannet_test_pairs.txt โ””โ”€โ”€ nyud_test/ โ”œโ”€โ”€ color/ โ”‚ โ””โ”€โ”€ 0001.png โ”‚ ... โ””โ”€โ”€ depth/ โ”‚ โ””โ”€โ”€ 0001.png โ”‚ ... โ””โ”€โ”€ intrinsic_color.txt ``` > **Note**: By default, datasets are expected in `datasets/`. Override with: > ```bash > dataset.DATASET_NAME.roots=[/your/path] > ``` ### Dataset Preparation
๐Ÿ“ RealEstate10K (Training) We follow [pixelSplat](https://github.com/dcharatan/pixelsplat)'s data processing pipeline. See the [pixelSplat dataset guide](https://github.com/dcharatan/pixelsplat?tab=readme-ov-file#acquiring-datasets) for instructions on downloading and processing the dataset (use the **360p version**, recommended for **256ร—256** training). You can also download the preprocessed dataset directly from the same page.
๐Ÿ“ ACID (Zero-shot Evaluation) Visit the [ACID Dataset Page](https://infinite-nature.github.io/) to download the raw data, then convert the dataset by following the instructions in the [pixelSplat dataset guide](https://github.com/dcharatan/pixelsplat?tab=readme-ov-file#acquiring-datasets). Alternatively, you can download the preprocessed version directly from the same guide.
๐Ÿ“ ScanNet (Zero-shot Evaluation) 1. **Request Access**: Visit the [ScanNet official page](http://www.scan-net.org/) and request access to the dataset. 2. **Download Data**: Once approved, download the **ScanNet v1 test set**, including color images, depth maps, camera poses, and mesh reconstructions. Use `scripts/download_scannet_v1_test_meshes.sh` to download the mesh files. 3. **Test Pairs**: Download the test split file [scannet_test_pairs.txt](https://drive.google.com/file/d/1uOkJZ6MfY8wrt6CzaNap9l-7UobWN2x-/view?usp=sharing), which defines the image pairs for test scenes and context views.
๐Ÿ“ NYU Depth V2 (Zero-shot Single-View Depth) 1. **Visit the official dataset page**: Go to the [NYU Depth V2 Dataset](https://cs.nyu.edu/~fergus/datasets/nyu_depth_v2.html) website. 2. **Download the data**: Follow the instructions on the page to obtain the dataset files you need. 3. **(Optional) Use the preprocessed test set**: Download the preprocessed test split here: [nyud_test](https://drive.google.com/file/d/1je9X8fU5Vq6GYcK8Y-St-g1mx6lIgGPf/view?usp=sharing).
--- ## Evaluation ### Depth Evaluation #### Multi-View Depth
ScanNet (Zero-shot) ```bash python -m src.eval_depth +experiment=scannet_depth_align_orient +evaluation=eval_depth \ checkpointing.load=pretrained_weights/g3splat_mast3r_3dgs_align_orient_re10k.ckpt ```
#### Single-View Depth
NYU Depth V2 (Zero-shot) ```bash python -m src.eval_depth +experiment=nyud_depth_align_orient +evaluation=eval_depth \ checkpointing.load=pretrained_weights/g3splat_mast3r_3dgs_align_orient_re10k.ckpt ```
> **๐Ÿ’ก Tip**: Add `evaluation.use_pose_refinement=false` to disable test-time pose refinement. **Metrics**: AbsRel โ†“ | ฮด<1.10 โ†‘ | ฮด<1.25 โ†‘ --- ### Pose Estimation Evaluate relative camera pose estimation:
RealEstate10K ```bash python -m src.eval_pose +experiment=re10k_align_orient +evaluation=eval_pose \ dataset/view_sampler@dataset.re10k.view_sampler=evaluation \ dataset.re10k.view_sampler.index_path=assets/evaluation_index_re10k.json \ checkpointing.load=pretrained_weights/g3splat_mast3r_3dgs_align_orient_re10k.ckpt ```
ACID (Zero-shot) ```bash python -m src.eval_pose +experiment=acid_align_orient +evaluation=eval_pose \ dataset/view_sampler@dataset.re10k.view_sampler=evaluation \ dataset.re10k.view_sampler.index_path=assets/evaluation_index_acid.json \ checkpointing.load=pretrained_weights/g3splat_mast3r_3dgs_align_orient_re10k.ckpt ```
ScanNet (Zero-shot) ```bash python -m src.eval_pose +experiment=scannet_pose_align_orient +evaluation=eval_pose \ checkpointing.load=pretrained_weights/g3splat_mast3r_3dgs_align_orient_re10k.ckpt ```
> **๐Ÿ’ก Tip**: Add `evaluation.use_pose_refinement=false` to disable test-time pose refinement. **Metrics**: Rotation Error (ยฐ) โ†“ | Translation Error (ยฐ) โ†“ | AUC@5ยฐ โ†‘ | AUC@10ยฐ โ†‘ | AUC@20ยฐ โ†‘ | AUC@30ยฐ โ†‘ --- ### Novel View Synthesis
RealEstate10K ```bash python -m src.main +experiment=re10k_align_orient_1x8 mode=test \ dataset/view_sampler@dataset.re10k.view_sampler=evaluation \ dataset.re10k.view_sampler.index_path=assets/evaluation_index_re10k.json \ checkpointing.load=pretrained_weights/g3splat_mast3r_3dgs_align_orient_re10k.ckpt ```
ACID (Zero-shot) ```bash python -m src.main +experiment=acid_align_orient_1x8 mode=test \ dataset/view_sampler@dataset.re10k.view_sampler=evaluation \ dataset.re10k.view_sampler.index_path=assets/evaluation_index_acid.json \ checkpointing.load=pretrained_weights/g3splat_mast3r_3dgs_align_orient_re10k.ckpt ```
ScanNet (Zero-shot) ```bash python -m src.main +experiment=scannet_depth_align_orient mode=test \ checkpointing.load=pretrained_weights/g3splat_mast3r_3dgs_align_orient_re10k.ckpt ```
> **๐Ÿ’ก Tip**: Set `test.save_image=true` and/or `test.save_video=true` to save rendered images and videos to the directory specified by `test.output_path`. **Metrics**: PSNR โ†‘ | SSIM โ†‘ | LPIPS โ†“ --- ### Mesh Evaluation Evaluate 3D mesh reconstructions on ScanNet: ```bash python -m src.eval_mesh +experiment=scannet_depth_align_orient +evaluation=eval_mesh \ checkpointing.load=pretrained_weights/g3splat_mast3r_3dgs_align_orient_re10k.ckpt ``` **Metrics**: Accuracy โ†“ | Completeness โ†“ | Overall (Chamfer Distance) โ†“ --- ### Export Gaussian PLY: ```bash python -m src.main +experiment=re10k_align_orient_1x8 mode=test \ dataset/view_sampler@dataset.re10k.view_sampler=evaluation \ dataset.re10k.view_sampler.index_path=assets/evaluation_index_re10k.json \ checkpointing.load=pretrained_weights/g3splat_mast3r_3dgs_align_orient_re10k.ckpt \ test.save_gaussian=true ``` --- ### Evaluation Quick Reference | Task | Dataset | Script | Experiment Config | |:-----|:--------|:-------|:------------------| | Depth | ScanNet | `src.eval_depth` | `scannet_depth_align_orient` | | Depth | NYU Depth V2 | `src.eval_depth` | `nyud_depth_align_orient` | | Pose | RE10K | `src.eval_pose` | `re10k_align_orient` | | Pose | ACID | `src.eval_pose` | `acid_align_orient` | | Pose | ScanNet | `src.eval_pose` | `scannet_pose_align_orient` | | NVS | RE10K | `src.main` | `re10k_align_orient_1x8` | | NVS | ACID | `src.main` | `acid_align_orient_1x8` | | NVS | ScanNet | `src.main` | `scannet_depth_align_orient` | | Mesh | ScanNet | `src.eval_mesh` | `scannet_depth_align_orient` | > **๐Ÿ“œ Batch Evaluation**: See [`scripts/eval_checkpoint.sh`](scripts/eval_checkpoint.sh) and [`scripts/all_evals.sh`](scripts/all_evals.sh) for unified scripts to run multiple evaluations. --- ## Training ### Prerequisites Download the [MASt3R](https://download.europe.naverlabs.com/ComputerVision/MASt3R/MASt3R_ViTLarge_BaseDecoder_512_catmlpdpt_metric.pth) pretrained weights: ```bash mkdir -p pretrained_weights wget -P pretrained_weights/ https://download.europe.naverlabs.com/ComputerVision/MASt3R/MASt3R_ViTLarge_BaseDecoder_512_catmlpdpt_metric.pth ``` ### Training Commands
Multi-GPU Training (Recommended) ```bash # 24ร— A100 GPUs (6 nodes ร— 4 GPUs), effective batch size 144 python -m src.main +experiment=re10k_align_orient \ wandb.mode=online \ wandb.name=g3splat_align_orient ``` **Training Time**: ~6 hours on 24ร— A100 (40GB) **SLURM Cluster**: See [`slurm_train.sh`](slurm_train.sh) for an example job script.
Single-GPU Training ```bash # Single A6000 (48GB), batch size 8 python -m src.main +experiment=re10k_align_orient_1x8 \ wandb.mode=online \ wandb.name=g3splat_align_orient_1x8 ``` **Training Time**: ~120 hours on 1ร— A6000
Training 2DGS Variant ```bash python -m src.main +experiment=re10k_align_orient \ model.encoder.gaussian_adapter.gaussian_type=2d \ wandb.mode=online \ wandb.name=g3splat_2dgs_align_orient ```
### Training Configurations | Config | Hardware | Batch Size | Training Time | |:-------|:---------|:----------:|:-------------:| | `re10k_align_orient` | 24ร— A100 | 144 | ~6 hours | | `re10k_align_orient_1x8` | 1ร— A6000 | 8 | ~120 hours | | `re10k_align` | 24ร— A100 | 144 | ~6 hours | | `re10k_orient` | 24ร— A100 | 144 | ~6 hours | > **๐Ÿ’ก Tip**: When changing batch size, adjust learning rate and training steps proportionally for optimal convergence. --- ## Acknowledgements This project is developed with several repositories: [VGGT](https://github.com/facebookresearch/vggt), [NoPoSplat](https://github.com/cvg/NoPoSplat), [MASt3R](https://github.com/naver/mast3r), [DUSt3R](https://github.com/naver/dust3r), [pixelSplat](https://github.com/dcharatan/pixelsplat), and [CUT3R](https://github.com/CUT3R/CUT3R). We thank all the authors for their contributions to the community. --- ## Citation If you find GยณSplat useful in your research, please consider citing: ```bibtex @inproceedings{g3splat, title = {G3Splat: Geometrically Consistent Generalizable Gaussian Splatting}, author = {Hosseinzadeh, Mehdi and Chng, Shin-Fang and Xu, Yi and Lucey, Simon and Reid, Ian and Garg, Ravi}, booktitle = {arXiv:2512.17547}, year = {2025}, url = {https://arxiv.org/abs/2512.17547} } ``` ---

โญ Star us on GitHub if you find this project useful! โญ

Questions? Feel free to open an issue or reach out!