feat: add point cloud classification using PTv3 by jayakumarpujar · Pull Request #665 · opengeos/geoai

jayakumarpujar · 2026-03-25T18:31:21Z

Summary

Add geoai.pointcloud module for semantic segmentation of 3D point clouds (LAS/LAZ) using RandLA-Net via Open3D-ML
Support three pre-trained models: SemanticKITTI (outdoor driving), Toronto3D (urban mapping), S3DIS (indoor)
Provide PointCloudClassifier class with classify(), classify_batch(), train(), summary(), and visualize() methods
Include convenience function classify_point_cloud() for one-off inference
Add example notebook with Colab support and install instructions
Handle PyTorch/NumPy 2.x ABI incompatibility with runtime patches
Center point cloud coordinates before inference to handle large CRS offsets (e.g., State Plane/UTM)

Key files

File	Description
`geoai/pointcloud.py`	Core module (1002 lines)
`tests/test_pointcloud.py`	40 unit tests
`docs/examples/point_cloud_classification.ipynb`	Example notebook
`docs/pointcloud.md`	API documentation page

Test plan

All 40 unit tests pass (pytest tests/test_pointcloud.py)
End-to-end tested on Google Colab with madison.las (4M points, GPU)
SemanticKITTI model classifies into 5 distinct classes (coordinate centering fix verified)
Summary displays model class names correctly
Reviewer to verify notebook renders correctly on ReadTheDocs

Closes #51

giswqs · 2026-03-25T18:36:59Z

@jayakumarpujar This is great!

The tests failed. Need to add laspy as an optional dependency.

Copilot

Pull request overview

Adds first-class point cloud semantic segmentation support to the GeoAI Python package via a new geoai.pointcloud module built around Open3D-ML’s RandLA-Net, along with docs, tests, and packaging/docs wiring.

Changes:

Introduces geoai.pointcloud with PointCloudClassifier, a model registry, checkpoint download, LAS I/O helpers, and convenience functions.
Adds a comprehensive unit test suite for the new module.
Wires the feature into packaging extras, top-level lazy exports, and MkDocs (API page + example notebook).

Reviewed changes

Copilot reviewed 7 out of 7 changed files in this pull request and generated 6 comments.

Show a summary per file

File	Description
`geoai/pointcloud.py`	Core point cloud classification module (model registry, inference, training stub, visualization, summary).
`tests/test_pointcloud.py`	New unit tests covering API shape, I/O helpers, summary, and mocked inference/training/visualize.
`geoai/__init__.py`	Registers `pointcloud` in lazy submodules and exposes public symbols via the lazy symbol map.
`pyproject.toml`	Adds `pointcloud` optional dependency group (`open3d`, `laspy[lazrs]`).
`mkdocs.yml`	Adds the new example notebook and API reference page to the docs nav and execution-ignore list.
`docs/pointcloud.md`	Adds mkdocstrings stub page for the new module and a link to the example notebook.
`docs/examples/point_cloud_classification.ipynb`	New example notebook (Colab link + install instructions + end-to-end usage).

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

jayakumarpujar · 2026-03-25T19:07:42Z

Thanks @giswqs! Fixed in the latest commit (43ee47c), the test file now uses pytest.importorskip("laspy") so the entire test module is skipped gracefully when laspy isn't installed, instead of failing at collection time.

giswqs · 2026-03-26T02:00:30Z

It seems none of the pre-trained models generate correct results.

RandLANet_SemanticKITTI

Total points: 4,068,294

Class distribution:
  car                           :       20 (0.0%)
  bicyclist                     :    1,066 (0.0%)
  road                          :      758 (0.0%)
  parking                       :        8 (0.0%)
  sidewalk                      :       11 (0.0%)
  other-ground                  :   40,354 (1.0%)
  building                      :   10,122 (0.2%)
  fence                         :       12 (0.0%)
  vegetation                    :    1,930 (0.1%)
  trunk                         :   28,153 (0.7%)
  terrain                       :    1,690 (0.0%)
  pole                          :   41,153 (1.0%)
  traffic-sign                  : 3,943,017 (96.9%)
test 0/1: 100%|██████████| 4068248/4068248 [03:00<00:00, 828.52it/s]

RandLANet_Toronto3D

Total points: 4,068,294

Class distribution:
  road                          : 2,773,950 (68.2%)
  road_marking                  :    4,925 (0.1%)
  utility_line                  :   73,848 (1.8%)
  pole                          :  923,070 (22.7%)
  car                           :   49,952 (1.2%)
  fence                         :  242,549 (6.0%)

jayakumarpujar · 2026-03-26T03:21:39Z

Thanks for testing both models @giswqs. You're right, the results are not meaningful for aerial LiDAR.

Root cause: domain mismatch

The pre-trained models come from Open3D-ML's model zoo and were trained on fundamentally different data domains:

There's also no label remapping; model output indices are written directly to the LAS classification field, so even semantically close predictions end up with wrong ASPRS codes.

The pipeline code works correctly for inference, coordinate centering, I/O, and visualization all function as intended. The problem is that there are no suitable pre-trained weights in Open3D-ML for aerial LiDAR.

Possible paths forward:

Find aerial LiDAR weights - We could train and host our own checkpoints.

Complete the training pipeline - the train() method is currently experimental. If finished, users could fine-tune their own labeled aerial LAS files.

Add ASPRS label remapping, map model class names to ASPRS codes by semantic similarity. This helps interpretation but won't fix the geometric perspective mismatch.

I think the right call is to hold off on merging until we have at least one model that produces useful results for aerial LiDAR.
what do you think is the better path?

giswqs · 2026-03-26T03:27:18Z

Makes sense.

Here are two alternatives. I have not tested them yet. Not sure how good they are. Neither of them have PyPI releases, and the dependency list seems very restrictive.

…ication Add Point Transformer V3 (PTv3) training pipeline for the DALES aerial LiDAR dataset with pre-trained checkpoint on HuggingFace for fine-tuning. - scripts/train_ptv3_dales.py: standalone training script with DDP, mixed-precision, OneCycleLR, CE+Lovasz loss, checkpoint management, and --hf_pretrained flag for auto-downloading from HuggingFace - scripts/preprocess_dales_ptv3.py: preprocesses DALES LAS tiles into Pointcept-compatible .npy blocks with spatial splitting - requirements_ptv3.txt: minimal dependencies (numpy, laspy, tqdm) - docs/examples/train_point_cloud_ptv3.ipynb: notebook example Results on DALES test set (26.2M points): Overall accuracy: 95.9%, mIoU (8 classes): 69.3% Pre-trained checkpoint: https://huggingface.co/jayakumarpujar/Ptv3

for more information, see https://pre-commit.ci

jayakumarpujar · 2026-04-13T15:13:34Z

PTv3 DALES Training Results

Trained Point Transformer V3 (m1_base, ~46M params) from scratch on DALES for 100 epochs on a single V100-32GB (~27.5 hours).

Test Set (26.2M points, 11 tiles)

Class	IoU	Precision	Recall	Support
Ground	95.69%	97.41%	98.18%	12,065,396
Vegetation	90.01%	95.80%	93.71%	8,691,162
Buildings	93.73%	97.57%	95.98%	4,880,561
Power lines	88.63%	96.11%	91.92%	73,735
Cars	71.40%	86.05%	80.74%	273,015
Poles	57.48%	66.48%	80.94%	29,046
Fences	37.76%	41.06%	82.43%	172,857
Trucks	19.60%	28.54%	38.49%	38,279

Metric	Score
Overall Accuracy	95.88%
mIoU (8 classes)	69.29%
Best epoch	83 / 100

Validation Set (8.5M points, 4 tiles)

Metric	Score
Overall Accuracy	92.14%
mIoU (8 classes)	63.46%

Pre-trained Checkpoint

Available on HuggingFace for fine-tuning: https://huggingface.co/jayakumarpujar/Ptv3

# Fine-tune on custom aerial LiDAR data
python scripts/train_ptv3_dales.py \
    --data_root data/your_dataset \
    --hf_pretrained \
    --epochs 50 --lr 0.0001 --no_amp

Training Config

Loss: Cross-entropy + Lovasz softmax (inverse-frequency class weights)
Optimizer: AdamW, lr=0.0005, OneCycleLR (10% warmup)
Grad clip: max_norm=1.0
Grid: 0.15m voxel, max 40K points/sample
Precision: fp32 (fp16 AMP causes NaN on V100 attention softmax)
Effective batch: 16 (batch=1 × accum=16)

giswqs · 2026-04-13T15:40:08Z

@jayakumarpujar This is very impressive! Thank you very much for your efforts. I will test it soon.

giswqs requested a review from Copilot March 25, 2026 18:35

Copilot started reviewing on behalf of giswqs March 25, 2026 18:36 View session

Copilot AI reviewed Mar 25, 2026

View reviewed changes

Comment thread geoai/pointcloud.py Outdated

Comment thread geoai/pointcloud.py Outdated

Comment thread geoai/pointcloud.py Outdated

Comment thread geoai/pointcloud.py Outdated

Comment thread geoai/pointcloud.py Outdated

Comment thread geoai/pointcloud.py Outdated

jayakumarpujar self-assigned this Mar 25, 2026

jayakumarpujar force-pushed the feature/point-cloud-classification branch from a750479 to 3ea4cf9 Compare April 13, 2026 15:07

[pre-commit.ci] auto fixes from pre-commit.com hooks

79d4838

for more information, see https://pre-commit.ci

jayakumarpujar changed the title ~~feat: add point cloud classification using RandLA-Net~~ feat: add point cloud classification using PTv3 Apr 13, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: add point cloud classification using PTv3#665

feat: add point cloud classification using PTv3#665
jayakumarpujar wants to merge 2 commits intoopengeos:mainfrom
jayakumarpujar:feature/point-cloud-classification

jayakumarpujar commented Mar 25, 2026

Uh oh!

giswqs commented Mar 25, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

jayakumarpujar commented Mar 25, 2026 •

edited

Loading

Uh oh!

giswqs commented Mar 26, 2026

Uh oh!

jayakumarpujar commented Mar 26, 2026

Uh oh!

giswqs commented Mar 26, 2026

Uh oh!

jayakumarpujar commented Apr 13, 2026

Uh oh!

giswqs commented Apr 13, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Uh oh!

Conversation

jayakumarpujar commented Mar 25, 2026

Summary

Key files

Test plan

Uh oh!

giswqs commented Mar 25, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

jayakumarpujar commented Mar 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

giswqs commented Mar 26, 2026

RandLANet_SemanticKITTI

RandLANet_Toronto3D

Uh oh!

jayakumarpujar commented Mar 26, 2026

Uh oh!

giswqs commented Mar 26, 2026

Uh oh!

jayakumarpujar commented Apr 13, 2026

PTv3 DALES Training Results

Test Set (26.2M points, 11 tiles)

Validation Set (8.5M points, 4 tiles)

Pre-trained Checkpoint

Training Config

Uh oh!

giswqs commented Apr 13, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

jayakumarpujar commented Mar 25, 2026 •

edited

Loading