Skip to content

need-singularity/ph-training

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

8 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

πŸ“ PH Training

License: MIT Python 3.9+ PyTorch 2.0+

YouTube Β· Email

Automatic model training pipeline powered by Persistent Homology

Know your dataset's difficulty, find the optimal learning rate, and detect overfitting β€” all from topological structure. No grid search. No guessing.

Why PH Training?

Traditional PH Training
Difficulty estimation Train fully, then evaluate 1 epoch β€” H0 predicts final accuracy (r > 0.9)
Learning rate Grid search (expensive) H0 CV minimum β€” automatic, topology-guided
Overfitting detection Watch val loss diverge H0 gap β€” catches it before accuracy drops (r = 0.998)
Confusion pairs Train fully + confusion matrix Epoch 1 β€” merge order = confusion pairs (r = -0.97)

Features

Phase What it does Evidence
1 Difficulty prediction (1-epoch H0) H0_ep1 vs final accuracy r>0.9
2 Automatic LR search H0 CV minimum = optimal LR
3 Real-time overfitting detection H0_gap predicts overfitting r=0.998
4 Confusion pair prediction Merge order = confusion r=-0.97
5 Semantic hierarchy analysis Dendrogram = semantic clusters
6 Adversarial vulnerability prediction Merge distance vs FGSM r=-0.71

Installation

pip install -e .

Dependencies

  • Python >= 3.9
  • PyTorch >= 2.0
  • ripser >= 0.6.0
  • scikit-learn >= 1.0
  • scipy >= 1.7
  • torchvision >= 0.15

Quick Start

CLI

# Basic usage (MNIST, auto LR search)
ph-train --dataset mnist

# Fashion-MNIST, 30 epochs
ph-train --dataset fashion --epochs 30

# CIFAR-10, manual LR
ph-train --dataset cifar --lr 0.001 --epochs 50

# Custom model size
ph-train --dataset mnist --hidden 256 --gap-threshold 0.05

Python API

from ph_training import PHTrainer

# Full automatic pipeline
trainer = PHTrainer(dataset='mnist', epochs=20)
result = trainer.run()

print(f"Difficulty: {result.difficulty}")
print(f"Best LR: {result.best_lr}")
print(f"Best accuracy: {result.best_acc:.1f}%")
print(f"Confusion pairs: {result.confusion_pairs[:3]}")

Using PHMonitor in Your Own Training Loop

from ph_training import PHMonitor

monitor = PHMonitor(n_classes=10, gap_threshold=0.08)

for epoch in range(max_epochs):
    train(model, train_loader)

    # Extract direction vectors from your model
    dirs_train, labels_train = extract_directions(model, train_loader)
    dirs_test, labels_test = extract_directions(model, test_loader)

    status = monitor.check(dirs_train, labels_train, dirs_test, labels_test)

    if status.alert:
        print(f"Overfitting detected! H0_gap={status.h0_gap:.4f}")
        break

    if monitor.should_stop:  # 3 consecutive alerts
        print("Early stopping recommended")
        break

Example Output

======================================================================
  PH Auto-Training β€” FASHION
  epochs=20, hidden=128, seed=42
======================================================================

  Phase 1: Difficulty prediction (1-epoch H0)
    H0_ep1 = 2.3142, acc_ep1 = 82.1%
    Difficulty: medium

    Confusion pairs (confirmed at epoch 1):
       Shirt <-> Tshirt   dist=0.0312
        Coat <-> Pullvr   dist=0.0587
      Sneakr <-> Boot     dist=0.0891

  Phase 2: LR auto-search (H0 CV minimum)
    LR=3e-04: H0 CV=0.1234
    LR=1e-03: H0 CV=0.0891  <- optimal
    LR=3e-03: H0 CV=0.1567

  Phase 3: Training (LR=1e-03, overfitting detection ON)
   Ep  trn%   tst%    gap   H0_tr   H0_te   H0gap     ts   status
    1  83.2   82.1   +1.1  2.3142  2.2891  0.0251  1.000       OK
    2  86.1   85.3   +0.8  2.1234  2.0987  0.0247  1.023       OK
   ...
  14  92.8   89.1   +3.7  1.4521  1.3012  0.1509  1.187    ALERT
  Early stop (3 consecutive ALERTs, epoch 14)

  Phase 4: Analysis
    Dendrogram (semantic hierarchy):
      d=0.0312 -> [Shirt, Tshirt]
      d=0.0587 -> [Coat, Pullvr]
      d=0.0891 -> [Boot, Sneakr]
======================================================================
  SUMMARY β€” FASHION
======================================================================
  Difficulty:     medium (H0_ep1=2.3142)
  Best LR:        1e-03
  Best accuracy:  89.4% (epoch 11)
  Early stop:     yes
  Top confusion:  Shirt-Tshirt
======================================================================

Layer PH Monitor β€” For Any LLM Fine-tuning

Monitor training health by tracking topological structure of per-layer signals. Works with any model β€” not just PureField. Feed it gradient norms, activation norms, loss contributions, or any per-layer scalar.

Why

During healthy training, different layers develop distinct roles (different gradient magnitudes, activation patterns). When this diversity collapses β€” all layers converging to similar values β€” it signals overfitting or mode collapse. PH (H0) captures this: high H0 = diverse layer topology, low H0 = collapsed.

Quick Start

from ph_training import LayerPHMonitor

monitor = LayerPHMonitor(n_layers=32, signal_name="grad_norm")

for step, batch in enumerate(dataloader):
    loss = model(batch).loss
    loss.backward()

    # Collect any per-layer signal
    grad_norms = []
    for name, param in model.named_parameters():
        if "layers" in name and "weight" in name and param.grad is not None:
            grad_norms.append(param.grad.norm().item())

    status = monitor.check(grad_norms)
    print(f"Step {step}: H0={status.h0:.4f} [{status.status}]")

    if monitor.should_stop:  # 3 consecutive COLLAPSE
        print("Layer topology collapsed β€” early stop")
        break

# Summary
print(monitor.summary())
# {'h0_trend': 'increasing', 'collapse_events': 0, 'final_status': 'HEALTHY'}

Supported Signals

Signal How to collect What it reveals
Gradient norm param.grad.norm() Layer learning dynamics
Activation norm Forward hook output.norm() Layer contribution
Attention entropy -sum(attn * log(attn)) Attention diversity
PureField tension layer.mlp.last_tension.mean() Engine A↔G disagreement
Per-layer loss Custom loss decomposition Layer-specific learning

Status Levels

  • HEALTHY: H0 is at or near peak β€” layers are differentiated
  • WATCH: H0 dropped to 60% of peak β€” monitor closely
  • COLLAPSE: H0 dropped below 30% of peak β€” early stop recommended

How It Works

Architecture: PureFieldEngine

The model uses a dual-engine repulsion field architecture:

  Input x
    |-- Engine A (logic) --+
    |                      +-- repulsion = A - G
    +-- Engine G (pattern) +
                               |
                    +----------+----------+
                    |          |          |
              magnitude    direction    tension
              = |A-G|     = norm(A-G)  = |A-G|^2
              (confidence)  (concept)   (reaction)
                    |          |
                    +--- output = scale * sqrt(tension) * direction

PH-Based Monitoring

At each epoch, the pipeline:

  1. Extracts direction vectors from both engines
  2. Computes per-class mean directions
  3. Builds a cosine distance matrix between class centroids
  4. Runs Persistent Homology (H0) on this distance matrix
  5. Compares train vs test H0 β€” the gap predicts overfitting

Key Insight

The topological structure of class confusion is determined within the first epoch and remains stable regardless of model architecture (architecture invariance).

This means:

  • 1 epoch is enough to predict final difficulty and confusion pairs
  • H0 gap between train/test catches overfitting before accuracy diverges
  • Merge order in the dendrogram directly maps to which classes will be confused

Benchmarks (verified)

Dataset Accuracy Difficulty H0_ep1 Best LR Early Stop Top Confusion Time
MNIST 98.3% easy 4.38 1e-03 no 3-5 2.2 min
Fashion 87.4% medium 2.35 3e-04 no Pullvr-Coat 2.2 min
CIFAR-10 52.0% medium 2.02 1e-03 yes (ep 6) cat-dog 1.4 min

Key observations:

  • Difficulty prediction works: H0_ep1 4.38 (easy) β†’ 2.02 (hard), matches final accuracy ranking
  • Early stopping fires correctly: CIFAR H0_gap exceeded threshold at epoch 4, stopped at epoch 6
  • Confusion pairs match intuition: cat-dog, Pullvr-Coat, digits 3-5 β€” all well-known hard pairs
  • LR search finds different optima: Fashion chose 3e-04 (not 1e-03), confirming per-dataset tuning value

Comparison: Overfitting Detection

Epoch  4 ───────────────────────────────────────

  Accuracy gap     β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘ +2.1%   (looks fine)
  H0 gap           β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ 0.15  ⚠️ ALERT

Epoch  6 ───────────────────────────────────────

  Accuracy gap     β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ +5.3%  (too late)
  H0 gap           β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ 0.22  πŸ›‘ STOP

H0 gap detects overfitting 2 epochs before accuracy diverges.

Related Papers

  • P-002: Persistent Homology Reveals Universal Confusion Structure in Neural Networks
  • P-003: Topological Generalization Gap: Detecting Overfitting via H0 Persistence (r=0.998)

Citation

@software{ph_training,
  title={PH Training: Automatic Model Training Pipeline Powered by Persistent Homology},
  author={need-singularity},
  url={https://github.com/need-singularity/ph-training},
  year={2026}
}

License

MIT

About

πŸ“ PH Training β€” Automatic model training pipeline powered by Persistent Homology. Epoch-1 difficulty prediction, automatic LR search, real-time overfitting detection (r=0.998).

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages