GBMAP

2024.05.14

PythonJAXRDKit

GBMAP is a supervised dimensionality reduction method that learns new features by stacking simple one-layer perceptrons through gradient boosting. The idea is that each weak learner captures a different projection of the data that matters for prediction and together they form an embedding where irrelevant features get filtered out.

How it works

The method trains perceptrons sequentially. Each one tries to correct the errors left by all previous ones. The output of each perceptron becomes one coordinate in the new embedding space. Because we use softplus activations (a smooth version of ReLU), each weak learner essentially splits the feature space with a hyperplane and responds differently on each side.

This gives us:

A set of new features (the outputs of each weak learner)
A distance measure between points that ignores directions not useful for prediction
A way to detect when new data points are far from the training distribution

Results

On standard regression and classification benchmarks, GBMAP performs comparably to XGBoost. But the main point is that the embedding features let simple linear models close the gap with more complex ones. A linear regression trained on GBMAP features often matches what you would get from a black-box model.

The method is also fast. Training on a million points with 25 features takes around 15 seconds.

One useful side effect: the distance in embedding space correlates with prediction uncertainty. Points that are far from the training data in this space tend to have higher errors, which helps with detecting distribution drift without needing ground truth labels.

If you find this useful, please cite:

@article{patron2024gbmap,
  title={Gradient Boosting Mapping for Dimensionality Reduction and Feature Extraction},
  author={Patron, Anri and Prasad, Ayush and Luu, Hoang Phuc Hau and Puolam{\"a}ki, Kai},
  journal={arXiv preprint arXiv:2405.08486},
  year={2024}
}