
HexFormer: Hyperbolic Vision Transformer with Exponential Map Aggregation
Researchers have developed HexFormer, a hyperbolic vision transformer for image classification that employs exponential map aggregation in its attention mechanism. The architecture includes both a hyperbolic variant and a hybrid version that merges a hyperbolic encoder with an Euclidean classification head. Experiments show HexFormer outperforms standard Euclidean models and previous hyperbolic transformers across various datasets, with the hybrid variant achieving the best results. The study also highlights that hyperbolic models offer improved gradient stability and reduced sensitivity to training strategies, suggesting practical advantages in using hyperbolic geometry for vision tasks.










