AI
AI News

Learning on the Manifold: Unlocking Standard Diffusion Transformers with Representation Encoders

Source:arXiv
Original Author:Amandeep Kumar et al.
Learning on the Manifold: Unlocking Standard Diffusion Transformers with Representation Encoders

Image generated by Gemini AI

A new approach called Riemannian Flow Matching with Jacobi Regularization (RJF) addresses convergence issues in diffusion transformers when generating high-fidelity outputs from representation encoders. By focusing on manifold geodesics and correcting curvature errors, RJF allows the DiT-B architecture (131M parameters) to achieve a significant FID score of 3.37, outperforming previous methods. Code is available at the provided GitHub link.

Unlocking Standard Diffusion Transformers with Riemannian Flow Matching

A new approach, Riemannian Flow Matching with Jacobi Regularization (RJF), resolves convergence issues in standard diffusion transformers. This method allows diffusion transformers to perform better without expensive modifications.

Previous research linked convergence failures to a capacity bottleneck, but this study identifies Geometric Interference as the primary cause. This occurs when standard flow matching directs probability paths through low-density regions instead of along the manifold surface where data points are concentrated.

Introducing Riemannian Flow Matching

The RJF method constrains the generative process to follow manifold geodesics, reducing curvature-induced error propagation. This allows the DiT-B architecture, with 131 million parameters, to achieve a Fréchet Inception Distance (FID) of 3.37, marking a significant improvement over previous methods.

Implications for Generative Modeling

The introduction of RJF enhances the fidelity of generative outputs. The research team has made the implementation of RJF publicly available on GitHub.

Related Topics:

Diffusion TransformersRepresentation EncodersGeometric InterferenceRiemannian Flow MatchingJacobi Regularization

📰 Original Source: https://arxiv.org/abs/2602.10099v1

All rights and credit belong to the original publisher.

Share this article