AI
AI News

Step-resolved data attribution for looped transformers

Source:arXiv
Original Author:Georgios Kaissis et al.
Step-resolved data attribution for looped transformers

Image generated by Gemini AI

Researchers have developed a new method called Step-Decomposed Influence (SDI) to analyze how individual training examples impact looped transformers during recurrent computations. Unlike existing methods that provide a single influence score, SDI offers a detailed influence trajectory across each iteration. Implemented using TensorSketch, SDI avoids generating per-example gradients, making it scalable for transformer models. Experiments demonstrate that SDI aligns closely with traditional full-gradient methods while enhancing data attribution and interpretability in algorithmic reasoning tasks.

New Method Enhances Data Attribution in Looped Transformers

Researchers have developed a novel approach, Step-Decomposed Influence (SDI), to improve the understanding of how individual training examples impact computation within looped transformers. This advancement addresses a significant limitation in existing methods, which only provide a single scalar score that aggregates influence across all iterations, obscuring the timing of an example's relevance.

SDI decomposes the influence attributed by existing estimators like TracIn into a detailed influence trajectory that spans the length of the recurrent iterations. By unrolling the recurrent computation graph, the new method allows for precise attribution of influence to specific loop iterations, offering a clearer picture of the reasoning involved in transformer models.

Experimental Validation

Extensive experiments were conducted using looped GPT-style models on various algorithmic reasoning tasks. Results indicate that SDI scales effectively and aligns closely with full-gradient baselines, maintaining a low error rate. This performance demonstrates SDI's potential as a reliable tool for data attribution and interpretability in machine learning.

Related Topics:

Step-Decomposed Influencelooped transformersrecurrent computationTensorSketch implementationdata attribution

📰 Original Source: https://arxiv.org/abs/2602.10097v1

All rights and credit belong to the original publisher.

Share this article