arxiv:2604.22786

AutoCompress: Critical Layer Isolation for Efficient Transformer Compression

Published on Apr 4

Authors:

Abstract

AutoCompress uses Critical Layer Isolation to protect the most important layer in small transformers while compressing other layers, achieving significant model reduction with maintained performance.

AI-generated summary

We present AutoCompress, a transformer compression method motivated by an empirical finding: in small transformers, Layer 0 carries disproportionately high task-critical information, with an NTK-based importance score of 3.6 compared to a maximum of 0.054 for all other layers -- a gap of over 60x. Based on this finding, we propose Critical Layer Isolation (CLI), an architecture that protects Layer 0 at full dimensionality, compresses all intermediate layers through a learned bottleneck, and restores the full dimension at the final layer. Applied to GPT-2 Medium (354.8M parameters), CLI-GPT2 achieves 204.5 perplexity on WikiText-103 with only 143.8M parameters -- a 2.47x compression ratio and 59.5% parameter reduction. Crucially, an ablation study demonstrates that a uniform bottleneck baseline of comparable size achieves only 571.8 perplexity under identical training conditions, confirming that the architectural decision to protect Layer 0 -- rather than simply reducing model size -- is the primary driver of performance. Code and checkpoints are publicly available.

View arXiv page View PDF Add to collection

Community

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment

Upvote

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2604.22786 in a model README.md to link it from this page.

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2604.22786 in a dataset README.md to link it from this page.

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2604.22786 in a Space README.md to link it from this page.

Collections including this paper 0

No Collection including this paper

Add this paper to a collection to link it from this page.