Managing memory when trying to process multiple files

YogurtMarinade · May 5, 2026, 5:07pm

Hey all,

I’m trying to explore making a code inspector (Mythos at home) with huggingface models. I’m currently working with gemma4 and and while I can load the smaller versions just fine, when I try to add a bunch of source code to a prompt I get errors saying I don’t have enough memory. One was trying to allocate ~1.7TB

I’ve made a function

def query_llm(system_message, user_message, assistant_message):
    messages = [
    {"role": "system", "content": system_message},
    {"role": "user", "content": user_message},
    {"role": "assistant", "content": assistant_message},
    ]

    text = processor.apply_chat_template(
        messages, 
        tokenize=False, 
        add_generation_prompt=True, 
        enable_thinking=False
    )

    inputs = processor(text=text, return_tensors="pt").to(model.device)
    input_len = inputs["input_ids"].shape[-1]
    # Generate output
    outputs = model.generate(**inputs, max_new_tokens=1024)
    response = processor.decode(outputs[0][input_len:], skip_special_tokens=False)
    return response

And I’m passing the code in as the assistant messages. Is this just the wrong approach? Is there any wisdom/guidance on how to go about doing local code analysis?

John6666 · May 5, 2026, 7:32pm

(post deleted by author)

YogurtMarinade · May 5, 2026, 10:15pm

Sorry, but I was kinda hoping to get experiential feedback. Have you attempted any code scanning with an LLM?

Topic		Replies	Views
Fine-Tuning LLMs on Large Proprietary Codebases Models	10	3576	October 31, 2025
CodeLlama for Multi-File Code Generation in a Private Repository Beginners	0	65	October 24, 2024
Using a Hugging Face Model offline to support code generation in VSCode Beginners	5	236	March 28, 2026
Benchmark: 6 local Ollama models for code-gen delegation, with variance analysis Models	0	49	April 27, 2026
A few questions for AI Code Assistant Beginners	0	178	June 15, 2024

Managing memory when trying to process multiple files

Related topics