In Part 2 of this series, we solved the context window problem by implementing a recursive Map-Reduce pipeline. The system could finally "read" massive documents without losing the plot.
However, as the pipeline moved closer to a production-ready state, a new challenge emerged: LLM Fragility. Even the best 1B or 3B parameter models occasionally hallucinate, return malformed JSON, or suggest categories that don't exist.
To fix this, I moved away from "one-shot" inference and implemented a layer of Structured Enforcement and an Agentic Correction Loop. Here is how it works.
1. The Problem: Hallucinations and Messy JSON
Small local models are fast, but they are prone to "chatter"—adding conversational text before or after a JSON block. Even with strict system prompts, you might get:
"Sure! Here is your JSON: { ... } Hope this helps!"
Using a standard json.Unmarshal on this would fail. Furthermore, the model might invent categories like "Taxes_2024" when your folder structure only has "Finance".
2. Enforcing Strict JSON Schemas
The first layer of defense was to move from naive unmarshaling to a strict decoding process. I implemented a parseAndValidate method that uses Go’s json.Decoder with DisallowUnknownFields().
func (e *MLXEngine) parseAndValidate(content string) (*AnalysisResult, error) {
content = cleanJSON(content) // Extracts only the first { ... } block
var result AnalysisResult
dec := json.NewDecoder(strings.NewReader(content))
dec.DisallowUnknownFields() // Reject any keys not in our struct
if err := dec.Decode(&result); err != nil {
return nil, fmt.Errorf("invalid JSON or unexpected fields: %w", err)
}
// Check for trailing garbage data
var dummy json.RawMessage
if err := dec.Decode(&dummy); err != io.EOF {
return nil, fmt.Errorf("trailing data after JSON object")
}
// ...
}This ensures that the AI's output isn't just "parsed"—it's enforced. If the model tries to add an extra explanation field, the parser rejects it instantly.
3. The Agentic Correction Loop
Validation is only half the battle. When validation fails, a standard script would just log an error and skip the file. A true agent, however, should try to fix itself.
I implemented a Correction Loop that detects validation errors and injects them back into the next prompt as feedback.
for attempt := 0; attempt <= maxRetries; attempt++ {
// ... send request ...
result, err := e.executeCategorization(ctx, reqBody)
if err == nil {
return result, nil // Logic passed!
}
// Inject the error back for the next retry
lastErr = err
userFeedback := fmt.Sprintf("Your previous response was invalid: %v. Please correct it.", lastErr)
// Add to next message history...
}By providing the model with the exact reason it failed (e.g., "missing required field: confidence_score"), the model can self-correct. This simple loop significantly improved the success rate for categorization.
4. Deep Discovery & Nested Paths
Initially, the categories were shallow and hardcoded (e.g., "Algorithms", "Systems"). But real-world organization is hierarchical. To support this, I updated the Pipeline to perform Recursive Auto-Discovery.
The engine now scans the destination directory up to a depth of 3, allowing the AI to choose from nested paths like Work/Project-A or Personal/Finance/2024.
func (p *Pipeline) discoverCategories() ([]string, error) {
var categories []string
filepath.WalkDir(p.DestDir, func(path string, d fs.DirEntry, err error) error {
if d.IsDir() && !strings.HasPrefix(d.Name(), ".") && path != p.DestDir {
rel, _ := filepath.Rel(p.DestDir, path)
categories = append(categories, filepath.ToSlash(rel))
}
return nil
})
return categories, nil
}The system prompt is dynamically updated with this full hierarchy, enabling "Deep Organization" without any manual configuration. If you create a subfolder, the agent learns to use it instantly.
5. Metadata-First Signals
Sometimes, the first 10,000 characters of a document are full of boilerplate, making it hard for the AI to classify. However, most PDFs contain a "hidden" layer of high-quality signals: The PDF Metadata.
I updated the extractor to pull titles, authors, and subjects directly from the PDF trailer. This metadata is prepended to the text content in a [METADATA] block.
pInfo := r.Trailer().Key("Info")
if !pInfo.IsNull() {
metadata := fmt.Sprintf("Title: %s\nAuthor: %s",
pInfo.Key("Title").String(),
pInfo.Key("Author").String())
content.WriteString("[METADATA]\n" + metadata + "\n\n[CONTENT]\n")
}By prioritizing the official document title from the metadata over the raw body text, we optimize the categorization process. Metadata often provides a condensed, high-quality semantic signal that is more reliable than inferring context purely from the document's flow.
6. Generalizing the Domain
The final polish was removing the "Computer Science" bias from the original prompts. By moving to a generalized "Intelligent File Assistant" persona and providing universal default categories like Finance, Legal, and Health, the tool transitioned from a developer utility to a personal productivity powerhouse.
Conclusion: The Engineering of Reliability
The biggest lesson from building the docs_organiser is that reliability is an architectural choice, not a model capability.
You cannot wait for models to become "perfect." Instead, you build guardrails—strict validation, feedback loops, and dynamic context—to ensure that even a small, erratic 1B parameter model behaves like a reliable production service.
The system is now robust, self-correcting, and adaptable to any workflow.
References
- Go: Strict JSON Decoding↗ - Technical documentation on using
DisallowUnknownFieldsto enforce schema rigidity. - Go
filepathPackage: Recursive Discovery↗ - Documentation for the recursive category discovery and filesystem traversal patterns. - PDF Specification: Metadata Trailers↗ - Background on how document titles and authors are stored in the Info dictionary.
- Agentic Patterns: Reflection↗ - The industry-standard pattern for the "Self-Correction Loop" implemented in the agent.