Three thousand three hundred and sixteen comic book pages. That's how much source material one developer fed through Gemini's vision models to recreate Uno, a beloved AI character from an Italian 1990s comic series called PKNA.
The project is a detailed case study in a question every AI builder eventually faces: when you want an LLM to behave like a specific character, do you fine-tune a model or engineer the context around it?
Fine-Tuning Was a Dead End
The developer tried supervised fine-tuning first and hit the problems that make it impractical for most indie projects. Every time a new model comes out, you'd need to redo the training. Models also suffer from "catastrophic forgetting" - where learning new behavior erases old capabilities, like a student who forgets math after cramming for a history exam. Manual system prompts worked for famous characters like Batman or Sherlock Holmes (because the base model already knows them from training data), but failed for niche characters the model had never seen.
DSPy prompt optimization - a framework that automatically tweaks prompts for better results - showed diminishing returns with stronger models. The better the base model, the less prompt tricks helped.
What Actually Worked: A "Soul Document"
The winning approach used three components: a structured personality document in the system prompt (about 7,000 tokens, or roughly 15 pages of text), agentic search tools that let the AI look up specific scenes from the comic universe, and a capable foundation model like Gemini 2.5 Pro.
The personality document wasn't written by hand. The developer built an extraction pipeline that processed comic scans through vision models, pulled out dialogue and events, then generated "claims" about the character's personality - each backed by specific evidence from the source material. Claims were structured hierarchically and filtered by evidence thresholds, so weak characterizations got dropped automatically.
A clever trick solved consistency across thousands of pages: instead of dumping everything into one massive prompt, the system processed pages sequentially, carrying forward a running summary. Each new page was analyzed alongside the previous page's summary, building context incrementally.
The Practical Takeaway
The total cost was under $56 - about $50 for vision processing and $6 for text analysis, using Gemini's pricing. The approach is portable across models since it relies on plain text and tool use rather than model-specific weights.
The biggest lesson is one that applies well beyond comic book characters: model capability matters more than prompt cleverness. No amount of context engineering compensates for a weak base model. But given a strong model, a well-structured personality document plus retrieval tools can produce convincing character behavior without any fine-tuning at all.
For anyone building AI assistants, chatbots, or character-based applications, the evidence-backed personality document approach is worth studying. It's cheaper, more portable, and more maintainable than fine-tuning - and the structured claim system means you can actually audit why the AI behaves the way it does.