Seventeen chat threads, one day, and a working enterprise knowledge system at the end of it. That was the result when engineer Anthony Putignano decided to stop running toy demos and instead threw an AI coding agent at a real engineering problem.
The project: build a centralized, searchable knowledge platform that pulls data from Jira, Notion, and Readwise Reader. Not a tutorial exercise, but production-grade infrastructure with data ingestion via Airbyte, Docker containers, PostgreSQL with pgvector for semantic search (meaning the database understands the meaning of queries, not just keywords), and a Flask API layer on top.
Putignano used OpenAI's Codex as his primary agent throughout. The system worked, but the interesting part is not that it got built. It is how it got built.
The Real Skill Is Debugging, Not Generating
The biggest takeaway cuts against the popular narrative that AI coding tools are about producing code faster. Putignano found that the actual value came from compressing the gap between "something is broken" and "I know what to do next." The agent was most useful when fed real evidence - logs, schema dumps, actual error output - and asked to diagnose problems rather than generate code from scratch.
This tracks with what most experienced developers report after months of daily use: AI agents are better debugging partners than they are code generators.
Six Practices That Actually Worked
Putignano distilled his approach into six patterns:
- Treat the agent as a collaborator, not a vending machine. Conversations, not one-shot prompts.
- Ground every request in evidence. Paste the actual log output. Show the real schema. Vague descriptions produce vague code.
- Work in short, testable loops. Build something small, verify it works, then move on. Long autonomous runs produce compounding errors.
- Feed repository-specific context. The agent does not know your codebase unless you show it.
- Codify discoveries immediately. When you and the agent figure something out, write it down as a reusable asset so you do not re-solve it.
- Refactor aggressively when patterns emerge. AI-generated code tends toward duplication. Catch it early.
None of this is surprising if you have spent real time with these tools, but it is useful to see it validated on a non-trivial project with actual infrastructure dependencies, not just a React todo app.
What This Tells Us About the Current State
The one-day timeline is impressive, but context matters. Putignano clearly knew what architecture he wanted before starting. The AI did not make design decisions. It executed them faster and helped troubleshoot when things broke. That is a meaningful productivity gain, but it is a different story than "AI replaces engineers."
For anyone using AI coding tools daily, the practical lesson is clear: invest your time in getting better at prompting with real context and working in tight feedback loops, rather than expecting longer autonomous runs to produce better results. The agents are good. They are not yet trustworthy enough to let run unsupervised on anything complex.