My Current AI Production Coding Workflow
In Early Lessons on AI Agent Productivity I talked about the early stages of my AI coding journey. After a couple months of forcing myself to use coding agents, I found myself being about a 1x developer. With a few weeks off to experiment on side projects, I began to find patterns that might let me work faster. At this point, especially since the release of Claude Opus 4.5 (and later 4.6!) I’ve found an AI coding workflow that I’m very happy with for writing production quality code with minimal human input. In this post, we will explore that workflow and how I built it.
Bootstrapping and Self-Improving Workflows
Before I get into the coding workflow, I really need to talk about how I made this workflow, and how I keep it up to date.
Today, every part of my workflow is solidified as a skill (for e.g. Claude Skills, Kiro Prompts). For each step in my workflow, I started by carefully crafting a prompt of what I wanted the agent to do. I’d revise it, tell it what I forgot to ask for, encouraged follow up questions, and kept going until the output was solid. Then I run my favorite skill: /capture-skill
The prompt for that skill goes roughly like this: “Please review this chat session, I want to capture what we did as a new skill called /whatever - feel free to ask me any questions where the chat history is unclear. If applicable please also take the output artifact and convert it to a template within the skill definition for reuse.”
That skill is one of the most magical things I’ve ever discovered with agentic coding, and if you take exactly one thing away from this post, let it be that. Of course, it also works for improving skills. /improve-skill goes roughly like “Please review this chat session where I used the /needs-improvement skill. From the follow up discussion, suggest improvements to that skill and if approved by the human user, make those changes. This skill invocation will be accompanied by an explanation of why the user was unsatisfied. Ask for that feedback if it is not.”
These two skills, plus a few well crafted prompts, are your path to really tightening up your workflow for absolutely anything. It gets better results in my opinion than having agents make a command from whole cloth. It’s how I developed every step of this workflow.
Coding Workflow
Planning Phase
When I came back to full time work after my second parental leave, I decided to push myself to use the Kiro IDE https://kiro.dev/ - I found myself quickly going back to a multiple CLI setup, but there’s one thing Kiro clearly gets right that I’ve captured into my workflow: The Requirements-Design-Tasks pipeline. I went back to CLI agents soon after, but the planning insights heavily inspired how I plan coding projects today.
- Requirements: I call a
/requirements-docskill and give it a full stream of consciousness description of the requirements of what I am building. It will produce a Markdown document under thedocs/requirements/folder in my workspace. - High Level Design: I call a
/design-high-levelskill, which expects adocs/requirements/input file (it is set to ask me if unsure what requirements doc to use) and I may add some commentary about other considerations or design decisions I’ve already made. In thedocs/designs/folder it will produce a high level design document in Markdown with Mermaid diagrams as appropriate. A high level design will cover architectural changes and decisions, major modules, data flow, how data is modeled, and security concerns. - Low Level Design: I call a
/design-low-levelskill, which expects a high level design indocs/designs/to add on to. Low level design focuses on detailed component-level logic, especially class diagrams and perhaps further refinements to how modules are separated (for e.g. Rust Cargo package separation within a project). Component interactions and a high level view of component APIs is a major part of the design. The intention of this low level design is it should make task graph generation straightforward, and should help better ensure separation of concerns when we actually start coding. - Task Plan: I call a
/plan-tasksskill which expects a low level design indocs/designs/and uses that to produce a list of discrete tasks (scoped to be completed with high confidence in a single context window, most often 1-2 code files plus tests). It also creates a dependency graph of these tasks, so that we know both completion order and which tasks can be parallelized.
For each of these steps, a review skill like /review-requirements-doc exists, which I run as a subagent to consistently apply the most common standards for each step (and indeed, patterns I’ve captured when reviewing previous artifacts and fed into /improve-skill). The review-improve cycle is repeated until I’m happy with the output of each step.
Once these steps are complete, we are ready to code, one task at a time.
Development Phase
In the development phase, we perform a loop for each task. If tasks are set up in such a way that some can be worked on in parallel (for e.g. different Cargo packages that are not dependent on one another), then this loop will be running in multiple terminals at once.
- Implement: I call a
/implement-taskskill. It first ensures we are in a non-mainbranch then it implements the task as called for in the task plan and low level design. This implementation is done using strict TDD Red/Green/Refactor loops, which I have anecdotally found to produce higher quality tests alongside high test line coverage. - Review: I call a
/code-reviewskill which works on the uncommitted diff. This will actually spawn 3-5 subagents who review the code from different facets like maintainability, security, performance, and so on. The top-level agent will synthesize the subagent reviews into a single review document (which I tend to store inreviews/folder that is part of.gitignoreand not committed to source - but that’s your call on how you like to run your workflow). - I will loop 1 and 2, having the agent address its own feedback and triggering new code reviews until the code review agents are satisfied. At that point, I perform a human review before letting the agent commit the task changes to the branch and mark a task complete. Pre-commit hooks are run before each commit and if those hooks break, we repeat the cycle.
The thing I’ve been strongly encouraged by is that this process has been producing great code most of the time - my human reviews are often just a look-over and then saying “Yes” to my agent. Sometimes I have small feedback or stylistic preferences. True major issues are increasingly rare.
Once all tasks are done, a Pull Request is created and sent off for another pair of human eyes (and any review agents that run against the repo).
Conclusion
I feel like the events that led to my writing this post are the moment in the adoption curve that I crossed the chasm. AI coding is a significant productivity boost for me now on real world, production problems. The fact that it’s also good at self-reinforcing loops just tells me that it’s going to get better and better. Am I a 10x developer? I don’t feel that way, but the real lesson has become the fact that code execution just isn’t the bottleneck anymore. At the same time, I am certainly not a 1x developer anymore either!