From Low-Code to AI Engineering

Introduction

It’s been a while since I last wrote here, and a lot changed in the meantime. Somewhere along the way, I moved out of the low-code world and into full-on AI engineering. Not in the “I let AI do random things for me” sense, but in the much more practical sense of building with AI, while also building the systems around it.

I’m also not a full-time vibe coder. I’m still a developer first — just one who uses AI heavily and spends a lot of time making AI systems cheaper, more reliable, and easier to control. That difference matters, especially now that more people are starting to experiment with agents without always seeing what’s happening under the hood.

Why I’m Writing This

I wanted to put together something useful for people who are either just starting with AI agents, or have been curious but haven’t really built them into their workflow yet. There’s an endless stream of new tools, plugins, wrappers, and extensions right now, and it’s easy to get distracted by novelty instead of focusing on what actually holds up in daily use.

So this is not meant to be a theory-heavy post. It’s a practical write-up of the stack and habits that helped me make coding agents faster, cheaper, and more predictable. The good part is that the overall approach is agent-agnostic, so the same ideas carry over whether you use Codex, Claude Code, Hermes, Pi, OpenCode, or something else.

Start with the System Prompt

If an agent behaves badly, a lot of the time the problem starts before the first tool call. One of the easiest improvements is adding a section to the system prompt that clearly describes the development environment, because agents make better tool decisions when they know what environment they are actually operating in.

This is especially useful in a native Windows sandbox. If the agent assumes Linux or Bash by default, it may keep making the wrong kind of command call, fail repeatedly, and burn tokens for no good reason. In practice, giving it explicit rules like “this is Windows-native,” “don’t use Bash,” and “retry in PowerShell format if needed” can remove a surprising amount of pointless failure.

Behavior Before Hype

If we’re building coding agents, I think Karpathy-style behavior rules should be treated as baseline discipline. The public skill writeups built around those ideas consistently emphasize explicit assumptions, simplicity, surgical edits, and goal-driven execution — which is a much healthier default than letting an agent improvise its way through a repository.

That matters a lot for beginners too. Many people start by adding a model and a few tools, then expect the setup to become useful on its own. Usually what’s missing is not another extension, but a better behavioral contract: when to inspect, when to search, when to ask, when to change code, and when to stop talking.

Context Is Where the Cost Goes

Most of the token burn happens in context engineering. In my experience, agents often fail not because the model is weak, but because the context is bloated, repetitive, or full of irrelevant junk.

The three-layer approach I’ve been leaning on is simple:

RTK — runtime-based calls instead of stuffing everything into static prompt context.
FFF — targeted file search instead of dragging large parts of the repository into every task.
Headroom AI — a compression layer between the agent and the inference API that uses reversible compression, so the model can work on a smaller context while still preserving access to the original data.

That last part is especially interesting. Headroom publicly describes its CCR model as “Compress-Cache-Retrieve,” where compressed content stays recoverable and the model can retrieve the original when needed, while still achieving large token savings on many workloads.

Codebase Intelligence Matters

For codebase intelligence, CodeGraph has worked really well for me. It is described publicly as a CLI-oriented code intelligence tool that helps query code structure, relevant fragments, and dependency relationships without forcing another hosted provider or a heavy external workflow.

That kind of tooling matters more than it looks. A coding agent that can navigate structure and connected logic is much less likely to act like it only skimmed a few files and guessed the rest. Better code understanding usually means fewer wrong edits, fewer broad searches, and less wasted iteration.

Workflow and Guardrails

For new tasks, I like having a fixed base workflow. The exact naming can vary, but the pattern is usually the same: inspect first, gather context, plan briefly, execute, and verify. Community descriptions of skill-based agent workflows make the same case — agents are more reliable when they follow tested workflows instead of improvising from scratch every time.

On top of that, project-level constraints and workspace guardrails help a lot. If you don’t define boundaries, the agent will invent them, and that is rarely where you want creativity. Clear rules about what it can touch, what tools it should prefer, and how much it should explain make the whole setup easier to trust.

Keep the Output Short

This is one of my strongest preferences: output should be short and useful. I don’t need a 15-line explanation for a 5-line code change, and I definitely don’t need a summary of the summary after that.

Short output is easier to review, cheaper to generate, and usually more honest. The more I use these systems, the more I feel that verbosity often gets mistaken for intelligence, when in reality it is just extra latency and token spend.

What Changed for Me

The reason I care about all of this is simple: it had measurable impact for me. Even before adding a more serious system prompt, just using runtime-oriented context handling was already reducing token usage noticeably. Later additions around behavior, retrieval, compression, and codebase intelligence pushed that further.

My rough takeaway after a week was straightforward: fewer wasted tokens, better throughput, and less time lost to failed tool behavior. That’s the part I think newcomers should pay attention to — AI agents become much more useful when you treat them like systems that need structure, not magic assistants that somehow figure everything out on their own.

These Tools Are Worth Trying

The best part is that a lot of this stack is open and accessible. Karpathy-inspired skills are shared openly, Headroom documents its compression model publicly, and CodeGraph is presented as a local-first CLI-oriented approach to code intelligence.

So if you’re just getting started with agents, my advice is simple: don’t obsess over hype first. Start with behavior, context, guardrails, and environment clarity. That foundation will usually matter more than chasing the newest tool of the week.

Question for You

I’m curious what others are using that has actually proven itself in production. Not in demos, not in benchmark screenshots, but in real day-to-day work.

Tools & Resources

Key Concepts Referenced

RTK (Runtime Knowledge): Runtime-based context handling instead of static prompt stuffing
FFF (Focused File Find): Targeted file search to keep repository context minimal
Headroom AI: Reversible compression layer (Compress-Cache-Retrieve model) for token savings
CodeGraph: CLI-oriented local-first code intelligence for querying code structure and dependencies

From Low-Code to AI Engineering

From Low-Code to AI Engineering

Introduction

Why I’m Writing This

Start with the System Prompt

Behavior Before Hype

Context Is Where the Cost Goes

Codebase Intelligence Matters

Workflow and Guardrails

Keep the Output Short

What Changed for Me

These Tools Are Worth Trying

Question for You

Tools & Resources

Key Concepts Referenced

Further Reading

Enjoyed this article?

You Might Also Like

Meet Mastra: Finally, a Professional TypeScript Framework for AI Agents!