And why it’s more like running a kitchen than writing a prompt

Prompt engineering gets a lot of attention in AI — but most of the content out there focuses on single-agent systems. A user types a query, a model replies. Tweak the prompt, adjust the output. That’s the loop.

But that’s not how real AI systems work anymore.

Most production use cases involve multiple agents working together — planners, coders, testers, reviewers, summarizers — all passing tasks and context between each other. And that changes everything.

In this article, I’ll explain what makes prompt design in multi-agent systems harder, more interesting, and more powerful — using a simple example that anyone can relate to.


A Simple Analogy: Running a Restaurant Kitchen

Think of each AI agent as a staff member in a busy restaurant kitchen.

  • The planner is the head chef.
  • The coder is the line cook.
  • The reviewer is quality control.
  • The summarizer is the person writing the menu for the customer.

You can’t just give everyone the same recipe and hope for the best.

The head chef doesn’t write, “Make pasta,” and expect the rest to improvise. Each instruction has to be tailored to the role, the step in the process, and the constraints of the kitchen.

That’s what prompt engineering looks like in multi-agent systems — not “get the model to do the right thing,” but “make sure every agent gets the right information at the right time in the right format.”


Why Prompting Gets More Complex with Multiple Agents

In a single-agent setup, a prompt is like a one-off instruction. In a multi-agent system, you’re designing a protocol — a chain of messages, roles, feedback, and handoffs.

You’re not just optimising for correctness. You’re optimising for coordination.

Some of the challenges include:

  • Making sure one agent’s output is cleanly structured for the next agent to use
  • Keeping context persistent across handoffs
  • Avoiding redundant or conflicting decisions
  • Handling errors or rework when something goes off-track


What Makes or Breaks a Multi-Agent System

 

1. Clear Roles and Responsibilities

Each agent should be prompted with a specific purpose and boundaries. Don’t just say “write code.” Say “implement Task 4 using only these modules and return it in this schema.”

2. Format Discipline

Agents talk to each other. That means output from one is input for another. Prompting must enforce structure — YAML, JSON, or Markdown — so the system doesn’t fall apart from parsing errors.

3. Embedded Feedback Loops

If a reviewer finds an issue, the original generator needs a prompt that says not just “fix it,” but what was wrong, why it needs to be changed, and what not to change in the process.

4. Error Handling

Some agents need instructions for what to do when they’re unsure. Prompting for escalation paths (“If you’re not confident, pass it back to the planner”) reduces silent failure.


Real-World Example: AI-Generated Code from Jira Tickets

Here’s a simplified version of a multi-agent system we’ve built at Alto Apto

  1. Planning Agent Prompted to break Jira stories into well-defined tasks and outputs them in YAML.
  2. Architecture Agent Takes those tasks and defines folder structures, module dependencies, and ordering.
  3. Code Generator Agent Writes each module with clear constraints and dependencies based on the plan.
  4. Reviewer Agent Checks for consistency, edge case coverage, and adds line-level TODOs.
  5. PR Writer Agent Summarizes all changes and generates a clean pull request description.

This flow only works if each agent is properly prompted — not just for the task, but for the context, formatting, and downstream needs of the next step.

That’s not “prompt tuning.” That’s system design.


Why This Matters to Clients

A lot of companies experimenting with LLMs are still thinking in single-turn interactions. A chatbot. A summarizer. A quick call-and-response.

But the future of AI isn’t one model doing one thing. It’s distributed, composable systems made up of agents working together — with the right infrastructure and prompting logic to support them.

At Alto Apto , we’re helping teams move from disconnected demos to production-ready AI systems. That means:

  • Architecting multi-agent flows
  • Designing robust prompt chains
  • Handling ambiguity and failure
  • Building tooling that makes these systems maintainable

This isn’t research. It’s software delivery — and it needs to be designed like software.


Final Thought

Prompt engineering in multi-agent systems is less about clever phrasing and more about designing for collaboration. It’s about thinking like a system architect, not a writer.

If you’re building something in this space — or want to — we’re happy to talk.

Whether it’s mapping out a simple RAG pipeline or designing a full multi-agent product workflow, this is where we do our best work.