Make.com Forgets Everything. That's Why AI Agents Fail

By Lazar Milicevic · Published July 4, 2026 · 8 min read

An AI email agent replied to the same customer like a stranger — 47 times in a row, same thread, same person, zero memory. That's not a prompt problem. It's an architecture problem, and it's the one gap nobody in the Make vs n8n debate actually talks about.

The Failure: Stateless Architecture by Design

Make.com scenarios execute as isolated, start-to-finish runs with no native memory of prior executions. When a scenario finishes, everything that happened during that run evaporates. The next execution starts from absolute zero. This is stateless architecture, and it's the reason your AI agent feels dumber than a raw ChatGPT window.

Here's exactly what happened on a client project. Small services business, roughly 400 Gmail threads a day landing in support — quotes, follow-ups, invoice questions. Two to four hours a day gone to inbox triage. Classic solopreneur pain. The prototype took a weekend: Gmail trigger, GPT-4 module, Telegram output. Worked on the first email. Then the same customer replied to the agent's draft, and GPT-4 responded with "Hi, how can I help you today?" — mid-conversation.

The Make.com execution model treats every scenario run as disposable. There's no shared heap between runs, no conversation buffer, no thread state that persists across executions. Each run is a fresh process.

# Make.com scenario execution model (simplified)
execution:
  trigger: gmail.new_message
  steps:
    - openai.create_completion
    - gmail.send_reply
    - telegram.send_message
  on_complete: destroy_all_state

That on_complete: destroy_all_state is the core issue. The scenario did exactly what it was designed to do — it just wasn't designed to remember anything.

The Data Store Trap: Paying for Memory per Hit

Make.com does offer Data Stores — a key-value persistence layer you can read from and write to during scenario runs. The problem isn't that Data Stores exist. It's that every read and every write costs operations, and operations are the metered resource that determines your monthly bill.

We measured the actual cost on the 400-thread-a-day workload. Here's the operation count per incoming email:

Step	Operations
Gmail trigger + dedup check	1
Data Store: thread lookup	1
Data Store: history read	1
GPT-4 completion	1
Data Store: history write	1
Gmail send reply	1
Telegram conditional ping	1

That's 7 operations per thread, minimum. On 400 threads a day, you're burning 2,800 operations daily. The Make.com Core plan at $9/month gives you 10,000 operations — exhausted by day four. The Pro plan at $16/month bumps you to 10,000 as well, with overages at $0.0023 per operation. To survive the month, you need the Teams plan at $29/month for 40,000 operations, and realistically you're paying overages that push you toward $99/month equivalent in consumed ops.

And here's the kicker: after all that spend, you still don't have real memory. You have a flat key-value table you're paying per-hit to query. There's no relational lookup, no ORDER BY timestamp DESC LIMIT 10, no way to pull "the last ten messages in this thread ordered chronologically" in a single operation. You'd need to store the entire thread history as a JSON blob under one key, read it on every execution, append the new message, and write the whole blob back. One read, one write, two operations — and you're serialization-thrashing a growing JSON payload on every single email.

The Real Question: Stateless vs Stateful, Not Beginner vs Developer

The Make versus n8n debate is almost always framed as ease-of-use versus developer control. That's the wrong axis. Here's the actual decision tree:

Does your automation need to remember anything between runs?

No — Form submission posts to Slack, new Stripe charge triggers a receipt email, Shopify order creates a Trello card. These are fire-and-forget automations. Make.com wins on speed to build. You can ship something working in an afternoon, and the visual builder genuinely helps non-technical operators understand the flow.
Yes — Customer history, conversation threads, which leads you already contacted, which invoices you sent, what stage a deal is in. The moment memory enters the picture, Make fights you and charges you per operation for the privilege.

This isn't about node count or logo aesthetics. It's about whether your platform's execution model assumes state persistence is a first-class concern or an afterthought bolted on via a metered add-on.

Rebuilding the Agent in n8n: 12 Nodes, $6/Month, Real Memory

Here's exactly how I built the same agent in n8n. The entire workflow is 12 nodes. Self-hosted on a small VPS, marginal cost per execution is zero because n8n doesn't meter operations when you self-host.

The architecture:

# n8n workflow: 12 nodes total
nodes:
  # 1-2: Trigger + Dedup
  - gmail_trigger          # Poll for new messages
  - postgres_dedup         # Check if message_id already processed

  # 3-5: Memory retrieval
  - postgres_thread_lookup # Find thread_id from sender + subject
  - postgres_history_pull  # Last 10 messages, ordered by timestamp
  - code_format_context    # Build Claude-ready context block

  # 6-8: AI processing
  - claude_draft_reply     # Generate reply with full thread context
  - code_confidence_check  # Parse model confidence + intent flags
  - gmail_send_reply       # Send if confidence above threshold

  # 9-10: State persistence
  - postgres_log_exchange  # Write the new exchange back to DB
  - postgres_update_thread # Update thread metadata + last_contact

  # 11-12: Human escalation
  - if_escalation_needed   # Check for cancellation/refund/legal keywords
  - telegram_notify_owner  # Ping owner only on edge cases

The critical piece is nodes 3-5. The agent pulls the last ten messages from the thread out of Postgres, formats them as a context block, and feeds them to Claude. The reply Claude generates actually sounds like a continuation of the conversation because Claude can see the entire history.

The Postgres schema that powers the memory layer:

CREATE TABLE email_threads (
    id          SERIAL PRIMARY KEY,
    thread_id   VARCHAR(255) UNIQUE,
    sender      VARCHAR(255),
    subject     TEXT,
    created_at  TIMESTAMPTZ DEFAULT NOW(),
    updated_at  TIMESTAMPTZ DEFAULT NOW()
);

CREATE TABLE email_messages (
    id          SERIAL PRIMARY KEY,
    thread_id   VARCHAR(255) REFERENCES email_threads(thread_id),
    role        VARCHAR(20),  -- 'customer' or 'agent'
    content     TEXT,
    message_id  VARCHAR(255) UNIQUE,
    created_at  TIMESTAMPTZ DEFAULT NOW()
);

CREATE INDEX idx_messages_thread_time
    ON email_messages (thread_id, created_at DESC);

That descending index on (thread_id, created_at) means the "last ten messages" query is a fast index scan, not a full table scan. On 400 threads a day with an average of 5 messages per thread, this table grows by 2,000 rows daily. At that rate, you're well within Postgres's comfort zone — even a $6/month VPS handles this without breaking a sweat.

The context-building code node:

// n8n Code node: format thread history for Claude
const history = $json.messages;
const context = history
    .map(m => `${m.role === 'customer' ? 'Customer' : 'Agent'}: ${m.content}`)
    .join('\n\n');

return {
    json: {
        context: context,
        thread_id: $json.thread_id,
        sender: $json.sender
    }
};

Claude receives this as system context, and the prompt template includes the instruction: "You are continuing an existing conversation. The thread history below shows prior exchanges. Write a reply that acknowledges what has already been discussed."

The escalation logic is a simple code node that checks the customer's message for trigger keywords — "cancel," "refund," "legal," "attorney," "complaint" — plus a confidence threshold on the model's output. If either fires, the Telegram node sends the owner a notification with a link to the thread. Otherwise, the agent handles it autonomously.

Real Numbers: The Side-by-Side Comparison

After 90 days of production traffic on this client:

Metric	Make.com (prototype)	n8n (production)
Monthly platform cost	$99 (ops tier needed)	$6 (VPS)
Operations metering	Yes, every read/write	No, self-hosted
Memory layer	Data Store (key-value)	Postgres (relational)
Context per reply	0 prior messages	Last 10 messages
Customer-facing errors	High (amnesia on every reply)	Near zero
Hours saved daily	N/A (unusable)	~3 hours
Payback period	N/A	Under 60 days

The three hours a day of reclaimed inbox time is the headline number. But the secondary win matters just as much: the error rate on customer-facing replies dropped to near zero because the agent stopped answering like it had never seen the customer before. Brand reputation repaired.

When to Use Each Tool: The Decision Framework

I use a simple rule when scoping client projects. It's not about which tool is "better." It's about which execution model fits the problem.

Prototype in Make.com when:

The client wants to see something working by Tuesday
The automation is genuinely fire-and-forget (trigger → transform → output, no memory needed)
The volume is low enough that operation costs stay trivial (under 1,000 ops/month)
The client's team needs to visually understand and modify the workflow

Ship in n8n (or a custom stack) when:

The agent needs to remember anything between runs
You're doing loops over large lists (Make charges per iteration)
You're making per-item AI calls where each call needs context
The volume is high enough that metered operations become a real cost center
You need relational queries on stored data (thread history, lead status, invoice state)

The simplest version of this test: ask whether your agent needs to remember anything. If yes, you already know which tool to pick.

How this fits into what we ship at bizflowai.io

At bizflowai.io, we build the stateful agent infrastructure that this post describes — the Postgres memory layers, the n8n workflows with real context retrieval, the escalation routing that knows when to pull a human in. Most of our clients come to us after hitting exactly this wall: a Make.com prototype that worked on the first email and fell apart on the second. We rebuild it with proper memory architecture so the agent remembers every customer, every thread, every prior exchange — without the per-operation tax.

Want more like this?

I publish practical AI automation, GenAI engineering, and faceless content workflows on YouTube every week.

Subscribe to bizflowai.io on YouTube — never miss a new tutorial.

Planning an AI automation project or need a second opinion on your architecture?

Connect with me on LinkedIn — Lazar Milicevic, GenAI Engineer & bizflowai.io Founder.

Visit bizflowai.io for our services, case studies, and AI consulting.

Frequently asked questions

What is the key difference between Make.com and n8n for AI agent automation?

Make.com is stateless by design—every workflow execution starts from zero and memory evaporates when the scenario finishes. n8n is stateful, offering persistent data on every workflow plus a native Postgres node for building real memory layers. The distinction is not beginner versus developer; it is stateless versus stateful. If your agent needs to remember anything between runs, n8n is the correct choice. If your automation is fire-and-forget, Make wins on speed to build.

How do I decide when to use Make.com vs n8n?

Prototype in Make.com when you need something working fast and the automation is fire-and-forget—form submissions to Slack, Stripe charges to a channel. Ship in n8n the moment your workflow requires memory, loops over large lists, or per-item AI calls. Ask one question: does this agent need to remember anything between runs? If yes, use n8n. Make.com charges per operation for every data store read and write, which becomes expensive at scale and still does not provide true conversational memory.

Why does stateless architecture break AI email agents?

Stateless platforms like Make.com wipe all context after each workflow execution. When a customer replies to an AI-drafted email, the agent starts fresh with no memory of the prior exchange. It responds as if meeting the customer for the first time, even mid-conversation. This is not a prompt problem—it is an architecture problem. Without persistent memory storing thread history, the agent cannot reference what was already discussed, causing brand-damaging replies and high error rates on customer-facing communication.

How much does n8n cost compared to Make.com for high-volume automation?

Self-hosted n8n on a small VPS costs roughly six dollars per month with zero marginal cost per execution. Make.com at 400 Gmail threads daily with data store reads, writes, dedup, and thread lookups burns through 2,000 operations per day, maxing the 10,000-operation tier by day fourteen. That puts Make at ninety-nine dollars monthly in operations fees alone, plus data store costs that still do not deliver real memory. Payback on the n8n build was under sixty days for the client.

How do you build a memory layer for an AI email agent in n8n?

Use n8n's native Postgres node to store and retrieve conversation history. The workflow pulls the last ten messages from a given thread, feeds them to an LLM like Claude as context, drafts a reply that continues the existing conversation, then logs the new exchange back to Postgres. You can also set a confidence threshold so the agent only pings a human via Telegram when uncertainty is high or when keywords like cancellation, refund, or legal appear. Rebuilt from Make, this came to twelve nodes.