Skip to main content

Command Palette

Search for a command to run...

Self-Hosting n8n on Microsoft Azure (Part 2): Performance Tuning, Backups, Memory Issues, and Scaling

Published
5 min read
Self-Hosting n8n on Microsoft Azure (Part 2): Performance Tuning, Backups, Memory Issues, and Scaling
B
Tech, Science, Phylosophy,

In Part 1, I covered how I deployed a production-ready, self-hosted n8n instance on Microsoft Azure using Docker, Caddy, and DuckDNS.

Once the system was live and real workflows started running, a new phase began — operational reality.

This is the part most tutorials skip:

  • Workflows start failing randomly

  • Memory usage spikes

  • OAuth tokens expire

  • The VM freezes at 3 a.m.

  • You realize “working” is not the same as “stable”

This article documents what I learned while tuning performance, solving memory issues, implementing backups, and preparing for scale on Azure.


1. Understanding n8n Performance in Production

n8n performance depends on four main factors:

  1. VM resources (CPU, RAM, disk)

  2. Workflow design (loops, concurrency, API calls)

  3. Execution mode (main vs queue mode)

  4. External services (APIs, databases, webhooks)

Early on, I learned that most performance problems are not n8n bugs — they’re architecture problems.


2. Memory Issues: The Most Common Failure Mode

One of the most frequent errors I encountered was:

“Workflow did not finish, possible out of memory issue”

Why This Happens

Common causes:

  • Large JSON payloads held in memory

  • Long-running loops

  • Multiple workflows executing simultaneously

  • Running everything on a low-RAM VM (1–2 GB)

  • Default Node.js memory limits

n8n keeps execution data in memory during runtime, which makes RAM the first bottleneck.


3. Fixing Out-of-Memory (OOM) Problems

3.1 Upgrading the VM

The fastest fix was increasing VM resources.

Recommended minimum for production:

  • 4 GB RAM

  • 2 vCPUs

On Azure, resizing a VM is straightforward and often cheaper than debugging memory crashes for days.


3.2 Increasing Node.js Memory Limits

By default, Node.js limits memory usage. This can be adjusted via environment variables:

NODE_OPTIONS=--max-old-space-size=4096

This allows n8n to use more memory safely when workflows grow in complexity.


3.3 Reducing Execution Data Size

In the n8n settings:

  • Disable saving full execution data unless needed

  • Use “Save successful executions: none or last”

This significantly reduces memory pressure and disk usage.


3.4 Workflow Design Optimization

Bad workflow design can kill even a powerful VM.

Best practices:

  • Avoid huge loops where possible

  • Use pagination instead of fetching everything at once

  • Split large workflows into smaller ones

  • Use “Execute Workflow” nodes to modularize logic

  • Clear unnecessary fields using Set nodes

Small design changes often produce massive stability improvements.


4. Performance Tuning at the Workflow Level

4.1 Controlling Concurrency

Running too many workflows simultaneously causes:

  • CPU spikes

  • Memory exhaustion

  • API rate limit issues

Control concurrency by:

  • Scheduling workflows intelligently

  • Avoiding heavy workflows running at the same time

  • Using queues (covered later)


4.2 Webhook vs Polling

Where possible:

  • Prefer webhooks over polling

  • Polling wastes CPU cycles and memory

  • Webhooks trigger workflows only when needed

This simple shift improves performance instantly.


5. Backups: The Thing You’ll Regret Not Doing

Sooner or later, something will break:

  • Accidental deletion

  • Disk failure

  • VM corruption

  • Bad updates

Backups are non-negotiable.


5.1 What Needs to Be Backed Up

At minimum:

  • n8n database (SQLite or PostgreSQL)

  • Encryption key

  • Docker volumes

  • Environment variables

  • Workflow data

If you lose the encryption key, all credentials are unrecoverable.


5.2 Moving from SQLite to PostgreSQL

For production, SQLite is not ideal.

PostgreSQL offers:

  • Better performance

  • Safer concurrent access

  • Easier backups

  • Better scaling support

On Azure, PostgreSQL can be:

  • Self-hosted in Docker

  • Or managed using Azure Database for PostgreSQL

Managed PostgreSQL reduces operational overhead significantly.


5.3 Automated Backups

Best practices:

  • Daily database dumps

  • Store backups outside the VM

  • Use Azure Blob Storage or another cloud bucket

  • Automate via cron or workflows

Backups should be tested, not just created.


6. Scaling n8n on Azure

At some point, vertical scaling (bigger VM) is not enough.


6.1 Vertical Scaling (Simplest)

  • Increase RAM and CPU

  • Easiest option

  • Works well up to a point

Azure allows VM resizing with minimal downtime.


6.2 Horizontal Scaling with Queue Mode

For serious workloads, n8n supports queue mode.

Queue mode:

  • Separates execution from the main instance

  • Uses Redis as a message broker

  • Allows multiple workers

  • Improves reliability and throughput

Architecture:

  • One main n8n instance (UI + API)

  • Multiple worker containers

  • Redis for job distribution

  • PostgreSQL as the database

This is where n8n becomes truly production-grade.


6.3 Azure Considerations for Queue Mode

On Azure:

  • Use a VM Scale Set or multiple VMs

  • Use managed Redis (Azure Cache for Redis)

  • Use managed PostgreSQL

  • Separate compute from storage

This reduces single points of failure.


7. Monitoring and Stability

Production systems need visibility.

Key things to monitor:

  • RAM usage

  • CPU usage

  • Disk space

  • Docker container health

  • Workflow execution failures

Azure Monitor + basic Docker logs go a long way.


8. Security and Stability Improvements

Additional production tips:

  • Restrict VM inbound ports (only 80/443/22)

  • Use SSH keys only

  • Rotate credentials regularly

  • Use environment variables, never hard-code secrets

  • Keep Docker images updated (Watchtower helps)


9. Real-World Lessons Learned

  • Most failures are architectural, not software bugs

  • Memory issues are normal when scaling

  • Backups are boring — until they save you

  • Vertical scaling is underrated

  • Queue mode is worth it when traffic grows

  • Azure is beginner-friendly if you respect its complexity


Final Thoughts

Running n8n in production teaches you far more than any managed automation platform ever will.

You stop thinking like:

“How do I make this workflow work?”

And start thinking like:

“How do I make this reliable at 2 a.m. when I’m asleep?”

That shift is the real value of self-hosting.