n88-microsoft azure hosting setup

In Part 1, I covered how I deployed a production-ready, self-hosted n8n instance on Microsoft Azure using Docker, Caddy, and DuckDNS.

Once the system was live and real workflows started running, a new phase began — operational reality.

This is the part most tutorials skip:

Workflows start failing randomly
Memory usage spikes
OAuth tokens expire
The VM freezes at 3 a.m.
You realize “working” is not the same as “stable”

This article documents what I learned while tuning performance, solving memory issues, implementing backups, and preparing for scale on Azure.

1. Understanding n8n Performance in Production

n8n performance depends on four main factors:

VM resources (CPU, RAM, disk)
Workflow design (loops, concurrency, API calls)
Execution mode (main vs queue mode)
External services (APIs, databases, webhooks)

Early on, I learned that most performance problems are not n8n bugs — they’re architecture problems.

2. Memory Issues: The Most Common Failure Mode

One of the most frequent errors I encountered was:

“Workflow did not finish, possible out of memory issue”

Why This Happens

Common causes:

Large JSON payloads held in memory
Long-running loops
Multiple workflows executing simultaneously
Running everything on a low-RAM VM (1–2 GB)
Default Node.js memory limits

n8n keeps execution data in memory during runtime, which makes RAM the first bottleneck.

3. Fixing Out-of-Memory (OOM) Problems

3.1 Upgrading the VM

The fastest fix was increasing VM resources.

Recommended minimum for production:

4 GB RAM
2 vCPUs

On Azure, resizing a VM is straightforward and often cheaper than debugging memory crashes for days.

3.2 Increasing Node.js Memory Limits

By default, Node.js limits memory usage. This can be adjusted via environment variables:

NODE_OPTIONS=--max-old-space-size=4096

This allows n8n to use more memory safely when workflows grow in complexity.

3.3 Reducing Execution Data Size

In the n8n settings:

Disable saving full execution data unless needed
Use “Save successful executions: none or last”

This significantly reduces memory pressure and disk usage.

3.4 Workflow Design Optimization

Bad workflow design can kill even a powerful VM.

Best practices:

Avoid huge loops where possible
Use pagination instead of fetching everything at once
Split large workflows into smaller ones
Use “Execute Workflow” nodes to modularize logic
Clear unnecessary fields using Set nodes

Small design changes often produce massive stability improvements.

4. Performance Tuning at the Workflow Level

4.1 Controlling Concurrency

Running too many workflows simultaneously causes:

CPU spikes
Memory exhaustion
API rate limit issues

Control concurrency by:

Scheduling workflows intelligently
Avoiding heavy workflows running at the same time
Using queues (covered later)

4.2 Webhook vs Polling

Where possible:

Prefer webhooks over polling
Polling wastes CPU cycles and memory
Webhooks trigger workflows only when needed

This simple shift improves performance instantly.

5. Backups: The Thing You’ll Regret Not Doing

Sooner or later, something will break:

Accidental deletion
Disk failure
VM corruption
Bad updates

Backups are non-negotiable.

5.1 What Needs to Be Backed Up

At minimum:

n8n database (SQLite or PostgreSQL)
Encryption key
Docker volumes
Environment variables
Workflow data

If you lose the encryption key, all credentials are unrecoverable.

5.2 Moving from SQLite to PostgreSQL

For production, SQLite is not ideal.

PostgreSQL offers:

Better performance
Safer concurrent access
Easier backups
Better scaling support

On Azure, PostgreSQL can be:

Self-hosted in Docker
Or managed using Azure Database for PostgreSQL

Managed PostgreSQL reduces operational overhead significantly.

5.3 Automated Backups

Best practices:

Daily database dumps
Store backups outside the VM
Use Azure Blob Storage or another cloud bucket
Automate via cron or workflows

Backups should be tested, not just created.

6. Scaling n8n on Azure

At some point, vertical scaling (bigger VM) is not enough.

6.1 Vertical Scaling (Simplest)

Increase RAM and CPU
Easiest option
Works well up to a point

Azure allows VM resizing with minimal downtime.

6.2 Horizontal Scaling with Queue Mode

For serious workloads, n8n supports queue mode.

Queue mode:

Separates execution from the main instance
Uses Redis as a message broker
Allows multiple workers
Improves reliability and throughput

Architecture:

One main n8n instance (UI + API)
Multiple worker containers
Redis for job distribution
PostgreSQL as the database

This is where n8n becomes truly production-grade.

6.3 Azure Considerations for Queue Mode

On Azure:

Use a VM Scale Set or multiple VMs
Use managed Redis (Azure Cache for Redis)
Use managed PostgreSQL
Separate compute from storage

This reduces single points of failure.

7. Monitoring and Stability

Production systems need visibility.

Key things to monitor:

RAM usage
CPU usage
Disk space
Docker container health
Workflow execution failures

Azure Monitor + basic Docker logs go a long way.

8. Security and Stability Improvements

Additional production tips:

Restrict VM inbound ports (only 80/443/22)
Use SSH keys only
Rotate credentials regularly
Use environment variables, never hard-code secrets
Keep Docker images updated (Watchtower helps)

9. Real-World Lessons Learned

Most failures are architectural, not software bugs
Memory issues are normal when scaling
Backups are boring — until they save you
Vertical scaling is underrated
Queue mode is worth it when traffic grows
Azure is beginner-friendly if you respect its complexity

Final Thoughts

Running n8n in production teaches you far more than any managed automation platform ever will.

You stop thinking like:

“How do I make this workflow work?”

And start thinking like:

“How do I make this reliable at 2 a.m. when I’m asleep?”

That shift is the real value of self-hosting.

Self-Hosting n8n on Microsoft Azure (Part 2): Performance Tuning, Backups, Memory Issues, and Scaling

1. Understanding n8n Performance in Production