Joonas Lahtinen
Back to blog
Production readiness

Error handling is what separates a prototype from production code

The script works. Then the API returns 429, the network drops mid-run, and someone tells you that the automation broke production.

This is the moment that separates a prototype from production code.

A simple distinction: prototypes handle the happy path. Production code handles everything else.

Three things that are missing nearly every time

Retry logic

If an external API does not respond or returns 429 (Too Many Requests), the script crashes. Instead, it should retry, wait a moment, retry again, and only then report a failure.

A simple example in PowerShell:

$attempts = 0
do {
    try {
        $result = Invoke-RestMethod -Uri $url -Headers $headers
        break
    } catch {
        $attempts++
        Write-Warning "Attempt $attempts failed: $($_.Exception.Message)"
        Start-Sleep -Seconds (2 * $attempts)
    }
} while ($attempts -lt 3)

if ($attempts -ge 3) {
    Send-MailMessage -To "admin@company.com" -Subject "Automation failed" ...
}

Exponential backoff is the standard, because it gives the external service time to recover.

Handling partial runs

If a script processes 500 users and fails at number 312, what happens? Does it restart from the beginning? Does it process some users twice?

You need either a checkpoint, where you record how far you got and continue from there, or idempotency: ensuring the same operation can be run multiple times without side effects. Idempotency is covered in a separate article.

Alerts to the right people

A try-catch that writes an error to a log file in a folder nobody checks is not error handling. It is self-deception.

Errors need to reach someone. An email, a Teams message, a ticket in a tracking system. And the log needs enough information to diagnose the problem without guesswork.

Why this gets skipped

Because the prototype works. When everything goes smoothly in testing, error handling feels unnecessary.

But automation that fails silently is worse than automation that does not work at all. At least then you know something needs to be fixed.

Production-ready automation reports clearly when something goes wrong, rather than leaving an error quietly in a log.

Want to review the production readiness of your automations? Get in touch.