Skip to content

Glossary

Infrastructure terms, in plain English

The acronyms and jargon you will meet when running infrastructure — explained without the gatekeeping.

SLA
Service Level Agreement — a written commitment to response and uptime targets, often backed by service credits if they are missed.
Uptime
The percentage of time a service is available. "99.9%" allows roughly 8.8 hours of downtime a year; "99.99%" allows under an hour.
MTTR
Mean Time To Resolve — the average time taken to fix an incident. Lower is better.
RTO
Recovery Time Objective — the maximum acceptable time to restore a service after a failure.
RPO
Recovery Point Objective — the maximum acceptable amount of data loss, measured in time (e.g. "up to 15 minutes").
SPF
Sender Policy Framework — a DNS record that lists which servers are allowed to send email for your domain.
DKIM
DomainKeys Identified Mail — a cryptographic signature that proves an email genuinely came from your domain and was not altered.
DMARC
A policy that tells receiving mail servers what to do with email that fails SPF and DKIM — and reports who is sending as you.
Hardening
Configuring a system to reduce its attack surface — closing unused services, enforcing good defaults, and applying security baselines.
Patching
Applying updates that fix security vulnerabilities and bugs. Unpatched systems are the most common way attackers get in.
IaC
Infrastructure as Code — defining your infrastructure in version-controlled files so it is repeatable, reviewable and rebuildable.
FinOps
The practice of managing and optimising cloud spend continuously, so cost is owned rather than left to grow.
Observability
The ability to understand what a system is doing from its outputs — logs, metrics and traces — especially when something goes wrong.
Disaster recovery
The plan and capability to restore your whole service after a major failure, with defined recovery targets and tested failover.
P1–P4
Priority levels for incidents, from P1 (critical, service down) to P4 (low, a request or question). They determine response times.
On-call
An engineer available outside normal hours to respond to urgent incidents — so problems are handled when they happen, not the next morning.

Rather not have to learn all this?

That is the point of us. We handle the infrastructure so you can focus on your business.