ranjan@ranjan.info:~$ man services/infrastructure-audit

Infrastructure Audit

Know exactly what you're running — and what it would take to lose it

Most infrastructure grows by accretion: a server here, a panel there, backups someone configured years ago and nobody has tested since. An audit replaces assumptions with a document — what you have, where it's weak, what it costs, and what happens on your worst day. It's the cheapest engagement I offer and the one that prevents the expensive ones.

Book a consultation Emergency support

What is an infrastructure audit?

An infrastructure audit is a structured review of everything your systems run on — servers, cloud resources, security posture, performance headroom, backups, and disaster recovery — producing a written report of findings ranked by business risk, each with a concrete fix and effort estimate. Its defining test is simple: after the audit, you can answer "what happens if this fails?" for every component with a document instead of a guess.

Written by Ranjan Chatterjee, Infrastructure Consultant · Linux Server Specialist · 15+ years in production Linux · Last reviewed July 4, 2026

ranjan@ranjan.info:~$ dmesg | tail

Signs you need this now

Audits pay for themselves in the questions you can't currently answer. If any of these apply, that's the sign.

You inherited infrastructure the last person built — and left undocumented
Nobody has actually restored a backup in the past year
The cloud bill grows every month and no one can say exactly why
One specific person is the only one who knows how production works
You're about to scale, hire, or hand off — and need the map first
An incident happened and the fixes were all patches, never structure
A client, insurer, or auditor asked security questions you answered from memory
"It works, don't touch it" is the operating policy for a revenue system

ranjan@ranjan.info:~$ cat scope.txt

What this covers

Security audit: exposure, patching, access, and configuration review
Performance audit: bottlenecks, capacity, and headroom
Cost optimization: oversized instances, unused resources, better-fit providers
Cloud architecture review (AWS, DigitalOcean, Hetzner, hybrid)
Backup strategy review — including an actual restore test
Disaster recovery planning: RTO/RPO you can defend
Architecture review with scaling recommendations

ranjan@ranjan.info:~$ grep -i "oops" ~/incidents.log

Mistakes that audits keep finding

Every one of these comes from a real engagement — usually from before I was called.

Trusting backup job status instead of restores

The job says "success" while backing up the wrong paths, or to a disk that died quietly months ago. The only backup metric that matters is a timed, tested restore — which is why this audit performs one.

Auditing security and ignoring recovery

A hardened server with untested backups is one disk failure from disaster. Risk lives in the whole chain — exposure, redundancy, backups, and the human runbook — not in any single layer.

Paying for peak capacity year-round

Instances sized for a traffic spike two years ago, storage nobody reclaimed, and duplicate services across providers — cost review routinely finds 30–50% of cloud spend doing nothing.

Having no owner per system

When everything is everyone's job, patching and monitoring quietly become no one's. The audit maps each component to an accountable owner — often the cheapest fix on the list.

Filing the report and changing nothing

An audit is a to-do list with prices, not an achievement. The roadmap section exists so the top three risks get dates and owners the same week the report lands.

ranjan@ranjan.info:~$ diff --options

DIY, provider support, or a specialist?

An honest comparison — each option is right in some situations, including the free ones.

Option	The right choice when…	Limits & risks
Internal self-audit	A capable team with time, using a good framework (CIS, provider well-architected reviews). Free, and builds internal knowledge of your own systems.	The blind spots that caused the risks also grade the homework. Teams consistently under-rate what they built and skip what they fear — restore tests above all.
Provider health checks	Free or cheap reviews from your cloud or host — decent at flagging obvious misconfigurations inside their own platform.	Scope stops at their product line, findings funnel toward their upsells, and nobody tests your backups or reads your architecture as a business.
Independent audit	Revenue-bearing infrastructure, an upcoming decision (scale, hire, migrate), post-incident clarity, or answering security questionnaires with evidence. Vendor-neutral by design.	Costs real money and needs read access or a walkthrough. The report is only worth what you execute — pair it with owners and dates, whoever implements it.

What you get

A written audit: findings ranked by risk, each with a concrete fix and effort estimate
A tested answer to "can we actually restore?" — not a checkbox
A 6–12 month roadmap: what to fix now, what to plan, what to stop paying for

Why work with me on this

15+ years inside production Linux — this exact work, done at fleet scale
Founder-operator of two hosting platforms: I've owned the uptime, not just the ticket
Every change documented and reversible — you keep a written trail, not a mystery
Plain-language updates and honest timelines you can plan a business around

ranjan@ranjan.info:~$ ./engage --how

How it runs

The same disciplined path on every engagement — scoped, planned, executed with checkpoints, handed off clean.

01
Scope

A short brief or call to understand your stack, the real problem, and what a good outcome looks like.
02
Plan

A clear architecture plan — steps, risks, rollback and timeline — agreed before anything touches production.
03
Execute

Hands-on work with checkpoints. You see progress; nothing changes on your servers silently.
04
Handoff

Documentation, access cleanup and a clear path for what comes next. No lock-in, no mystery.

ranjan@ranjan.info:~$ faq --service infrastructure-audit

Common questions

We're a small operation — is an audit overkill?

Small operations have the least slack when something fails, which makes the audit more valuable, not less. A two-server audit is quick, cheap, and usually finds at least one silent risk worth its whole price.

What exactly do I get at the end?

A written report with findings ranked by business risk — each with a concrete fix and an effort estimate — a tested answer to "can we actually restore?", and a 6–12 month roadmap separating fix-now from plan-for from stop-paying-for. It's written to be executable by any competent engineer, not just me.

How long does an audit take?

Most environments deliver within one week: a few days of review and testing, then the report. Larger or multi-provider estates take proportionally longer, quoted up front. Your team's time cost is small — access provisioning and an hour of walkthrough questions.

What does it cost?

A fixed price by environment size — servers, providers, and complexity — quoted before work starts. It's deliberately the least expensive engagement I offer, because it's the one that prevents the expensive ones.

Will you try to sell us a rebuild?

No — the report ranks findings by risk to your business, and "leave it alone, it's fine" is a finding I write regularly. The audit is deliberately decoupled from implementation so the advice stays honest.

What do you need access to?

Read-level access to servers and the provider console covers most of it. Where access is sensitive, I work from configuration exports and an engineer walkthrough instead.

Do you actually test our backups?

Yes — an actual timed restore of real data, not a glance at job logs. It's the single most valuable step in the audit: roughly half of first-time audits discover their backups were incomplete, unrestorable, or slower to restore than the business could survive.

Can you audit cloud and on-premises together?

Yes — most real estates are hybrid: some AWS or DigitalOcean, a dedicated box at Hetzner, an office NAS someone forgot. The audit treats it as one system, because that's how it fails.

How often should we repeat an audit?

Annually as a baseline, or after any structural change — a migration, a major growth step, a new compliance requirement, an incident. Repeat audits are faster and cheaper than the first, since the map already exists and only the diff needs review.

Is our information confidential?

Completely. An audit sees credentials, architecture, and costs — it's treated like the inside of your business, because it is. NDAs are welcome, findings are shared with no one, and case-study references are anonymized beyond recognition.

ranjan@ranjan.info:~$ man glossary

Terms you'll meet in the report

Plain-language definitions — so the report reads like information, not incantation.

RTO: Recovery Time Objective — how long the business can afford to be down. Every architecture decision is quietly a bet on this number.
RPO: Recovery Point Objective — how much data you can afford to lose, measured in time. Your real RPO equals the age of your last tested backup.
Single point of failure: Any component whose failure alone takes the service down — a server, a person, a provider, a DNS account with one password.
DR plan: Disaster recovery: the written, tested path from "everything is gone" back to serving customers. Untested plans are hypotheses.
Right-sizing: Matching resources to measured need — the polite name for "you're paying triple for idle capacity".
Attack surface: Everything externally reachable: open ports, exposed panels, forgotten subdomains. The audit maps it before someone else does.
Runbook: The document that lets a competent stranger operate your systems — the antidote to key-person risk.
Restore test: Actually recovering data from a backup, timed and verified. The single highest-value item in this audit, and the most commonly skipped everywhere else.

ranjan@ranjan.info:~$ ls ../related

Related services

Engagements that commonly pair with this one.

Linux Server Security Hardening

SSH, firewall, kernel, PHP, MySQL — locked down in layers, documented, auditable.

View service

Server & Website Performance Optimization

Web server, PHP, MySQL, cache layers — tuned from measurements, not folklore.

View service

Managed Server Administration

24×7 monitoring, patching, backups, and incident response on a flat monthly retainer.

View service

ranjan@ranjan.info:~$ ssh [email protected]

Ready when you are

One paragraph is enough: your stack, the symptom, and when you need it solved. Emergencies are answered first.

Book a consultation All services

Infrastructure Audit Book a consultation Emergency

Know exactly what you're running — and what it would take to lose it

What is an infrastructure audit?

Signs you need this now

What this covers

Mistakes that audits keep finding

Trusting backup job status instead of restores

Auditing security and ignoring recovery

Paying for peak capacity year-round

Having no owner per system

Filing the report and changing nothing

DIY, provider support, or a specialist?

What you get

Why work with me on this

How it runs

Scope

Plan

Execute

Handoff

Common questions

Terms you'll meet in the report

Related services

Linux Server Security Hardening

Server & Website Performance Optimization

Managed Server Administration

Ready when you are