Services

Five kinds of work I take.

Every engagement is delivered by me, remotely, in your repos and your AWS accounts. You get working infrastructure and a written record of why it's shaped that way, so nothing depends on me after I leave.

platform

Cloud platform architecture

Project-based. Short assessment first, then a scoped build.

The situation

The AWS environment grew by accretion. The VPC layout predates half the team, security groups reference services that no longer exist, and the one engineer who understands the database failover left last spring. Every new workload makes it harder to change.

What I do

I rebuild environments to current standards while they keep serving traffic. At Postscript I re-architected the full AWS estate (VPC, ECS, EKS, MSK, RDS Aurora, Elasticache) under live load measured in tens of millions of requests a day. The method is the same at any size: inventory what exists, sequence changes so nothing user-facing blinks, and put every resource in Terraform as it moves.

What you get

An architecture and migration plan sequenced against your traffic, not a greenfield fantasy
Terraform for every resource touched, in your workspaces, reviewed through your PRs
Multi-region only where the failure math justifies it, with the cost modeled first
Written handover: diagrams, runbooks, and the decisions with their reasoning

kubernetes

Kubernetes and EKS operations

Project-based or a retained day rate for ongoing operations.

The situation

The EKS upgrade is two versions overdue because nobody wants to own the blast radius. ArgoCD is half-adopted. Node groups were sized once, in 2023, and the cluster bill has crept up ever since.

What I do

I run EKS with ArgoCD, Helm, Karpenter, and KEDA as the default toolchain, and I've operated it on-call at two companies. That includes the migrations nobody volunteers for: ECS to EKS (done for business-critical services at CVS Digital), version upgrades across large fleets, and moving node groups to Bottlerocket and Spot where the workload tolerates it.

What you get

Cluster upgrades executed and documented, with a repeatable path for the next one
GitOps delivery through ArgoCD your team can operate without me
Autoscaling (Karpenter, KEDA) tuned against real traffic, with the bill as a first-class metric
A written operating model: who owns what, what pages, what the escalation path is

delivery

Terraform and CI/CD at scale

Project-based. Fixed scope works well here.

The situation

Infrastructure lives in three places: some CloudFormation, some Terraform, some console changes nobody wrote down. CI takes forty minutes and still needs a stored AWS key that security has flagged twice.

What I do

I led a migration of hundreds of resources from CloudFormation to Terraform at Aetna Digital, and I run multi-region Terraform on Terraform Cloud today. Pipelines get the same treatment: GitHub Actions with OIDC role assumption instead of long-lived keys, plans posted on PRs, applies gated and auditable.

What you get

A migration path from CloudFormation or console-managed resources into Terraform, executed in slices
Terraform Cloud (or your backend) workspace design with state, permissions, and drift handled
CI/CD with OIDC auth, plan-on-PR, and no stored cloud credentials anywhere
Module patterns your engineers copy instead of reinventing

cost

Cloud cost engineering

Assessment first, with savings quantified before changes ship. For larger engagements, pricing on a share of verified savings is on the table.

The situation

The AWS bill grows faster than traffic and nobody can say which team, feature, or decision is responsible. Finance asks; engineering shrugs; the reserved-instance renewal is next quarter.

What I do

Cost work is where I have the longest receipts: an AWS bill cut from roughly $500K to $200K a year at MyRounding, $350K+ a year saved moving GoPro transcode fleets to Spot, S3 lifecycle policies across petabytes worth hundreds of thousands more, and ~$4M a year of spend at Postscript held flat while traffic grew. The pattern: attribute spend first, then take the wins in order of effort-to-savings, then build the controls that stop the regrowth.

What you get

Spend attribution by service, team, and workload, so every line item has an owner
Executed savings: Spot, right-sizing, storage lifecycle, purchase commitments
Budgets and alarms that page before finance notices
Support on the AWS contract itself; I have negotiated these on the customer side

ai ops

AI ops and readiness

Project-based buildout of the guardrails, or embedded for a fixed term to stand up the workflow and hand it off.

The situation

Your engineers are already pointing AI agents at infrastructure, or they want to and can't risk it. The worry is legitimate: one ungoverned apply in the wrong account is a bad afternoon at best. So the throughput either goes unused or gets used with nothing watching it.

What I do

I run Claude Code every day as a force multiplier on a live platform: parallel sub-agents inventory 50+ microservices across Terraform, ArgoCD, and AWS and drive changes from plan toward production. What makes that safe is the scaffolding around the model — an adversarial production-safety review plus an automated create-only / no-prod-reference gate that runs before any terraform apply, with a human approving every production-affecting change and nothing reaching prod until it proves out in dev. I build that scaffolding for teams that want the speed without betting the account on it.

What you get

Guardrails as code: a create-only / no-prod-reference gate enforced before any apply, plus a production-safety review the agent has to pass
A human-in-the-loop workflow where agents propose changes and a policy gate plus review decide whether they ship
Read-only agent workflows for the low-risk wins first: EKS health triage, Terraform and Helm PR review
Enablement so your team runs it after I leave: the runbook, and an honest map of where not to point an agent yet

Not sure which of these you need?

Describe the problem and I'll tell you what I'd do about it, including the case where the answer is "you don't need me for this."

Get in touch