Dezen Technology
All articles
EngineeringApr 13, 20268 min read

CI/CD pipeline that doesn’t suck

Five stages, cached, parallel, preview-per-PR, gated to production. The metrics to watch (p50 duration, flakiness, MTTR) and the ‘roll forward, not back’ principle.

CI/CD pipeline that doesn’t suck

The CI/CD pipeline is the single highest-leverage system in your engineering setup — it dictates how fast you can ship, how confidently, and how often someone has to drop their tea to fix it on a Friday afternoon. Most pipelines we inherit from previous teams are either too slow (engineers batch their changes) or too lenient (broken code ships).

Here’s the shape of a pipeline we’d defend in a code review.

CI/CD pipeline — commit, build, parallel tests, preview, gated production deploy

The five-stage pipeline

Stage 1: Install + cache (~30s)

Cache aggressively. Node modules, pip packages, Docker layers, build artifacts. Anything that doesn’t change between commits should be retrieved from cache, not rebuilt. GitHub Actions, GitLab CI and CircleCI all have first-class cache primitives — use them.

Stage 2: Build (~60s)

Type-check, compile, bundle. Whatever produces the artifact your tests run against and your deploy ships. Build once; reuse the artifact across all downstream stages. Don’t build twice (once for test, once for deploy) — that’s how “works on my CI” happens.

Stage 3: Test, in parallel (~90s)

Shard your test suite across N workers. A 6-minute serial suite is 90 seconds on 4 workers. Vitest, Jest, pytest-xdist, Go’s parallel tests — all support this natively. The marginal cost of more workers is dwarfed by the cost of engineers waiting.

Inside this stage, run in parallel: unit tests, integration tests, type check, lint, a11y tests, visual regression. They don’t depend on each other; let them race.

Stage 4: Preview deploy (~60s)

Every PR gets a real URL with the change deployed. Real env, real DB (a clone), real traffic-shape. Reviewers click the link; PMs click the link; designers click the link. The PR conversation is grounded in something tangible.

Stage 5: Production deploy (gated)

On merge to main, the same artifact rolls out to production. Gated by either auto (canary at 5% → wait for health → ramp) or manual (Slack approval). The gate is about confidence, not about meetings.

Roll forward, not back

When prod breaks, your instinct will be to roll back to the previous version. Resist. Roll forward is faster, more honest, and forces you to fix the underlying problem instead of papering over it. Most modern deploy systems (Vercel, Fly, Render, AWS with proper canaries) make roll-forward as cheap as roll-back — without the “we lost the migration” risks.

Reserve rollback for the genuinely catastrophic: data corruption, security incident, complete outage. For everything else — fix forward.

Pipeline-as-code, version-controlled

  • Your pipeline lives in .github/workflows/ (or equivalent), not in a web UI. Reviewable. Diffable. Reverts cleanly.
  • Secrets in a real secrets manager (GitHub Secrets, AWS Secrets Manager). Never inline.
  • Environment-specific values driven by environment variables, not by hand-editing steps.

The metrics that matter

Watch these four numbers like a hawk:

  • P50 pipeline duration.Target: <5 min. If engineers can’t context-switch to a code review in less time than CI takes, batching starts.
  • Flakiness rate.A test that fails 1% of the time is a bug. Quarantine it the first time it’s flaky; fix or delete within a week. Flaky tests poison trust in the whole pipeline.
  • Mean time to revert.When prod breaks, how fast can you ship a revert? Target: <10 min.
  • Deploys per day.Health metric. If it’s zero on most days, you have a deploy pain that’s shaping engineering behavior.

How we approach this

Every project we ship via Ongoing Maintenance and SaaS Product Development ships with this pipeline shape pre-built — cached, parallel, preview-per-PR, canary-gated to prod. We treat the pipeline as a product, not a one-time setup.

Takeaways

  • Five stages: install · build · test (parallel) · preview · prod (gated).
  • Cache everything reusable. Build once. Test in parallel.
  • Every PR gets a real preview URL.
  • Roll forward, not back. Most deploys.
  • Watch P50 duration, flakiness, MTTR, deploys/day.
Keep reading

More from the engine room

AI in QA: where it helps, where it doesn’t

May 27, 2026

AI in QA: where it helps, where it doesn’t

AI augments QA throughput — test generation, triage, visual regression. It doesn’t replace QA judgment: strategy, exploratory testing, and defining correctness stay human.

Read More
Controlling LLM costs in production

May 25, 2026

Controlling LLM costs in production

Four levers cut spend 10x without cutting quality: route by difficulty, cache, trim context, batch and stream. Measure cost-per-feature first; set budget guardrails always.

Read More
RAG vs fine-tuning: which do you actually need?

May 23, 2026

RAG vs fine-tuning: which do you actually need?

Facts → RAG. Behavior → maybe fine-tune. Most business AI features want RAG even when teams ask for fine-tuning. The decision rule and the order to try things in.

Read More
Agentic features in SaaS: the maturity ladder

May 21, 2026

Agentic features in SaaS: the maturity ladder

From manual to autonomous — four levels of autonomy and the guardrails each needs. Match autonomy to the cost of being wrong, not to how impressive it sounds.

Read More
Offline-first mobile: the app that works on the subway

May 19, 2026

Offline-first mobile: the app that works on the subway

The UI never waits on the network. Local DB, sync engine, server — with conflict resolution per data type. The architecture that makes mobile apps feel instant.

Read More
Lift-and-shift vs refactor: how to actually decide

May 17, 2026

Lift-and-shift vs refactor: how to actually decide

Lift-and-shift is fast, cheap to do, expensive to keep. Refactor is months of work with structural upside. The matrix — and why half-finished refactors are the worst path.

Read More
Monolith migration: the strangler-fig playbook

May 15, 2026

Monolith migration: the strangler-fig playbook

The big-bang rewrite is the most consistently bad idea in software. Proxy in front, extract one route at a time, shrink the monolith to nothing. No migration day.

Read More
SOC 2 readiness in plain English

May 13, 2026

SOC 2 readiness in plain English

Five Trust Service Criteria, Security mandatory and the rest optional. Type 1 vs Type 2. The pragmatic 6-month timeline — not the year-long ordeal it’s made out to be.

Read More

Let’s Build the Future Together!

Contact our team today and turn your ideas into reality.

Let’s Discuss
Contact Details : sales@dezentech.com Sy. No:40, Flat No:402, SIRISAMPADHA ARCADE I, Plot no:18-21, behind Union Bank of India, Khajaguda, Hyderabad, Telangana 500104