Designing a startup infra team
BLUF: On a team of around 12–20 engineers, I’m dedicating 2–3 of them solely to infrastructure—focusing on uptime, security, costs, and the developer experience.
At Graphite, we recently split our single engineering team into three smaller teams. This step was about setting ourselves up for scalability: as we approach 20+ engineers, we need structure that allows for more focused management and tightly scoped meetings.
One of these new groups is an infrastructure (infra) team. Forming it felt like a proud milestone—after all, I still carry a warm nostalgia for Airbnb’s infra teams and the often unsung, behind-the-scenes work they powered. But I also knew that establishing an infra team too early can be risky. Common pitfalls include:
Product teams racing ahead, stacking up shaky code paths, then tossing the maintenance and debugging over the wall to infra.
The infra team devolving into a catch-all sink for bug fixes.
Polishing low-impact corners of the codebase while product teams struggle under tight deadlines.
To avoid these traps, I’ve given our infra team a specific mandate: they must keep a steady grip on four key pillars:
Site uptime and stability
Security
Cloud cost containment
Developer experience (dev-x)
I’m starting the infra team at 2 engineers out of our current total of 12. When we hit around 20, I plan to bump that to 3. Here’s my reasoning:
No team of one: Having a single-person team is siloing and just feels grim. Teams need at least two people.
Will Larson’s 25% rule of thumb: Larson suggests that infra eventually accounts for about 25% of the engineering team. At 12 total engineers, that would be 3. But Graphite isn’t at the stage where we need full-blown “platform engineering” yet, so starting at 2 makes sense. We’ll prove the team’s value and grow incrementally.
Heirloom tomato model: There’s a case to be made for intentionally having teams of varied sizes—small, medium, and large. A 2-person team out of 12 hits that sweet spot for a “small team” archetype.
These infra projects—uptime, security, cost optimization, and dev-x—often get ignored by short-term product pushes because they’re classic “tragedy of the commons” issues. No immediate product deadline justifies investing in them, but they matter deeply in the long run. By allocating a dedicated infra team, we protect them from the day-to-day fire drills and ensure these critical areas get the attention they deserve.
Of course, right after setting up the infra team, requests started rolling in to co-opt their bandwidth for urgent product backends. For example, we have a syncing engine that’s cross-cutting and might, on paper, sound like “infrastructure.” But the request to work on it springs from product-driven bug fixes that no one got around to. In other words, it’s a reactive ask. The infra team must weigh these requests against their four pillars: will taking on this project yield a meaningful, long-term improvement in uptime, security, cost efficiency, or dev-x?
Their first major project—“Event loop blockers”—is a perfect example of what I want them working on. Our Node/Express servers often lock up due to excessive data serialization and deserialization. A single edge-case database or API call can overfetch, hog CPU, and degrade performance for all users in that moment. This has been a consistent source of recent Graphite incidents. Product teams aren’t incentivized to solve this systematically, but it’s an ideal cross-cutting, high-leverage, uptime-boosting initiative.
Other projects in the pipeline include:
More frequent penetration tests and hardening.
Cutting AWS RDS costs and exploring how to partition our relational database for efficiency.
Better internal rate limits to ensure one feature or user doesn’t gobble up all our GitHub API tokens.
Improving local dev environments to make them more responsive and pleasant to work with.
In summary, I’m excited that at 12+ engineers we’re now dedicating two full-time folks to infrastructure. They have a clear mandate, protected time, and a chance to make a long-term impact on Graphite’s foundations. As we grow, the infra team will grow, too—and I’m betting that the dividends they pay down the road will more than justify the early investment.