I get asked—a lot—to build a self-hosted or on-prem version of Graphite. I usually respond with a line I borrowed from Datadog’s philosophy: I’ll never say no; I’ll just quote an extremely high price. Why? I want to maximize our team’s itteration velocity. Over the years, I’ve seen how the on-prem conversation can shape product roadmaps and company trajectories. It’s rarely as simple as saying, “Sure, we’ll just let you host it yourself.”
It’s interesting how far the SaaS industry has come. In the early days of software, there was no concept of “the cloud”—everything was on-prem because that was the only way to ship code. If you look at the 1960s and ’70s, entire businesses were built around selling software licenses for mainframe computers. In the ’90s and 2000s, though, the internet and cheaper hardware flipped that model on its head. Companies like Concur (for expenses) and Salesforce (for CRM) opened a new world of web-based solutions, paving the way for SaaS as we know it.
These days, we think of cloud-hosted SaaS as the default. The typical benefits are pretty well-documented: no huge hardware investments, easy scalability, and remote accessibility. But that same rise of cloud-based software also created tension in industries where control, security, or compliance is paramount. While some companies (Datadog is a famous example) have maintained a hard line on staying fully cloud-hosted, many others end up building an on-prem offering as soon as a lucrative enterprise contract comes knocking.
I can relate. It stings to turn down seven-figure deals just because you don’t want to offer an on-prem deployment. But there’s a good reason so many teams hold out as long as possible: the additional complexity is enormous, and it directly impacts how quickly you can ship and iterate.
On-Prem Slows You Down
I believe that the most essential principle in software engineering is iteration speed: ship fast, learn, tweak, repeat. Software is uniquely “soft,” meaning you can push an update at almost any time—unlike physical products that are locked in once they leave the factory. On-prem, however, grinds that magical quality to a near halt. Let me explain why.
1. Observability
In a standard cloud setup, you can instrument your application to gather metrics, logs, and performance data right in your own environment. With on-prem, you’re effectively deploying your code into someone else’s data center. You have to build custom reporting frameworks that “phone home” with usage data—or you settle for flying blind. Either way, you lose the seamless real-time insight you’d otherwise have.
2. Debugging
This is where it gets really messy. If a customer hits a weird bug in an on-prem environment, maybe it’s your application logic. Or maybe their load balancer is misconfigured, or their OS is on a strange version of Linux, or the database is at capacity. Reproducing that bug often means reconstructing their entire environment, which they may not want to share. Even if they do, it’s a time sink that quickly kills your iteration speed.
3. Shipping Updates
Let’s say you nail down the bug, fix it, and build a shiny new feature while you’re at it. How do you deliver those changes? Are you begging your customers to run manual upgrades? Automating behind the scenes? What if you need to perform a database migration—do you schedule downtime or risk doing it live? Suddenly, something that’s trivial in a SaaS world becomes a multi-step negotiation for each and every customer environment. The problem is uniquely painful enough to inspire startups dedicated to smoothing the process.
All this grinds velocity to a halt. On-prem can make product iteration painfully slow, draining resources you’d rather spend building features. For most early-stage companies—startups especially—that slow speed is lethal.
Potential Workarounds
That said, I understand not all businesses don’t have the luxury of turning down self-hosting requests. There are some creative ways to satisfy the control and compliance needs of certain customers without sacrificing too much velocity.
Single-Tenant Cloud
You can run a dedicated instance for each customer in your own cloud environment. Teleport calls this a “private cloud” model. This approach keeps your observability and debugging mostly intact and eliminates your customer’s hardware overhead. You still have multiple instances to manage, but you can generally push updates more frequently than you could with pure on-prem.
Customer VPC with Remote Ops
Sometimes you can deploy into the customer’s VPC and ask for IAM permissions so you can troubleshoot or run upgrades on their behalf. That approach offers them the comfort of hosting in their own environment while still giving you some level of access to debug.
Embedded Engineers
Palantir famously uses forward-deployed engineers who work directly inside customer organizations. This is overkill for most companies but can be worth it for a massive contract. Your team effectively becomes the ops team within the customer’s environment—solid for debugging, but not scalable for smaller deals.
Enjoying content like this? Consider following me to stay updated with new posts and insights.
Will On-Prem Ever Disappear?
I doubt it - though most SaaS founders would wish different. Life is easier when your code runs in one place (your own cloud infrastructure) and you can deploy updates at will. But certain industries—healthcare, military, finance, advanced AI—will always push for maximum secrecy. They’ll say, “you can’t dare see our data,” and that typically means you have to put the actual compute in their environment.
I’ve dreamed of building tools that operate on encrypted data without ever decrypting it, but it’s not generally viable for the average SaaS app right now. We do see some neat hybrid architectures emerging—like Tecton’s or Datadog’s—that split the difference by running data-processing components inside the customer’s VPC while hosting the rest of the logic in a central cloud. It’s a promising direction, but it doesn’t completely solve the fundamental tension between velocity and on-prem demands.
If you’re just getting started, my recommendation is pretty straightforward: either go all-in on on-prem as a key selling point (like Sourcegraph did) or avoid it altogether for as long as you can. The market of customers who don’t require on-prem is huge—often big enough to get you to product-market fit and beyond.
Introducing on-prem too early can slow you down and distract you from making your product great for the majority of your users. Eventually, you might decide it’s worth addressing specific high-paying segments, but only when you have the resources and stability to handle the complexity.
At the end of the day, iteration speed is the beating heart of a healthy software business. On-prem can be a serious drag on that heartbeat. Approach it with your eyes open, weigh the trade-offs, and decide if the revenue is truly worth the slowdown.