I hope this question isn’t off topic or too general because it isn’t specific to your product, but I just read your article about why you use terraform/docker instead of CM.
The whole immutable infrastructure thing works fine for your stateless app servers, not so much for data platforms like DBs. I can’t blue/green a multi-terabyte DB deployment. I can’t throw that away for each config change! Even backing up and restoring that volume of data just doesn’t make sense at scale.
Do you have a way of making this immutable? Or the the solution here to just pay AWS or a cloud vendor mucho $$ to handle this for you?
There are a few different ways to approach this issue:
Always run replication for your data store(s) so that you have at least two copies of the data, one primary, and one standby, and do deployments by rolling out new code to the standby and routing all traffic to it once the new code is deployed. Maintaining an extra replica is useful anyway in case of failover, and this way, your backups are being constantly tested. Of course, there are availability & consistency trade-offs with whether its a synchronous or asynchronous replica, but this would be the “pure” immutable infrastructure approach.
Separate the data from the code. For example, with AWS, your database could store its data in an EBS volume. To do a deployment, you could deploy a new server with your new code, detach the EBS volume from the old server, and attach it to the new server. This requires no copying/replication of large data sets, but it does require a brief downtime (at least for writes) while you do the switchover.
Use a distributed data store. If you are using a data store that doesn’t live on a single system (e.g., a NoSQL DB instead of a relational DB), and all data is replicated across multiple servers, then you can do a zero-downtime, immutable, rolling deployment. For example, Kafka typically replicates each piece of data across at 2 or 3 brokers. To do a deployment, you can take 1 broker out of rotation, deploy a new server to replace that broker, wait for the server to either (a) get the data it needs via replication or (b) get the data it needs by attaching the EBS volume from the old broker, and then repeat the process for each of the other servers in your fleet.
Only use the “immutable infrastructure” approach for stateless services. Remember, the end goal isn’t immutable infrastructure itself, but making your infrastructure easier to maintain and reason about. If the vast majority of infrastructure is immutable, but a small, rarely-updated part is mutable (e.g., the database), that’s OK! You’re still getting a lot of benefits and as long as you’re aware of the trade-offs you’re making, this can be a pragmatic and reasonable approach.