You'd think that this topic would have been done to death, but given that *every* job I've started in the past 10+ years has used …
I've been writing code professionally for 24 years, 15 of which has been Python and 9 years of that with Docker. I got tired of running into the same complications every time I started a new job, so I wrote this. Maybe you'll find it useful, or it could even start a conversation, but this post has been a long time coming.
Update: I had a few requests for a demo repo as a companion to this post, so I wrote one today. It includes a very small Django demo user Docker, Compose, and GitLab CI.
Hey OP, it looks like you're the author of the post? If so I'm curious how you handle cloud services like AWS or Azure when taking this approach? One of the major issues I've run into when working with teams is how to test or evaluate against cloud services without creating an entire infrastructure in the cloud for testing.
For AWS, my favourite one is LocalStack, a Docker image that you can stand up like any other service and then tell it to emulate common AWS services: S3, Lamda, etc. They claim to support 80 different services which is... nuts. They've got a strange licensing model though, which last time I used it meant that they support some of the more common services for free, but if you want more you gotta pay... and they aren't cheap. I don't know if anything like this exists for Azure.
The next-best choice is to use a stand-in. Many cloud services are just managed+branded Free software projects. RDS is either PostgreSQL or MySQL, ElastiCache is just Redis, etc. For these, you can just stand up a copy of the actual service and since the APIs are identical, you should be fine. Where it gets tricky is when the cloud provider has messed with the API or added functionality that doesn't exist elsewhere. SQS for example is kind of like RabbitMQ but not.
In those cases, it's a question of how your application interacts with this service. If it's by way of an external package (say Celery to SQS for example), then using RabbitMQ locally and SQS in production is probably fine because it's Celery that's managing the distinction and not you. They've done the work of testing compatibility, so theoretically you don't have to.
If however your application is the kind of thing that interacts with this service on a low level, opening a direct connection and speaking its protocol yourself, that's probably not a good idea.
That leaves the third option, which isn't great, but I've done it and it's not so bad: use the cloud service in development. Normally this is done by having separate services spun up per user or even with a role account. When your app writes to an S3 bucket locally, it's actually writing to a real bucket called companyname-username-projectbucket. With tools like Terraform, the fiddly process of setting all this up can be drastically simplified, so it's not so bad -- just make sure that the developers are aware of the fact that their actions can incur costs is all.
If none of the above are suitable, then it's probably time to stub out the service and then rely more heavily on a QA or staging environment that's better reflective of production.
A CLI to simplify local vs docker vs cloud operations. Reduces chance of operator error. Have had good luck with python click library.
Moving config settings into separate JSON and .env files to avoid loading too many config and secrets in the docker-compose file.
For AWS, I'd go with CDK. That way, cloud deployment is all in python (or typescript).
For cloud, you can also package Django into a single lambda, with dependencies inside a lambda layer. Not sure I'd use it in heavy production, but for small apps, really handy.
Inside Django settings, you can switch DB and services whether running local (sqlite, Redis), docker (postgres, RabbitMQ), or cloud (RDS, SQS).
I don't mean to be snarky, but I feel like you didn't actually read the post 'cause pretty much everything you've suggested is the opposite of what I was trying to say.
A CLI to make things simple sounds nice, but given that the whole idea is to harmonise the develop/test/deploy process, writing a whole program to hide the differences is counterproductive.
Config settings should be hard-coded into your docker-compose file and absolutely not stored in .json or .env files. The litmus test here is: "How many steps does it take to get this project running?" If it's more than 1 (docker compose up) it's too many.
Suggesting that one package Django into a single Lambda seems like an odd take on a post about Docker.
I did read the post, but I assumed it was the starting point of a system or mechanism, not the end-point. Wanting to just run "docker compose up" is fine, but there is more to developing and deploying to production (and continuing post-launch).
That's why I mentioned the CLI. It lets you go from a simple local app (Django on sqlite) to a Docker one (postgres, celery, redis, etc.), to all the way out to the cloud (ECS/EKS/serverless lambda/RDS), without having to remember what commands do what or managing lots of separate docker-compose files.
I can see we are VERY far apart on how docker should be used in moving toward a production-ready system.
For one thing, recommending putting secrets inside docker-compose is an instantly disqualifying piece of advice. There's a whole 'secrets' section of docker compose that is there to prevent people from inadvertently including those in cleartext and baking them into images: https://docs.docker.com/compose/how-tos/use-secrets/.
Github itself has a secret scanning mechanism to prevent leakage: https://docs.github.com/en/code-security/secret-scanning/introduction/about-secret-scanning. For gitlab, there's also Blackbox or HashiCorp vault. Putting AWS key/secret inside a repo can be VERY expensive and open one to legal liability if the account is misused. Repeated infractions could lead to AWS banning one's account.
I really recommend you take down that part of your post, instead of proliferating bad practices.
Nice. You should check out devcontainers if you haven’t already. Maybe it deviates a little from the dev/prod parity idea, but you can use it with a compose file like you described. It’s saved my current team quite a bit of headache in maintaining local dev environments and keeping everyone in sync as the project evolves.
I either missed it or it isn't in the "developer tools" section: how do you connect this to an IDE or editor with an LSP or DAP? The image might have python:3.12 but locally you only have python:3.6 mind you, so it's not something one can ignore. How do you handle this?
It is not realistic to replicate a production setup in development when you're working with sensitive user data. I've worked in different contexts (law enforcement, healthcare, financial services) where we've had complicated setups (in one instance including a thing called pre-staging environment), but never would a sizeable team of developers have access to user data, and thus to a realistic setup in terms of size, let alone of quality of data.
Just trying not so confuse realistic testing with self-deception :)
Not convinced testing with synthetic data can pretend to be similar to a production environment.