Three months ago a new client came to us with a project hosted on Amazon Elastic Beanstalk (EB). Since then we’ve learned a lot about this technology and today I’d like to share some thoughts with you.
Elastic Beanstalk used to be a nice alternative to bare EC2 (Amazon Elastic Compute Cloud) or even Heroku for some time. But today, in times of docker clusters, it is clear that it stayed far behind its competition. I will try to explain why using Python Environment as an example.
WHY YOU SHOULD STAY AWAY FROM IT
EB deployment is started by zipping entire app using eb cli command and uploading it to AWS. This action triggers deployment to requested environment. The zipped app is then downloaded to each target EC2 instance and there deployed. In our case took up to 10 minutes.
The process itself consists of 2 major steps:
- Configuring EC2 instance (stored in .ebextensions)
- Installing app’s requirements (stored in requirements.txt)
Our requirements.txt file is huge and (sadly) includes some github dependencies, so even if all requirements are already installed doing pip install is quite slow.
This could be easily solved by moving installation process out of EC2 (lambda maybe) bundling already prepared virtualenv in zip file itself (this would also solve slow rollbacks problem described in next point)
What can be more frustrating then slow deployments ? You guessed it: slow rollbacks.
Rolling back works very similar to deployments: EB downloads the zip file and installs it. And the worst part is that there is absolutely no caching. So imagine a scenario where you introduced a terrible bug in the newest release and want to do a rollback ASAP… nope.
In typical workflow when you want to introduce something new you have to do it manually on EC2 instance. After you succeed, you have to create conf file in .ebextensions.
- The biggest problem is that you can test .ebextensions only by deploying the app and believe me – it is really frustrating when after 10th deployment you still have some errors.
- Second problem is imposed by limitations in providing file content in files section of a conf file. There are only 2 ways to do it: s3 or inline. There definitely should be way to use files in app that you could just copy.
- Another problem is that you can’t edit wsgi to suite your needs. We wanted to add some rewrite rules in VirtualHost section, not a chance. We ended up doing some nasty workarounds.
NO WAY TO RUN SINGLE PROCESS ACROSS ALL EC2S
If you use autoscale and run more than one EC2 container, there is literally no way to run single process in only one of the EC2 containers.
In our stack (web app + celery workers + single celerybeat) we have to run celerybeat ONCE. We ended up running celery beat on separate environment in order to achieve that. This means that we have to deploy to 2 environments each time we rollout a new version.
ERROR PRONE DEPLOYMENTS
Because versions share the same virtualenv, you cannot remove a dependency, it stays there forever. We had a major problem when we removed large dependency and moved small method from it to our own package with the same name. Do you know what happened ? We got ImportError because Python tried to import it from old location.
ELASTIC BEANSTALK DOCKER ENVIRONMENT
There are two versions of it: Single Container Docker which runs single docker container on each EC2 and Multi Container Docker which runs same set of containers on each EC2.
This could solve some problems that EB has, like less error prone deployments but it still is not really extensive and future-proof solution.
Are there any other alternatives ?
Yes. One of them is Kubernetes backed with something like Deis to get heroku-like deployments. The downside of this solution is complex maintenance if you don’t have proper devops experience. However Google already has a managed Kubernates clusters.
We are going to start evaluating one of them soon, so stay tuned for new blog post.