Some wrap up of experience of past two days. I tried to automate deployment of kubeflow on to a self-built kubernetes cluster with ansible. 5/7 experience with partial success so far.
I generally like the idea of kubeflow, which on contrary to Sagemaker lets me work and test small things out while not being constantly connected, which feels great while on a train or a plane. I wanted to play more extensively with it so I could not only share my data-science related experience, but also understand if deploying it is easy enough to be handled by a product team and won’t burden some dedicated in-company platform people.
Good pieces
Ansible got much better since last time I went through its docs to get a general feeling on how things are working. The modules became more idiomatic, there’s much less need to plug raw commands. Some interfaces became nicer (writing a list of packages to be installed with apt makes more sense than doing it through with_items
). Asserts seemed a great idea, but after some time I found them more distracting than helpful. Inventory plugins are really cool. No more need for a separate magical scripts to get the list of host. Yet, if you’re not a AWS or GCP user, you might get unpleasantly surprised with the quality of those, scaleway plugin works, but feels unpolished and needs experimentation to make things go as expected. In the end I managed to create a flow of playbooks creating me a set of machines and then provisioning those, which looks like one step before a pretty nice scaling automation.
Setting kubernetes to the point of it being able to run some pods was easy, kubeadm does all the work for you. I felt, it was the only easy part about kubernetes :)
Challenging pieces
I understand, that most of it is about me not reading enough about some specific technology/library/component.
- Kubernetes documentation is visually nice. Content-wise it needs a lot of improvements and more thorough explanations. I don’t like parts where you are supposed to take some command and run it just because you were told to. Because of that I had to go through pieces of cgroups documentation, which was not bad, installing kubernetes implies good knowledge of linux ecosystem, but the part about network add-ons is mind-breaking, as you’re given 6 alternatives without a proper decision strategy to choose one over the others and next to it you’re given a link with 10+ more, 1-liner description for each. Yet, going with this magic made things work.