(2023-04-10) ReThyncing DevOps ------------------------------ This summer, I'm about to reach the point where I have already been working for 7 years as a DevOps engineer, whatever it means to you. Yes, the term "DevOps" still doesn't have a commonly understood definition, which is a problem by itself because customers, vendors and contractors often think about different things when they even hear this term, not even to mention when they actually try to build any kind of business relationship on this ground with each other. Some folks perceive DevOps as a bunch of sysadmins on steroids doing cloud deployment automation and/or SRE kind of support, some view them as the guys who actually help development teams organize CI/CD and release management processes, some even think that there's no such thing as "DevOps engineer" at all and it is just a set of common practices and methodologies used by whatever specialists the teams already have, and a new job position isn't necessary to implement them. Well, the truth is, as IRL I had to do all this stuff for different customers, I have to agree with all these points, even with the last one stating we don't exist. Why? Because in an ideal world, we wouldn't. In an ideal world, developers would know how to write code that's easy to deploy and scale, and sysadmins would know how to offer developers a way to deploy and scale this code with little to no human intervention. But we don't live in an ideal world either, that's why DevOps engineers exist. And the practices we employ are also far from ideal. There are, however, several major issues we could have avoided en masse (and on the individual/team level, anyone still can) that have deeply rooted into the DevOps discipline as a whole during all these years as this discipline started to encompass more and more practices from different areas of IT, and these issues just won't go away unless a conscious collective effort is made to eliminate them entirely. The first and probably the biggest major issue is some totally blind assumption that everything that starts small _will_ grow large. Hence all the premature project complexity growth for the purposes of scalability that might never become necessary. Writing a simple network service script that's reasonable enough to put behind a properly configured xinetd and that's it? Hold up a second: - "No, you don't know where you're going to run it from, containerize it!" - "No, you don't know what the workload is gonna be, put your service container into a Kubernetes cluster!" - "No, you don't know how many machines you'll need, put your cluster into an autoscaling host group and configure load balancing on the cloud side too!" - "No, it's considered bad practice to do it manually, use IaC tools like Terraform or CloudFormation!" - "Oh, and don't forget to use Helm charts instead of bare Kubernetes manifests because the latter will drive you insane!" - "No, you can't just maintain your service code and IaC files at one place, they need to be in separate repos, and... oh, why didn't you already put everything in Git, it would make your life so much easier!" - "Fine, you have everything in Git, but where's your CI/CD process? Write a Jenkinsfile or .gitlab-ci.yml or something already..." - "What do you mean you're running Jenkins directly on the host? You have to create a corresponding autoscaling cluster for its worker nodes, and then also maintain it within your IaC repo, and make two jobs where one builds your service and puts it into the artifact storage (try Artifactory), and the other deploys it onto your cluster from there, and..." Fuck. Off. I'm rsyncing my script onto the server along with the xinetd.conf. And it works without any of your resource-hogging bullshit. IF and when I need to build a scalable system, I'll build a scalable system. I know how to build them without your "best practices". The second major issue arises as a direct consequence of the first one: making everything VCS-centric instead of human-centric. Despite the name, version control systems are now being used for every single aspect of development except the actual version control. Now, Git itself is still great for the purposes it was initially created for: to simplify collaboration on the project code for entire teams. And for these purposes, it's easy to see why it surpassed CVS, Subversion, Mercurial etc. And I might not like every aspect of it ("you made a typo? screw you, you can't override your already pushed commit unless you have an explicit permission to force-push, so everyone will see from now on that you made a typo"), but for TEAM collaboration on SOFTWARE code, I guess nothing better exists yet. However, there are cases where I consider Git (or any VCS, for that matter) a bit overkill: 1. Personal projects. If you work on the code alone, rsync over SSH is fine enough for both remote storage and deployment. If you need to preserve the version, just make a tarball of your code tree before further editing, name it accordingly and back it up in a reliable place. Anyway, lots of small files with unreadable metadata in .git must die. I publish my projects on various Git repos (started with GitHub long ago, then moved to GitLab, now on SourceHut) more out of courtesy to those who might be ready to fork them than out of personal necessity. And mind you, I started doing this when I was too poor and uninformed to order a VPS I can trust at least to some extent. However, now that I have it, with all the recent repo service owners' craziness (like "no-crypto" policy on SourceHut and all the SJW madness on GitHub/GitLab), instead of increasing the entropy by installing Forgejo/Gitea or whatever on my VPS, I'm actually considering to stop doing this altogether and provide a simpler access interface to all of my source code trees or tarballs instead. Maybe even via Gopher, why not? 2. Documentation. Unless we are talking about code comments or docstrings that various languages support, any other documentation must be separated from the code entirely. And the process to collaborate on it is much better done on wiki-type resources or real-time collaborative editors. I won't endorse anything of particular here, but there are several cool FOSS options to do this. 3. Infrastructure as code. Again, there is nothing wrong about describing infrastructure with nice, readable, declarative syntax. I fully support this movement. What I don't support is putting it onto Git. Repeat after me seven times: infrastructure is not software. Being able to describe it as code doesn't make it software. Software development practices don't apply to infrastructure. Files that describe infrastructure can never be subject to large team collaboration, or you WILL run into a disaster sooner or later. The third major issue is trying to containerize everything. Containers are not a bad thing per se, on x86 systems I use them for a lot of things myself, and even this host actually runs many containerized web proxy/gateway services behind Traefik. However, in my case it was a necessity because I probably couldn't have gotten these services to work all at once on the host system otherwise. And this is the problem: containerization discourages service developers to write simple, interoperable, portable and secure code. They no longer have to test it anywhere except a single version of a single Linux distro of their choice. Instead of trying to reduce external dependencies as much as possible, they encapsulate all of them into the image and tell us "here, you're not going to have any hiccups if you run this as a whole thing". Instead of providing command-line options for us to be able to specify where to look for all the configuration and data, they gladly hardcode these paths because all the customization can be done via volume mounts anyway. Instead of learning to work within limited user permission boundaries, they get root access to everything within the container and manipulate various directories they don't really need to touch, not to mention opening privileged network ports they don't need to use. Containerization is great if you know what you're doing, and only if you can write a proper service without it, but you just need to isolate all internal networking (if your services, for some reason, can't be designed to conduct all internal communication via Unix domain sockets instead) and/or ensure particular versions of particular dependencies you really cannot get rid of. In any other cases, it is the opposite of efficient and also creates totally unnecessary entities ready to be shaved off with the Occam's razor. Finally, the fourth major issue is a neverending stream of buzzwords aimed to conceal the truth. I could babble about them all day, but my most favorite example is "serverless". Lolwut? Serverless? Does it mean any peer-to-peer architecture? No, it's a glorified name of a fucking function-as-a-service (yes, FFaaS, now THAT should be the term!) that still runs on servers and consumes extra resources you're paying for. And no, the AWS Lambda name is in no way better. Historically, the word "lambda" came from lambda calculus where it denoted _anonymous_ functions, and it has been meaning anonymous functions henceforth. In AWS though, every "lambda" has a name, well, how would you run it otherwise? I love my job, and if I quit it, I think I'll quit the sphere of IT entirely. But, after seven years of experience in the DevOps world, I fully and firmly understand that the only buzzword and methodology it desperately needs right now is the KISS principle. Keep it simple, stupid. --- Luxferre ---