New Sheriff in Town
Having hopped on the CI/CD train early, starting out most of my work was with introducing a client to CI/CD and standing up an initial implementation. Later as DevOps and CI/CD matured, it became not uncommon to have to take over a failed or barely limping CI/CD implementation. The T word the title refers to is TRANSITION. This article intends to discuss some of the pitfalls and gotchas to inheriting someones CI/CD infrastructure. Why must we deal with some other persons mistakes? Typically after having spent X dollars, where X may be large, it may be politically infeasible to ditch the current bad way of doing things. To satisfy this scenario I recommend taking a more adaptive approach to resolving the situation. So congratulations, you’ve won a new CI/CD contract, you are the new sheriff in town, and you are stuck with a mess, what do you do now?
Unless things have gone really badly there will be a handoff from the old DevOps team. This handoff is usually in the days or weeks in terms of time. Depending on the size and complexity of the implementation you need to ask your questions carefully since this is usually not enough time to gleen everything from the incumbents. Here I will give 2 pieces of advice, do not criticize the outgoing teams implementation in any way, and prioritize learning the path of how the application gets from code to production. The rationale behind the first item is simple, you need the information the out going team has and criticizing their implementation makes them less likely to willfully give you the information you need. Remember, the short comings may not be their fault, the previous team may have been limited on time or resources, or any other number of factors and opening old wounds will not encourage them to share what they know with you.
Path to Production
Ferreting out this information is important for many reasons, and it will have to be done no matter what. Having the old team guiding you through the process is much easier than reverse engineering. What exactly is the path to production? Well it is different for every organization, but the fundamentals are the same. Source code, through some process, is transferred to a production machine. In very immature organizations this can be as simple as a file copy from the developer machine. In more advanced CI implementations there may be gates such as static analysis, automated testing, and binary packaging that take place. What ever the process is, make sure the outgoing team walks you through step by step before they go. This is the perfect opportunity to create a wiki to document the process.
Owning the Process, Warts and All
The first step to fixing a problem is understanding it. You have been brought in to correct current CI/CD deficiencies, but you only have an idea of what they are second hand at this point. After you get as much information as possible from the outgoing team, and learn the path to production it is time to put all your wiki’s and notes to the test. What has worked for me in the past is working with project management to have you or your team introduce a low impact change, or story in the agile parlance, into the production environment. In the ideal scenario, you have the expertise of the previous team standing at the ready, but you make every attempt to introduce the change by yourselves. After walking through a small release, and possibly making a few mistakes two things will happen. First, you will have gained the confidence to keep the current process moving. Not stopping forward progress can be very important for client credibility as well as reducing the overall cost of your introduction to the project. Secondly, you will learn first hand what all the flaws are.
Manual Steps, One of the Warts
True CI/CD is supposed to be highly automated. Unless you are Netflix or Google it is highly unlikely there is true full automation. Usually, there is some minor but critical set of tasks, that are not completely obvious which are manual. One of the advantages to owning the process is exposing these little warts as quickly as possible. But as described, owning the process will not expose all the little warts. Since it only focues on the path to production many other IT processes are missed. Upgrading the platform application stack, OS upgrades, hardware replacement. While you should focus on the path to production first, see if the outgoing team can address these items in some way before they leave. At best there is a script they forgot to tell you about originally, at wort you have a roadmap for future work.
Changing the Tires at 90mph
By this point you or your team should have a minimal idea of how the old CI/CD infrastructure works. The next step is incremental improvement. This approach fits very well with the agile development philosophy. Internally it is time to decide what to change first, hopefully for the better. My advice it to pick several items that are visible to both management, the CI/CD consumers (developers and testers), and to start small. This is where maintaining the CI/CD pipeline sets us aside from normal developers, just because our services run in dev, doesn’t mean they have dev SLA’s. Even though there may be major pain points in the CI infrastructure, if you pick a low risk item and do it right it is more beneficial than picking a high risk high reward change and having a troubled roll out. This technique is similar to Owning the Process, except now you are dealing with the internals of the CI infrastructure, not just rolling code to production.
Since your time and interactions with the outgoing team may be limited be choosy about what questions you ask for transition. First, figure out how code gets to production. Second, figure out how to own that process.Third, find all the manual steps and understand why they are manual. Finally, when you understand what the system is doing try to affect change for the better by starting small.