What CI/CD practitioners know that ML engineers don’t… yet
With the accelerated adoption of AI in the software stack, we at CircleCI have seen our users take on new challenges and complexity. On top of the technical challenge, we also see the arrival of ML engineers and data scientists in the picture as a new layer to the cross-team collaboration complexity that organizations have to handle. Model evaluation, data quality, and operations are no longer restricted to a small set of experts or isolated systems; every software engineer is navigating the shift from deterministic to probabilistic applications.
And of course, the pressure to ship products fast and efficiently is greater than ever.
ML engineers are also finding themselves more involved in the DevOps process. This is not always an easy road, as they have to hit the ground running with model deployment, or even full involvement with the whole software delivery pipeline.
In the DevOps ecosystem, we believe that most developer tools are struggling to genuinely solve for the newfound complexity of CI/CD for AI-powered applications. For the time being, few members of the DevOps ecosystem properly address it. Including us.
Here’s why.
The world of machine learning has evolved relatively separately from that of DevOps, with different practices. But with the rise of large language models, those worlds are colliding. We are now collectively trying to catch up. For the past year, CircleCI has been focused on to bringing ML model builders the same speed, efficiency, and confidence in shipping their work that we were able to provide to software engineers through 10 years + of experimentation with DevOps tooling and best practices.
For us at CircleCI, this new use case meant questioning the core assumption of CI/CD platforms. CI/CD platforms are all focused around the VCS and they all follow a linear flow - of building, testing, releasing and deploying - for each code change committed to the repository.
But AI brings a new paradigm, one where change lives outside of the VCS. Events happening in the dataset or model registry do not result in a commit to the repository, and yet, they impact the application’s behavior. We would consequently want to give ML engineers the same control on testing and deploying the application that developers require for an edit in the codebase. We used to treat models as mostly separate from the application and only used in a few bespoke systems, but now AI is becoming as much a part of the stack as a database.
That's why we started to think of how we should shift our CI/CD platform to better accommodate the builders of AI-enabled applications. We first tried to figure out how to let users of model registries automatically execute a comprehensive set of tests to control the performance of the models and datasets they leverage.
This work resulted in designing (newly released) inbound webhooks that can connect the CI pipeline to events happening outside of the VCS. With this release we moved away from the old paradigm, where users would need to trigger their pipeline manually for every change, to a new system, where these changes could automatically trigger a CI workflow.
Since the release of inbound webhooks in December 2023 we monitored usage and immediately saw our users trigger workflows from ML platforms, including - you guessed it - Hugging Face. Indeed, one of the first use cases that surfaced was the need for Hugging Face users to be able to trigger model evaluations anytime a new version of their model is published on the platform.
In 2024, we are focused on accelerating our shift to enable automated and easily scalable integration and deployment for AI models and AI-enabled applications.
If you want to see how to use webhooks with Hugging Face this tutorial will show you how to trigger a pipeline for the update of an image detection model hosted on Hugging Face. These webhooks are available to all CircleCI users who authenticate via GitHub App. Beyond the Hugging Face events you can play with our webhooks and create events that are relevant to your particular workflow and deployment process from any platform including Weights and Biases, Docker Hub, or Replicated.
Once we eliminate the need for manual pipeline triggers and make feedback loops faster, we want to double down on enabling model deployment. The first milestone was our Amazon SageMaker integration and we are working to extend to other platforms.
The rise of AI in software development has brought fresh challenges to CI/CD processes. ML engineers, now integral to cross-team collaboration, face the demand for fast software delivery. However, most CI/CD platforms have not grasped the nuances of AI applications.
We hope that we will support an increasing number of Hugging Face users and we are eager to get feedback regarding these webhooks and all our work to tackle the fresh challenges to CI/CD processes that the rise of AI brings to our world.
As ML teams join the effort on fast software delivery we are committed to continuing to ease the DevOps processes for them.
TL;DR
- Hugging Face users can now automatically trigger a pipeline to run their tests for every update of their Hugging Face model or dataset
- Deploying models endpoints on SageMaker with canary or blue/green deployment is now possible on our platform
- We launched a course about automating testing for LLMOps with Deeplearning.ai
- We regularly release new work to support AI-enabled applications delivery pipelines - all updates are published on this product newsletter