IBM Develops AI Agents to Automate Software Engineering Tasks

IBM Research is testing a set of artificial intelligence (AI) agents that both discovers bugs in code found in a GitHub repository and recommendations to remediate them.

Ruchir Puri, chief scientist at IBM, said version 1.0 of the software engineering (SWE) AI agents make use of multiple large language models (LLMs) to automate tasks.
He added that the overall goal is to substantially reduce the backlog of bugs that application developers would have previously needed to manually address themselves.

Instead, an application developer can assign a GitHub bug report by tagging with IBM SWE and the agent will then find the problematic code and suggest a fix that a developer can then review before implementing, said Puri.

On average, the SWE agents can localize and fix problems within five minutes with a 23.7% success rate on the SWE-bench tests. That score places the IBM SWE agent high up the SWE-bench leaderboard, well above many other agents relying on massive frontier models, such as GPT-4o and Claude 3.

IBM is building a series of AI agents, including one for editing lines of code based on developer requests, that makes use of IBM’s Granite LLM found on the watsonx, cloud service. There is also an agent that can be used for developing and executing tests.

In addition, IBM is building an orchestration framework that will simplify creating workflows spanning multiple agents, noted Puri.

It’s not clear just how AI will impact application development, but the role of the software engineers is clearly going to evolve, he noted. AI won’t eliminate the need for software engineers, but it will, for example, enable application developers to spend much less time fixing bugs, said Puri.

DevOps teams are, of course, already making extensive use of AI. A Techstrong Research survey finds a third of DevOps professionals (33%) are working for organizations that make use of artificial intelligence (AI) to build software, while another 42% are considering it. Only 6% said they have no plans to use AI.

However, only 9% have fully integrated AI into their DevOps pipelines. Another 22% have partially achieved that goal, while 14% are doing so only for new projects. A total of 28% said they expect to integrate AI into their workflows in the next 12 months.

It’s now only a matter of time before software engineering becomes a lot less tedious, which in turn should provide application developers with an opportunity to spend more time building new applications versus maintaining legacy applications, said Puri. Most of the tasks that are being automated are not ones that software engineering professionals enjoyed having to do in the first place, he noted.

Hopefully, that will lead to more applications being built and deployed faster than ever. In the meantime, DevOps teams should create an inventory of the tasks that will soon be automated using AI agents. It won’t be long before DevOps teams will consist of a mix of humans and AI agents who are specially trained to complete tasks whenever directed.