In a scientific paper, Google’s advertising team describes how it handled four migration tasks from old to modern code with artificial intelligence: the switch from 32-bit IDs to 64 bit, updating JUnit3 to version 4, timing Added format conversion to java.time and cleaned up unused example code in all repositories.
Advertisement
In all tasks, the team was able to according to paperDue to which about fifty percent gain was achieved in time and effort. The benefit of pure code was greater, but this was offset by the increase in testing effort.
work challenges
There are over 500 million lines of code in the Google Ads repository that the team had to rework. It used Gemini as an AI model, but in a tailored version of the tasks. This means it was trained on Google repositories and had to make changes across all sources. The problem with all tasks was that there was huge variation in the locations to be replaced.
Not only AI was used, but also traditional methods such as abstract syntax trees (AST), heuristics, and search with regular expressions. AST was mainly used to find places that needed to be changed. The change was then made by the LLM itself. Ultimately human review was done just like any code change.
The prompting rules included general instructions about the task and then more specific sub-steps. These were interchangeable depending on the application. The team got help by asking the AI which of the changes it suggested was best.
Journey from 32 bit to 64 bit
Google Ads uses IDs for users, merchants, campaigns, etc. that are in C++ or Java, but not as independent types, but often as integers, but sometimes also as text. The change from 32 bits to 64 bits affected “thousands of places”, including all class interfaces. The files in question were compiled via AST, and the changes were made by the AI itself. However, the Google team released AI only on parts of the code that could be verified by unit tests. The team had to adapt it.
Eighty percent of the fragments changed by LLM were found to be correct in human review. When it comes to errors, AI has generally changed a lot. Here’s a quick example:
{id} should be of type int64_t.
Update the tests to reflect a large id.
Initialize the {id}s with values larger than 10000000000.
If necessary add new test parameters with large ids.
If previous id was negative, new value should be negative.
This tells the AI to customize only the class constructor in which the type should be changed. As a result, LLM also accurately fine-tuned usage in the private sector and within the classroom. The following images show the differences and changes in the test files.
Diff shows changes that the AI completed correctly.
(Image: Google)
The test also shows true changes in AI.
(Image: Google)
Problems switching time API
Other projects had similar experiences. Switching from the old Joda library to the java.time package was difficult because not all types were comparable, for example, joda.time.Interval had no equivalent in Java’s standard time API. Therefore changes require intervention in the logic of the function. Here the authors of the paper have to admit: “Migration is still ongoing and we face challenges in solving it.”
The paper concludes that the best migration route is a mix of old and new methods. “We found that the planning capabilities of the LLM were not necessary and added a layer of complexity that should be avoided if possible.” AST has the advantage of always being correct. AST parsers are also suitable for checking changed code.
The team recommends using smaller, optimized models. “They increase the reliability of the results and reduce the effort required for debugging.” Human testing can be a hurdle because it requires experts. In the future, this can be avoided with better tools – keyword: agents. The paper does not specify this point further.
The paper also said: “Whether this has long-term effects on quality should be monitored.” However, overall, Google plans to further expand the AI migration into the advertising sector.
read this also
(Who)