Jackie Davalos and Dina Bass
When software developer Nikolai Avteniev got his hands on a preview version of Microsoft Corp’s Copilot coding assistant in 2021, he quickly saw the potential.
Developed by Microsoft’s GitHub coding platform and based on a version of OpenAI’s generative artificial intelligence, the assistant wasn’t perfect and sometimes got things wrong. But Avteniev, who works for ticket seller StubHub, was surprised by how ably it finished lines of code with just a few prompts. All he had to do was press the tab key, and Copilot filled in the rest. “Instead of using 15 keystrokes, it took three,” he recalled recently. “It was nice a little speed boost.”
Three years later, and now infused with the latest version of OpenAI’s GPT-4 technology, GitHub’s Copilot can do a lot more, including answering engineers’ questions and converting code from one programming language to another. As a result, the assistant is responsible for an increasingly significant percentage of the software being written and is even being used to program corporations’ critical systems.
Along the way, Copilot is gradually revolutionising the working lives of software engineers — the first professional cohort to use generative AI en masse. Microsoft says Copilot has attracted 1.3 million customers so far, including 50,000 businesses ranging from small startups to corporations like Goldman Sachs, Ford, and Ernst & Young. Engineers say Copilot saves them hundreds of hours a month by handling tedious and repetitive tasks, affording them time to focus on knottier challenges.
Acquired by Microsoft in 2018 for $7.5 billion, GitHub dominates its market and is betting Copilot has the AI horsepower to fight off rival services including Tabnine, Amazon’s CodeWhisperer and Google-backed Replit Ghostwriter. GitHub’s
AI assistant is also a kind of beta test for a host of other Copilots that Microsoft is baking into Office, Windows, Bing and other business lines.
As is true with AI generally, GitHub Copilot has limitations. Developers say it sometimes pulls up outdated code, provides unhelpful answers to questions and generates suggestions that are buggy or could infringe copyright. Because the tool is trained on public and open repositories of code, engineers run the risk of replicating security issues or injecting new ones into their work, particularly if they blindly accept Copilot’s recommendations.
GitHub emphasises that the tool is an assistant, not a substitute for human programmers, and has put the onus on customers to use it wisely. Robust guidelines are required to prevent lazy programmers from simply accepting what Copilot suggests, said GitHub CEO Thomas Dohmke. He expressed confidence that engineers would keep one another honest.
Generative AI is the latest in a long line of innovations that have transformed computer coding over the years. Last century, program compilers accelerated software development by rapidly translating commands into ones and zeros that computers can understand. More recently, Linux popularized open-source coding, letting programmers leverage one another’s work rather than writing everything from scratch.
Coding assistants like GitHub’s Copilot could be even more revolutionary because generative AI holds the potential power to automate large swathes of what software engineers currently do.
For now, it mostly makes them more efficient. StubHub’s Avteniev, who also teaches software engineering at City College of New York, says Copilot’s predictive ability helps programmers stay in “the flow” because they no longer have to stop to look things up. Avteniev has been coding for more than 20 years, but even he sometimes forgets programming languages — forcing him to waste time Googling them. “Copilot stops you from having to exit your current coding process,” he said. “Even when it produces gibberish, it’s still easier to just accept what it does and then correct it myself.”
Aaron Hedges, a developer for more than 15 years, was getting burned out before Copilot arrived. Hedges works for ReadMe, a startup that helps companies create technical descriptions of their application programming interfaces, or APIs. Like Avteniev, he makes good use of Copilot’s auto-complete function. “Because I’m a fairly senior engineer, I can look at that and go, ‘Oh yeah, that looks right.’” He also likes that he can ask questions without leaving his programming window. “I don’t have to shift away and open a browser, which can be really disruptive,” he said.
At $10 a month, a Copilot subscription is a bargain that Hedges willingly pays himself. After work, he builds websites for Dungeons & Dragons fans. With a toddler and another baby on the way, leisure time is precious. “Those two hours I get to myself to code in the evening are super important to me,” he said. “The more efficient I can be, the better.”
Few tasks are more tedious than debugging software — a process that can consume as much as 50 per cent of an engineer’s time. Figma, which helps developers design app or website interfaces, says Copilot can create defect-testing programs in minutes rather than hours. “That is the real value of AI,” said Abhishek Mathur, the company’s vice president of engineering. “It doesn’t replace our work, but frees up our time to develop creative solutions.”
Some companies are starting to deploy Copilot to create code for critical systems. Brewer Carlsberg uses it to write code for an existing tool that helps the sales force plan, prepare for and document sales calls. Mindful of Copilot’s limitations, the beer maker uses its own quality-assurance process to check that the code it has created works as intended, according to Chief Information Officer Sarah Haywood. Eventually, she said, companies will be able to outsource that task as well. “As time goes on, people will build more trust in AI,” she said. “I don’t think we should be having to double-check everything that AI does, otherwise we’re not really adding any value.”
In an attempt to assess the technology’s accuracy, Canada’s University of Waterloo published an experiment last year. Researchers collected a dataset made up of code snippets that had known flaws and the fixes for those mistakes. The researchers prompted Copilot to create these exact snippets to see whether it would spit out the buggy versions. The assistant replicated the flawed version 33 per cent of the time, less frequently than a human. In a quarter of the cases, the AI spit out code with the fix. Copilot generally was better at avoiding basic errors than more complex ones, said Mei Nagappan, a computer science professor at the school and one of the study’s authors. “The analogy here is that we are in an era of driver assist right now, not yet at the self-driving stage,” he said.
Software engineers can be slow to change their work habits. Many welcome Copilot but are wary about becoming too reliant on it. A recent GitHub-funded study found developers accepted the assistant’s suggestions just 27 per cent of the time.
Engineers also can be quick to blame Copilot if something goes awry. When Etsy’s site crashed for short periods last October and December, some of the company’s developers fingered Copilot for the outage. Etsy confirmed the incidents but disputed that Copilot was responsible. “While we certainly understand that engineers may discuss how Copilot could theoretically play a role in outages or issues, we have zero evidence that the tool has actually led to
any customer-facing impacts,” a spokesperson said.
Copilot is expected to improve dramatically in the coming years. GitHub is already rolling out enhancements, including an enterprise version that can answer questions based on a customer’s own programming code, which should help new engineers get up to speed and enable veteran coders to work faster. In the coming months, GitHub also will let engineers use their employer’s own codebase to help auto-complete
programs they’re working on. That
will make the code generated more customised and helpful.
GitHub can’t afford to sit still. At least a dozen startups are looking to disrupt the market. Some are leveraging new models that have dramatically boosted the amount of information code assistants can draw on quickly, making it easier for them to generate entire programs. “An AI programmer that can see all of your code is going to be able to make much better decisions and write much more coherent code than one that can only sort of look at your code through a paper towel roll, a small amount at time,” said Nat Friedman, an investor and former GitHub CEO.
Friedman is backing a startup called Magic AI that plans to create “a superhuman software engineer.” Peter Thiel-backed Cognition AI, meanwhile, is working on an assistant that can handle software projects on its own. Princeton University this month released an open-source model for an AI software engineering agent, and it seems that not a week goes by without a new startup popping up.
In interviews, few coders expressed fears that AI will replace them. As in many industries, they say, automation will free them up to focus on more challenging and interesting tasks. But Jensen Huang, CEO of the red-hot AI chipmaker Nvidia Corp, has a less-rosy perspective. He recently predicted that coding as a career is doomed. Now that AI is making it possible to code in plain English, Huang said, anyone can become a programmer.