ChatGPT and the future of coding

Like everyone else, I’ve been playing a lot with ChatGPT, sometimes asking various questions, but mostly coding related stuff. I tried to replicate the experience some people report on Twitter of coding entire apps with ChatGPT and having it fix all the errors that show up. My experience has been more of a whack-a-mole, where I tell it to fix something and then it introduces another bug, or omits an endpoint, and after 10 minutes of this I just go fix it myself. I find that the claims that ChatGPT is the future of coding are somewhat overblown.

But I also used it in a slightly different way, by starting to define a web app that does what I want and then changing the storage layer. First I started off with Deta (free “cloud” for personal projects), then I asked it to move to SQLite. And for this kind of stuff, it does a pretty good job. However, I do have to review all the code and I inevitably run into bugs and regressions.

Using ChatGPT to replace ORMs

But normally, changing the storage layer is a long and boring work: even if you are just switching between various SQL databases, you have to make sure that you were not using some particular database specific language construct or that you were not relying on some implicit ordering in one flavour that doesn’t exist in another one. And this is part of the reason why things like ORMs have come into existence: to abstract away all the differences between SQL databases and allow you to easily change from one to another (with the caveat that it’s still a bad idea once you’re in production and it will most likely result in performance issues).

But if I can just have my AI assistant (be it ChatGPT, Github Copilot, or any other LLM based tool) rewrite all the SQL code (or even move to a NoSQL database), I don’t particularly care about that higher level of abstraction anymore. I’ll just ask Copilot to migrate from MySQL to Postgres, generate the new better indices and use built-in ARRAY columns instead of the normalized tables I used in MySQL (just to give an example).

Making design patterns obsolete

And I’m thinking that maybe in the future many of the coding best practices that exist today won’t be relevant anymore. For example, normally you want to avoid writing duplicate code because if you have to change something in the future, you don’t want to have to change in multiple places (because we programmers are lazy) and because we might forget to change in all the places (so some duplicate code might not be changed so we might introduce bugs). But if I can ask Copilot to change this code pattern wherever it appears, I might not even want to do that anymore, because sometimes code is easier to read if you don’t have to jump between several functions all the time. And Copilot can be really smart about it, changing not only in places where it appears exactly the same way, but also in places where there are some small differences (which are unrelated to the change we proposed).

Previously, this was a problem: if I have a piece of code repeated 5 times, but with small differences in each, do I extract them into a common function, with lots of ifs? Do I create some complicated class hierarchy that allows me to reuse common functionality and to customize the differences? Or do I just simply leave things duplicated and hope I will fix all bugs when they appear? With ChatGPT, that tradeoff might no longer be necessary.

Another quite common pattern (in some programming languages) is dependency injection, which allows you to define at runtime where a certain dependency comes from. In my experience, you rarely want to fiddle with that, with one big exception: testing. Dependency injection makes testing a lot easier, especially in more statically typed languages (Java, C#, etc). But what if, when pushing the “Run test” button in the IDE, ChatGPT could change all the relevant database connections (for example) from the big SQL server to an in-memory database? Then you wouldn’t have to look at ugly dependency injection code and scratch your head to figure out where the heck did a certain value come from.

The future of coding

This is still far off in the future. ChatGPT is still too unreliable to do this at scale. But lots of people are exploring using LLMs for the future of coding (such as Geoffrey Litt) and maybe the next level of programming will be to have some sort of high level templates that we feed to ChatGPT which will spit out code at “build time”. We will be able to tell it that we want to read the code, so it will output something that is easier for humans to read and understand, or we can specify that the target is the computer, so it will output something that can be compiled efficiently.

Of course this will require several things first: having ChatGPT create a test suite first, so that you can verify that the changes are working reliably, finding a good temperature for it to generate code without becoming too creative, and finding a way to work with large codebases (even the new 32K tokens context size is not enough for more than tens of files, and I don’t think the current way of working with embeddings is very good either).

But I believe we have some exciting new ways of working ahead of us! So let’s explore the future of coding!

Image at the top brought to you by DALL-E2 and proofreading done by ChatGPT.

GPT-3 and AGI

One of the most impressive/controversial papers from 2020 was GPT-3 from OpenAI. It’s nothing particularly new, it’s mostly a bigger version of GPT-2, which came out in 2019. GPT-3 is a much bigger version, being by far the largest machine learning model at the time it was released, with 175 billion parameters.

It’s a fairly simple algorithm: it’s learning to predict the next word in a text. It learns to do this by training on several hundred gigabytes of text gathered from the Internet. Then to use it, you give it a prompt (a starting sequence of words) and then it will start generating more words. Eventually it will decide to finish the text by emitting a stop token.

Using this seemingly stupid approach, GPT-3 is capable of generating a wide variety of interesting texts: it can write poems (not prize winning, but still), write news articles, imitate other well know authors, make jokes, argue for it’s self awareness, do basic math and, shockingly to programmers all over the world, who are now afraid the robots will take their jobs, it can code simple programs.

That’s amazing for such a simple approach. The internet was divided upon seeing these results. Some were welcoming our GPT-3 AI overlords, while others were skeptical, calling it just fancy parroting, without a real understanding of what it says.

I think both sides have a grain of truth. On one hand, it’s easy to find failure cases. It’s easy to make it say things like “a horse has five legs”, showing it doesn’t really know what a horse is. But are humans that different? Think of a small child who is being taught by his parents to say “Please” before his requests. I remember being amused by a small child saying “But I said please” when he was refused by his parents. The kid probably thought that “Please” is a magic word that can unlock anything. Well, not really, in real life we just use it because society likes polite people, but saying please when wishing for a unicorn won’t make it any more likely to happen.

And it’s not just little humans who do that. Sometimes even grownups parrot stuff without thinking about it, because that’s what they heard all their life and they never questioned it. It actually takes a lot of effort to think, to ensure consistency in your thoughts and to produce novel ideas. In this sense, expecting an artificial intelligence that is around human level might be a disappointment.

On the other hand, I believe there is a reason why this amazing result happened in the field of natural language processing and not say, computer vision. It has been long recognized that language is a powerful tool, there is even a saying about it: “The pen is mightier than the sword”. Human language is so powerful that we can encode everything that there is in this universe into it, and then some (think of all the sci-fi and fantasy books). More than that, we use language to get others to do our bidding, to motivate them, to cooperate with them and to change their inner state, making them happy or inciting them to anger.

While there is a common ground in the physical world, often times that is not very relevant to the point we are making: “A rose by any other name would smell as sweet”. Does it matter what a rose is when the rallying call is to get more roses? As long as the message gets across and is understood in the same way by all listeners, no, it doesn’t. Similarly, if GPTx can affect the desired change in it’s readers, it might be good enough, even if doesn’t have a mythical understanding of what those words mean.