The Utility of Today's AI
Last month’s article foreshadowed this article’s topic a bit with the sources cited at the bottom. ChatGPT generated the Rust code snippets after I told it “Write a Rust queue using a linked list.” I had to specify that it use a linked list, otherwise ChatGPT would have imported the Rust queue library which uses a less efficient vector implementation. DALL-E generated the image when provided the title of the article, “Rust queue performance.”
For the past few months I’ve been working to find a way to use the explosion in new LLMs, or Large Language Models. I’ve tried ChatGPT, Bard, DALL-E, Studio Bot, and others. My first attempts were just to help me in my day-to-day job. “Write unit tests for this class”, “What’s causing this runtime error?”, “Tell me how to fix this compiler issue.” Unfortunately LLMs proved to be woefully incapable. The problem always ended up being context. LLMs only have access to the code that you put directly into their prompt. In projects comprising hundreds of thousands of lines of code, this simply wasn’t possible. The code that it generated would never compile as-is, and had to be carefully re-written before it could even start to work. After a dozen different attempts I ultimately concluded it was easier just not to use LLMs for these cases. This was particularly frustrating for the Studio Bot plugin for Android Studio. Having it integrated directly into my IDE I assumed it would be able to reference the rest of the project when offering solutions. Unfortunately it ended up just being a text query directly to the Google Bard API, with the same limitations as all the other LLMs.
My next attempts were closer to the live demos at Google IO 2023. I tried using ChatGPT and Bard to write emails. The generated responses were mediocre and derivative, representing the text equivalent of “uncanny valley.” Associates that received the emails could immediately tell that they were generated by an LLM, even when they had never previously received writing samples from me to compare against. Essays were similarly hamstrung, and given the manner in which LLMs are built, couldn’t include references. So using ChatGPT for long-form writing was out.
Its only been recently that I’ve begun to find use cases for these new technologies. As someone who writes regularly, writing very much represents a skill that I’d like to consciously improve upon. The GRE has an analytical writing section which is graded from 0-6 in half point increments. They also provide prompts and example essays with the scores that those essays received. I fed ten of those sample essays into ChatGPT and asked it to grade them and to provide ways that the essays could be improved on. Every ChatGPT grade given was within a half point of the official score, and the suggestions for improvement were sufficiently specific to be able actionable.
As LLMs continue to improve and I get better at writing prompts, I hope to find more use cases in which they provide a particular utility. I’ll let you know as I find them.comments powered by Disqus