Welcome to The Long View—where we peruse the news of the week and strip it to the essentials. Let’s work out what really matters.
This week, a conundrum: On the one hand, researchers say ChatGPT is losing the plot; and on the other, we hear outsourced coding jobs in India will be replaced by AI.
1. ChatGPT has Gotten Much Worse, Supposedly
First up this week: Researchers say OpenAI’s LLM is giving far poorer results than it did a few months ago. Notably, it’s producing buggier code.
Analysis: Questionable—but the point is: We don’t know
Reviewers of the research point to methodology problems with the study. But the bigger issue might be that “Open” AI is anything but.
Benj Edwards: Study claims ChatGPT is losing capability
“Left stumbling in the dark”
Researchers from Stanford University and University of California, Berkeley [uploaded a preprint] paper that purports to show changes in GPT-4’s outputs. … Using API access, they tested the March and June 2023 versions of these models on tasks like … code generation.
…
[It] fuels a common-but-unproven belief that the AI language model has grown worse at coding and compositional tasks over the past few months. … Popular theories about why include OpenAI “distilling” models to reduce their computational overhead, … training to reduce harmful outputs, [and] conspiracy theories such as OpenAI reducing GPT-4’s coding capabilities so more people will pay for GitHub Copilot.
…
Meanwhile, OpenAI has consistently denied any claims that GPT-4 has decreased in capability. [But] with a closed, black box model like GPT-4, researchers are left stumbling in the dark trying to define the properties of a system that may have additional unknown components, such as safety filters, or the recently rumored eight “mixture of experts” models working in concert.
Katyanna Quach: LLMs are getting dumber
“Be wary”
The team … examined both models’ coding capabilities … on a list of 50 easy programming challenges taken from the LeetCode set. A response containing bug-free code that gives the correct answer is considered directly executable code. The number of directly executable scripts generated by GPT-4 dropped from 52% to 10% over the same period.
…
[They’re] warning developers to test the models’ behavior periodically in case any tweaks and changes have knock-on effects elsewhere in applications and services relying on them. … Businesses relying on software like OpenAI’s technologies to power their products and services … should be wary about how their behaviors can change over time.
It’s definitely getting worse at coding, thinks r3trohack3r:
During the early access program [I gave] programming tasks [to] GPT-4 regularly. [But] the GPT-4 I use now feels like a shadow of the GPT-4 I used [then]. GPT-4, back then, ported dirbuster to POSIX compliant multi-threaded C by name only. It required three prompts.
…
Most programming tasks I threw at it, it could work its way through with a little guidance. … Now, it’s basically worthless at helping me with programming tasks, beyond trivial problems.
Regardless of whether GPT is losing the plot, Simon Willison points to the bigger issue:
Honestly, the lack of release notes and transparency may be the biggest story here. How are we meant to build dependable software on top of a platform that changes in completely undocumented and mysterious ways every few months?
Don’t believe the “natural language processing” hype, says Matthew Slyman:
OpenAI/ChatGPT is only one of many AI NLP systems. Their products are good, but better systems exist. Broaden your horizons! These systems have been under development for ~10 years (from ~70 years prior research). Progress is getting faster.
…
Every few generations of technology, we will see new capabilities emerge in AI/ML systems. Otherwise progress will be incremental. We have just witnessed a step change of this type for NLP. The last major “step change” was 10 years ago. [But] where are all those self-driving cars we were promised?
…
No, ChatGPT won’t make 90% of jobs obsolescent within 6-12 months! … Most folks hawking services on social media as AI/ML experts and “thought leaders” are charlatans.
Wait, you still think OpenAI actually has a genuine AI model? NinjaNerd56 isn’t fooled:
It’s a roomful of interns, jacked up on Red Bull and high-THC weed.
Neither is this Anonymous Coward:
The Mechanical Turks are getting fed up and starting to refuse the gig, the next bunch picking up the slack have different areas of expertise. What, you believed all that ******** about LLMs?
2. India’s Dev Jobs Gone in Two Years, Says PHB
Speaking of alleged charlatans, the pointy-haired boss of Stability AI is again making headlines with outlandish claims. This time, he’s saying most outsourced development jobs in India are toast.
Analysis: Garbage in, garbage out
CEO Emad Mostaque’s reasoning is that LLMs can code just as well—if not better. This hilariously dim PHB-PoV only demonstrates his lack of understanding of LLMs and coding.
Ryan Browne: Most outsourced coders in India will be gone in 2 years due to A.I.
“Under threat from the impacts of advanced AI”
Mostaque said that most of India’s coders will lose their jobs. … On a call with UBS analysts, [he] said … it is now possible for software to be developed with far fewer people: … “These models are like really talented grads.”
…
“Why would you have to write code where the computer can write code better? When you deconstruct the programming thing from bug testing to unit testing to ideation, an AI can do that, just better,” Mostaque said.
…
India is home to more than 5 million software programmers, who are most under threat from the impacts of advanced AI tools like ChatGPT. … Asia’s second-largest country is a prime location for companies that outsource back-office jobs and other roles overseas.
Doesn’t sound like a long term solution, thinks furry_wookie:
This will be fun. AI just copies other peoples work and reformats it—it does not “create.” If AI takes over and no one is ever coding again … there will never be anything new for the AI to copy/steal. The entire world will come to a halt, with no progress made—no new works, no new inventions, no new programming languages, no new databases, no new platforms or frameworks—nothing.
But this Anonymous Coward foresees a dev employment collapse in India for different reasons:
One smart engineer with Terraform and AWS can do the work of 100 outsourced warm bodies from companies like these doing it manually. Their business model is obsolete.
The Moral of the Story:
It takes 20 years to build a reputation and five minutes to ruin it
—Warren Buffett
You have been reading The Long View by Richi Jennings. You can contact him at @RiCHi, @richij or [email protected].