Counterfeit Philosophers in Testing
Critical thinking about the writings of humans, whose goals and desires may not align with yours, is just as important as contemplating the impact of generative AI. I recommend reading the original thoughts of Daniel C. Dennett, a widely respected philosopher, in his recent article ‘The Problem with Counterfeit People’, and we’ll see how it relates to Software Testing.
It’s intriguing to observe the cognitive dissonance that arises when discussing AI in the world of Software Testing. Some express both concerns that AI could lead to the destruction of professional societies (like Sofware Testing), even society as a whole, while simultaneously dismissing AI as a mere trick with little practical value. Dennett, however, sees AI not as a trick, but as a potentially perilously good emulation of human thinking and the appearance of semantic understanding.
Some vocal folks in the testing world, who often aren’t even aware of how to test AI-based systems, say that AI cannot test without the constant oversight of human beings. These humans being them of course. These folks often say this without understanding the technology, or, gasp, without even testing it :) A hint that you are reading or listening to such a message, is to look for attempts to redefine existing terms and create new ones. They also often slip and reveal that their first priority is job security, not Truthiness or understanding. They also often loudly claim their technical prowess, often dating to the year 2000, but truly competent folks understand their own weaknesses and under-advertise their skills. It would be OK and we could laugh, if not for the influence this thinking might have on other folks. Then again, if they are susceptible to these arguments, it's not likely they would be doing much else anyway, or even reading this article.
“The irony of Software Testers that claim to be technical, and yet warning against the use of tools is something to giggle at.”
In the world of Software Testing and Quality, a common mantra has been to advocate for the end-user/customer by adopting their perspective when evaluating/testing/checking/rating a product. Skilled testers excel at understanding diverse users and determining whether the system under test fulfills their needs and goals. Although testers strive to emulate users, it’s important to recognize that they are not the users themselves. The best testers possess the ability to empathize with and represent users who are different from them. While human testers play a crucial role, it is worth considering the potential value of an AI system that has read the collective knowledge and experiences of all humans, coupled with its comprehensive exposure to the web of applications. Such an AI should serve as a valuable tool for testers, enabling them to gain insights from diverse perspectives and apply that knowledge to their testing efforts. Many who profess that AI isn’t useful, or even dangerous for Software Testing don’t realize there were actually more people weaving cotton after the invention of the loom. Its ok, the water is warm. Luddites often let fear, rather than logic, data and history, drives their anti-AI narrative.
“Software testers that claim to be scientific should probably be hard at work testing these new AI systems, and skeptical of their own biases.”
Some say that AI cannot test. Tell that to Microsoft’s co-Pilot :) Some will say ‘That is just checking’. Did you catch that? Yeah, that is the strange attempt at redefining a well-understood and technical word. The AI generates what people call “Unit Tests”. Interesting. But — this is obviously dangerous and testers should not use this! These testers are redefining themselves out of relevance.
Danger, Will Robinson! (https://about.codecov.io/blog/writing-better-tests-with-ai-and-github-copilot/)
Most software testers who dismiss the usefulness, even autonomous usefulness, of AI for testing don’t seem to have fully explored the potential value offered by these new AI systems. They are often influenced by confirmation bias — understandable given their professional livelihoods. Nevertheless, it is disappointing when this mindset discourages other testers from embracing new tools that could enhance the profession, the quality of software, and their own careers. It’s important to note that utilizing tools is a differentiating factor between humans and other organisms. Even AI can connect and interact with tools. You can learn more about recent updates on AI’s ability to use tools in OpenAI’s blog post. Oh, and what about user interface testing? Open AI is going multi-modal soon and can combine the processing of images (screenshots) and the LLM.
“I hope more testers actively engage in applying AI — or the developers of the AI models, and products built on them, will just keep moving forward without us.”
Another deep irony is that for all the faults and gotchas with these LLMs (and AI in general), what these models need is great testing so the issues can be fixed. The more concerned a great software tester is, they should contribute to improving the quality of these systems. How many LLM test suites have been created for re-use by these AI-negative testers? Don’t these testers realize that creating a test suite, even ad-hoc rating and testing, is actually training data for the next versions? Have any of these testers taken an afternoon to try and “finetune” / fix a model? All it takes is a simple list of questions and answers that are “correct” — then they could not only test the models but build better LLMs in general. LLMs that answer all their gotcha questions correctly, and learn to mimic their version of Truthiness. There is even a UI to finetune, and all their ‘gotchas’ have either been solved by recent versions of LLMs, or can be easily translated into finetuning data. Many of these anti-AI folks still claim that LLMs are ‘only word predictors’. It can be argued that all their posts are as well — very predictable. Regardless, they must be correct because they understand these AI systems and how to use them, right? Well, they better tell the researchers at Microsoft they are wrong when they say LLMs even have the “spark of general intelligence”.
“Sigh, you’d think testers would test — at least file the bugs they post.”
Regarding those arguments against AI again, I think we can all take the advice of Dennett. What is particularly interesting is that Dennett’s argument doesn’t rely on analogies related to civil rights, warfare, or aliens. Instead, he emphasizes the inherent similarity between AI and human cognition and explains his thinking with a mix of simple and practical examples. To gain further clarity on Dennett’s perspective, I recommend watching his insightful interview on the subject. Dennett’s emphasis on civil and constructive conversations is a notable highlight. Dennet also suggests that perhaps the most dangerous aspect of modern AI is that it can ‘reproduce’ and evolve. Misinformation and self-serving memes in the world of testing are apparently already propagating and evolving.
Dennett points to deception as a core concern when it comes to AI — precisely because of its competence. Exploring his original, non-derivative viewpoints, alongside others, contributes to a more comprehensive understanding of the capabilities, ethical considerations, and consequences surrounding AI and just maybe new ways to think about how to apply it to Software Testing between now and the end of the world.
Yet, people around the testing industry are racing to write about the problems with using AI for testing. There are people posting on the dangers and ineffectiveness of AI-based tools who Try to seem to know what they are talking about by creating counterfeit adaptations of legit thinkers and misapply them to software testing. I predict many people will be taken in by these posts that really only serve the person posting. This will have a short-term effect of making it even harder for professional testers to get work and be heard in this modern world of AI. In the long run, the paranoia about AI will go the same way as all the other historical concerns about bicycles, looms, television, and the internet. Testers that don’t adopt or adapt AI in their work will find that they will have more quality problems as more code is generated by AI, and will need still more people to try to find the bugs, and it will be too late for them.
The best we can do is to make use of AI in responsible ways and call out counterfeit testing philosophers wherever we find them.
— Jason Arbon