1. You frame your paper as a response to what you call ‘the problem of bot speech’. Can you start by explaining what this problem consists in? What are bots, and why is their speech seen as puzzling? People engage with chatbots as though they produce meaningful speech but most of our standard accounts of what makes speech meaningful suggest that the bot’s outputs are meaningless. This is because it doesn’t look like chatbots have communicative intentions (e.g. Grice, 1957), follow social conventions (e.g. Lewis, 1975), or are appropriately causally-linked to the world to produce meaningful speech (e.g. Putnam, 1981). The problem of bot speech is the challenge of figuring out what to do about this tension. As I saw it at the time, one could either stick with our traditional theories and say that bots were ‘stochastic parrots’ and that people were ‘deluded’ if they took them to produce meaningful speech (sometimes called 'the ELIZA effect'). Or one could reject our standard theories and say that chatbots do produce meaningful speech and we have just been very wrong about meaning in the past. This may be true, most philosophical theories are false, but these theories have proven useful in explaining a bunch of phenomena in semantics and pragmatics as well as in signalling-games more generally, and I think we should be cautious about discarding them because systems designed to simulate human conversations can successfully simulate human conversations. When I wrote the paper, I took ‘chatbot’ in a wide sense, as a system designed to converse with people. This would include anything from Joseph Weizenbaum’s ELIZA (1966), the customer service chatbots that have been around for a couple of decades, and chatbots grounded in Large Language Models. While I wrote the paper before ChatGPT was released (I think my original examples used BERT), I think it’s a good example of what is meant by ‘bot’. 2. The solution you propose is that the manner in which we engage with bots can be understood as a kind of fiction or make-believe. What do you mean by ‘make-believe’ in this context? How is this supposed to work? Fictionalism is a classic ‘middle way’ approach. The idea is that people don’t need to believe they are literally talking to some kind of intelligent agent with beliefs and intentions in order to use a chatbot. We regularly engage in similar activities, games of make-believe, that involve ascribing meaning to things provisionally and for local purposes. When you engage with a customer service chatbot to top-up your phone, you don’t act as if the sounds it makes are meaningless because the system lacks communicative intentions, you understand that in order to get what you want, you need to play along. Similarly, when you use a pocket calculator, you can understand what it says despite the fact that it isn’t following social conventions in the way that Lewis laid out in Convention. Our ways of engaging with the world encompass much more than having true beliefs about it and there are many more attitudes we can take to the world than just believing propositions or not. We regularly choose to imagine things about the world in various games of make-believe and some of these games involve props. If I say ‘the floor is lava’, then we can play a game involving the floor within which one is at lesser or greater risk of burning alive. If I say ‘this orrery is the solar system’ then we can play a different game from which we can learn something about the relations between the planets. One of Kendall Walton’s great ideas was that engaging with a prop imaginatively does not preclude us from learning about the world through these games. Our engagement with chatbots can be like this. We can ‘go along with’ the customer service bot to find out when the next screening of a film is and we can go along with ChatGPT to find out who is in that film. In either case, ‘going along with things’ doesn’t mean committing ourselves to false beliefs. In the paper, I argued that the text may be literally meaningless but fictionally meaningful because, when interacting with chatbots, we engage in a make-believe in which the text is meaningful or in which you are engaging with another thinking being. In essence though, it’s an empirical question what people think they are doing when they engage with bots and there is interesting work being done. Merel Semeijn at the University of Groningen is doing cool work on this as well as a team at UiO (Ingrid Lossius Falkum, Nicholas Allot and others). I suspect that there are a range of different attitudes people bring to bots. The key thing I wanted to argue is that people don’t need to believe that chatbots have communicative intentions in order to use them and we don’t necessarily need to abandon the limited progress we have made on what makes speech meaningful in philosophy and linguistics. 3. You mentioned that you wrote the paper before ChatGPT was released. Would you change anything about it now? I think the general idea is still right but if I wrote the paper today, there are at least two things I would change. First, I’d say more about semantic externalism since a lot more has been said about it since (see bibliography below). I largely took it for granted that externalist arguments pushed against the idea that bot speech was meaningful. In his paper Brains in a Vat, the arch-externalist Hilary Putnam proposed a ‘Turing Test for Reference’ and described a system not unlike contemporary chatbots but concluded,
I get the impression that contemporary semantic externalists seem much more willing to grant reference in cases of weak causal relations — if there is some kind of ‘natural history’ connecting a token to a referent, then that’s enough for the token to refer to that referent. In part, I think the issue is that pure causal theories are a bit light on the theory. While traditional externalists gestured at some conditions on causal chains, e.g. there must be intentional repetition between tokens, they seemed to want to avoid attributing any explanatory significance to the internal architecture of a system. In contrast, teleosemanticists have gone into much more theoretical detail, e.g. there must be senders and receivers with a certain history etc. You might agree with the observation that, when working out what someone is saying, we sometimes make allowances for their false beliefs and attend to the history of the words they use (this is what our thought experiments show), but this doesn’t entail that systems without beliefs produce meaningful speech in virtue of those words’ histories. For that, you’d need another argument. In any case, if you think that all that is required for reference is some history or causal chain connecting a word to their referents, then simple n-gram models could be used to generate text that referred to the world and we have had systems capable of referring to the world since at least the 1950s. Such systems didn’t produce meaningful strings but their tokens stood in causal chains. Personally, I don’t see much theoretical value in talking of reference independent of the assertion of true or false propositions and that it probably isn’t worth doing this without more cognitive architecture. We can say that something refers in virtue of the history of its words, but once we’ve said that, it’s not clear what has been explained. Does this tell us the system is trying to get things about its referent right? That it regards its referent as a stable entity about which incompatible properties shouldn’t be ascribed? That it forms goals or plans involving this referent? Not really. The cost of making reference trivial is making it theoretically uninteresting. On the other hand, you might be a social externalist and say that what matters is whether or not a chatbot is a member of our linguistic community. I’m more sympathetic to this and actually argued for it years ago, proposing a form of the Turing Test for determining membership of a linguistic community (Mallory, 2020). You might think that all sorts of systems can be members of our linguistic community while lacking various properties including a thought, subjectivity, or original intentionality. This is something we may choose to do but there is more to being a member of a linguistic community than being taken to be a member by others. We could say that the system secures reference through our interpretive practices that connect it to the world. But I think there are interesting questions about how things come to be connected to distal stimuli that can’t be solved by simply saying that we take them to be connected to distal stimuli. Similarly, we might all call fool’s gold ‘gold’ but that doesn’t change its atomic structure (by analogy to ‘fools gold’, you might call ‘artificial intelligence’ ‘fools intelligence’.). Broadening our conception of meaning, narrows the range of inferences we can draw from ascriptions of meaning (‘this text is meaningful, therefore…). Fictionalists agree that we can ascribe content to the systems, after all, we do this in the games of make-believe. Where they disagree is in whether these ascriptions play the same role that they played in our theories of meaning for humans, whether they support the same inferences and can play the same explanatory role in accounts of cognition and communication. My second regret is using the term ‘literally’ in the paper. I wrote that bot speech is ‘literally meaningless’ but ‘fictionally meaningful’ but I don’t really want to cling to a robust distinction between literal and non-literal claims. I think I should have just said it is meaningless ‘according to our best theories’ and meaningful relative to various non-explanatory interests like buying a cinema ticket. You can see that I’m trying desperately (and probably not successfully) to maintain what I take to be helpful differences between some of our language-games without grounding those differences in the idea that some games carve reality at its joints. 4. You argue that a fictionalist approach is preferable to claiming that people are deluded when they treat the outputs of chatbots as meaningful speech. Can you say more about why you think your account offers the more plausible interpretation? I don’t think people are generally foolish and I do think they are very used to engaging in local imaginative practices that may superficially look like they have false beliefs. We do this from an early age. People are clearly capable of using chatbots without believing that those bots have thoughts or communicative intentions. They also engage with toys, artworks, video games, and mathematical models without literally believing the content of these engagements are true. I presented it as an inference to the best explanation in the paper and I still think it is just a more plausible account of what’s going on. However, it’s clear that some people do literally believe that they are talking to thinking, conscious agents when they engage with chatbots. In part, this is because that’s what it can feel like when chatting to a bot, and in part, because there is a massive advertising and lobbying industry that permeates both media and academic research that encourages people to view these systems that way. Behind all of this is a question of whether we want to extend our use of certain everyday terms (e.g. meaning, understanding, reasoning, knowledge, agent) to some new domain. I think philosophy has a role to play in the peculiar democratic deliberative process of metasemantic negotiation. But like a lot of democratic processes, there are vested interests pouring money into it to ensure that we start calling token generation ‘reasoning’ or getting slightly better results than an alternative model ‘introspection’. These terms have normative and epistemic significance that inspires trust in users of these systems as well as encouraging misunderstandings about how the systems operate. This should concern us. The major, widely-used chatbots these days are surveillance devices that have been designed to make people as dependent on them as possible while encouraging them to share personal information with companies that have dubious commitments to democracy and which are centralising political and epistemic power in a dangerous way. How people interpret these systems is ideologically significant. Author's work:
Other cited work:
|