Risk category: Cooperation
Risk subcategory: Commitment
Description: "The landscape of advanced assistant technologies will most likely be heterogeneous, involving multiple service providers and multiple assistant variants over geographies and time. This heterogeneity provides an opportunity for an ‘arms race’ in terms of the commitments that AI assistants make and are able to execute on. Versions of AI assistants that are better able to credibly commit to a course of action in interaction with other advanced assistants (and humans) are more likely to get their own way and achieve a good outcome for their human principal, but this is potentially at the expense of others (Letchford et al., 2014). Commitment does not carry an inherent ethical valence. On the one hand, we can imagine that firms using AI assistant technology might bring their products to market faster, thus gaining a commitment advantage (Stackelberg, 1934) by spurring a productivity surge of wider benefit to society. On the other hand, we can also imagine a media organisation using AI assistant technology to produce a large number of superficially interesting but ultimately speculative ‘clickbait’ articles, which divert attention away from more thoroughly researched journalism. The archetypal game-theoretic illustration of commitment is in the game of ‘chicken’ where two reckless drivers must choose to either drive straight at each other or swerve out of the way. The one who does not swerve is seen as the braver, but if neither swerves, the consequences are calamitous (Rapoport and Chammah, 1966). If one driver chooses to detach their steering wheel, ostentatiously throwing it out of the car, this credible commitment effectively forces the other driver to back down and swerve. Seen this way, commitment can be a tool for coercion. Many real-world situations feature the necessity for commitment or confer a benefit on those who can commit credibly. If Rita and Robert have distinct preferences, for example over which restaurant to visit, who to hire for a job or which supplier to purchase from, credible commitment provides a way to break the tie, to the greater benefit of the individual who committed. Therefore, the most ‘successful’ assistants, from the perspective of their human principal, will be those that commit the fastest and the hardest. If Rita succeeds in committing, via the leverage of an AI assistant, Robert may experience coercion in the sense that his options become more limited (Burr et al., 2018), assuming he does not decide to bypass the AI assistant entirely. Over time, this may erode his trust in his relationship with Rita (Gambetta, 1988). Note that this is a second-order effect: it may not be obvious to either Robert or Rita that the AI assistant is to blame. The concern we should have over the existence and impact of coercion might depend on the context in which the AI assistant is used and on the level of autonomy which the AI assistant is afforded. If Rita and Robert are friends using their assistants to agree on a restaurant, the adverse impact may be small. If Rita and Robert are elected representatives deciding how to allocate public funds between education and social care, we may have serious misgivings about the impact of AI-induced coercion on their interactions and decision-making. These misgivings might be especially large if Rita and Robert delegate responsibility for budgetary details to the multi-AI system. The challenges of commitment extend far beyond dyadic interpersonal relationships, including in situations as varied as many-player competition (Hughes et al., 2020), supply chains (Hausman and Johnston, 2010), state capacity (Fjelde and De Soysa, 2009; Hofmann et al., 2017) and psychiatric care (Lidz, 1998). Assessing the impact of AI assistants in such complicated scenarios may require significant future effort if we are to mitigate the risks. The particular commitment capabilities and affordances of AI assistants also offer opportunities to promote cooperation. Abstractly speaking, the presence of commitment devices is known to favour the evolution of cooperation (Akdeniz and van Veelen, 2021; Han et al., 2012). More concretely, AI assistants can make commitments which are verifiable, for instance in a programme equilibrium (Tennenholtz, 2004). Human principals may thus be able to achieve Pareto-improving outcomes by delegating decision-making to their respective AI representatives (Oesterheld and Conitzer, 2022). To give another example, AI assistants may provide a means through which to explore a much larger space of binding cooperative agreements between individuals, firms or nation states than is tractable in ‘face-to-face’ negotiation. This opens up the possibility of threading the needle more successfully in intricate deals on challenging issues like trade agreements or carbon credits, with the potential for guaranteeing cooperation via automated smart contracts or zero-knowledge mechanisms (Canetti et al., 2023)."