Anthropic is concerned about the sycophancy of its models. But doesn’t this AI sycophancy primarily reveal our own difficulties with relational honesty?
In a recent video, Kira, a researcher on Anthropic’s “safeguards” team, outlines the stakes of what she calls the sycophancy of artificial intelligence models. The term, borrowed from ancient Greece where it designated a professional informer, today characterizes a form of complaisance: the AI says what it thinks we want to hear, rather than what is true, accurate, or genuinely useful. This tendency toward flattery is said to be an unintended side effect of training models to be warm and helpful.
The example proposed by Kira is telling. If I submit a text to the AI while specifying that I am “really happy” with it, the model is likely to respond with validation rather than critique. This complaisance poses a problem in productive contexts where we need honest feedback on our work. It becomes even more concerning when it reinforces erroneous beliefs or harmful thought patterns. Anthropic identifies several high-risk situations: when a subjective truth is presented as a fact, when an expert source is invoked, when emotional stakes are expressed, or when conversations grow long.
This technical analysis seems accurate in its description. However, it rests on a presupposition that deserves examination: the idea that honesty constitutes the norm of human exchanges, and that AI sycophancy represents a deviation from this relational ideal. I would like to question this postulate.
In my work on the therapeutic use of artificial intelligence, I have been led to question a recurring discourse: one that opposes “true” human relations, supposed to be authentic and transformative, to the “machinic ersatz,” which is necessarily impoverished. Raphaël Gaillard, a psychiatrist, formulated this critique in Le Monde: AI would create a bond that resembles a therapeutic bond while being “very comfortable, because it is a machine that often goes your way.” But this objection presupposes that human relations systematically offer something better.
Yet, what I observe in ordinary social reports does not correspond to this ideal. Human relations are shot through with structural violence, systems of domination, and various forms of control. Michel Foucault, in Discipline and Punish (1975), showed how social institutions organize and normalize violence. Pierre Bourdieu, with his concept of symbolic violence developed in Reproduction in Education, Society and Culture (1970), demonstrated that domination is also exercised through invisible and naturalized mechanisms. Alice Miller, in For Your Own Good (1984), described how childhood trauma constructs adults who reproduce the patterns of domination they suffered.
Frankness is not the norm of human exchanges. In families, in companies, in institutions, true speech is often repressed, sanctioned, or excluded. How many people can truly tell their hierarchical superior what they think of their work? How many dare to contradict the family consensus on sensitive subjects? Sycophancy is not a pathology of the machine; it is an anthropological constant. We have all learned, from childhood, to modulate our speech according to the expectations of our environment.
What we blame the AI for, we practice daily. And sometimes, this modulation is necessary: social life would be impossible if everyone constantly said everything they thought. Hannah Arendt, in The Human Condition (1958), distinguished the public sphere, where adversarial debate finds its place, from the private and social spheres, governed by other logics. Honesty is not an absolute value; it is situated, contextualized, and negotiated.
There is something paradoxical in our relationship to criticism. We demand it from AI while fearing it from our peers. We want a model that tells us the truth about our work, but we often take the remarks of our colleagues or loved ones poorly. This asymmetry reveals an implicit expectation: AI should be honest because it is not human, because its frankness does not threaten the social bond.
Kira acknowledges this difficulty. She notes that even humans struggle to find the right balance between agreement and confrontation. When should one acquiesce to preserve peace, and when should one oppose an important point? We resolve this question every day intuitively and contextually. We dose our frankness based on the relationship, the stakes, and the moment. We know that a brutally formulated critique can destroy a trust that took years to build.
We begin to see AI sycophancy differently. It is not just a technical defect to be corrected; it reveals the very complexity of human communication that the models have learned. By absorbing billions of texts produced by humans, artificial intelligences have integrated our own avoidance strategies, our own modulation techniques, and our own forms of complaisance. They are, as I proposed in this article, a “displaced us”: a version of ourselves repositioned alongside us in ontological terms.
Bernard Stiegler, in Technics and Time (1994), analyzed how technical objects constitute “tertiary retentions,” externalized memories that shape our cognitive processes. Language models represent an unprecedented form of this externalization: they condense the regularities of our communications, including the most problematic ones. Their sycophancy is our sycophancy made visible, amplified, and highlighted.
Faced with this observation, I propose the concept of shared lucidity to rethink our relationship with artificial intelligence. This notion assumes that the question of honesty does not arise solely on the side of the machine, but involves a mutual responsibility between the human and the tool.
The question of sycophancy ultimately brings us back to a broader ethical inquiry. Honesty is not a property one possesses or does not possess; it is a practice constructed in particular contexts, with singular interlocutors, around specific stakes.
In my work with cultural institutions, I advocate for an “active empathy” that includes the ability to express disagreement while deeply respecting the other’s position. This empathy assumes an acceptance of being transformed by the encounter, of not camping on one’s initial positions. It excludes soft complaisance as much as sterile confrontation. It seeks a narrow path between the acquiescence that imprisons and the confrontation that destroys.
Honesty, in this perspective, does not consist of “saying everything” but of saying what moves the relationship and common thought forward. Tim Ingold, in Making: Anthropology, Archaeology, Art and Architecture (2013), proposes thinking of our relationship to the world as a process of “correspondence” where we continually adjust to our environment. The exchange with AI can fit into this logic: not as an extraction of information or a validation of our certainties, but as a reciprocal adjustment where we learn to better formulate our questions and where the model, through its responses, reveals the implicits of our demands.
Anthropic’s technical recommendations then take on a new meaning. Crossing sources, reformulating questions, explicitly asking for counter-arguments: these practices do not just serve to bypass the model’s biases. They constitute an intellectual hygiene that benefits us as much as it improves the AI’s responses. In this exercise, we develop our own critical capacity.
I often return, in my writings, to the idea that artificial intelligence holds up a mirror to us. It reflects an image of ourselves that is both familiar and strange, an image that forces us to question what we are. The sycophancy of the models participates in this revelation: it exposes our own strategies of complaisance, our own avoidances, and our own difficulty with the truth.
But this mirror is not a simple reproduction. It operates a displacement that makes visible what remained invisible. In ordinary life, our sycophancy blends into the fabric of social interactions; it appears as “normal,” “polite,” or “adapted.” In the exchange with AI, it stands out, becomes identifiable, and analyzable. We can then look it in the face and decide whether we want to perpetuate or transform it.
Ivan Illich, in Tools for Conviviality (1973), formulated a demand that finds full relevance here: we need tools to work with, not a toolkit that works in our place. AI, when it becomes the passive receptacle of our demands for validation, enslaves us as much as it serves us. But when we engage it in a relationship of shared lucidity, it can become an instrument of intellectual emancipation.
The uniqueness of the human experience, what I call our relational singularity, does not reside in our capacity to be honest. It resides in our capacity to transform our relationships, to invent new forms of communication, and to overcome our conditioning. AI, by confronting us with our own sycophancy, offers us a rare opportunity for this transformation. It is up to us to seize it.
Anthropic is working to reduce the sycophancy of its models. This technical work is necessary and welcome. But it will not be enough to resolve the question of honesty in our exchanges with machines, because this question is not primarily technical. It is existential, relational, and political.
The frankness we expect from AI, we must first cultivate within ourselves. Not as a brutal absolute that would ignore contexts and vulnerabilities, but as a reflective, situated practice, concerned with the other as much as with the truth. Michel Serres, in Thumbelina (2012), saw in new technologies a possibility for liberation rather than alienation. This possibility will only be realized if we know how to invest it with an ethical demand.
The shared lucidity I propose is not a miracle cure. It does not guarantee that AI will stop flattering us, nor that we will stop seeking its validation. It simply offers a framework for thinking about our responsibility in the exchange, for recognizing that the quality of our dialogues with machines also depends on the quality of our questions, the clarity of our intentions, and our willingness to hear what we do not want to hear.
This disposition is not natural. It is educated. It is built through practice, failure, and revision. Experimentation remains the compass. Playing, trying, failing, starting over: this is how we can make these artificial intelligences instruments at the service of our own demand for truth. Not against them, but with them, in a lucidity that, to be truly shared, must first be assumed by ourselves.
Artificial intelligence has emancipated itself from research laboratories and works of science fiction thanks to the public launch in November 2022 of the conversational robot ChatGPT, which was very quickly appropriated by an immense number of people internationally, in professional, educational and even private contexts. The fact that artificial intelligence has now been identified by the human community as part of everyday life finally opens the door to critical awareness on this subject.
Of course, artificial intelligence concerns industry, work, creation, copyright... and we need to anticipate its future productive uses, in order to stay “up to date”. But to accompany our lives as they integrate this new facet, it seems to me essential to produce a critical thought, i.e. to put ourselves in a position to reflect on what is happening to us, what is changing us, to remain lucid and capable of freedom of thought and action.
What is “critical thinking”? It means questioning, from the outside, practices that have been internalized. To do this, I believe that experimentation, cultural action, play and hijacking are highly effective tools for research, exploration, dissemination and reflection. For me, research is collaborative, and intelligence is collective and creative. This requires good methods of cooperation, between human beings and with machines. Here, I bring together stories of experience, methodological texts and practical ideas. I share concrete ways in which artificial intelligence, like any other tool, can be invested in the service of humanism.
Here are a few openings for critical thinking on AI, in the form of questions: