De l’imagination à la compositionnalité artificielle / From imagination to artificial compositionality

01/2023

J’utilise depuis 2010 le concept d’imagination artificielle (ImA) pour désigner la production d’images automatisée par les réseaux récursifs de neurones. Stratégiquement cette notion permet de ramener l’imagination, non à l’expression d’une intériorité dont la nature subjective est floue, mais au simple fait de produire des images et introduit une ambiguïté, un trouble par rapport à l’imagination anthropologique, déstituant celle-ci de ses privilèges auto-accordés.
Si je garde cette formule du fait de sa polysémie et de sa capacité à désigner l’artificialisation de l’imagination tout comme la manière dont nous imaginons l’artificiel, liant ces deux phénomènes d’une manière inextricable afin d’en problématiser l’influence réciproque, la notion de compositionnalité est techniquement plus précise.
Celle-ci consiste en la capacité de composer des paramètres fruit d’un calcul statistique sur des données, c’est-à-dire d’arranger plusieurs éléments probables d’une manière cohérente. C’est cette cohérence de la composition visuelle qui constitue la spécificité de la production automatique d’images et son effet de réalisme.
On peut distinguer la compositionnalité d’autres modes d’appareillage traditionnels et fondamentaux pour la sélection culturelle tels que la citation, la fragmentation, le collage, la traduction, la transduction, etc.
C’est cette compositionnalité alliée à la technologie CLIP corrélant des paramètres visuels et langagiers dans l’espace latent qui permet aux êtres humains de produire un texte pour générer une image. Et il faut remarquer que socialement, face à cette possibilité d’écrire un texte pour produire une image (qui intensifie en profondeur les liens complexes qui se sont tissés au cours de l’histoire entre le texte et l’image), les résultats sont médiocres. Tout se passe comme si la compositionnalité automatique révélait le manque d’imagination anthropologique ou plus exactement le rendait lisible, au sens strict du terme, grâce à ces prompts textuels.
On voit l’influence de la culture américaine qui a déterminé tant et si bien l’imaginaire des individus qu’ils ne sont plus capables que d’imaginer un mélange de Star Wars, Le seigneur des anneaux, Tim Burton et tout un ensemble de clichés diffusés sur Netflix. Si l’espace latent contient un nombre de possibilités d’images transfinies (plus grand que ce que nous pouvons parcourir), les images passées comme les images à venir (ce pour quoi il peut produire des images réalistes qui n’avaient jamais été composées), ces possibilités sont réduites drastiquement par la finitude de l’imagination humaine déterminée par les industries culturelles.
On peut dès lors parler de l’imagination artificielle sous la forme de :
1/l’imagination (de l’) artificiel.le pour désigner la boucle de réciprocité entre les technologies et les êtres humains, puis distinguer
2/la compositionnalité automatique permet d’intégrer visuellement des paramètres visuels et textuels de
3/l’imagination humaine comme capacité de projeter (par un texte, par un geste) une image sur un support matériel que celui-ci soit sous la main ou plus éloigné (dans des datacenters par exemple). Même en automatisant les prompts, comme c’est déjà le cas, il est toujours question pour définir l’élément 1 de préciser les relations et les rétroactions entre les éléments 2 et 3.

Un modèle scientifique en thermodynamique expliquant la compositionnalité de la diffusion.

Since 2010 I have been using the concept of artificial imagination (AmI) to refer to the automated production of images by recursive neural networks. Strategically this notion allows to bring imagination back, not to the expression of an interiority whose subjective nature is blurred, but to the simple fact of producing images and introduces an ambiguity, a disturbance in relation to anthropological imagination, disestablishing the latter of its self-granted privileges.
If I keep this formula because of its polysemy and its capacity to designate the artificialization of the imagination as well as the way we imagine the artificial, linking these two phenomena in an inextricable way in order to problematize their reciprocal influence, the notion of compositionality is technically more precise.
This one consists in the capacity to compose parameters resulting from a statistical calculation on data, i.e. to arrange several probable elements in a coherent way. It is this coherence of the visual composition which constitutes the specificity of the automatic production of images and its effect of realism.
One can distinguish compositionality from other traditional and fundamental modes of apparatus for cultural selection such as quotation, fragmentation, collage, translation, transduction, etc.
It is this compositionality combined with CLIP technology correlating visual and linguistic parameters in latent space that allows human beings to produce a text to generate an image. And it should be noted that socially, in the face of this possibility of writing a text to produce an image (which intensifies in depth the complex links that have been woven throughout history between text and image), the results are mediocre. Everything happens as if automatic compositionality reveals the lack of anthropological imagination or, more exactly, makes it readable, in the strict sense of the term, thanks to these textual prompts.
We see the influence of American culture that has determined the imagination of individuals so well that they are only capable of imagining a mixture of Star Wars, Lord of the Rings, Tim Burton and a whole set of clichés broadcast on Netflix. If the latent space contains a number of possibilities of transfinite images (greater than what we can browse), past images as well as future images (for which it can produce realistic images that have never been composed), these possibilities are drastically reduced by the finiteness of the human imagination determined by the cultural industries.
We can therefore speak of the artificial imagination in the form of :
1/artificial imagination to designate the loop of reciprocity between technologies and human beings, then distinguish
2/the automatic compositionality allows to integrate visually visual and textual parameters of the human imagination.
3/human imagination as capacity to project (by a text, by a gesture) an image on a material support that this one is under the hand or more distant (in datacenters for example). Even by automating the prompts, as it is already the case, it is always question to define the element 1 to specify the relations and the retroactions between the elements 2 and 3.

A scientific model in thermodynamics explaining the compositionality of diffusion.