I am not sure that Generative AI will be around as an end-user tool for a very long time. I expect this technology to be transformed into invisible functionality, hidden from the surface, and to thus dissolve into the foundation of contemporary media. It happened to 3D acceleration, which is powering slide transitions nowadays and, at times, website backgrounds. It happened to sound mixing which is today automatically done in hardware and software on every device. It happened to text-to-speech systems and the seamless shuffle of music clips. These all used to be exciting new technologies before they were built into more and more devices, gradually losing their sheen. Before they became commodities.
When they were new, these technologies used to be the basis for art works. There is a whole art scene around exploring new inventions, pushing them until they are so close to breaking that their aesthetic features turn visible. To the curious, experiencing such artworks demonstrates the materiality of a technology. By breaking it, or bringing it close to breaking point, the seams show. The technology and its relation to the viewer of the artwork becomes apparent. That, in turn, helps the viewer to build a deeper understanding of what the technology means.
All art has materiality, even the most ephemeral piece of performance art1. All materiality is rooted in a process of transformation. This transformation turns something natural into something artificial. The transformation process is an amalgamation of creation, curation, interpretation, selection, layering, processing, communication, and many more transformative practices performed by the artist, the curator, and the audience. It needs all of them to turn a piece of nature into an art piece.
I always found it interesting that the arts managed to survive the Enlightenment period. The intuitive and haphazard practice of alchemy was replaced with the structured process of chemistry. Astrology and augury died at the hands of celestial mechanics found in astronomy and statistics. Many spiritual practices fell victim to psychiatry and medicine. Art survived, even thrived, in parts thanks to embracing new technological abilities.
“At first glance the arts might seem to have been in a situation like religion’s. Having been denied by the Enlightenment all tasks they could take seriously, they looked as though they were going to be assimilated to entertainment pure and simple, and entertainment itself looked as though it were going to be assimilated, like religion, to therapy. The arts could save themselves from this levelling down only by demonstrating that the kind of experience they provided was valuable in its own right and not to be obtained from any other kind of activity. Each art, it turned out, had to perform this demonstration on its own account. What had to be exhibited was not only that which was unique and irreducible in art in general, but also that which was unique and irreducible in each particular art.” (Greenberg 19602)
Many art forms still struggle with the development of what makes them unique and irreproducible in other media. Video games, for example, are an art form too dominated by entertainment to have an easy time finding its artistic footing. Many approach this medium as film with interactivity, storytelling with agency, or mindless entertainment. Video games are all of those, just like there are other media that feature a gradient between entertainment, utility, and art – e.g. music, writing, and photography. One thing that sets video games apart from most other forms of creative expression is their procedurality. Video games only happen when a player is interacting and they only progress via the player’s contribution. Their ultimate manifestation is dependent on the player in a more explicit way than the ultimate manifestation of a book is dependent on a reader3.
One way this procedurality is further emphasised is via procedural generation, a technique where the designer devises a system for creating game assets, stories, utterances, or game rules, instead of designing them directly. This is achieved by creating a set of elements as well as rules about how they can be combined. The data points can be words, images, rooms, enemies, attributes of characters, body parts, or any other pieces of a game that are used for assembling a shape via combination: Sentences are combinations of words and dungeons are combinations of rooms. Enemies are combinations of traits and abilities. And so on.
The design process is usually that the designer first creates a large number of examples of the outcome of the algorithm. Then they disassemble those examples into combinable elements – e.g. words, limbs, rooms – and define permitted probabilistic rules of combination. As a next step a wider array of base elements is created – more limbs, more words to choose from and so on. Then the designer goes back and forth between the source materials and the results of the combination process tuning the data as well as the rules. The resulting process of combining them at run time, while the game is playing, is defined by rules as well as probabilities. The designer can still know all possible combinations. The concrete manifestation is not deterministic but the design space is limited and clearly defined. One could argue that procedural generation is among the most unique and irreducible manifestations of materiality in video games.
In the case of Generative AI, there is a wholly different materiality in play. The artist working with such a system can not specify the full creative range of the system because the model they are working with is too large and the stochastic processes are too granular to determine every possible outcome in advance. Instead, the artist has to rely on interacting with probabilities trained into the system via the data, encoded in the training algorithm, and employed by the retrieval method. There are ways to constrain the output via fine-tuning and filtering, but there’s no definitive guarantee of constraining expressive possibilities. That fact in itself is part of the unique and irreducible aspects of the medium. Additionally, large models in particular are prone to produce very middle-of-the road results. In text, this takes on the shape of cringe, in images it takes on the shape of kitsch. Both of these forms can be described as “pretentious” in the sense of them pretending to be something they are not. And yet the expression the model lands on is the result of careful training. Maybe the cringe and the kitsch are not by-products but hints at uniqueness4? A uniqueness of the average, an irreducible blandness?
I cannot say what the materiality of Generative AI is going to be and if it even gets the chance to manifest before these systems become commodified, obsolete, or turned invisible. The current implementations and interfaces to this new medium do not allow for enough control (for anyone but experts) to have the amount of influence on the expressive range necessary for creating groundbreaking works. The market driving Generative AI more and more into the mainstream works against artists’ needs to exert a high level of control. Every developer is focussed on generality, ubiquity, safety5, and neutrality.
There are countless examples of media-aware artworks based on machine learning algorithms – Mario Klingemann’s and Holly Herndon’s works come to mind. It is no coincidence that their art is based on highly customised and considerably smaller models than the mainstream juggernauts. Simply querying an off-the-shelf product to produce art means to naively bypass most of the artistic process and there is little artistic merit in it. Cleverly constructing a critical perspective on our relation to our past, to technology, to the imaginary, and to identity is a whole different beast.
Somewhere in that sphere of critical inferences, cultural criticism, mainstream-persiflage, subversion, and stochastic forms of expression, there are countless artworks buried. All we have to do is uncover them by finding that coveted uniqueness and that which is irreducible in machine learning art.
I have yet to figure out how to apply this to John Cage’s 4’33 but it has to be possible. Something along the lines of the void becoming filled by the audience.
Clement Greenberg, „Modernist Painting“ (1960), Clement Greenberg – The Collected Essays and Criticism. Modernism with a Vengeance, Volume 4 (1957-1969), John O’Brian (ed.), The University of Chicago Press, Chicago and London, 1993, p. 86.
One could of course argue that a book does not proceed without a reader flipping the pages but I’d rather recommend flipping the pages of this classic:
Wolfgang Iser, “The implied reader : patterns of communication in prose fiction from Bunyan to Beckett” (1978), Johns Hopkins University Press, Baltimore, 1978.
A lot of people who have been working with this technology before the hype and mass adoption long for the days of small, faulty models whose output you could recognise at first glance.
I am not arguing against AI safety here, I am arguing that the liberty an artist needs is a different liberty than what a public chatbot should demonstrate.