The politics of image search - A conversation with Sebastian Schmieg [Part I]
Visual artist, computer programmer and data activist. Lives and works between Brussels and London.Read full Bio
Last year, for a few months, the images displayed on the pages of The Photographers’ Gallery’s website were overlaid with four intriguing keywords: problem, solution, past and future. While at the same time, on the Media Wall of the Gallery, a script recursively exploiting Google’s ‘recursive image search’ , was producing a continuous chain of images. The discussion that follows took place while these two processes creatively engineered by the digital artist Sebastian Schmieg were unfolding.
Here we present the first part of the interview where Schmieg reflects on these projects and speculates on their development. He also discusses critically the impact of machine learning on visual culture as well as the imbalances and asymmetries in the production of artificial intelligence.
Shall we make a start by talking a bit about your background? It is interesting to sketch not just how you came to do what you are doing today but also how it fits in the context of The Photographers’ Gallery digital programme.
Well, I guess like many people, I started dabbling with the Internet. I was lucky that we had an early Internet connection so I started making websites. I initially aspired to become a web designer but I ended up studying with the net artist Olia Lialina. Which back then I didn’t realise how much it influenced me.
I first did some studies in computer science, which was interesting, but I was totally missing the creative part, as they stupidly say these days. Ι continued studying at the University of the Arts in Berlin and I always had a strong focus on programming. I wouldn’t say that this element could be found in all my projects but it’s a mind-set that I have. That background for me connects to the Decision Space project: Studying in the design department, I was asked to create solutions, which I reluctantly did, but I was always better looking into problems. [Laughter]
So you found yourself some good problems to show and discuss?
Yeah I found that’s a process that’s struck with me.
‘Search by Image’ was the first project of yours that I found out and it’s from there that I became more familiar with your work and curious about what you were doing. This project has had some exposure in the past, and [at the time of this conversation] it is on at the Media Wall of The Photographers’ Gallery. We can see how it’s still super relevant, even quite some years after its appearance. I would like, if that’s not too far in the past for you, to ask you to talk about the context in which the original idea to make that work came from. If I’m not mistaken, it was in 2011, and I think Google Image Search had already existed for 10 years then.
So how did you come to it?
It was in that year that the search-by-image feature was actually released (by Google).
A friend told me about this brand new feature which instantly fascinated me. I was basically just playing around with it, without having any idea where it would lead. I was just throwing any image that I could find at it, and trying to find out what would happen. And when I saw that, for example, when you feed dogs into it you get back naked men or women, it became clear that this is a totally different way of searching to the one Google regularly offers. I then instantly had the idea to generate videos with it and I made lots of experiments with that. Out of this process the video that got the most exposure is the one starting with a transparent PNG.
The production of the video was quite fast: it took one night to scrape the images and to then put them together. But the process as a whole was much longer since there is no API (Application Programming Interface) and you are not supposed to do that. So you have to find a way to programme it, and then to find the right things.
I actually published the 'Search by Image' series while studying at the Piet Zwart institute in the Netherlands where I did a study abroad. And it is interesting for me after presenting it there, to now continue with it for so many years. Also my perception of it has changed. In the beginning it was for me a way to really look at an algorithm, while these days I think about it as more of a narrative work and not as pure as I thought it would be.
‘To look at it as a way to interact with an algorithm’: Would you like to further explore the relationship between solution and problem, considering how hard it is to figure out what these algorithms are actually doing?
At the time I had been very much concerned with the archive and specifically in the way that Google has one. I was thus questioning how that archive is organized beyond its interface, which looks always so clean and easy to navigate. Understanding how that archive worked would allow me to traverse it and to interact with it in a different way; to do a sort of a creative misuse of it. And to me, the empty PNG was really the purest way I could find to interact with it. When I worked with it, the interesting thing that came up was that the results you get change on a daily basis or even every hour. In 2011, searching with the transparent PNG would lead to animated explosions whereas now it connects to things like business cards or some Instagram icon.
Did you keep these sort of versioning of the result?
Yes, I have many, many folders full of sequences that I did.
That’s very interesting, because you have a sort of timeline of Google’s algorithmic unconscious so to say; and what this unconscious projects on an empty image. And going from explosions, to business cards...that is quite a mixture [laughter].
The story of the Internet…
So when you say that you’re now more interested in the narrative, do you find that this is a general evolution in your work?
My approach and my way of making art changes regularly, as I try to find a good way of speaking about the things I want to speak about. But at the beginning of this project, I thought it would be something more pure, closer to Google’s internals. In the beginning I also put the algorithm for doing it on my website, because I wanted to at least make my side open, and show what am I doing.
And are you using Google’s API today?
They still don’t have an API for it, so it’s the same thing. I’m using different code now, a bit more complex. You have to automate a few processes, click there, click here and so on. For example, in the case of the Media Wall at The Photographers' Gallery, one day the sequence got stuck and it wouldn’t continue. What was actually happening was that they had an iframe overlaying the search results, and you had to scroll down and click on “I agree to the terms of service”.
So, in a way, you have to emulate a user?
Yes, exactly. It’s basically like automating a browser.
It is interesting to see that over the years the situation has not changed and that they (Google) never made it easier…
I don’t think they are making anything easier if they are not benefiting from it.
Further on the perception of the project though, speaking with The Photographers’ Gallery digital programme team, they told me something about the 'Search by Image: Lena/Fabio' sequence that I found interesting and I had noticed myself too.
Quite quickly during the sequence – having started with either Lena or Fabio – the succession of the images lead to male politicians, or superhero guys. So again, this is nothing that I’m doing with my algorithm, it’s really a Google thing. In that sense I think that the project is exposing something which is already there: a strong, huge cluster in their system, which I guess is white male dominated.
Is this maybe contingent? Do you think it relates to Donald Trump’s presidency or is it a regular tendency?
That’s a good question… Well, I only did the experiment of the sequence starting with Lena and Fabio for The Photographers’ Gallery so I don’t have any data over the years to back that up. But I wouldn’t say it’s necessarily a Trump thing.
For example when I did a contribution to the TPG Loose Associations magazine I was asked the question “How do you visualise the woman today?”. And so what I did was to search for the first image that you get when you put in the word ‘woman’ on Google image search. I then used that as the starting point for the ‘Search by Image’ sequence and in that case you would get again Superman, or the actors that play Supermen. It wasn’t planned but as soon as I ran it once that was instantly the case. I didn’t plan to transform a woman into superheroes. It just happened.
So what is the profile of this user that you are emulating? I suppose you have to give it a browser brand and version. And do you see differences when you run the script from your home in Berlin, or from The Photographers' Gallery in London?
Of course, there are differences. The search is always really tied to your location so in the case of The Photographers’ Gallery, it’s running on a server in London. And what I did for the first time, in order to avoid this problem of constantly having to click on the ‘I accept the Terms of Service’, which would pop up again and again, was to actually create an account that also logs in automatically.
When I set up the piece at The Photographers' Gallery, a friend of mine was in Brazil so I used one of his spare phone numbers to make an account. For that reason the account tied with 'Search by Image: Lena / Fabio' had this weird identity, based in London and in Brazil at the same time.
It is interesting to think about the search results as linked to this user with one leg in Brazil and the other in London, and probably a third one in Berlin…
Yes, that’s true. I wonder what I should do with this user?
It has definitely been the most active user on Google for the last few months, really. A user that is uploading images every few seconds, nonstop -
maybe it’s already a Super-Power user by now.
This project could also be a tool for research - I was wondering whether you have been contacted by people who are working in social science or visual culture?
No, but Google contacted me…it was 2012 or 2013 actually. They liked the project and they wanted to publish one video on their Google search blog; a blog they run about their search products. Although I was assuming they would pick the transparent PNG video they chose a way more abstract piece, which actually made me question what their reading of the video was? What did it originally show that they didn’t want to tell? The fact that they showed interest in the work was flattering but looking at the work as an enquiry and criticism into their inner workings it was also uncomfortable for me, like producing some kind of advertisement for their products, which I didn’t want to do at all.
I mean, they should collect such things but maybe this is what they aim to do through the Google Cultural Institute now.
Would you like to describe your current project commissioned by The Photographers' Gallery, 'Decision Space'?
There are two strong currents running through my practice. One, which started with Search by Image, is looking at computer vision and archives of images. The other one is looking at how this constant connectivity changes the way we work.
These two things together brought me to look into neural networks and how they learn to look at an image and to understand it. I realised that at the core of artificial intelligence there is a huge amount of people who are tagging or describing images for very little money, and on the other hand people like you and me, that everything we do is also put into these datasets to train artificial intelligence.
I got interested in what these systems are trained on because given the extent to which they are used and will be used – from smartphones to surveillance to warfare – the photographic training data sets have serious ethical implications that need to be addressed. So when somebody is sitting there doing a job for really shitty money that influences the outcome of the system too.
So, decision space is an attempt to create a new dataset, which you can then use to train a so-called artificial intelligence or a neural network; and I created that dataset by using all the photos that were part of The Photographers' Gallery online image collection. Visitors to the website could then help categorise each image. For instance, when you go to the website you can decide by saying ‘that for me represents the problem, the solution, the past or the future’.
At the same time there is also a dedicated website which shows all images from the online collection, one after another. It recreates an interface similar to what you would have when you work as a dedicated image annotator or classifier, as a Mechanical Turk worker - so to speak.
If I am not mistaken you usually reverse-engineer an existing algorithm or, for example in earlier projects, how people are annotating datasets. But now you take this decision and responsibility to constitute the dataset yourself. As it is the first decision to make, I wondered what kind of questions does this bring and what’s the difference in a way?
I wanted to create the situation of a mechanical turk – which we all are of course: select all images, tag your friend, rate this product – rather than using mechanical turks. I was interested in how I could set up this whole little factory myself and understand how this little assembly line works. For example, datasets are often based on photos from Flickr. There, people often upload their photos using a creative commons license – since of course sharing is caring – but that also means that these photos might end up being used by Microsoft or other companies to train their system.
Looking at the Gallery's website, I was trying to find a way to speak about appropriating the work of all these photographers, and the cognitive resources of all visitors to the website.
In the case of Mechanical Turk workers most of the time there are a lot of precautions that are set in order to not ambiguously label images. So for instance when Li Fei Fei explains how they did ImageNet she shows how every time the annotator needs to annotate an image, the annotator is tested. Does (s)he understand the meaning of the label?
It is only when the person has passed the test that (s)he can begin the annotation. Which leads me to your decision to stay ambiguous. The four words “past”, “present”, “problem”, “solution” appear without definition or context. It is puzzling and enigmatic for the person who comes to the website as they don’t know what they are supposed to put behind those words. But that’s also a decision you took and I was wondering how did you come to that?
This stage that we are in presently with Artificial Intelligence is a naïve one. I wanted to extrapolate what is in the future and suggest that there are some concepts that are neither truth nor false and not as innocent as dog or cat. The projects plays with the idea that there are concepts, that are really biased by your background, your cultural background. And obviously there is no exact future, problem, etc so for me, the dataset shows how being totally contradictory can still make sense.
You can look at an image and say I can understand why a person clicked on ‘future’ but I also understand why another person clicked on ‘past’. I really wanted to explore this ambiguity and in a way break the logic of ‘you just need to scale up these systems’ in order to construct this super-intelligent thing.
To quickly talk a bit about influences, I was really fascinated by the MIT moral machine as I think on the one hand it’s an utterly stupid thing but on the other it’s absolutely revealing, especially in considering that is an approach you take to decide what a car should do. The whole setup already is highly problematic for me but I found it a really strong influence on the project.
Could you say a few words about 'Segmentation Network'? Because I tend to read these projects side by side, as two different approaches to the same problem.
To explain it briefly, Segmentation Network is looking at the Microsoft COCO dataset, which is quite special because it’s not only saying 'okay in this photo there is a dog and the owner, and the dog is there'. Instead, the data has proper segmentation. There are lines around the dog and the owner, locating them and tracing their shapes, so the data is very clear. And the data has been created by Mechanical Turks, click by click. I was then trying to highlight two things: first, the amount of labour that goes into it. Most of the time when you talk about these systems, it’s either about clever engineers, or about the super intelligent algorithm, but it’s hardly ever about the people that are at the core of it.
Secondly, I wanted to look at how this assembly line defines computer vision: the machine can only see what is inside a segment.Everything that is outside of this will remain unrecognised or deemed irrelevant. For me a key point to realise is that is not just these people doing the work but it is all of us. So, with Decision Space I wanted to highlight that we are the same, we are also workers, cloud-workers, turning our cultural background and our biases into data and software.
[end of Part I]