Seeing Machines research doc.
Seeing machines
Not only humans perceive the world. In the digital age, machines see the world for other machines. By producing images, Artificial Intelligence (AI) is able to orientate itself in the world. Just like humans, that created the image to guide us through the universe.
Approaching ‘seeing machines’ will introduce us to a new field of algorithms that trust on photographic technologies. They reveal images purposed solely for machine-to-machine communication and which are not optimized to seen by human. I attempt to see from their perspective, curious for the opportunities in which photography could transform. By adopting their vision I want to explore in what way machine perception can expand the field of photography, open our view of external reality and our outlook of life. Can they be a legitimate voice in the discourse of photography?
introduction
From a young age I was triggered by the image, with photography in particular. How it was a mediator between the world and myself. It was an entrance to make the world imaginable, without immediate access to this world. Opening up to a new view and increasing awareness of our environment. The world constantly changes, therewith the image.
Nowadays my urgency still lies with the image. But the deluge of images, the saturation, has prompted me to ask if its still make sense to photograph in his existing framework. Anyone can produce pictures without knowing about the complex processes. Everybody has cameras and image-processing software at there fingertips. Knowledge of craftsmanship is no longer necessary; no practise of training in equipment is required and so on. Billion of images are added daily, whereby it almost becomes a disposable product. The consequence is that I don’t see the value of making more photographs.
Photography, as we all traditionally know, has undergone a transition. The further I came to realize that, the more I wanted to dissociate myself from its tradition. Because today in the age of smart phones, satellite images, CCTV, machine image algorithms or drone media, image practices become all-pervasive. The definition of photography expands. From which I concluded that this is about ‘Seeing machines’. Without question, the photographic landscape and image-making devices will enlarge and it will play a fundamental role in many basic elements of our lives. The development makes me very curious and that is why I do not abandon mine practice. I haven’t seen anything yet.
Relevance of the Topic
The contemporary revolution in photography offers opportunities that exceed the wildest expectations. The medium is embedded in our everyday life on many different levels: from automated license plate recognition systems, CCTV, Google Earth, smart phone, machine image algorithms, drones media to the advent of infinite image storage. It contains many different kind of technologies, imaging devices and practices. The medium has always closely linked with the possibilities of technological advancements. The evolving relationship has created new ways to represent our physical world, shape and regulate it. Thereby it has ultimately transformed society and affects the photographic landscape. Photography, as it was once understood, has going beyond his existing framework. The current time asks to consider a new perspective. How do I relate to photography when I think in terms of imaging systems instead of photos? The definition extends to help us see what photography has become.
In this Post-humanist world, the human is no longer automatically the subject who sees. Development is going so far that over the last decade or so, something radical has happened. Images become disconnected from human acting and vision. The shift has been barley noticed. An invisible landscape of images produced by machine image algorithms purposed solely for other machines to see. Trevor Paglen introduced me to the idea of photography as seeing machines. By ‘seeing’ I mean the capability to ‘perceive’. “Now objects perceive me,” the painter Paul Klee wrote in his notebook, according to Paul Virilio in The Vision Machine. The French theorist explains that we are on the verge of synthetic vision, the automation of perception – “ a machine that would be capable not only of recognising the contours of shapes, but also of completely interpreting the visual field.” – (Virilio 1994* 1988) Historically, the performances of machines go beyond expectations. In the field of perception, the process in order to sensory information and turn it into concepts, the understanding became a reality. Through Artificial intelligence (AI) a neural networks can mimic the process of real neurons. Neural networks use a process to generate a variation of patterns, composed in several layers. Connected to each other, they stack the layers to form full neural network architecture. This set of patterns might look absurd to humans but for machine vision they are the most realistic representation of certain thing. They single out het best ‘performing’ ones; the image that the system classifies with a high percentage of probability. Neural Networks “learn” based on a collection of training images. By training they evolve their own set of relevant characteristics. Machine learning can perform functions as object detection, facial recognition or for example automatic transportation. For the input the algorithm trusts on lens-based medium to orientate itself in the world. Just like humans, that almost forgot, that it created the image to guided us also trough the universe.
In closing, I think that there is a new frontier and it will challenge the established ways of seeing. The embracing of machine perception will introduce us to a “New Vision”. This double point of view will expand our understanding, just like the invention of photography. I wonder, if photography became this life-shaping medium what will his nearly identical replica achieve? The development of photography has been from his origin a process of increasing awareness of the concept of knowledge and supported our visual capacities if there where inadequate. […] “Embracing nonhuman vision as both a concept and a mode of being in the world will allow humans to see beyond the humanist limitations of their current philosophies and worldviews, to unsee themselves in their godlike positioning of both everywhere and nowhere, and to become reanchored and reattached again.” (Zylinska 15) […/comment] As an artist I think it’s very exited to adopt this “New Vision” and explore how this tool should work for me. As machine vision develops, can it be a legitimate voice in the discourse of my practice? In which aspect will it engaging with external reality and aesthetic integration? In ways, it’s hard to imagine from today’s point of view, but I think this can be a new entrance to make the world imaginable. […]
Insight from Experimentation
‘Seeing machines’ – the working title of my artistic-based research. I came across this concept a few years ago in an interview with artist Ola Lanko. She translated it as constant recordings from example satellites or CCTV. A production process that is decoupled from human intervention. Although I could not exactly understand what the definition would contain, I founded it intuitively way more exciting than the traditional understanding of photography. In addition it is a better translation of the 21st century photographic landscape because it contains that many different kind of imaging devices and practices. So, to reinvent the medium for myself I started exploring within this domain.
I consistently embarked with what ‘seeing machines’ defined for me and it became a rich topic of exploration. I approached it in different ways until I found something that I was not aware of, something that made me very curious. In the series ‘Is Photography Over?’ published a few years ago by Trevor Paglen, I became aware of the concept behind ‘Seeing machines’. In first instance, he sketches out that the traditional photography theory and practice seems to be at a standstill. I agree, because experience shown that at the academy no questions are asked about what photography has become. Despite the extreme change in the photographic landscape. However, we need a broader debate on this subject. And interestingly, as I think back, to ask questions such as ‘can the definition of photography expand’ of ‘what has photography become’. Paglen did approaches these question and free my mind. From his question based text I was confident enough to submit myself with the question if machine perception can expand the field of photography.
“Seeing machines is an expansive definition of photography. It is intended to encompass the myriad ways that not only humans use technology to “see” the world, but the ways machines see the world for other machines.” Paglen
To me, it was exiting enough to start a practical research into this technique. I started with a simple question, how do computers read images? I stumbled on different techniques that all led to deep learning. [HOG/CNN] What drew me the most was an image from Stanford’s CS 231 N a course taught by Andrei Karpathy and Justin Johnson [x]. It shows us an image of the process, inside a neural network, entire abstract to the human eye. Although I’m not familiar with the process I wanted to explore this field and had to master the technical part, no matter what. Experience in practice will provide me with answers.
Torch-visbox: https://github.com/Aysegul/torch-visbox
By Aysegul
To get an insight how computers approach the image I worked within the field of algorithms and deep learning. I had to find myself a tool, a script on Github, which visualizes the activations produced on each layer of a trained Convnet as it processes an image or video. I never worked in a terminal before, so I did need some guidance by other students. Together we managed to install and run the script within Torch7 and Python (Ubuntu). Torch-visbox shows us the input layer (image), the middle layers of nodes called hidden layers and the output (object detection). The design of the neural network architecture contains five hidden layers. The information runs trough these nodes but seems to be more compressed when going deeper in the network. [x] There where two possibilities in running the script, the process shown in text or images. When only executing in text, you can see that they single out ten best ‘performing’ ones; classified and arranged on high percentage of probability. [x] The script is designed to provide you always with an answer, even though it may not be trained on the input you feed. The machines script determines the “style” of ‘seeing’ and is interrelated with how it ‘wants’ to see the world.
There are many types of artificial neural networks and it grows exponentially. In order to discover more of a glimp of AI, I wanted to look deeper in this issue. Only this time I wanted to explore the hidden layers live, with my camera as input.
Ml4a-ofx: https://github.com/ml4a/ml4a-ofx
By Gene Kogan
The script is executed within Xcode and requires OpenFramworks apps to run (Mac). It reveals what convolution neural networks see by processing real-time footage from the webcam. This time it clearly shows how it scans the images looking for patterns.
It is striking that the nodes show us the presents of the features inside. Through some experiments this became clear. I exposed the neural network to a ‘dark room’. The performance of machine vision takes place in the hidden layers; the production of images is invisible to us. This refers, to me, as photography dark room.
We have to use tools, to see beyond our human limitations. [x]
Photography can “Lend form to things that were not normally visible to the human eye – providing them the appearance of something permanent and solid, or at least bound by shape and structure.” In the terms of Lázló Moholy-Nagy a ‘New Vision’, that creates a whole new way of seeing the outside world that the eye could not. I had to shed a light on the influences from photography.