We often think of ‘seeing’ as sitting at the intersection of the physiological and the psychological, of the natural and the cultural. In this view, the former has to do with the basic ‘wiring’ of the eye and the brain, whilst the latter comes about from visual experience, language, historical context, class, gender, etc—the visual information from which we have learned.

The borders of ‘the cultural’ are contested and paradoxical – but when this way of thinking is applied to machine vision, it necessarily foregrounds training data (qua visual experience) as the site of culturally situated vision. This ignores the fact that the neural ‘wirings’ (computer vision algorithms) are themselves artificial technologies. For instance, by training a convolutional network entirely on fractals, it is possible to create a workable neural machine vision algorithm with no extrinsic training data: yet this algorithm obviously also has a ‘way of seeing’. I seek to redress the balance by investigating the ideologies of vision (scopic regimes, if one prefers) of the neural wiring itself: of max-pooling, of the encoder-decoder architecture, of skip-connections, the triplet loss, dropout, momentum, attention layers, etc…