Abstract

We present a web application that facilitates multimodal search within institutional image collections using current-generation machine learning models like CLIP. Further, we discuss image retrieval as a combined computer vision/human-computer interaction problem, and propose that the standardization of feature extraction is one of the main problems that digital art history faces today.

Find the web application itelf at imgs.ai.