I have used VGG-16, a machine vision algorithm by Karen Simonyan and Andrew Zisserman for object detection and image classification, to organise the photographs within the kolam. Ideally, I would have developed my own model towards old photographs but I didn't have the resources to gather large amounts of photographs, to hire image annotators or to achieve the computational power required.

The model is trained on images from ImageNet, one of the most influential visual datasets in the field of machine learning. It contains more than 14 million images scraped from the Internet, labelled by Amazon Mechanical Turk crowdworkers and organised through 21,000 categories.

However, similar to the approach of the vast majority of machine vision models, VGG-16 detects objects in images and categorises them using just 1,000 classification classes:

The categories of ImageNet were created making use of WordNet, a lexical dataset of English nouns, verbs, adjectives and adverbs grouped into sets expressing distinct concepts. Problems of the inherited categories to describe abstract concepts –for example ‘fear’ or ‘happiness’– and its integration of racial and gender bias have been publicly criticised and debated.

Given the limited and problematic 1,000 categories of VGG-16, its classification is often incorrect:

This misleading result could be produced by labels that are not necessarily relevant to these types of images. For example, it is very unlikely to find a European Fire Salamander (category id 25) in these images, while a category like bottu or bindi (the red dot applied between eyebrows) would have been helpful.

Kimono (category id 614) (the traditional Japanese garment) is probably the closest to something like saree (the traditional Indian garment). We should also remember that the images from my family album and stars.archive are quite different from the mostly born-digital images of the ImageNet database.

Instead of having the model classify the images from stars.archive using preset categories, I am interested in something called embedding. Before the model assigns labels from the 1,000 categories to the images, it creates a unique series of numbers (an embedding), for each image that it later uses to classify it. This embedding is a mathematical representation of the image. The method by which the model assigns numerical values to parts of the image is something that is tweaked and optimised over time; researchers run tests until the predictions become more and more ‘accurate’. Similar images, in theory, should have embeddings that are closer in value. For example, the difference between embeddings for two images of ‘men wearing ties’ should be less than an embedding for an image of ‘a man wearing a tie’ and an embedding for an image of ‘a fire truck’.

In some respects, it doesn't matter how incorrect the labels are, as long as similar images get similar embeddings in the process before categorisation.

For example, the embedding for this image:

is:
[ 1.91794758e+01 2.45684185e+01 -1.48634014e+01 1.00456610e+01 1.23722658e+01 3.67962575e+00 1.86768460e+00 -5.08265734e+00 2.82952571e+00 -1.82983720e+00 -1.37407389e+01 1.27371275e+00 -2.83554959e+00 1.48696537e+01 -9.56767321e-01 1.30410614e+01 1.56107402e+00 -5.20693350e+00 6.45998001e+00 -4.87222195e+00 4.18484402e+00 2.02562189e+00 1.62268889e+00 6.55196619e+00 -3.79729223e+00 9.08865213e-01 -4.80334014e-01 3.64598536e+00 2.87132716e+00 1.45120323e-02 -1.86411238e+00 2.61045027e+00 1.78066015e+00 2.33796835e+00 -1.94519413e+00 -1.07703531e+00 7.75017262e-01 1.07955289e+00 -3.98576546e+00 9.80376065e-01 5.26948977e+00 2.49461865e+00 2.67107844e+00 3.21161079e+00 2.43223262e+00 1.71105647e+00 2.77814841e+00 -1.22377649e-01 -5.11695576e+00 -4.16733980e+00 -2.58463597e+00 3.96219110e+00 -1.90711915e+00 2.47059897e-01 -2.32500839e+00 -1.60414040e-01 -5.07035303e+00 2.42650795e+00 -3.34037542e-01 1.68242824e+00 5.26408195e-01 -2.52320242e+00 -8.63885343e-01 1.78900170e+00 -1.23478949e+00 5.99599659e-01 3.45318055e+00 -2.19646502e+00 5.45571625e-01 2.45045096e-01 -2.75517106e+00 -1.44544095e-01 -8.69867802e-01 -1.63786530e+00 -1.37256712e-01 -4.81543827e+00 -2.63040376e+00 2.89736366e+00 4.22516489e+00 8.83858562e-01 4.44634581e+00 3.95269275e+00 -6.53525293e-01 -6.16135776e-01 -2.35084200e+00 -2.76584029e-02 6.25337064e-01 1.61569691e+00 -8.54805112e-01 1.72879767e+00 -3.45003515e-01 2.26746488e+00 4.09944773e+00 2.24842310e+00 3.15522969e-01 -6.14471555e-01 -2.87589818e-01 3.92144263e-01 2.94887733e+00 -1.43602991e+00 -1.35511529e+00 -1.95796967e+00 1.34478807e+00 -2.93180680e+00 -1.86152124e+00 -2.01161075e+00 4.60683680e+00 1.30669916e+00 1.25809121e+00 -2.18553036e-01 8.61385822e-01 1.95201397e+00 3.50504255e+00 7.50711799e-01 -2.04655600e+00 -9.43926513e-01 4.96668756e-01 1.10160053e-01 3.54241967e-01 9.09713864e-01 2.38036990e+00 -1.31741774e+00 -2.44925737e+00 1.64631522e+00 1.11341393e+00 1.35700965e+00 -1.38343012e+00 4.72699106e-01 4.72502738e-01 2.89054537e+00 2.44591951e-01 -3.90299439e-01 -6.78489387e-01 2.36203742e+00 -1.68411446e+00 -8.83051515e-01 -2.78843880e-01 -2.56786275e+00 -1.00011373e+00 1.04334641e+00 -4.61221159e-01 3.97799224e-01 9.04500484e-04 1.60928559e+00 -2.47642875e+00 1.29292533e-01 7.82566145e-02 -8.67262840e-01 5.52608311e-01 3.39104271e+00 -8.92583787e-01 2.21043205e+00 -1.06479669e+00 -5.92290044e-01 -1.02788627e-01 9.00046885e-01 -2.02432442e+00 -2.95873284e-02 1.08827424e+00 -6.82543874e-01 5.28323352e-01 -2.82444894e-01 -3.99644554e-01 -1.87972867e+00 -6.41972721e-01 -1.41761875e+00 -1.04742336e+00 4.30806786e-01 -6.55426145e-01 -4.04562056e-03 -5.76043725e-01 -4.61630583e-01 -4.09543514e-01 3.32864904e+00 1.22535360e+00 3.17406297e+00 6.27566218e-01 4.49247241e-01 -8.92385483e-01 6.72171295e-01 -3.64607871e-01 -8.71325195e-01 -6.57851756e-01 -1.02997088e+00 -1.42854404e+00 -7.70915031e-01 -6.37771368e-01 -4.93319094e-01 3.81817371e-01 -5.55320323e-01 -2.09226227e+00 -7.80385435e-01 -1.67794597e+00 -4.49801832e-01 -6.85872793e-01 2.21424580e+00 4.23316479e+00 -6.19300008e-01 1.86850464e+00 -3.34484614e-02 3.88091350e+00 -2.42278114e-01 6.84221506e-01 -8.82503688e-01 2.12381554e+00 8.49718750e-01 -2.18842477e-01 -6.81617916e-01 -2.72165871e+00 5.63951731e-01 1.34849000e+00 7.51719236e-01 -1.12067580e+00 1.74883485e+00 1.72191286e+00 9.62925732e-01 4.30905968e-01 2.55084777e+00 2.16137981e+00 9.20611978e-01 -7.78484702e-01 9.10550475e-01 7.25476146e-01 -2.22245902e-01 -2.53118634e-01 4.85176146e-01 -5.03149629e-01 1.59352708e+00 1.73332858e+00 -1.19019949e+00 4.07134265e-01 -1.17331676e-01 2.63678288e+00 -9.51610267e-01 -9.12574768e-01 -7.83781767e-01 -1.06167269e+00 -4.37193990e-01 -5.36050856e-01 6.24245703e-01 -3.14469635e-02 -2.35921097e+00 1.75781369e+00 -9.93272722e-01 -1.51024699e+00 -1.61797667e+00 1.11819232e+00 3.06039572e-01 -1.61693978e+00 3.64957690e-01 4.65739518e-01 7.12308809e-02 1.37075335e-02 -6.52577877e-01 1.46927685e-02 -1.13664818e+00 2.54228055e-01 1.05104399e+00 4.17230837e-02 7.13630170e-02 1.21708453e+00 9.03380990e-01 -8.21467116e-02 -7.59055734e-01 -3.96756232e-01 -1.06059361e+00 9.26356852e-01 -2.47443207e-02 -1.37211287e+00 -5.55631638e-01 -1.09995258e+00 9.90472436e-02 -1.16855562e+00 -3.36859494e-01 -4.77764785e-01 1.14232063e+00 -2.68011713e+00 2.02437788e-01 1.65829599e+00 8.72483909e-01 -3.99105817e-01 4.87966716e-01 -1.04135334e-01 -1.77713990e-01 6.71906590e-01 1.76185417e+00 3.88295412e-01 -1.08416092e+00 9.82434750e-02 -3.77404451e-01 -2.53587574e-01 1.14483452e+00 -9.93044376e-01 1.16938555e+00 4.82089259e-02 5.48468471e-01 1.48385793e-01 -4.37964976e-01 -3.75440359e-01 -5.34462452e-01]

The embedding for this image

is:
[ 1.39509230e+01 1.57726021e+01 -1.39290342e+01 1.53334646e+01 1.88452137e+00 -8.16810894e+00 2.29300952e+00 -1.72023892e+00 5.74233770e+00 -7.48942995e+00 -1.37936316e+01 4.39429283e-02 -3.44045138e+00 9.43470573e+00 4.76403093e+00 1.45046339e+01 7.12236091e-02 1.43390208e-01 5.68005514e+00 -1.48921275e+00 1.46231210e+00 2.20108795e+00 7.35101759e-01 1.14242649e+00 -1.91482663e+00 -2.41836381e+00 7.03861237e+00 5.00103569e+00 4.17246675e+00 5.65432596e+00 -5.29990768e+00 -3.74641800e+00 4.68931627e+00 -1.25669646e+00 -4.84356451e+00 -1.92194963e+00 1.13912916e+00 -3.63552499e+00 -3.31987286e+00 -1.95919001e+00 -7.68998504e-01 4.78142738e+00 3.70770311e+00 -1.76924789e+00 3.50682306e+00 -6.87864721e-01 1.84991884e+00 1.37862384e+00 -2.10066748e+00 -7.68206954e-01 -9.06196296e-01 -4.01076889e+00 -2.59839535e+00 2.32850626e-01 -1.00757694e+00 -1.20353532e+00 -2.06982851e+00 -1.78817451e+00 1.58575892e+00 3.17387283e-01 -4.46884155e+00 3.97757292e-01 -2.68035591e-01 -3.34565687e+00 7.80928671e-01 -1.53702700e+00 2.37716842e+00 9.92273390e-01 2.08555889e+00 -3.00560594e+00 -3.57237172e+00 1.41698670e+00 -1.95569432e+00 4.26490784e-01 -1.63480723e+00 -2.27633381e+00 -8.08896601e-01 4.04194266e-01 4.05652905e+00 -1.12160891e-01 -1.45051014e+00 -1.78686154e+00 8.09084535e-01 1.77147672e-01 9.33111548e-01 1.07338572e+00 8.99628341e-01 -1.84138298e+00 -3.51665109e-01 -7.59162664e-01 -1.08349502e+00 -1.03144896e+00 2.89804316e+00 2.60064363e+00 4.09009123e+00 -8.09401155e-01 -4.56716156e+00 -4.50304747e-01 -1.42174351e+00 -2.19219351e+00 -2.04563469e-01 1.86300039e+00 3.20406938e+00 1.44611990e+00 1.61961150e+00 -2.05519319e+00 2.30542397e+00 -1.60238683e+00 4.89198089e-01 -1.74054217e+00 -6.79731548e-01 7.04344213e-01 9.44294155e-01 1.16075671e+00 -2.04535395e-01 4.49702215e+00 2.05011159e-01 -5.03859967e-02 9.97282386e-01 6.58375025e-02 -1.38229835e+00 4.49326247e-01 -2.67472804e-01 -3.04264218e-01 -8.39258730e-01 1.05518758e-01 4.11018580e-01 2.12912291e-01 6.66252255e-01 6.46552503e-01 -1.41004014e+00 2.73475707e-01 4.26153511e-01 9.68831778e-01 -1.07074946e-01 6.70713663e-01 5.41332364e-01 -9.55132484e-01 7.26018012e-01 1.90517890e+00 -1.35173416e+00 -1.90488720e+00 1.67615068e+00 1.46352148e+00 1.33349013e+00 3.54355633e-01 4.60074931e-01 -7.83267021e-01 7.80422926e-01 -2.09911942e-01 -1.23356327e-01 1.32152081e+00 -1.29597366e+00 -2.04120541e+00 1.36339259e+00 -3.81184518e-02 -1.10634100e+00 -3.82290572e-01 -1.17844427e+00 3.03505182e+00 -5.04788041e-01 -4.21788096e-01 -1.53143871e+00 -9.80525732e-01 2.65300369e+00 -1.15192771e-01 9.97484922e-01 5.00579476e-02 5.76818645e-01 1.00528097e+00 -2.24905670e-01 -1.49113774e+00 2.68705726e-01 2.85539556e+00 -3.98975879e-01 -1.57772756e+00 1.93733072e+00 -1.19273293e+00 2.10888028e+00 1.46575105e+00 1.49359614e-01 -1.72821856e+00 1.80095822e-01 1.33534026e+00 -4.83725339e-01 -4.43692833e-01 4.74454701e-01 -7.04340696e-01 -1.73896566e-01 6.39160514e-01 -6.68460190e-01 -6.77079976e-01 9.32900429e-01 9.56034482e-01 3.12611163e-01 5.79683602e-01 1.82459641e+00 -6.77413166e-01 6.85611844e-01 -4.84373599e-01 -1.46384373e-01 -2.86526471e-01 4.21394289e-01 -5.28891742e-01 -3.02784353e-01 -7.52703846e-01 1.42143500e+00 -2.01210904e+00 -2.75162548e-01 -3.70684683e-01 -5.36288381e-01 2.12948847e+00 1.65410280e+00 1.56391427e-01 6.66792512e-01 2.68948138e-01 -1.29685390e+00 -1.46059543e-02 -1.59904373e+00 2.95166016e-01 -9.25642550e-01 -9.10231411e-01 2.58096278e-01 1.18902540e+00 -8.23410302e-02 1.15398794e-01 2.87378252e-01 1.49307108e+00 -4.62432086e-01 -1.20593572e+00 5.06623328e-01 -1.50804833e-01 -1.56888410e-01 1.27353773e-01 -1.37721586e+00 -1.26553357e+00 5.87539673e-01 -1.12404859e+00 1.21043909e+00 1.19625163e+00 -1.21831465e+00 -2.36697912e-01 4.14913952e-01 -6.80861592e-01 3.07412148e-02 5.09652257e-01 -1.71673745e-01 -6.43420637e-01 -9.84362006e-01 -9.74100649e-01 2.35690176e-02 1.47413343e-01 -1.40842330e+00 8.99301171e-01 -2.12456799e+00 4.25302416e-01 -3.00494224e-01 8.08667421e-01 -2.70122814e+00 -2.07691050e+00 -9.61755514e-01 -6.22252047e-01 6.28189683e-01 -6.53336287e-01 -1.82604864e-01 -8.48317742e-01 -2.17714801e-01 -1.04423857e+00 -3.50459218e-01 2.08377004e+00 -5.58921814e-01 2.73671418e-01 1.57726455e+00 1.26133990e+00 -6.70335472e-01 1.36419249e+00 9.39889550e-01 5.56972623e-01 -4.83564526e-01 8.68434906e-01 -5.11888713e-02 -7.79513896e-01 1.09836310e-01 -8.08777660e-02 8.07954490e-01 -1.13288558e+00 8.39772463e-01 7.30755866e-01 -1.61235243e-01 -1.56402171e-01 1.93634421e-01 2.00064874e+00 -5.93690991e-01 -1.79873213e-01 1.92424357e+00 6.48455679e-01 -1.77089557e-01 -5.82734048e-01 -9.92835402e-01 1.03992963e+00]

The distance between these two embeddings (using Cosine Similarity, a mathematical method of measuring similarity between two sets of numbers) is 0.33302491903305054.

Whereas, if we compare the embedding for the first image and the one for this image:


The distance between the embeddings is much higher at 1.0102533549070358.

Using these calculations, we can arrange all the images according to similarity, as with the grid you can see in the background of this page. Looking at this grid, I intuitively started to map out groupings.

For example, I saw close-cropped portraits of men, or images with large standing mirrors. I also tried to place more value on meaning than on the numbers of images within a cluster. Even though there might have been fewer images in a grouping, if that group of images illustrated a cultural aspect that I wanted to give space to –such as the images with standing mirrors that reveal the intricate stitching of flowers into a woman's hair – I included it as a cluster.

Using embedding instead of the labelling categorisation allowed me to avoid the limitations and the bias of the VGG-16 model while still employing it to start organising a large collection of photographs. The algorithmic sorting facilitated an initial categorisation, one that I continued and finalised signifying, along with my family members and researchers from the stars.archive.