Getting to know your data with metric learning

Dmitry Kan
2 min readMay 15, 2022

--

I hope your Spring (or Autumn, if you happen to be in the Southern Hemisphere) is going well so far. In this new episode of Vector Podcast I sat down with Yusuf Sarıgöz, AI Research Engineer at Qdrant — vector database, one of 7 as discussed in the related blog post.

We discussed metric learning — this technique can be used in data scarce scenarios when you need to build a production-grade model, for instance a classifier.

Episode of Vector Podcast on metric learning with Yusuf Sarıgöz

You might have a small amount of labelled data, but if you manage to present the metric learning model positive and negative samples, it will learn optimal embeddings for your end-user model.

To implement metric learning in the form of a neural network, we can use AutoEncoder architecture:

Metric learning with AutoEncoder neural network

The input sample X with dimension K is passed through the Encoder, which computes an embedding out of it with dimension N << K. The Decoder will reconstruct the input sample X’ from the embedding. The network optimizes for the objective D(X, X’)->0, that is the distance between the input sample and its reconstructed form should be as close to 0 as possible.

Once the optimum is reached across all input samples, we can dispose the Decoder part of our network and utilize the Encoder for computing the embeddings on our data. Once you have the embeddings, you can store them in a vector database for optimal retrieval during classification.

Watch the episode on YouTube:

You can also listen to it on the usual Spotify:

Apple Podcasts:

I’ve also received a request to publish the RSS feed of the podcast episodes, so that you can plug in into your favorite podcast software. Here it is:

https://media.rss.com/vector-podcast/feed.xml

You will find lots of links to papers, blogs and tools around this topic of metric learning to optimize your learning process. Good luck and remember to subscribe to the podcast to get new episodes in your stream.

--

--

Dmitry Kan
Dmitry Kan

Written by Dmitry Kan

Founder and host of Vector Podcast, software engineer, product manager, but also: cat lover and cyclist. Host: https://www.youtube.com/c/VectorPodcast

No responses yet