Yury Malkov — Staff Engineer, Twitter — Author of the most adopted ANN algorithm HNSW
In the quest to deepen my knowledge in vector search, I’ve read a few most important research papers in the space. One of such papers is co-authored by Yury Malkov, who invented the HNSW algorithm — Hierarchical Navigable Small World Graph. Today this algorithm has been implemented in all vector databases in different languages — Rust, Go, C++ and Java.
In this episode I sat down with Yury to get into the fascinating topic of Approximate Nearest Neighbour (ANN) algorithms, that can handle hundreds of millions of points in a multidimensional space, as well multistage ranking and whether symbolic filtering is a good fit in an ANN algorithm.
You can watch the episode on YouTube:
And listen to it on Spotify:
And Apple Podcasts:
Vector Podcast: Yury Malkov - Staff Engineer, Twitter - Author of the most adopted ANN algorithm…
Topics: 00:00 Introduction 01:04 Yury's background in laser physics, computer vision and startups 05:14 How Yury…
Here are the topics we covered:
01:04 Yury’s background in laser physics, computer vision and startups
05:14 How Yury entered the field of nearest neighbor search and his impression of it
09:03 “Not all Small Worlds are Navigable”
10:10 Gentle introduction into the theory of Small World Navigable Graphs and related concepts
13:55 Further clarification on the input constraints for the NN search algorithm design
15:03 What did not work in NSW algorithm and how did Yury set up to invent new algorithm called HNSW
24:06 Collaboration with Leo Boytsov on integrating HNSW in nmslib
26:01 Differences between HNSW and NSW
27:55 Does algorithm always converge?
31:56 How FAISS’s implementation is different from the original HNSW
33:13 Could Yury predict that his algorithm would be implemented in so many frameworks and vector databases in languages like Go and Rust?
36:51 How our perception of high-dimensional spaces change compared to 3D?
38:30 ANN Benchmarks
41:33 Feeling proud of the invention and publication process during 2,5 years!
48:10 Yury’s effort to maintain HNSW and its GitHub community and the algorithm’s design principles
53:29 Dmitry’s ANN algorithm KANNDI, which uses HNSW as a building block
1:02:16 Java / Python Virtual Machines, profiling and benchmarking. “Your analysis of performance contradicts the profiler”
1:05:36 What are Yury’s hopes and goals for HNSW and role of symbolic filtering in ANN in general
1:13:05 The future of ANN field: search inside a neural network, graph ANN
1:15:14 Multistage ranking with graph based nearest neighbor search
1:18:18 Do we have the “best” ANN algorithm? How ANN algorithms influence each other
1:21:27 Yury’s plans on publishing his ideas
1:23:42 The intriguing question of Why