Photo by Benjamin Suter on Unsplash

This year’s Berlin Buzzwords was particularly focusing on what can become the future of search — vector search, amongst other really cool topics, like scaling Kafka, distributed systems tracing with Opentelemetry and increasing the job satisfaction. A few sessions dedicated to the topic of vector search included some impressive demos of dense retrieval techniques to enable question answering in your text lake.

In the Ask Me Anything: Vector Search! session Max Irwin and I discussed major topics of vector search ranging from its areas of applicability to comparing it to good ol’ sparse search (TF-IDF/BM25), to its readiness for prime…


Making Sense of Big Data

Neural Search in Elasticsearch: from vanilla to KNN to hardware acceleration

BERT (Image via Flickr, licensed under CC BY-SA 2.0 / background blurred by author)

In two previous blog posts on my journey with BERT: Neural Search with BERT and Solr and Fun with Apache Lucene and BERT I’ve taken you through the practice of what it takes to enable semantic search powered by BERT in Solr (in fact, you can plug in any other dense embeddings method, other than BERT, as long as it outputs a float vector; a binary vector can also work). While it feels cool and modern to empower your search experience with a tech like BERT, making it performant is still important for productization. You want your search engine operations…


In the era of Internet distractions, I find podcasts one of the best sources of knowledge (after books). I seek shows, that educate me in some direction, giving tangible value vs “just story telling”. The following is a succinct list of podcasts that worked for me in three areas of my interest: profession; business; history, culture, science and entertainment.

Photo by Juja Han on Unsplash


In his book on communication secrets “Five Stars” Carmine Gallo talks about SCARF — acronym reflecting the foundation of high-performing organizations introduced by David Rock.

More so when we are remote, but also when we are in the office (hopefully someday!) we care for being connected, recognized, supportive and good team players. What makes a good team truly good for all of us? I’d like to share a little Slack app with you to try out and improve your team and company culture — Wowwlr. But first, let’s see what are the integral components of a high-performing team.

Enable your team to rock

Rock presents…


After publishing the blog post on neural search with BERT and Solr (6.6.0), I got a few questions on how to run this with version 8.6.x of Solr. It took me a few days of going back and forth, and quite honestly a bit of despair, and finally a helping hint from the Lucene committer Adrien Grand (https://twitter.com/jpountz/status/1324093784460873731) to solve. I thought I’d share a few bits on what it took to upgrade vector query functionality from Solr 6.6 to 8.6.x and also explain the nitty-gritty detail of storing the dense embedding in Lucene and querying it in Solr.

Bert with Lucene in mind

Background

The…


It is exciting to read the latest research advances in the computational linguistics. In particular, the better language models we build, the more accurate downstream NLP systems we can design.

Update: if you are looking to run neural search with latest Solr versions(starting version 8.x), I have just published a new blog where I walk you through low-level implementation of vector format and search, and the story of upgrading from 6.x to 8.x: https://medium.com/@dmitry.kan/fun-with-apache-lucene-and-bert-embeddings-c2c496baa559

Bert in Solr hat

Having background in production systems I have a strong conviction, that it is important to deploy latest theoretical achievements into real life systems. …


How often do you project your mental image of a great working environment onto your current company setting? How often do you look for ways to improve the environment in the hope to attract more talent or retain and motivate your current staff? How frequently you, as a staff member, keep thinking about your role within the organisation and your ability to connect with other individuals and teams to achieve goals?

Microsoft office in Finland (Espoo). Why is it here? Read on (copyright: Dmitry Kan, 2019)

As AlphaSense was prepping to announce its Series B funding round led by Eric Schmidt’s Innovation Endeavours, I worked for a month in New York City, interacting closely with…


Long gone are the days, when company culture did not matter or was a second-class citizen. Today, when choosing a company to work for, above all you choose the culture (may be even not realizing it and thinking that you are after technology or product). When you look at the job openings or office photos with employee smiles and general cheering atmosphere you will likely not see the culture of the company. You may get a glimpse of it during the interview process, but it is not enough.

Why is culture important?

What is culture? Citing Wikipedia:

Culture (/ˈkʌltʃər/) is the social behavior and…


The Martian is my top favourite movie (and a book) that in action shows the excitement around engineering professions. Mark Watney, being left alone on Mars, fights for his life with all the knowledge and skills he has, from botany to chemistry and physics. Of all engineering professions, software engineering is probably the most booming right now in light of Artificial Intelligence breakthroughs. But does this profession have ethical aspects that we as engineers and humans need to be continuously thinking about?

I began to follow the work of Yuval Noah Harari and his call to the humanity on a…


Quite many machine and deep learning problems are directed at building a mapping function of roughly the following form:

Input X — -> Output Y,

where:

X is some sort of an object: an email text, an image, a document;

Y is either a single class label from a finite set of labels, like spam / no spam, detected object or a cluster name for this document or some number, like salary in the next month or stock price.

While such tasks can be daunting to solve (like sentiment analysis or predicting stock prices in realtime) they require rather clear…

Dmitry Kan

Founder, tech team lead, software engineer, manager, but also: cat lover and cyclist. Follow me on Twitter: twitter.com/@DmitryKan

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store