DEVELOPER

Vector database provider Qudrant introduces new algorithm for hybrid search

04/07/2024

Vector database startup Qudrant wants to tailor its open-source database and vector search engine specifically to modern use cases in the fields of AI and search, such as Retrieval Augmented Generation (RAG). The company is now introducing a new search algorithm called BM42, which is positioned as an alternative to established variants such as BM25 or SPLADE.

The BM42 combines the best of both worlds

According to the announcement by Qudrant CTO and co-founder Andrey Vasnetsov, BM42 takes an innovative approach that combines the strengths of the classic BM25 algorithm with the advantages of Transformer-based AI models: on the one hand, the simplicity and interpretability of BM25 and, on the other hand, the semantic intelligence of Transformer models.

Unlike classic search applications, documents in RAG systems are typically very small. The BM42 algorithm addresses this problem by replacing term weights within a document with semantic information from the Transformer model. While it retains the inverse document frequency (IDF) known from BM25, which measures the importance of a term in relation to the entire document collection, BM42 uses attention values from the Transformer model to determine the meaning of a term rather than the statistical term frequency within a document for the entire document.

According to Vasnetsov, by using the attention score, BM42 can take into account the semantic meaning of words without relying on additional training. A special tokenization method is used that is more suitable for search tasks. Tokens (CLS) Represents the entire sequence in classification tasks. As shown in the list below, the token attention line can be used to determine the importance of each token in the document throughout the document.

sentences = "Hello, World - is the starting point in most programming languages"

features = transformer.tokenize(sentences)

# ...

attentions = transformer.auto_model(**features, output_attentions=True).attentions

weights = torch.mean(attentions(-1)(0,:,0), axis=0)                       
#                ▲               ▲  ▲   ▲                                 
#                │               │  │   └─── (CLS) token is the first one
#                │               │  └─────── First item of the batch         
#                │               └────────── Last transformer layer       
#                └────────────────────────── Averate all 6 attention heads

for weight, token in zip(weights, tokens):
    print(f"{token}: {weight}")

# (CLS)       : 0.434 // Filter out the (CLS) token
# hello       : 0.039
# ,           : 0.039
# world       : 0.107 // <-- The most important token
# -           : 0.033
# is          : 0.024
# the         : 0.031
# starting    : 0.054
# point       : 0.028
# in          : 0.018
# most        : 0.016
# programming : 0.060 // <-- The third most important token
# languages   : 0.062 // <-- The second most important token
# (SEP)       : 0.047 // Filter out the (SEP) token

Developers benefit from BM42

According to the announcement, BM42 offers several advantages that benefit developers in addition to improved search results for small documents, as well as their traceability and interpretability. The algorithm can be easily integrated into existing systems and is characterized by high efficiency with low memory requirements. The ability to use various transformer models provides users with greater flexibility. According to Vasnetsov, the best possible results can currently be achieved by combining BM42 with dense embeddings in a hybrid search approach, in which the sparse model is responsible for accurate token matching and the dense model is responsible for semantic similarity.

HashiCorp: New names and features for Terraform, Packer, and Vault

(Image: DOAG)

On November 20 and 21, 2024 AI Navigator In the second round. Organized by DOAG, Heise Medien and D’Ge’Pole, the event will again take place at the Nuremberg Convention Center East. KI Navigator is a conference on the practice of AI in the three areas of IT, business and society. It is dedicated to the specific application of artificial intelligence.

The program includes, among other things, lectures On vector database and retrieval enhanced generation. By 30 September Tickets available at starting price Available.

More details about BM42 as well as technical background information about search algorithms such as BM25 or SPLADE can be found here In the Annunciation of Andrey Vasnetsov. If you want to go deeper into the discussion and get to know the Qoodrant projects, this is the place for you Discord Channel open.

(Map)

Product Activists: The AARRR Model — How Pirate Metrics Help Product Owners

Vector database provider Qudrant introduces new algorithm for hybrid search

The BM42 combines the best of both worlds

Developers benefit from BM42

LEAVE A REPLY Cancel reply

EDITOR PICKS

Chatgpt makes a man killer: NOYB data safety is sufficient for complaints

The Return of the Jack Connector: Apple USB-C AudiCabula

We-Vibe launches the Jive Lite, a remote-controlled pleasure toy

POPULAR POSTS

Digital division is many annoying when using payment services-and compulsion

JavaScript runtime reimplements Node.js 23 modules and test support

Münchner Cyber Security Conference: Ukraine Europe is changing

POPULAR CATEGORY

ABOUT US

FOLLOW US

WhatsApp: “Advanced Chat Privacy” should protect sensitive communication