Vector Databases Are the Wrong Abstraction
published on 2024/10/31
Vector databases treat embeddings as independent data, divorced from the source data from which embeddings are created, rather than what they truly are: derived data. By treating embeddings as independent data, we’ve created unnecessary complexity for ourselves.
In this post, we'll propose a better way: treating embeddings more like database indexes through what we call the "vectorizer" abstraction. This approach automatically keeps embeddings in sync with their source data, eliminating the maintenance costs that plague current implementations.