The current similar items implementation is quite coupled to MongoDB implementation.
Refactoring plan for similar items
- The main function for `similar_items` is still `search_similar_items`
- `search_similar_items` takes the main parameters i.e `item_id`, `venue_ids`, `excluded_brands` etc and 2 repositories
- `EmbeddingRepository`: help retrieving embedding for an item id
- `VectorSearchRepository`: help searching for similar items based on the embedding from `ItemRepository`
- For `EmbeddingRepository`, we will have `MongoDBEmbeddingRepository` (fetching item embedding from MongoDB as we have right now) and later `FeaturePlatformEmbeddingRepository` which fetches item embedding from feature platform
- For `VectorSearchRepository`, we will have `MongoDBVectorSearchRepository` (fetching similar items from MongoDB as we have right now) and `ArgoVectorSearchRepository` which fetches similar items from Argo
- By having those 2 repositories, we will be able to swap different repositories to `search_similar_items`, allowing us to try out different approaches