Skip to content

RAGatouille API Server#92

Open
0-hero wants to merge 7 commits into
AnswerDotAI:mainfrom
0-hero:api-server
Open

RAGatouille API Server#92
0-hero wants to merge 7 commits into
AnswerDotAI:mainfrom
0-hero:api-server

Conversation

@0-hero

@0-hero 0-hero commented Jan 30, 2024

Copy link
Copy Markdown

No changes on core RAGatouille module. Just a fastapi addition

Looking for suggestion on the implementation, routes etc...

API Endpoints

Index

  • POST /api/v1/index/: Create an index with documents.

Document Management

  • POST /api/v1/documents/search/: Search documents.
  • [ISSUE] POST /api/v1/documents/add/: Add documents to an existing index.
  • [ISSUE] DELETE /api/v1/documents/delete/: Delete documents from an index.

Encode

  • POST /api/v1/encode/: Encode documents.
  • POST /api/v1/encode/search/: Search through encoded documents.
  • POST /api/v1/encode/clear/: Clear all encoded documents.

Rerank

  • POST /api/v1/rerank/: Rerank a set of documents based on a query.

Known Issues. Still exploring the cause

  1. Add to index doesn’t work
2024-01-31 00:15:41,992 - app.api.endpoints.document_management - ERROR - Failed to add documents to index: Error in void faiss::Clustering::train_encoded(faiss::idx_t, const uint8_t *, const faiss::Index *, faiss::Index &, const float *) at /Users/runner/work/faiss-wheels/faiss-wheels/faiss/faiss/Clustering.cpp:281: Error: 'nx >= k' failed: Number of training points (114) should be at least as large as number of clusters (128)
  1. Search after delete & Search after failed "add" document throws below error.
2024-01-30 18:11:28,756 - app.api.endpoints.search - INFO - Initiating search with query: 'egypt'. Index: 'string', k: 10
2024-01-30 18:11:28,848 - app.api.endpoints.search - ERROR - Search operation failed for query: 'egypt': 0
Traceback (most recent call last):
  File “…/RAGatouille-experiments/serve/app/api/endpoints/search.py", line 13, in search
    results = rag.search(
  File "…/RAGatouille-experiments/ragatouille/RAGPretrainedModel.py", line 292, in search
    return self.model.search(
  File "…/RAGatouille-experiments/ragatouille/models/colbert.py", line 458, in search
    document_id = self.pid_docid_map[id_]
KeyError: 0
INFO:     127.0.0.1:64170 - "POST /api/v1/search/ HTTP/1.1" 500 Internal Server Error

@0-hero

0-hero commented Jan 30, 2024

Copy link
Copy Markdown
Author

@bclavie @anirudhdharmarajan Any thoughts on the above issues?

EDIT: Just saw issue 1 might be related to #71

@bclavie

bclavie commented Jan 30, 2024

Copy link
Copy Markdown
Contributor

Thanks a lot for this! I’m likely not going to be able to review bigger PRs in the next few days but will do so ASAP (or defer to @Anmol6 / @anirudhdharmarajan if they have the bandwidth to do so!)

@0-hero

0-hero commented Jan 30, 2024

Copy link
Copy Markdown
Author

@bclavie No problem, thanks! I’m running additional tests, while trying to fix the above

@0-hero

0-hero commented Feb 7, 2024

Copy link
Copy Markdown
Author

@bclavie gentle bump!

@bclavie

bclavie commented Feb 7, 2024

Copy link
Copy Markdown
Contributor

Hey, sorry for the delay in reviewing this, still getting some random bouts of poor health (COVID really is tricky!)

As for add_to_index, I think it's fine for it to have this known issue, as the issue is being fixed by @anirudhdharmarajan and isn't related to this service (the function itself has a fundamental issue atm). Moreover, I think this style of on-the-fly serving would work better with an HNSW style index than the optimised one currently used if CRUD is an issue!

Things are looking pretty good at first glance and I really do appreciate it, it lowers the barrier to entry to even more!

A nitpick: could you migrate the dependency management to poetry, or, at least (I don't mind this option!) pin the versions in requirements.txt? Some of those libs move fast and it'd be best to have them pinned!

@0-hero

0-hero commented Feb 13, 2024

Copy link
Copy Markdown
Author

Sorry to hear that @bclavie. Updated the requirements & migrated the dependency management to poetry

@0-hero

0-hero commented Feb 14, 2024

Copy link
Copy Markdown
Author

@bclavie do you think this is ready for the merge?

@hiepxanh

Copy link
Copy Markdown

@bclavie hi, what do you think, I love this feature so I can implement it and deploy on docker <3

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants