Exercise 9
- Implement a simple retriever using BM25s (pip install bm25s) https://github.com/xhluca/bm25s
- Use
pymupdf to implement a pdf reader.
- Implement a text chunker that creates chunks using chunk_size and chunk_overlap
- Create a pipeline that reads a pdf file, creates chunks and loads into the retriever.
- Implement a Streamlit-based UI for retrieving and displaying the top_k most relevant chunks to a given query.
- (Optional Add an LLM from Mistral to create a RAG chatbot experience.)