I would like my model to know the code libraries I use and help me write code with them. I use llama.cpp’s server and web UI for inference, but I have no clue how to get started with RAG, since it seems it is not natively supported with llama.cpp’s server implementation. It almost looks like I would need to code my own agent.
I am not interested in commercial offerings or APIs. If you use RAG, how do you do it?
You can use something like Anything LLM for RAG:
https://github.com/Mintplex-Labs/anything-llm
It works with local models.
https://docs.anythingllm.com/agent/usage#what-is-rag-search-and-how-to-use-it