Building Memory System
What I learnt from Building a Memory system for faffit.com Context: faffit is a personal assistant as a service, they live inside users Whatsapp. Problem Statement: Pull various user details (address, office etc.), persona (recent baby, constant travel between London and Mumbai etc.), preferences (like Italian food, etc.) from the conversation that spans years with 100s of tasks done.
Approach 1: The intuitive thing to do was to chunk the messages into batches. Extract information from each batch. Finally combine each output together.
If you think of what we did there we made 3 passes to compress the chat history to the memory document.
- First PROMPT to take the batch and output some interesting things
- Combine them all to remove duplicates
- Collapse all of these into 3 categories details, persona, preferences.
At every stage you lose significant context for the ai to do the correct job. Sure you can attach messages like citations and help preserve context through each pass, but still every action taken by the LLM there was a decision (inferred by the ai), the context of the decision is lost.
Approach 2: Now think of a different system One where we have live Memory Document, and chats batched through tasks. A system where the LLM Inputs are previous memory document (starting with an empty one) and the new task related messages.
The output this LLM provides is going to be a new memory document. Now we will have a separate agent that takes the previous document and the new memory document. Acting like 'Apply Code Diff model'
Here it's ONE PASS between the actual messages and the final document
Most Importantly, You can have feedback loop in this system. How?
Let's say a new task is going on right now and we have messages related to that task, A human can look at the diff and give feedback around what the ai missed. We can have a feedback agent collect these as examples. Using these examples we can improve the system prompt for our LLM.
Approach 3: You think that's where one would stop. We can improve it and also preserve the ONE PASS approach. In the earlier approach we were either taking texts batch wise or task wise.. What about the context and details that is between this segregation
How can we get that context? Have an agent that looks at the batched data and ask questions? Example questions after looking at the data related to a task - Finding a Home in Koramangala?
- where is the user living now, where is the user from, where does the user' office locate etc.. One can ask these questions to the rest of the conversation data and get answers. Feed those answers along with the current batched data.
So we enrich the system with
- added context from rest of the conversation
- feedback given by the user from approach 2