Video: The AI Agent Memory Landscape | Duration: 1848s | Summary: The AI Agent Memory Landscape | Chapters: Welcome and Introduction (33.14s), Context Graphs Overview (130.655s), Agent Memory Types (330.625s), Data Models & Reasoning (463.535s), Project Scaffolding Demo (644.44s), Real-World Data Integration (951.81s), Multi-Agent Memory (1171.89s), TCK and Coordination (1263.84s), Resources and Q&A (1644.065s), Q&A and Wrap-Up (1700.445s), Closing Remarks (1804.1s)
Transcript for "The AI Agent Memory Landscape":
I can see that some people are starting to join. Go ahead and give a few more seconds to people to join in here. And, I'd like to welcome you all once again to another session and, introduce William today who will be talking about the AI agent memory landscape. With that, I'll hand it over you to William. Great. Thanks a lot. Hey, everyone. Thanks for joining today. Hope, Nodes AI has been been a good event for folks, so far that are watching live. And if you're catching the recording, welcome to you as well. You can find my slides, either by scanning that QR code or the the URL down in the bottom left, graphstuff.com/nodesai2020six. I'll put that in the chat too, and, we'll we'll share that again at the end too. But it might be some useful resources to to check out some of the slides. I wanna pick up kind of right where, you know, we we left off in the, session earlier this morning in the panel discussion in the keynote. I wanna pick up some of the the topics there that we touched on and and maybe drill in a little bit, specifically around context graphs, agent memory on ontologies, these sorts of things. So I'm gonna jump, jump right in. And the first thing I wanna talk about, and and this came up in the in the panel discussion, is this relationship between context graphs and agent memory. And so, you know, I I said earlier today that there's this this close relationship. I don't think it's quite one to one, but I I think there's a very close relationship between context graphs, and and we'll we'll go through a refresher on on context graphs in in a second, but there's been a lot of discussion around context graphs today. But context graphs and agent memory. With with memory, we're we're typically thinking about some text representation, some expression of a preference, or something that happened. But, you know, there's typically, like, file or vector, like, text based, is how, oftentimes, we're initially implementing, some of these things. But with something like, like a context graph or graph based memory, we're actually extracting out the entities. So what are the what are the things mentioned in the conversation and identifying the relationships, between them. I think that's one of the the most important, you know, first things that we need to think about when we're working with graphs, with with agent memories. How do we identify the entities? How do we go from unstructured data to identifying, like, what are the the things? Like, right, like, the entities are the things that exist, events. These are things that have happened. And then context in in the context graph, terminology, the the decision trace is, this is the the why. Right? These are the policies that were applied, the risk factors, the reasoning that was applied either by, a human employee or or by an agent applying some policy, but we need some, some insight into the decision that was applied. And, I I read a blog post a few months ago and and showed this this demo application. It's probably, still up and running. But, basically, a context graph for a financial services organization, you know, we we can do things like ask for, credit approval decisions or responding to customer support questions as they, as they come in. And, really, the the architecture for this is we give the agent tools. So we give them some functionality, some way to interact with and understand their environment by querying, updating the context graph, and finding the most relevant decisions, applying those policies, generating a recommendation is a core part of what that agent, is doing. Graph data science is an important component of this. We we can use the structure of the graph through, through node embeddings. We can use things like community detection to help us find and and service the most recent or the most relevant, context and and decision for any step, that we are in, in sort of our agent interaction. And this is largely based on the Neo four j agent memory Python package, which exposes short term, long term, and reasoning memory. So this is an important observation that context graphs need three types of memory. When we're talking about agent memory, oftentimes, we're thinking about preferences or, we're thinking about conversations. But reasoning, the decision traces, these are also an important part of agent memory. Sometimes this is called procedural memory or experiential memory, things like that. But these are all abstractions that are baked into the Neo four j agent memory package, which we can install through PIP and integrates with pretty much any Python based agent framework, and has lots of configurable functionality for things like entity extraction and working with graph data science. Now I mentioned these these three types of agent memory, short term, long term, and reasoning memory. Short term memory, we can think of as conversations. Right? Conversation history, session state. This is really, though, an entry point into entity extraction. Right? So, we're we're talking about conversational, so unstructured data, conversations with an agent. And in that, in that text unstructured data, we're going to be mentioning things. Right? We're gonna be mentioning customers or, products, accounts. We need to go through this entity extraction and entity resolution phase to understand what what is the actual thing that, that we're talking about. And applying just an an LLM only approach to entity extraction and entity resolution can be quite, slow and and quite expensive. And so as part of the Neo four gs agent memory package, we've built this multistage sort of entity extraction pipeline that is able to use, named entity resolution, like, like, statistical NLP methods as well as smaller local models like the the GLINER fine tune models for entity extraction that can run on CPU, to run locally for free, essentially, with fallback to LLM for more complex extraction and resolution. Now the data model that we apply during entity extraction and resolution is really important. And that data model, I think, is going to depend on the domain, that we're working with. Right? Now by default, Neo four j agent memory uses the poll plus o entity model. This is a a common model that's used in investigations. Right? So person, organization, location, event, object. This this I think of as kind of the the starting point. As you're going through and implementing your own agent memory, I would really encourage you to extend the the base data model, the base ontology, that's used. And we'll see how to do that, in a moment here. Reasoning memory, this is the the third type of memory, that we mentioned that's supported by the Neo four j agent memory package. And this is really all about understanding that why. Right? The the the missing piece, the explanation of what execution plan did the agent make. Why was the decision made? What are the decision traces? What are the, what can we observe from, the agent's reasoning phase? Those sorts of things are captured in reasoning memory. And, altogether, these are the the main abstractions for working with memory in the neo f j agent memory package. Altogether, short term memory, long term memory, reasoning memory, these together make up your context graph, and they're they're all related. Right? So so messages, these can trigger a a reasoning trace. It can trigger, like, the reasoning phase for the agents. We extract specific entities from specific messages. Messages trigger a tool called these sorts of things. We've been working a lot with the different framework, and agent, cloud orchestration vendors. And so we we've spent a lot of work making sure that Neo4j agent memory integrates well with Google ADK, with AWS, Microsoft agent framework. And so you'll see some specific examples and and specific integrations with some of those, those tools as well. But there's still a bit of a a challenge that I saw after we, if we released this, and and it was great to see folks adopting, especially some of the more, sophisticated agent memory integrations, that are out there. But I I think there's still a bit of an overhead in surfacing some of this data. Like, how do I actually surface the decisions that have been made in the organization? How do I actually get that data, into the graph? How do I expose that as a tool for my agent to make sense of? And so to to help smooth that process a bit, we created the create context graph project. Create context graph, this is inspired by the create react app project, which which was, for a while, kind of the default way of creating a a React application. Go through this interactive scaffold and choose, sort of what what domain are you working with. Do you want to import real data? Do you want sample data just to get started? That's kind of the the basic idea. And let's take a look at what this looks like. So if I run uvx create context graph, it'll take me through this interactive workflow. Let's say, I'm gonna create a a science graph project so we can, suck in SAS data or generate demo data. Let's do demo data for scientific research, and we can choose which agent framework we want. Let's go with cloud code, and I'm gonna use an existing Neo four j instance. We can also connect to Neo four j Aura instance or, use Docker, but I have an instance running locally. Let's see. Local host. There we go. And we're gonna skip the kind of optional enhancements and and add ons here. And here I'm being asked to add, an entropic API key, and and these are optional for my agent. I I can use whatever model, I want. By default, we'll use entropic Sonnet model and OpenAI embeddings. So we've now, provisioned out kind of this full stack context graph application. I'm just installing some dependencies. This is the step here to download, the spaCy model. This is probably the, the longest step of of what we're gonna, see here. But if we go to the, the docs here, this is the create context graph page, which has, documentation. There's good, tutorials here. We're basically following through this, your first, context graph app. But, okay, now we're installing, the, front end dependencies, and and so the front end is a, Next. Js application. And so we saw earlier, we had our choice of back end framework. Right? And and so we chose, I think, the cloud, cloud agent SDK. We could have chosen any of the the Python models supported by, supported by Neo4j agent memory. Okay. Cool. And so then the next step is to, seed our database. So run the make seed, and this is going to just load some sample data, in my database. And then finally, we can run make start, and this is gonna run, the back end and the front end application. So now if we go to local host 3,000 and zoom in a bit, we should see this is our full stack scientific research context graph. So the first thing we see is the data model. But we can ask some, questions here, something like, show me the citation network around deep learning in drug discovery. And we can see what our, our agent has access to a number of tools, and it's going to choose which tools to execute, to try to answer, this question. Right? And, initially, it's found, a paper here. We can explore, the paper and the, the citation network around that to bring in, more data. And that's essentially what our our agent is doing here, sort of doing text decipher to bring in more data to answer, our question. Now what data did did we actually bring in? Like, this this is sample data. We we we see, like, documents. I have kind of decision traces here. I have, like, science papers. I have, like, researchers that are affiliated. And you can see, like, this this data matches the domain that I chose and was all generated, by piping work documentation. Right? So, in in this case, we're working with science papers, meeting notes, from, organizations that are allocating, grants, these sorts of things. We're, taking documents that represent that sort of work product and running that through the Neo four j agent memory entity extraction pipeline to construct, this context graph. So that is one way to get started, with context graphs and the Neo four j agent memory, package. That that's using sample data. Right? And and that that's helpful for for demos. But how do we surface, like, the these real world decisions that occurred? And for that, we can use the, SAS connector data in the create context graph project. Currently, there's support for, GitHub Linear, Google Workspace, Cloud Code, maybe a few others, that we've added in there. But the basic idea is we can, include a connector, or you saw when we're going through the interactive flow. We were being asked if we wanna demo data or, to connect to SaaS to to a SaaS service to pull in, real data. And so for cloud code, for example, the connector works by parsing, the JSONL session files that we have, locally on our machine. And we'll parse those into a a context graph and look at, like, what are the, what are what is the, like, graph structure of our sessions that we've had with Cloud Code. Right? So Cloud Code is a a coding agent. I interact with it by, you know, sort of sending messages that can either be, you know, feedback on a plan or, maybe a a a PRD, but it it's somehow related to some project that I'm working on. I'm I'm gonna be calling tools, to update a file. Right? Right? Like, these sorts of of interactions. Now an important component, though, of of context graphs is surfacing the decision trace. Right? Like, what were the decisions that were made either by a human or or by an agent and materializing those, surfacing those from, from our work product? And here's an example. In the Cloud Code Connector for create context graph, we do this using some heuristics. And so, essentially, we're looking for the case where there was some user correction, or maybe user chose to change dependency, or there was some error. How did we resolve the error? We were presented with an alternative, that was, like, a clear decision that the the human had to make based on, you know, maybe an architecture decision, these sorts of things. Now what what's really important is when we're able to combine these data sources and and query across them. So here we're adding the Google Workspace, and the linear, connector. And so we can combine projects that we found in cloud code with, linear work items and maybe PRDs that we found in Google Workspace, for example. So here's just another look at some of the heuristics we use for the, the decision extraction, and some examples, right, where I'm using explicit statements. Right? Always use, single quotes, for example. Right? They're those are easy to, sort of have reg x reg xs that are trying to identify, those sorts of explicit statements. Behavioral patterns are are more interesting and and tricky to identify. Cool. So this is my second demo that I wanted to show. Maybe I'll I'll skip sort of this process since we're, limited to time here. We'll skip the import process, but I'll show you a bit of what this looks like. So I pointed this at my, Aura instance. Zoom in a little bit here. And we can see here, if we start by querying, for projects. So I have 16 projects. This is grouse flocks. Oh, this was a, this was experiments on, adding some new visualization styling, using the, the NBL, the NBL visualization, package. And so, anyway, we can see the the message the messages that were sent, the the tool calls that were made. Here's a tool name, edit, which is essentially, you know, editing, a file. We see the file that that was updated and and, so on. And we have the same full stack, like like, front end piece that we can use to interact with that, as well as part of create context graph. Cool. So the third piece that I wanna talk about here is this idea of multi agent memory. So I I think it's, you know, somewhat intuitive when we're interacting with a single maybe a coding agent using the Cloud Code example. These are the preferences that I personally care about, and and we can extract that. And and maybe a a markdown file for saving that, you know, locally is is fine. But the case where we have, you know, hundreds or thousands of agents that are collaboratively working together, how can we enable memory, and, specifically, the context graph the, that we're using with Neo4j agent memory, how can that be a shared coordination layer, for our, like, swarm of agents, essentially? Right? Like, databases were created to solve the this coordination problem. There there are a lot of advantages that we can have there as we're leveraging that that graph database as our context graph memory layer. Let's look at another example. So here we have, kind of an agent swarm, again, in the financial services domain. We have, in this case, a series of agents that, are all implemented in the same agent framework. So they're all implemented in AWS strands. They share a, memory layer, but they all have a a somewhat different persona. Right? So we have a you know, your customer agent, an anti money laundering agent, a compliance agent. They all have somewhat different, personas, somewhat different outcomes, somewhat different goals, but they have that shared memory layer still. Because they're all using Neo4j agent memory, they all have this the same conventions, the same data models that they're using, the same contract, essentially, with the database for the shape of their memory. And so a typical workflow might be something like this where, the know your customer agent is, you know, ingesting new customer data, going through this entity extraction. Maybe it has some, process for flagging, suspicious transactions. And this can happen just in the memory layer. So in in the the memory layer, the the KYC agents, you know, maybe flags that, oh, this is a suspicious relationship. That's immediately available for, say, the credit agents to pick up on because that is in the, shared memory layer. Now this example, I think this this makes sense to us, and and we said, okay. These are all written using the same, agent framework, the same, like, Neo4j agent memory, integration for strands is kind of how these are implemented. But what if we had, you know, a system where we had many different agents implemented in many different languages or, you know, different frameworks. We have different, you know, different teams throughout our organization that are all working on, you know, different pieces of technology. Having a shared memory layer, that all of our agents conform to the same conventions, means that our memory layer can then be the layer of coordination for our agents. But we need some way to certify that, those agents are all using, the same conventions, that the memory layer that they're using is compatible. And this is where, technology compliance kit testing comes in. And so this, I think, is something we're gonna be, working on in our our agent memory tooling quite a bit going forward is being able to certify, that different agent memory implementations adhere to the same conventions. They have the same shape and behavior of agent memory, in the graph. And when when we do that, we're able to share, this agent memory layer as a coordination layer. And so we we've begun some initial work. There there's a link here to the agent memory TCK repo, which has, the specification and and some of the, conventions that we're using in the Neo four j agent memory tooling. This this is by no means, done and and comprehensive. This sort of a a first stab, I think, at what this looks like. And there's an example application in there of, I think we have six agents written in in different languages from Python, TypeScript, Go, c sharp, multiple versions multiple, Python frameworks. And then, also, just earlier today, I got the the R, version working. So if anyone is is interested in our lang and and agent memory for r, we can now have a sort of a you have to agent memory client, in that. So, anyway, that that I think is just kind of the the vision that I wanna set out for going forward in the future is, you know, imagine we have, all of these agents in different languages, different frameworks, all, working together, contributing to learning from, the same shared memory layer. Cool. Well, I think we're we're about out of time, but I'll I wanna leave some resources up, and then we'll we'll go through some questions, if we have time for them. Here's some some links to some of the things I talked about. GraphAcademy is always a a good place to start. There's a a new GraphAcademy context graph course. Take a look at that. The create context graph project, that is the command line tool used to scaffold up your your context graph application, which is built on top of the Neo four j agent memory, package. So those are some resources, I would encourage folks to check out. I wanna leave you with, I think, what was my favorite paper, from last year. It's coming out that it was the end of last year. And this was kind of a a survey of memory in the age of AI agents. And I think it does a really good job of presenting a few different ways to to slice up the memory landscape based on sort of, like, the the form function or or dynamics of the memory system. And so I'd encourage folks to to check this out. We I I talked about some of the the aspects, but just maybe not through the same lens. So I would I would encourage you to check this paper out, as a good way to, to sort of have a different lens on, the memory landscape piece. And I see a few folks asking for the slides. Yes. The slides. I usually have a link at the end. I forgot that. But if we zoom all the way to the beginning, we have, yes, the slides here. So the link is graphstuff.com/nodesai2020six. I can write that in graphstuff.comnodesai. There we go. I think I typed that right. We'll see. Cool. So we have just a a few minutes left. I'm I'm sure if we have time to take questions here. Let's see. I'll just scroll through the chat. Maybe we can take one or two. Let's see. So I see a few a few folks asking about the, like, the difference between the industry domains in in create context graph and, like, the the connector data. So, yeah, maybe that's a that's good, a good shout there. So the idea with create context graph is just really to be able to get started with this this full text graph, application. And so, really, the idea is choosing a domain and an agent framework. And then if you want to work with sample data or if you want to actually connect your own, SaaS service like Linear, GitHub. And, if you do that, we'll actually try to identify, some of the decisions that were made in your in your work documents, through that that process that I talked about. Cool. And so I see we have one minute left. So maybe we will pause there because I think we're gonna head into the next session soon. Hey, everyone. Thanks, William, again, for your time and, for everyone that's attended this session. It is now time for the closing keynote. So I just wanna thank you all for watching. William, anything else, you wanna say? I think that covered it. Thanks, everyone, for joining. And, you know, please, if you try out our agent memory tooling or create context graph, please let us know what what feedback you have. It's always helpful, to hear from users to to guide what we're working on going forward. So thanks a lot, everyone. Thank you. We'll see you all in the closing keynote.