Abstract: Retrieval-augmented generation pipelines store large volumes of embedding vectors in vector databases for semantic search. In Compute Express Link (CXL)-based tiered memory systems, ...