Matsutani Lab

Department of Information and Computer Science, Keio University

Our low-end edge LLM demo video

投稿日時: 2026-02-12 投稿者: arclab

We are working on local LLMs on resource-limited edge devices. This video demonstrates our KV cache sharing approach, in whicn two Raspberry Pi Zero 2W devices share their KV cache via the middle cache server to reduce TTFT (time to first token) when inferencing similar prompts.
Open Access (ACM Digital Library)