

So if I have two machines running the same local LLM and I pass a prompt between them, I’ve achieved data compression by transmitting the prompt rather than the LLM’s expected response to the prompt? That’s what I’m understanding from the article.
Neat idea, but what if you want to transmit some information that an LLM can’t tokenize and generate accurately?
Extraordinary claims require extraordinary evidence.