localllama
LocalLLaMA stsquad 10mo ago 96%

How to make LLMs go fast

https://vgel.me/posts/faster-inference/

I found this post interesting for my layman's understanding of LLMs and some of the underlying architecture choices that are made.

30
0
Comments 0