SelfHostLLM

Calculate GPU memory requirements and max concurrent requests for self-hosted LLM inference. Support for Llama, Qwen, DeepSeek, Mistral and more. Plan your AI infrastructure efficiently.

What the Community Said

Built to simplify planning for self-hosted AI deployments. Unlike other AI infrastructure tools, SelfHostLLM lets you precisely estimate GPU requirements and concurrency for Llama, Qwen, DeepSeek, Mistral, and more using custom config. B̶u̶t̶ n̶o̶w̶ I̶ w̶̶a̶n̶t̶ t̶o̶ s̶e̶e̶ ̶A̶p̶p̶l̶e̶ s̶i̶l̶i̶c̶o̶n̶ ̶a̶d̶d̶e̶d̶ t̶o̶ t̶h̶e̶ m̶i̶x̶! Update: Now there's a Mac version too!

Super useful — sizing GPU memory and concurrency upfront saves a ton of headaches. Love that it works with different models.

No way, this is exactly what I needed! Figuring out GPU memory for LLMs has always been such a headache—super smart to automate it. Any plans to support multi-GPU setups?

Hi all, I'm the creator of SelfHostLLM.org. You can read more about why I created it here: https://www.linkedin.com/posts/e...

Very cool calculator, looking forward to checking this out.

Frequently Asked Questions

A measure of community engagement at launch. Higher means more people noticed and interacted with the product. It's a traction signal, not a quality rating.

Discussion threads divided by interest score. Above 0.30 is strong. Below 0.15 suggests the product got clicks but not conversation.

Categories come from the product's launch tags. Most products appear in 2-3 categories. The primary category is listed first.

The scores reflect launch-period engagement. Historical data is preserved and doesn't change retroactively. The build date at the bottom shows when the index was last refreshed.

What the Community Said

Similar Products in Open Source

Frequently Asked Questions

Explore

Track products like SelfHostLLM