Discover reviews on "best llm to run locally on a 5060 ti" based on Reddit discussions and experiences.
Last updated: February 18, 2026 at 03:54 PM
Summary of Reddit Comments on the Best LLM to Run Locally on a 5060 Ti
Gemma 3 27B and Mistral/Magistral 24B
- Great for handling emails and Q&A tasks.
- Suitable for light to medium coding tasks.
- Recommended for multiple models for different tasks.
Qwen3 Coder 30B and Devstral 2 24B
- Great for light to medium coding tasks.
- Recommended for different coding tasks.
GPT-OSS-120B
- Requires at least 64GB RAM and two 5060Ti 16GB GPUs for faster output speed.
- Considered a level above the 30B models.
Qwen3 Family of Models
- Excellent for building RAG applications for document databases.
- Useful for creating embeddings and querying content.
Rei 24B KTO, GLM 4.6v, Q4 Gets ~17.5 tokens/second
- Works well for roleplaying and one-off chats.
- Performs satisfactorily for coding tasks.
SEED OSS 36B
- Requires a 5090 to run.
- Known for being a smart model.
Pros of Larger LLMs (e.g., 20B, 30B Models)
- Fast speeds for summarization tasks.
- Suitable for chat, coding tasks, and document/database processing.
Cons of Larger LLMs
- Requires powerful GPUs and extensive system RAM.
- Can be expensive and power-intensive.
RTX 3060 vs. 3090
- RTX 3090 recommended over 3060 for dense models due to better performance.
Choosing the Right Setup
- Consider system RAM, GPU VRAM, and CPU power for optimal performance.
- Evaluate the power consumption and PSU requirements.
Importance of Context in Model Usage
- Contrast and context play a significant role in the perceived value of different models.
- The setup can influence the overall user experience and model performance.
Considerations for Local LLMs
- Prioritize VRAM for handling larger models efficiently.
- Evaluate the trade-offs between local and cloud-based LLMs based on use cases and requirements.




