How to Set Up LM Studio for Local AI on Any GPU (4GB to 24GB)

Run AI Locally — No Cloud, No Subscription, Full Privacy

LM Studio lets you run large language models on your own hardware. No API fees, no data leaving your machine, and surprisingly capable results even on modest GPUs.

Installing LM Studio

lmstudio.ai

GPU Tiers & Recommended Models

4GB VRAM (GTX 1650, RTX 3050)

Phi-3 Mini 3.8B (Q4)

TinyLlama 1.1B

Qwen2 1.5B

8GB VRAM (RTX 3060, RTX 4060)

Llama 3.1 8B (Q4)

Mistral 7B (Q5)

CodeLlama 7B

Gemma 2 9B (Q4)

12GB VRAM (RTX 3060 12GB, RTX 4070)

Llama 3.1 8B (Q8)

Mixtral 8x7B (Q3)

DeepSeek Coder V2 16B (Q4)

16-24GB VRAM (RTX 4080, RTX 4090, RTX 3090)

Llama 3.1 70B (Q3-Q4)

Qwen2 72B (Q3)

Mixtral 8x7B (Q6)

DeepSeek V2.5 (Q4)

Setting Up the Local API Server

LM Studio includes a built-in OpenAI-compatible API server:

— SPONSORED —

🍔 Claim a $100 McDonald's Gift Card! Take a quick survey, earn your reward, and enjoy free meals. It's that simple!

👉 Claim Your Reward Now

Building Automation Workflows

1. Python Automation


import openai
client = openai.OpenAI(base_url="http://localhost:1234/v1", api_key="not-needed")

response = client.chat.completions.create(
    model="local-model",
    messages=[{"role":"user","content":"Summarize this article: ..."}],
    temperature=0.7
)
print(response.choices[0].message.content)

2. Batch Processing

3. Integration with n8n or Make.com

4. Document Processing

Performance Tips

Cost Savings vs Cloud AI


Cloud AI (GPT-4 API):     ~$20-100/month for moderate use
Local AI (LM Studio):     $0/month after hardware
Break-even:               1-3 months of saved API costs

Local AI is not just about saving money — it's about privacy, speed, and unlimited usage. Once it's set up, it runs forever for free.