Using Local AI Models with Circuitry
Circuitry supports local AI models like LM Studio and Ollama, allowing you to run AI workflows completely on your own machine without cloud API costs.
Why Local Models Need CORS Enabled
When using local models with Circuitry (especially when deployed), your browser needs to communicate directly with your local model server. This requires CORS (Cross-Origin Resource Sharing) to be enabled.
The Technical Details
Your Browser (Circuitry)
↓ Direct connection
Your Local Model Server (LM Studio/Ollama)
Without CORS enabled, your browser will block these requests for security reasons.
Setting Up LM Studio
1. Install and Configure LM Studio
- Download LM Studio from lmstudio.ai
- Install and launch LM Studio
- Download a model (e.g., Llama 3.2, Qwen, Mistral)
2. Enable CORS
Important: This step is required for Circuitry to work with LM Studio.
Option A: Using LM Studio GUI (Windows/macOS/Linux)
- Open LM Studio
- Click the Settings icon (⚙️) in the sidebar
- Go to the Server tab
- Find the "Enable CORS" checkbox
- Check the box to enable CORS
- Click Start Server (or restart if already running)
Option B: Using LM Studio CLI (macOS/Linux)
If the GUI CORS toggle doesn't work or you prefer command line:
# macOS/Linux - Start server with CORS enabled
lms server start --cors
The CLI will output:
W CORS is enabled. This means any website you visit can use the LM Studio server.
Success! Server is now running on port 1234
Note: On macOS, you may need to use the CLI method as the GUI toggle sometimes doesn't properly set CORS headers.
3. Note Your Server Details
After starting the server, LM Studio will show:
- Server Address: Usually
http://localhost:1234
- Model Name: The loaded model (e.g.,
llama-3.2-3b-instruct
)
4. Add Model to Circuitry
- In Circuitry, click the Settings icon (⚙️)
- Go to Model Settings
- Click + Add Custom Model
- Fill in:
- Model ID: The exact model name from LM Studio (e.g.,
llama-3.2-3b-instruct
) - Display Name: A friendly name (e.g.,
Llama 3.2 3B (Local)
) - Provider: Select
Local
- Endpoint: Enter your server address (e.g.,
http://localhost:1234
)
- Model ID: The exact model name from LM Studio (e.g.,
- Click Save
Setting Up Ollama
1. Install Ollama
# macOS/Linux
curl -fsSL https://ollama.com/install.sh | sh
# Or download from: https://ollama.com/download
2. Pull a Model
ollama pull llama3.2
# or
ollama pull qwen2.5-coder
3. Start Ollama with CORS Enabled
Option A: Set Environment Variable (Permanent)
# macOS/Linux - Add to ~/.zshrc or ~/.bashrc
export OLLAMA_ORIGINS="http://localhost:3001,http://localhost:3000"
# Windows PowerShell
$env:OLLAMA_ORIGINS="http://localhost:3001,http://localhost:3000"
# Then restart Ollama
ollama serve
Option B: Run with Environment Variable (Temporary)
# macOS/Linux
OLLAMA_ORIGINS=* ollama serve
# Windows (Command Prompt)
set OLLAMA_ORIGINS=* && ollama serve
Important for Security: Using OLLAMA_ORIGINS=*
allows all origins. In production, specify exact origins like http://localhost:3001
.
4. Verify Ollama is Running
# Check if Ollama is running
curl http://localhost:11434/api/tags
# Test model generation
ollama run llama3.2 "Hello!"
5. Add Model to Circuitry
- In Circuitry, click the Settings icon (⚙️)
- Go to Model Settings
- Click + Add Custom Model
- Fill in:
- Model ID: The Ollama model name (e.g.,
llama3.2
,qwen2.5-coder
) - Display Name: A friendly name (e.g.,
Llama 3.2 (Ollama)
) - Provider: Select
Local
- Endpoint: Enter
http://localhost:11434
- Model ID: The Ollama model name (e.g.,
- Click Save
Alternative: Using ngrok for Remote Access
If you can't enable CORS or want to access your local models remotely, use ngrok:
What is ngrok?
ngrok creates a secure tunnel to your localhost, allowing you to access your local model server from anywhere (including when Circuitry is deployed).
Setup ngrok
-
Install ngrok:
# macOS (with Homebrew) brew install ngrok/ngrok/ngrok # Or download from: https://ngrok.com/download
-
Sign up for free account:
- Go to ngrok.com
- Sign up (free tier is sufficient)
- Copy your auth token
-
Configure ngrok:
ngrok config add-authtoken YOUR_AUTH_TOKEN
-
Start LM Studio server:
# macOS/Linux lms server start --cors # Or use the GUI
-
Create ngrok tunnel:
# For LM Studio (port 1234) ngrok http 1234 # For Ollama (port 11434) ngrok http 11434
-
Copy the public URL: ngrok will output something like:
Forwarding https://abc123.ngrok-free.app -> http://localhost:1234
-
Use in Circuitry:
- Add custom model with endpoint:
https://abc123.ngrok-free.app
- No CORS issues! (ngrok handles it)
- Works from anywhere, even when deployed
- Add custom model with endpoint:
ngrok Benefits
✅ No CORS issues - ngrok provides proper headers ✅ Remote access - Access your local models from anywhere ✅ Works with deployed apps - Even Vercel deployments can reach your local machine ✅ Secure - HTTPS encryption and authentication
ngrok Free Tier Limits
- 1 online ngrok process
- 40 connections/minute
- Random URL (changes on restart)
- Perfect for development/personal use
Troubleshooting
Error: "Cannot connect to local model"
Cause: CORS is not enabled or the server is not running.
Solutions:
-
For LM Studio:
Try GUI first:
- Open LM Studio Settings → Server
- Ensure "Enable CORS" is checked
- Restart the server
- Verify the server is running (green indicator)
If GUI doesn't work (especially on macOS):
# Stop any running server in GUI # Then use CLI: lms server start --cors
Still not working? Use ngrok:
# Terminal 1: Start LM Studio lms server start # Terminal 2: Create tunnel ngrok http 1234 # Use the ngrok URL as your endpoint
-
For Ollama:
- Ensure
OLLAMA_ORIGINS
is set - Restart Ollama:
ollama serve
- Check if running:
curl http://localhost:11434/api/tags
If CORS still fails:
# Terminal 1: Start Ollama OLLAMA_ORIGINS=* ollama serve # Terminal 2: Create tunnel ngrok http 11434 # Use the ngrok URL as your endpoint
- Ensure
-
Check Endpoint:
- LM Studio default:
http://localhost:1234
- Ollama default:
http://localhost:11434
- ngrok URL:
https://xxxxx.ngrok-free.app
- Can you access the endpoint in your browser?
- LM Studio default:
Error: "Model not found"
Cause: Model ID doesn't match what's loaded in LM Studio/Ollama.
Solutions:
-
For LM Studio:
- Check the exact model name shown in LM Studio
- It might include path components:
llama-3-8b-instruct
ormodels/llama-3-8b
- Copy the exact name to Circuitry
-
For Ollama:
- List available models:
ollama list
- Use the name shown in the list
- Common names:
llama3.2
,qwen2.5-coder
,mistral
- List available models:
Error: "Connection refused"
Cause: Local model server is not running.
Solutions:
-
For LM Studio:
- Launch LM Studio
- Load a model
- Click "Start Server"
- Verify the green "Running" indicator
-
For Ollama:
- Start Ollama:
ollama serve
- Or run a model:
ollama run llama3.2
- Start Ollama:
Platform-Specific CORS Issues
macOS
If the LM Studio GUI CORS toggle doesn't work on macOS:
# Use the CLI instead
lms server start --cors
Windows
The GUI toggle should work. If not:
- Try restarting LM Studio completely
- Check Windows Defender firewall settings
- Use ngrok as an alternative
Linux
Should work with both GUI and CLI. If issues persist:
# Check if the port is accessible
curl http://localhost:1234/v1/models
# If blocked, check firewall
sudo ufw status
sudo ufw allow 1234
Still Having Issues?
- Check Firewall: Ensure your firewall allows localhost connections
- Check Port: Ensure the port is not in use by another application
# macOS/Linux - Check what's using port 1234 lsof -i :1234 # Windows netstat -ano | findstr :1234
- Browser Console: Open browser DevTools (F12) → Console for detailed error messages
- Try ngrok: Skip CORS entirely and use ngrok tunnel
- Verify Model Loaded: Make sure a model is actually loaded in LM Studio
Using Local Models in Workflows
Once configured, local models work exactly like cloud models:
-
In Agent Nodes:
- Select your local model from the model dropdown
- Configure temperature and other settings
- Execute the workflow
-
In Chat Nodes:
- Select your local model
- Chat naturally - it runs entirely on your machine
-
In Wizards:
- Note: Wizards may still use cloud models by default
- This is a server-side limitation we're working on
Benefits of Local Models
✅ Privacy - Your data never leaves your machine ✅ Cost - No API fees ✅ Speed - No internet latency (for fast models) ✅ Offline - Works without internet connection ✅ Customization - Use any model you want
Recommended Models
For Code Generation
- Qwen 3 Coder (30B) - Latest state-of-the-art code generation model with excellent performance
- Qwen 2.5 Coder (7B, 14B, 32B) - Excellent for code tasks
- DeepSeek Coder (6.7B, 33B) - Strong coding performance
- Code Llama (7B, 13B, 34B) - Meta's code-focused model
For General Chat
- Llama 3.3 (70B) - Latest and most capable open-source model
- Llama 3.2 (3B, 8B) - Fast and capable
- Llama 3.1 (8B, 70B) - Very capable, larger sizes are slower
- Mistral (7B) - Good balance of speed and quality
For Reasoning
- GPT-OSS (20B) - OpenAI's open-source reasoning model with strong analytical capabilities
- Qwen 2.5 (7B, 14B, 32B, 72B) - Strong reasoning
- Llama 3.3 (70B) - Excellent reasoning and instruction following
- Llama 3.1 (70B) - Excellent reasoning (requires powerful hardware)
Hardware Requirements
- Small models (3B-7B): 8GB RAM minimum, runs on CPU
- Medium models (8B-14B): 16GB RAM recommended, GPU helpful
- Large models (30B+): 32GB+ RAM, GPU highly recommended