Connecting to dev server...

Using Local AI Models with Circuitry

Circuitry supports local AI models like LM Studio and Ollama, allowing you to run AI workflows completely on your own machine without cloud API costs.

Why Local Models Need CORS Enabled

When using local models with Circuitry (especially when deployed), your browser needs to communicate directly with your local model server. This requires CORS (Cross-Origin Resource Sharing) to be enabled.

The Technical Details

Your Browser (Circuitry)
    ↓ Direct connection
Your Local Model Server (LM Studio/Ollama)

Without CORS enabled, your browser will block these requests for security reasons.

Setting Up LM Studio

1. Install and Configure LM Studio

  1. Download LM Studio from lmstudio.ai
  2. Install and launch LM Studio
  3. Download a model (e.g., Llama 3.2, Qwen, Mistral)

2. Enable CORS

Important: This step is required for Circuitry to work with LM Studio.

Option A: Using LM Studio GUI (Windows/macOS/Linux)

  1. Open LM Studio
  2. Click the Settings icon (⚙️) in the sidebar
  3. Go to the Server tab
  4. Find the "Enable CORS" checkbox
  5. Check the box to enable CORS
  6. Click Start Server (or restart if already running)

Option B: Using LM Studio CLI (macOS/Linux)

If the GUI CORS toggle doesn't work or you prefer command line:

# macOS/Linux - Start server with CORS enabled
lms server start --cors

The CLI will output:

W CORS is enabled. This means any website you visit can use the LM Studio server.
Success! Server is now running on port 1234

Note: On macOS, you may need to use the CLI method as the GUI toggle sometimes doesn't properly set CORS headers.

3. Note Your Server Details

After starting the server, LM Studio will show:

  • Server Address: Usually http://localhost:1234
  • Model Name: The loaded model (e.g., llama-3.2-3b-instruct)

4. Add Model to Circuitry

  1. In Circuitry, click the Settings icon (⚙️)
  2. Go to Model Settings
  3. Click + Add Custom Model
  4. Fill in:
    • Model ID: The exact model name from LM Studio (e.g., llama-3.2-3b-instruct)
    • Display Name: A friendly name (e.g., Llama 3.2 3B (Local))
    • Provider: Select Local
    • Endpoint: Enter your server address (e.g., http://localhost:1234)
  5. Click Save

Setting Up Ollama

1. Install Ollama

# macOS/Linux
curl -fsSL https://ollama.com/install.sh | sh

# Or download from: https://ollama.com/download

2. Pull a Model

ollama pull llama3.2
# or
ollama pull qwen2.5-coder

3. Start Ollama with CORS Enabled

Option A: Set Environment Variable (Permanent)

# macOS/Linux - Add to ~/.zshrc or ~/.bashrc
export OLLAMA_ORIGINS="http://localhost:3001,http://localhost:3000"

# Windows PowerShell
$env:OLLAMA_ORIGINS="http://localhost:3001,http://localhost:3000"

# Then restart Ollama
ollama serve

Option B: Run with Environment Variable (Temporary)

# macOS/Linux
OLLAMA_ORIGINS=* ollama serve

# Windows (Command Prompt)
set OLLAMA_ORIGINS=* && ollama serve

Important for Security: Using OLLAMA_ORIGINS=* allows all origins. In production, specify exact origins like http://localhost:3001.

4. Verify Ollama is Running

# Check if Ollama is running
curl http://localhost:11434/api/tags

# Test model generation
ollama run llama3.2 "Hello!"

5. Add Model to Circuitry

  1. In Circuitry, click the Settings icon (⚙️)
  2. Go to Model Settings
  3. Click + Add Custom Model
  4. Fill in:
    • Model ID: The Ollama model name (e.g., llama3.2, qwen2.5-coder)
    • Display Name: A friendly name (e.g., Llama 3.2 (Ollama))
    • Provider: Select Local
    • Endpoint: Enter http://localhost:11434
  5. Click Save

Alternative: Using ngrok for Remote Access

If you can't enable CORS or want to access your local models remotely, use ngrok:

What is ngrok?

ngrok creates a secure tunnel to your localhost, allowing you to access your local model server from anywhere (including when Circuitry is deployed).

Setup ngrok

  1. Install ngrok:

    # macOS (with Homebrew)
    brew install ngrok/ngrok/ngrok
    
    # Or download from: https://ngrok.com/download
    
  2. Sign up for free account:

    • Go to ngrok.com
    • Sign up (free tier is sufficient)
    • Copy your auth token
  3. Configure ngrok:

    ngrok config add-authtoken YOUR_AUTH_TOKEN
    
  4. Start LM Studio server:

    # macOS/Linux
    lms server start --cors
    
    # Or use the GUI
    
  5. Create ngrok tunnel:

    # For LM Studio (port 1234)
    ngrok http 1234
    
    # For Ollama (port 11434)
    ngrok http 11434
    
  6. Copy the public URL: ngrok will output something like:

    Forwarding  https://abc123.ngrok-free.app -> http://localhost:1234
    
  7. Use in Circuitry:

    • Add custom model with endpoint: https://abc123.ngrok-free.app
    • No CORS issues! (ngrok handles it)
    • Works from anywhere, even when deployed

ngrok Benefits

No CORS issues - ngrok provides proper headers ✅ Remote access - Access your local models from anywhere ✅ Works with deployed apps - Even Vercel deployments can reach your local machine ✅ Secure - HTTPS encryption and authentication

ngrok Free Tier Limits

  • 1 online ngrok process
  • 40 connections/minute
  • Random URL (changes on restart)
  • Perfect for development/personal use

Troubleshooting

Error: "Cannot connect to local model"

Cause: CORS is not enabled or the server is not running.

Solutions:

  1. For LM Studio:

    Try GUI first:

    • Open LM Studio Settings → Server
    • Ensure "Enable CORS" is checked
    • Restart the server
    • Verify the server is running (green indicator)

    If GUI doesn't work (especially on macOS):

    # Stop any running server in GUI
    # Then use CLI:
    lms server start --cors
    

    Still not working? Use ngrok:

    # Terminal 1: Start LM Studio
    lms server start
    
    # Terminal 2: Create tunnel
    ngrok http 1234
    
    # Use the ngrok URL as your endpoint
    
  2. For Ollama:

    • Ensure OLLAMA_ORIGINS is set
    • Restart Ollama: ollama serve
    • Check if running: curl http://localhost:11434/api/tags

    If CORS still fails:

    # Terminal 1: Start Ollama
    OLLAMA_ORIGINS=* ollama serve
    
    # Terminal 2: Create tunnel
    ngrok http 11434
    
    # Use the ngrok URL as your endpoint
    
  3. Check Endpoint:

    • LM Studio default: http://localhost:1234
    • Ollama default: http://localhost:11434
    • ngrok URL: https://xxxxx.ngrok-free.app
    • Can you access the endpoint in your browser?

Error: "Model not found"

Cause: Model ID doesn't match what's loaded in LM Studio/Ollama.

Solutions:

  1. For LM Studio:

    • Check the exact model name shown in LM Studio
    • It might include path components: llama-3-8b-instruct or models/llama-3-8b
    • Copy the exact name to Circuitry
  2. For Ollama:

    • List available models: ollama list
    • Use the name shown in the list
    • Common names: llama3.2, qwen2.5-coder, mistral

Error: "Connection refused"

Cause: Local model server is not running.

Solutions:

  1. For LM Studio:

    • Launch LM Studio
    • Load a model
    • Click "Start Server"
    • Verify the green "Running" indicator
  2. For Ollama:

    • Start Ollama: ollama serve
    • Or run a model: ollama run llama3.2

Platform-Specific CORS Issues

macOS

If the LM Studio GUI CORS toggle doesn't work on macOS:

# Use the CLI instead
lms server start --cors

Windows

The GUI toggle should work. If not:

  1. Try restarting LM Studio completely
  2. Check Windows Defender firewall settings
  3. Use ngrok as an alternative

Linux

Should work with both GUI and CLI. If issues persist:

# Check if the port is accessible
curl http://localhost:1234/v1/models

# If blocked, check firewall
sudo ufw status
sudo ufw allow 1234

Still Having Issues?

  1. Check Firewall: Ensure your firewall allows localhost connections
  2. Check Port: Ensure the port is not in use by another application
    # macOS/Linux - Check what's using port 1234
    lsof -i :1234
    
    # Windows
    netstat -ano | findstr :1234
    
  3. Browser Console: Open browser DevTools (F12) → Console for detailed error messages
  4. Try ngrok: Skip CORS entirely and use ngrok tunnel
  5. Verify Model Loaded: Make sure a model is actually loaded in LM Studio

Using Local Models in Workflows

Once configured, local models work exactly like cloud models:

  1. In Agent Nodes:

    • Select your local model from the model dropdown
    • Configure temperature and other settings
    • Execute the workflow
  2. In Chat Nodes:

    • Select your local model
    • Chat naturally - it runs entirely on your machine
  3. In Wizards:

    • Note: Wizards may still use cloud models by default
    • This is a server-side limitation we're working on

Benefits of Local Models

Privacy - Your data never leaves your machine ✅ Cost - No API fees ✅ Speed - No internet latency (for fast models) ✅ Offline - Works without internet connection ✅ Customization - Use any model you want

Recommended Models

For Code Generation

  • Qwen 3 Coder (30B) - Latest state-of-the-art code generation model with excellent performance
  • Qwen 2.5 Coder (7B, 14B, 32B) - Excellent for code tasks
  • DeepSeek Coder (6.7B, 33B) - Strong coding performance
  • Code Llama (7B, 13B, 34B) - Meta's code-focused model

For General Chat

  • Llama 3.3 (70B) - Latest and most capable open-source model
  • Llama 3.2 (3B, 8B) - Fast and capable
  • Llama 3.1 (8B, 70B) - Very capable, larger sizes are slower
  • Mistral (7B) - Good balance of speed and quality

For Reasoning

  • GPT-OSS (20B) - OpenAI's open-source reasoning model with strong analytical capabilities
  • Qwen 2.5 (7B, 14B, 32B, 72B) - Strong reasoning
  • Llama 3.3 (70B) - Excellent reasoning and instruction following
  • Llama 3.1 (70B) - Excellent reasoning (requires powerful hardware)

Hardware Requirements

  • Small models (3B-7B): 8GB RAM minimum, runs on CPU
  • Medium models (8B-14B): 16GB RAM recommended, GPU helpful
  • Large models (30B+): 32GB+ RAM, GPU highly recommended

Next Steps