Table of Content
Table of Content
Ollama lets you run large language models (LLMs) like Llama 3, Mistral, Gemma, and Phi locally on your own hardware—no cloud subscription, no data leaving your machine, complete privacy.
The problem: Ollama listens on localhost:11434 by default, which means only software running on the same machine can access it. The moment you want to query your local LLM from your phone, a second laptop, or a remote application, you hit a wall.
This guide shows you how to use SocketXP Remote Access Solution to expose your Ollama server to the internet with a permanent public HTTPS URL—without port forwarding, without a VPN, and without touching your router settings.
Why Access Ollama Remotely?
- Use your local LLM from any device: Query your models from a phone, tablet, or any laptop without being on the same WiFi.
- Build applications that call your private LLM: Connect web apps, mobile apps, or scripts running on other machines to your Ollama API.
- Share with teammates: Let collaborators query your models without setting up their own hardware.
- Avoid cloud API costs: Keep workloads on your local GPU instead of paying per token to a cloud provider.
How It Works
Ollama starts a REST API server on localhost:11434. SocketXP installs a lightweight agent on the same machine and creates a secure outbound SSL/TLS tunnel to the SocketXP Cloud Gateway.
SocketXP then assigns a permanent public HTTPS URL to that tunnel. Any HTTP request sent to that URL is securely forwarded through the tunnel to your local Ollama server—and the response travels back the same way.

Because the agent makes an outbound connection, your router never needs to be configured to allow inbound traffic. This also works through corporate firewalls, 4G/5G cellular networks, and Starlink satellite connections.
Step-by-Step: Access Ollama Remotely
Step 1: Start Ollama on Your Machine
Start the Ollama server:
$ ollama serve 2025/01/01 10:00:00 Listening on 127.0.0.1:11434 (version 0.x.x)
Verify it is running locally:
$ curl http://localhost:11434 Ollama is running
You do not need to change OLLAMA_HOST or any Ollama settings. SocketXP connects to localhost, which Ollama already listens on.
Step 2: Install the SocketXP Agent
Download and install the SocketXP agent on the machine running Ollama.
Step 3: Get Your Authentication Token
Sign up at the SocketXP Web Portal and copy your authentication token.

Authenticate the agent:
$ socketxp login "eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9..."
Step 4: Create the HTTPS Tunnel to Ollama
$ socketxp connect http://localhost:11434 Public URL -> https://your-user-id-abc123.socketxp.com
SocketXP outputs a permanent public HTTPS URL. Ollama is now reachable at that address from any machine on the internet.
Step 5: Query Your Models Remotely
Send API requests to the SocketXP public URL from any device:
$ curl https://your-user-id-abc123.socketxp.com/api/generate \
-d '{
"model": "llama3",
"prompt": "What is the capital of France?",
"stream": false
}'
Or use the OpenAI-compatible endpoint:
$ curl https://your-user-id-abc123.socketxp.com/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"model": "llama3",
"messages": [{"role": "user", "content": "Hello!"}]
}'
Using with OpenAI SDK or LangChain
Because Ollama exposes an OpenAI-compatible API, you can point any OpenAI SDK or LangChain client to your SocketXP URL:
from openai import OpenAI
client = OpenAI(
base_url="https://your-user-id-abc123.socketxp.com/v1",
api_key="ollama", # Ollama ignores the key, but the SDK requires it
)
response = client.chat.completions.create(
model="llama3",
messages=[{"role": "user", "content": "Explain gradient descent in one paragraph."}]
)
print(response.choices[0].message.content)
This lets you swap your cloud LLM endpoint with your private Ollama instance in any application that uses the OpenAI SDK.
Keeping the Tunnel Running Permanently
To ensure the Ollama tunnel stays up after reboots or terminal sessions close, configure the SocketXP agent as a systemd service. See the SocketXP Getting Started guide for setup instructions.
Once configured as a service, the agent starts automatically with the machine and maintains the tunnel—your Ollama URL is always accessible.
Other Local LLM Servers
The same SocketXP method works for any local LLM inference server:
| Server | Default port | SocketXP command |
|---|---|---|
| Ollama | 11434 | socketxp connect http://localhost:11434 |
| LM Studio | 1234 | socketxp connect http://localhost:1234 |
| llama.cpp server | 8080 | socketxp connect http://localhost:8080 |
| vLLM | 8000 | socketxp connect http://localhost:8000 |
| Jan.ai | 1337 | socketxp connect http://localhost:1337 |
Security Considerations
The Ollama API does not include built-in authentication. The public URL SocketXP creates is accessible to anyone who knows it. Keep these points in mind:
- Keep the public URL private. Do not share it publicly.
- If you need to share access with multiple people, consider running an authenticated reverse proxy (such as nginx with HTTP basic auth) in front of Ollama on a different local port, then tunnel that port instead.
- SocketXP encrypts all traffic in transit with SSL/TLS, so the content of API requests and responses is protected between your machine and the caller.
Why SocketXP vs. Other Methods for Remote Ollama Access?
Port forwarding: Requires a public IP, router configuration, and exposes your machine directly to the internet.
SSH tunnel: Works but requires running a tunnel command every time from a machine that already has SSH access—impractical for applications or mobile clients.
ngrok free tier: URL changes every session. Applications that use a hardcoded endpoint break every time ngrok restarts.
SocketXP: Permanent URL, no router configuration, works on any internet connection, and the URL stays the same indefinitely.
Conclusion
SocketXP gives your locally hosted Ollama server a permanent, secure public address with a single command: socketxp connect http://localhost:11434. Your models run on your own hardware, your data stays on your machine, and you can reach the API from any device anywhere in the world.
For more on SocketXP’s tunneling capabilities, visit SocketXP IoT Remote Access or the Getting Started guide.
Frequently Asked Questions
1. Can I access Ollama remotely without a public IP or port forwarding?
Yes. SocketXP installs a lightweight agent on the machine running Ollama. The agent makes an outbound SSL/TLS connection to the SocketXP Cloud Gateway, which gives you a permanent public HTTPS URL. No inbound ports need to be opened on your router or firewall.
2. Does SocketXP work with Ollama without changing OLLAMA_HOST?
Yes. By default, Ollama listens on localhost (127.0.0.1:11434). SocketXP connects to localhost on the same machine, so you do not need to set OLLAMA_HOST or change any Ollama configuration. The tunnel handles the external connectivity.
3. Can I access Ollama from my phone or another computer using the SocketXP URL?
Yes. The SocketXP public URL is reachable from any device with an internet connection—your phone, another laptop, or a cloud server—without needing to be on the same network as your Ollama machine.
4. Is it safe to expose the Ollama API remotely via SocketXP?
SocketXP encrypts all traffic with SSL/TLS. However, the Ollama API itself does not require authentication by default, so anyone who knows your public URL can query your models. Keep the URL private, share it only with trusted users, or place an authenticated reverse proxy in front of Ollama if you need access control.
5. Can I use the SocketXP Ollama URL with OpenAI-compatible clients?
Yes. Ollama exposes an OpenAI-compatible API at /v1/chat/completions. Set your client’s base URL to the SocketXP public URL and it will work with any OpenAI SDK or tool that supports a custom base URL.
6. Does this work with LM Studio or other local LLM servers?
Yes. Any local LLM server that listens on a localhost port—LM Studio (port 1234), llama.cpp server (port 8080), or a custom inference server—can be exposed remotely using the same SocketXP method: socketxp connect http://localhost:<port>.
7. Can I run multiple Ollama models and access them remotely?
Yes. Ollama hosts all loaded models on a single API endpoint (port 11434). You select the model in your API request body using the model field (e.g., "model": "llama3"). One SocketXP tunnel covers all your Ollama models simultaneously.
8. Can I use Ollama with AI coding assistants like Cursor or Continue.dev remotely via SocketXP?
Yes. Tools like Cursor, Continue.dev, and other OpenAI-compatible coding assistants allow you to set a custom base URL for the language model API. Set the base URL to your SocketXP public URL and select the Ollama model you want to use. Your coding assistant will send requests through the SocketXP tunnel to your local Ollama instance.
9. Can I use LangChain or LlamaIndex with a remote Ollama instance via SocketXP?
Yes. Both LangChain and LlamaIndex support Ollama as an LLM backend and allow you to set a custom base URL. Point the base_url to your SocketXP public URL instead of http://localhost:11434, and all inference calls from your application will be routed to your remote Ollama server.
10. How do I run Ollama on a home server and use it from my work laptop?
Install Ollama and the SocketXP agent on your home server. Run socketxp connect http://localhost:11434 to get a permanent public URL. From your work laptop—even behind a corporate firewall—use that URL as the Ollama API endpoint in your applications or coding tools. No VPN is required.
11. How do I keep the Ollama SocketXP tunnel running permanently after a reboot?
Configure the SocketXP agent as a Linux systemd service following the instructions in the SocketXP Getting Started guide. The service starts automatically when the machine boots, re-establishes the tunnel, and your Ollama public URL remains reachable without any manual intervention.
12. What is the best alternative to ngrok for exposing Ollama remotely?
SocketXP is a strong alternative to ngrok for Ollama because it provides a permanent URL that does not change between sessions. ngrok free tier URLs are ephemeral—they change every time you restart the tunnel, which breaks any client application or coding tool that has the URL hardcoded. SocketXP’s permanent URL is more practical for ongoing Ollama API access.
13. Can I expose Ollama running inside a Docker container remotely?
Yes. If Ollama is running inside Docker and you have mapped its port to the host (e.g., -p 11434:11434), the host machine’s localhost:11434 is accessible. Run the SocketXP agent on the host machine and create the tunnel with socketxp connect http://localhost:11434 to expose it remotely.