Xano Documentation
  • 👋Welcome to Xano!
  • 🌟Frequently Asked Questions
  • 🔐Security & Compliance (Trust Center)
  • 🙏Feature Requests
  • 💔Known Issues
  • Before You Begin
    • Using These Docs
    • Where should I start?
    • Set Up a Free Xano Account
    • Key Concepts
    • The Development Life Cycle
    • Navigating Xano
    • Plans & Pricing
  • The Database
    • Designing your Database
    • Database Basics
      • Using the Xano Database
      • Field Types
      • Relationships
      • Database Views
      • Export and Sharing
      • Data Sources
    • Migrating your Data
      • Airtable to Xano
      • Supabase to Xano
      • CSV Import & Export
    • Database Performance and Maintenance
      • Storage
      • Indexing
      • Maintenance
      • Schema Versioning
  • 🛠️The Function Stack
    • Building with Visual Development
      • APIs
        • Swagger (OpenAPI Documentation)
      • Custom Functions
        • Async Functions
      • Background Tasks
      • Triggers
      • Middleware
      • Configuring Expressions
      • Working with Data
    • Functions
      • AI Tools
      • Database Requests
        • Query All Records
          • External Filtering Examples
        • Get Record
        • Add Record
        • Edit Record
        • Add or Edit Record
        • Patch Record
        • Delete Record
        • Bulk Operations
        • Database Transaction
        • External Database Query
        • Direct Database Query
        • Get Database Schema
      • Data Manipulation
        • Create Variable
        • Update Variable
        • Conditional
        • Switch
        • Loops
        • Math
        • Arrays
        • Objects
        • Text
      • Security
      • APIs & Lambdas
        • Realtime Functions
        • External API Request
        • Lambda Functions
      • Data Caching (Redis)
      • Custom Functions
      • Utility Functions
      • File Storage
      • Cloud Services
    • Filters
      • Manipulation
      • Math
      • Timestamp
      • Text
      • Array
      • Transform
      • Conversion
      • Comparison
      • Security
    • Data Types
      • Text
      • Expression
      • Array
      • Object
      • Integer
      • Decimal
      • Boolean
      • Timestamp
      • Null
    • Environment Variables
    • Additional Features
      • Response Caching
  • Testing and Debugging
    • Testing and Debugging Function Stacks
    • Unit Tests
    • Test Suites
  • CI/CD
  • File Storage
    • File Storage in Xano
    • Private File Storage
  • Realtime
    • Realtime in Xano
    • Channel Permissions
    • Realtime in Webflow
  • Maintenance, Monitoring, and Logging
    • Statement Explorer
    • Request History
    • Instance Dashboard
      • Memory Usage
  • Building Backend Features
    • User Authentication & User Data
      • Separating User Data
      • Restricting Access (RBAC)
      • OAuth (SSO)
    • Webhooks
    • Messaging
    • Emails
    • Custom Report Generation
    • Fuzzy Search
    • Chatbots
  • Xano Features
    • Snippets
    • Instance Settings
      • Release Track Preferences
      • Static IP (Outgoing)
      • Change Server Region
      • Direct Database Connector
      • Backup and Restore
      • Security Policy
    • Workspace Settings
    • Advanced Back-end Features
      • Xano Link
      • Developer API (Deprecated)
    • Metadata API
      • Master Metadata API
      • Tables and Schema
      • Content
      • Search
      • File
      • Request History
      • Workspace Import and Export
      • Token Scopes Reference
  • Build With AI
    • Building a Backend Using AI
    • Get Started Assistant
    • AI Database Assistant
    • AI Lambda Assistant
    • AI SQL Assistant
    • API Request Assistant
    • Template Engine
    • Streaming APIs
  • Using AI Builders with Xano
  • Build For AI
    • MCP Builder
      • Connecting Clients
      • MCP Functions
    • Xano MCP Server
  • Xano Transform
    • Using Xano Transform
  • Xano Actions
    • What are Actions?
    • Browse Actions
  • Team Collaboration
    • Realtime Collaboration
    • Managing Team Members
    • Branching & Merging
    • Role-based Access Control (RBAC)
  • Agencies
    • Xano for Agencies
    • Agency Features
      • Agency Dashboard
      • Client Invite
      • Transfer Ownership
      • Agency Profile
      • Commission
      • Private Marketplace
  • Enterprise
    • Xano for Enterprise
    • Enterprise Features
      • Microservices
        • Ollama
          • Choosing a Model
      • Tenant Center
      • Compliance Center
      • Security Policy
      • Instance Activity
      • Deployment
      • RBAC (Role-based Access Control)
      • Xano Link
  • Your Xano Account
    • Account Page
    • Billing
    • Referrals & Commissions
  • Troubleshooting & Support
    • Error Reference
    • Troubleshooting Performance
      • When a single workflow feels slow
      • When everything feels slow
      • RAM Usage
      • Function Stack Performance
    • Getting Help
      • Granting Access
      • Community Code of Conduct
      • Community Content Modification Policy
  • Special Pricing
    • Students & Education
    • Non-Profits
  • Security
    • Best Practices
Powered by GitBook
On this page
  • What is Ollama?
  • What can I do with Ollama + Xano?
  • Getting Started
  • If you're not on an Enterprise plan, reach out to our team to get started.
  • Head to your instance settings, and select Microservices
  • Choose a model to deploy
  • Add a persistent volume
  • Add a deployment
  • Deploy your chosen model
  • Interact with your Ollama deployment
  • What's next?

Was this helpful?

  1. Enterprise
  2. Enterprise Features
  3. Microservices

Ollama

Deploy LLMs as a part of your Xano instance

Last updated 17 days ago

Was this helpful?

Quick Summary

Through Xano's Microservice feature, you can deploy Ollama as a part of your Xano instance, enabling secure communication with an off-the-shelf or customizable LLM tailored specifically to your needs.

What is Ollama?

Ollama is a platform that enables seamless integration of large language models (LLMs) into various applications. It provides an environment where businesses can deploy pre-built or customized LLMs, facilitating secure and efficient communication tailored to specific business needs.

This makes it easier for companies to leverage advanced AI technology without extensive in-house development, and without the concerns of sending data to a third-party AI provider. In addition, deploying Ollama as a part of your Xano instance can be a significant cost-saving measure in comparison to leveraging a third party service.

What can I do with Ollama + Xano?

Deploying Ollama as a microservice in Xano helps address a critical data privacy challenge when building AI-enabled applications that handle sensitive information.

By processing data within your controlled infrastructure boundary rather than sending it to external AI providers, using Xano with microservices removes one significant barrier to developing secure applications that can still leverage the benefits of AI models.

While this approach addresses an important technical aspect of data privacy, comprehensive security and compliance will require implementing additional safeguards appropriate to your specific regulatory environment.

Getting Started

1

If you're not on an Enterprise plan, reach out to our team to get started.

Talk to Sales

2

Head to your instance settings, and select Microservices

3

Choose a model to deploy

You'll want to know which model you're deploying so we understand the resources that need to be allocated. Check out the resource linked below if you need help choosing a model.

Choosing a Model

4

Add a persistent volume

A persistent volume is just a place that the microservice can store data that remains between restarts. Ollama will use this volume to store the model(s) that you're working with.

How much storage do I need?

When browsing Ollama models, use the Size column for your chosen model to determine how large of a volume you should deploy. Make sure to add a little extra when creating your volume, just for some breathing room.

Name the volume ollama, select the size, and choose SSD as the storage class. When you're ready, click Add

5

Add a deployment

Click Add under Deployments

Fill out the following information. For this example, we'll be deploying the phi3:mini model, which should work with the following example values.

Parameter
Purpose
Example Value

Deployment Name

Name of the deployment

ollama

Replicas

Number of container instances

1

Docker Config

Source of the Docker image

public repo

Deployment Strategy

Update strategy

RollingUpdate

Container Name

Name of the container

ollama

Container Type

Container type

Standard

Docker Image

Docker image to use

ollama/ollama

Container Port

Port the container listens on

11434

Service Port

Port exposed to the service

11434

Persistent Volume Name

Name of the persistent storage volume you created in the previous step

ollama

Volume Type

Volume type

Persistent Volume

Mount Path

Path in container where volume is mounted

/root/.ollama/models

Min CPU

Minimum CPU allocation

500m

Max CPU

Maximum CPU allocation

2000m

Min RAM

Minimum RAM allocation

4096Mi

Max RAM

Maximum RAM allocation

8192Mi

Click Add and then Update & Deploy

6

Deploy your chosen model

All interactions with Xano microservices are facilitated by the Microservice function, which is a REST API request to your chosen microservice.

Use the cURL below to get started. Add a Microservice function to your function stack, and click Import cURL

curl -X POST http://localhost:11434/api/pull \
  -H "Content-Type: application/json" \
  -d '{
    "name": "phi3:mini"
}'

Please note that the Content-Type: application/json header is required, or your requests will fail.

Click on the host dropdown and choose your deployment, likely called ollama

Change the timeout to a value that will allow you to monitor the deployment right inside of Xano. Below are some recommended values. You can also monitor the deployment via the logs of the microservice, available from your instance settings where you deployed the microservice earlier.

phi3:mini (~5 GB)

10–60 seconds

5 min (300s)

mistral:7b (~8 GB)

30–120 seconds

5–10 min (300–600s)

llama3:70b (~40 GB)

5–20 minutes or more

15–30 min

Once the model has downloaded, you'll be returned a 200 status response, as shown below. The model will be saved to your persistent storage volume, so you should only have to do this once.

7

Interact with your Ollama deployment

You're ready to receive generations from your Ollama microservice. Here's an example cURL command to get you started.

curl -X POST http://localhost:11434/api/generate \
  -H "Content-Type: application/json" \
  -d '{
    "model": "phi3:mini",
    "prompt": "Explain the difference between supervised and unsupervised learning.",
    "stream": false
}'

The above command will return an output like the one shown below.

8

What's next?

Ollama offers a set of standard commands that can be issued via REST API endpoints. You can review them here, and use them as necessary inside of your function stacks.