Using Local LLMs with Ollama

Introduction

BrainSoup supports the integration of local Large Language Models (LLMs) through Ollama, an open-source application. This feature is especially beneficial for users prioritizing data privacy, security, or those looking to manage costs effectively by running models on their own hardware.

What is Ollama?

Ollama is a versatile tool designed for running, creating, and sharing LLMs locally. It supports a variety of models including Llama2, Mistral, and Phi-2, making it a comprehensive solution for personal or organizational use. Ollama simplifies the process of managing these models with an easy-to-use command-line interface and an optional REST API for advanced use cases. It is compatible with macOS, Linux, and Windows platforms.

Step 1: Downloading and Installing Ollama

Visit the Ollama official website to download the latest version of the application suitable for your operating system.
Follow the installation instructions provided on the website to set up Ollama on your machine.

Step 2: Downloading a Model with the Ollama CLI

Open a terminal window on your machine.
Use the following command to download a model from the Ollama repository (replace <model_name> with the name of the model you wish to download):

ollama pull <model_name>

Tips:

You can find a list of available models with the command ollama list , or by visiting the Ollama library.
Some models are specifically optimized for certain domains or tasks, such as mathematics, programming, medical applications, role-playing, and more. By combining agents with different models, you can create your personalized team of experts.

Step 3: Integrating Ollama with BrainSoup

Once Ollama is installed, BrainSoup can automatically detect it if both applications are on the same machine. This seamless integration allows all installed Ollama models to become instantly available within BrainSoup.

For Local Installation:

BrainSoup detects Ollama automatically.
All models managed by Ollama will be listed in BrainSoup under the AI providers section in settings.

For Remote Installation or Docker:

If you have Ollama installed on a different machine or within a Docker container, you need to specify the URL of your Ollama server:

Navigate to the Settings screen in BrainSoup.
Go to the AI providers section.
Enter the URL of your remote Ollama server.
Click on the Connect button to establish the connection.

Step 4: Getting Started with Local LLMs in BrainSoup

All the models managed by Ollama are now accessible within BrainSoup. You can select the desired model for your agents in their respective settings. For this, follow these steps:

Open the agent settings by double-clicking on the agent's name in the left pane.
In the AI settings section, select the desired model from the dropdown list.

Conclusion

Integrating local LLMs via Ollama offers unparalleled control over your data privacy and computational resources. With this setup, you're equipped to harness the capabilities of advanced language models while maintaining full ownership of your data and infrastructure.

Note: Most Ollama LLMs don't support function calls and are not multimodal, but your agent can still use tools, see images and listen to audio thanks to BrainSoup's ability to delegate these abilities to a more powerful LLM when needed. This multi-LLM cooperation is the cornerstone of BrainSoup, allowing you to leverage the strengths of different models without being limited by their individual capabilities.

Warning: By default, most Ollama LLMs will run with a very small context window of 2048 tokens, which is just enough to manage simple conversations. For more advanced scenarios, where the agent needs to access documents and use tools, a context window of at least 8192 tokens is recommended. To increase the context window of an Ollama model, please follow our dedicated tutorial: Optimizing Ollama models for BrainSoup.