Running an LLM Locally Using Ollama

In the evolving world of AI, Large Language Models (LLMs) like GPT and BERT have become pivotal. However, accessing these models typically requires cloud services. But you can run some of these LLMs locally, right from your own hardware, assuming you have adequate system resources.

Running Ollama using Docker

Ollama is a command-line chatbot that simplifies the use of large language models, which offers a range of open-source models like Mistral, Llama 2, Code Llama, and more, catering to different requirements (see Ollama on GitHub for minimum system requirements).

If, like me, you already have docker installed and want to avoid installing more stuff, you can just use docker (see Ollama on Docker Hub):

docker run -d -v ollama:/root/.ollama -p 11434:11434 --name ollama ollama/ollama
# then run the LLM
docker exec -it ollama ollama run llama2

Here is that chatbot telling me a joke and helping me solve an equation: Chatbot telling a joke

There are a number of other LLMs you can run as listed at Ollama Library, e.g.:

docker exec -it ollama ollama run mistral

You can also invoke this via an API call:

curl -X POST http://localhost:11434/api/generate -d '{
  "model": "mistral",
  "prompt":"what is python?"
}'

This approach with Ollama and Docker opens up new possibilities for utilizing LLMs directly from personal devices, ensuring both accessibility and privacy for users.

Running an LLM Locally Using Ollama

Running Ollama using Docker

Can You Be a Successful Programmer in 2027 Without AI Skills?

Building a Simple Multi-Agent Physics Teacher Application with AutoGen

Creating a Real-time Chat Application with Streamlit and Neo4j

Building a simple chat application using Streamlit and Langchain