Ollama (Download and run large language models locally) Ollama is an application which lets you run large language models offline. Optional dependencies like CUDA or ROCm will be automatically detected during compilation of ollama libraries, if present. CUDA=ON: building with CUDA, default is CUDA=OFF, build and runtime dependencies: development/cudatoolkit_13 system/nvidia-driver or system/nvidia-legacy580-driver ROCM=ON: building with ROCm, default is ROCM=OFF, build and runtime dependencies: development/rocmtoolkit_7 Building ollama requires network and google-go-lang. To verify the installation: $ nohup ollama serve & $ ollama --version $ ollama run gemma3:270m See also: https://docs.ollama.com/faq https://docs.ollama.com/quickstart