6d ago

How to run LLaMA (and other LLMs) on Android.

Hello, everyone! I wanted to share my experience of successfully running LLaMA on an Android device. The model that performed the best for me was llama3.2:1b on a mid-range phone with around 8 GB of RAM. I was also able to get it up and running on a lower-end phone with 4 GB RAM. However, I also tested several other models that worked quite well, including qwen2.5:0.5b , qwen2.5:1.5b , qwen2.5:3b , smallthinker , tinyllama , deepseek-r1:1.5b , and gemma2:2b. I hope this helps anyone looking to experiment with these models on mobile devices!

Step 1: Install Termux

Download and install Termux from the Google Play Store or F-Droid

Step 2: Set Up proot-distro and Install Debian

Open Termux and update the package list:

 bash

    
pkg update && pkg upgrade

Install proot-distro
```
 bash
```
```
    
pkg install proot-distro


  
```

Install Debian using proot-distro:

 bash

    
proot-distro install debian

Log in to the Debian environment:
```
 bash
```
```
    
proot-distro login debian


  
```
You will need to log-in every time you want to run Ollama. You will need to repeat this step and all the steps below every time you want to run a model (excluding step 3 and the first half of step 4).

Step 3: Install Dependencies

Update the package list in Debian:

 bash

    
apt update && apt upgrade

Install curl:
```
 bash
```
```
    
apt install curl

  
```

Step 4: Install Ollama

Run the following command to download and install Ollama:

 bash

    
curl -fsSL https://ollama.com/install.sh | sh

Start the Ollama server:
```
 bash
```
```
    
ollama serve &


  
```
After you run this command, do ctrl + c and the server will continue to run in the background.

Step 5: Download and run the Llama3.2:1B Model

Use the following command to download the Llama3.2:1B model:
```
 bash
```
```
    
ollama run llama3.2:1b


  
```
This step fetches and runs the lightweight 1-billion-parameter version of the Llama 3.2 model .

Running LLaMA and other similar models on Android devices is definitely achievable, even with mid-range hardware. The performance varies depending on the model size and your device's specifications, but with some experimentation, you can find a setup that works well for your needs. I’ll make sure to keep this post updated if there are any new developments or additional tips that could help improve the experience. If you have any questions or suggestions, feel free to share them below!

– llama

17 comments

Is there an alternative Android app that enables downloading LLaMA locally (without using a terminal)?
- There are a few. There's Private AI. It is free (as in beer) but it's not libre (or open source). The app is a bit sketchy too, so I would still recommend doing as the tutorial says.
  Out of curiosity, why do you not want to use a terminal for that?
  
  Thanks for the suggestion.
  I’m like GUI AI such as ChatGPT. I’m currently in the process of running a local model that also allows me to connect with the internet and cross-platform.
  
  I see. I don't think there there are many solutions on that front for Android. For PC there are a few, such as LM Studio.