On-Device AI Models
Every model runs entirely on your iPhone. Choose what you need, download only what you want, and use them offline forever.
LiberaGPT uses compact, open-source AI models that run entirely on your iPhone. Download only the models you need. Each model is optimised for the Neural Engine with quantization that balances quality and speed.
Language Models (GGUF)
Seven quantised models (Q4_K) optimised for on-device inference via llama.cpp. Download what you need in Settings. All models run with GPU acceleration on iPhone 17 Pro tier devices.
Gemma 3 1B Instruct
1 billion parameters · 32K context · Q4_K_M quantization · 806 MB download · ~50 tok/sec
Google's latest compact instruction model. Fast inference with minimal battery impact. Ideal for quick answers, summarization, and everyday tasks. GGUF format with GPU acceleration support.
SmolLM3 3B Instruct
3 billion parameters · 128K context · Q4_K_M quantization · 1.9 GB download · ~25 tok/sec
HuggingFace's compact yet capable model with massive context window. Extended conversations, long document analysis, and detailed reasoning. Balanced performance for most tasks.
Phi-4 Mini 3.8B Instruct
3.8 billion parameters · 128K context · Q4_K_M quantization · 2.5 GB download · ~18 tok/sec
Microsoft's latest Phi generation. Strong reasoning capabilities, instruction following, and language understanding. Best-in-class quality for its size. Deep Mode recommended.
StableLM Zephyr 1.6B
1.6 billion parameters · 4K context · Q4_K_S quantization · 989 MB download · ~28 tok/sec
Stability AI's lightweight instruction model. Excellent speed-to-quality ratio for everyday tasks. Minimal memory footprint, ideal for older devices or battery-sensitive workflows.
EXAONE Deep 2.4B
2.4 billion parameters · 32K context · Q4_K_M quantization · 1.6 GB download · ~22 tok/sec
LG AI Research's bilingual model (English/Korean). Strong multilingual capabilities and reasoning. Optimised for technical and analytical tasks with extended context support.
EXAONE 4.0 1.2B Instruct
1.2 billion parameters · 65K context · Q4_K_M quantization · 851 MB download · ~30 tok/sec
LG AI's newest lightweight model with massive 65K context window. Excellent for document-heavy workflows and extended conversations. Fast inference with minimal storage footprint.
AceInstruct 1.5B
1.5 billion parameters · 128K context · Q4_K_M quantization · 1.2 GB download · ~26 tok/sec
Instruction-tuned variant with strong following capabilities. Reliable for structured outputs, task execution, and complex multi-step instructions. Good balance of size and capability.
Semantic Search (Planned)
Document retrieval system for RAG. Currently using hash-based fallback — semantic embeddings integration planned for future release.
all-MiniLM-L6-v2 (Planned)
Sentence embeddings · 384 dimensions · 22 million parameters · 90 MB
Compact sentence transformer for generating dense vector embeddings. Will convert imported documents into semantically searchable vectors. Planned integration with CoreML and local SQLite-vec storage.
Model Selection Philosophy
Every model in LiberaGPT is chosen for a specific purpose: speed, accuracy, context length, or specialisation. You control which models to download and use. Lightweight models provide fast, efficient responses for everyday tasks. Larger models offer more power when you need it. Voice models enable natural interaction. Embedding models make your documents searchable.
All models run entirely offline, all data stays local, and you're never locked into a single provider. This is honest AI: you see what's running, you choose what to install, and you understand the trade-offs.