This is a submission for the Gemma 4 Challenge: Build with Gemma 4
What I Built
I built Aether. It is a local Android assistant. It replaces your default cloud assistant with a completely offline alternative. You trigger it just like normal. You hold the power button and Aether pops up instantly. Everything happens directly on your device.
The app registers natively as your default Android assistant using the ACTION_ASSIST intent. It uses the built-in Android speech recognition for voice input.
For the foundation, I used the open source FriedGPT codebase. I stripped out all the old cloud API routes to create a clean boundary for local inference. Your chat history and text data stay entirely on your phone.
Demo
Here is a quick video showing Aether running on a real phone.
Note: the specifications of this phone (2.0GHz CPU and 4.0GB RAM) make the generation of the text slow -- it should work really fast on a "good" phone.
Code
Aether is open source under the Apache 2.0 license. You can check out the architecture and the migration path on GitHub here:
https://github.com/arjuncodess/aether
How I Used Gemma 4
I used the Gemma 4 E2B model for this project. E2B stands for Effective 2 Billion parameters. I downloaded the LiteRT-LM version from Hugging Face. The app loads this model directly from your local storage.
Gemma 4 handles all the text generation right on the phone. It acts as the core brain for the entire chat experience. The new architecture features a huge 128K context window. This means the model easily remembers past messages in your device-local chat history without losing the thread.
Why This Model Fits Perfect for Mobile
Running models on a phone is tough. You have strict memory limits. If an app uses too much RAM, the Android system kills it immediately.
Gemma 4 E2B solves this perfectly. It uses Per-Layer Embeddings and 4-bit weights through LiteRT. This keeps the memory footprint under 1.5GB. The model runs smoothly in the background while you have other apps open.
It also uses a hybrid attention mechanism. It mixes local sliding windows with global attention. You get fast processing speeds without sacrificing context. You get a smart assistant that works instantly even in airplane mode.
Future Roadmap
This current version is just the first step. I have plenty of bugs to track down and fix. I want to improve the UI design and polish this into a daily-driver product.
The next major feature is adding a tool layer. Gemma 4 supports native function calling out of the box. I plan to map those capabilities to offline Android actions. Soon, Aether will set alarms, launch apps, manage your calendar, and control your clipboard.
Giving system access to a local model is incredibly safe. It is far better than handing your personal data over to a massive AI company.
A local model keeps your life private.
';" />
';" />