Tutorials
Using mimik ai for Android AI applications

Using mimik ai for Android AI applications

The objective of this article is to demonstrate how to integrate AI components, including language models, with Android applications using the mimik Client Library.

The intended readers of this document are Android software developers who want to familiarize themselves with how the mimik Client Library interfaces with mimik ai.

Learning about topics relevant to working with the mimik Client Library AI interfaces, such as:

Integrating AI components
Configuring AI language model sources
Downloading AI language models
Referencing downloaded AI language models
Chatting with downloaded AI language models
Processing AI language model chat stream responses

Understanding the mimik Client Library components integration and initialization process as laid out in this article.
Understanding how to work with the mim OE Runtime in an Android application.
A real Android 64-bit device. mim OE does not run on emulated devices, and the AI runtime requires a 64-bit device.

Our example AI use case consists of the following components:

As a package deployed via the mimik Client Library to your Android application, mimik ai components add a simple, yet powerful interface to the world of AI language models. In order to get your application ready to start communicating with AI language models, we need to deploy the mimik ai microservice.

In this tutorial, we won't cover how to integrate and initialize the mimik Client Library or how to start mim OE Runtime. See the prerequisites for tutorial links.

So, let's begin!

Before using the mILM Client Library, we need to configure some dependencies and resources in our Android application. The following assumes that we have a working application by following the tutorials outlined in the prerequisites section.

The first thing we will need to do is declare the dependencies for various mimik AI libraries. In our module-level build.gradle, we will change our declaration of the mimik Client Library from the regular variant to the AI variant. We then add a dependency for the mILM Client Library. This looks as follows:

1: // Regular mimik Client Library declaration
2: //implementation("com.mimik.mim-oe-sdk-android:mim-oe-ai-client-developer:3.14.0.1")
3: // AI mimik Client Library declaration
4: implementation("com.mimik.mim-oe-sdk-android:mim-oe-ai-client-developer:3.14.0.1")
5: 
6: // mILM Client Library declaration. We exclude the mim-oe-client module to avoid duplicate classes
7: implementation("com.mimik.mim-oe-sdk-android:mim-oe-milm-client:0.1.1") {
8:     exclude(group = "com.mimik.mim-oe-sdk-android", module = "mim-oe-client")
9: }

Next, we will need to add the mILM microservice package to the application resources for use in the app. This can be done by adding the .tar file that can be found in the release .zip file at the Github page to the raw resources folder of the application module, {module}/src/main/res/raw/.

Note that Android does not allow -, ., or uppercase characters in the filenames of resources. The easiest way to resolve this is to rename the mILM-v1.x.x.tar file to milm_v1.tar.

Now that our application has all the necessary declarations and resources, we can begin using the mILM Client Library. We will start by using the library to deploy the mILM microservice to the mim OE instance.

The following code deploys the default mILM microservice and checks the status of the deployment. Using the default configuration automatically generates an API key for the microservice and configures the client library to use it. We can also specify our own API key, instead of allowing the library to generate it.

1: //MimOEClientMilm.setMilmApiKey(context, "someApiKey") // Optional, to specify an API key instead of generating one automatically.
2: val status = MimOEClientMilm.deployDefaultMilmMicroservice(
3:     context, // Application context
4:     mimOEClient, // Our mim OE client instance which has already been started
5:     resources.openRawResource(R.raw.milm_v1) // Our mILM microservice package resource
6: )
7: if (status.error != null) {
8:     Log.e("Error!", "Failed to deploy microservice! ${status.error.message}")
9: } else {
10:     Log.e("Success!", "Successfully deployed microservice!")
11: }

Once the microservice has been successfully deployed, we can get a Retrofit2 wrapper class to interact with it, which we call the mILM Provider. This wrapper class sets up an HTTP client to make API calls to the mILM microservice, and functions that make calling the APIs more streamlined.

1: val mILMProvider = MimOEClientMilm.getMilmProvider(
2:     context, 
3:     mimOEClient
4: )

Before we can begin downloading AI language models locally to the user's device, there are a few preparatory steps we need to take as shown in the code example below.

The first task is to decide which AI language model to download. To simplify this process, we’ve provided an example definition of a third-party model. However, you can work with any AI language model that fits within the hardware and software capabilities of your Android device.

Once we have chosen a model, we can create a ModelDownload object to represent it. We then use the mILM provider to initiate the model download by the microservice. Since the API responds as a stream, we will need to handle the streaming response. In this example we will use Kotlin Coroutines to do this, and we will just log the download progress to logcat.

1: // Function to queue model download
2: fun queueModel() {
3:     // Example ModelDownload object
4:     val model = ModelDownload(
5:         id = "hugging-quants/Llama-3.2-1B-Instruct-Q8_0-GGUF",
6:         obj = "model",
7:         url = "https://huggingface.co/hugging-quants/Llama-3.2-1B-Instruct-Q8_0-GGUF/resolve/main/llama-3.2-1b-instruct-q8_0.gguf?download=true",
8:         ownedBy = "hugging-quants"
9:     )
10: 
11:     val call = mILMProvider.queueModel(model)
12:     // Use Kotlin Coroutines to handle HTTP streaming response
13:     CoroutineScope(Dispatchers.IO).launch {
14:         modelDownloadStreamingCoroutine(call).collectLatest {
15:             withContext(Dispatchers.Main) {
16:                 // Handle parsed streaming response
17:                 Log.d("queueModel", "${it.size} / ${it.totalSize} downloaded")
18:             }
19:         }
20:     }
21: }
22: 
23: // Coroutine function to parse streaming response
24: private fun <T> modelDownloadStreamingCoroutine(call: Call<ResponseBody>) = flow {
25:     try {
26:         val response = call.awaitResponse() // Make call to API
27:         val gson = Gson() // GSON parser
28:         if (response.isSuccessful) {
29:             response.body()?.byteStream()?.bufferedReader().use { input -> // Create an InputStream from the response
30:                 while (currentCoroutineContext().isActive) {
31:                     val line = input?.readLine() // Read line from streaming response
32:                     if (line != null && line.startsWith("data:")) { // streaming response lines always start with this 'data:'
33:                         try {
34:                             val data = gson.fromJson(
35:                                 line.substring(5).trim(), // parse JSON after the 'data:' prefix
36:                                 ModelStatus::class.java // Expected format of the response data
37:                             )
38:                             
39:                             if (data.size == data.totalSize) {
40:                                 // Model Queuing is finished, end streaming
41:                                 Log.d("queueModel", "Download finished")
42:                                 break
43:                             }
44:                             emit(data) // Emit the data back to the handler
45:                         } catch (e: Exception) {
46:                             Log.e("Error!", "Exception while parsing streaming response data!")
47:                         }
48:                     } else {
49:                         delay(100) // Wait a bit if no data is available, to avoid pinning
50:                     }
51:                 }
52:             }
53:         } else {
54:             Log.e("Error!", "API Call failed!")
55:         }
56:     } catch (e: IOException) {
57:         Log.e("Error!", "IOException while making API call!")
58:     }
59: }

Using the MilmProvider wrapper class, we can get a list of queued/downloaded models by using the getModels() function. This is a network call, and cannot be run on the main thread.

1: Executors.newSingleThreadExecutor().execute {
2:     val getModelListResponse = MimOEClientMilm.getMilmProvider(this@MainActivity, mimOEClient).getModels().execute()
3:     if (getModelListResponse.isSuccessful) {
4:         val list: List<MilmModel> = resp.body()?.data
5:         list?.forEach { model ->
6:             Log.d("getModels", "Found model ${Gson().toJson(model)}")
7:         }
8:     }
9: }

To start chatting with a downloaded AI language model, we will use the sendCompletion() function of MilmProvider. To use this API, we need to know the id of the model we want to query, which was chosen when we initially started the model download, as well as the prompt that we want to send. This API can be configured to respond in a streaming or non-streaming manner. In the following example we will configure it to be streaming, and handle it with Kotlin Coroutines again, similarly to model download.

During the first prompt response to a model, mILM will add in messages showing the initialization of the AI language model. We filter these messages out in the handling of the streaming response.

1: // Function to send prompt to AI language model and log the response.
2: fun sendPrompt(prompt: String) {
3:     val query = MilmQuery(
4:         model = model.id, // id of the model we want to query
5:         messages = listOf(Message("user", prompt)), // List of Message objects containing the prompt we want to send
6:         temperature = null, // Parameter controlling the randomness of the AI language model output
7:         maxTokens = null, // Maximum number of tokens the AI language model will use to generate the output
8:         stream = true // Whether the response is streaming or not
9:     )
10: 
11:     val completionCall = MimOEClientMilm.getMilmProvider(this@MainActivity, mimOEClient).sendCompletion(query)
12:     // Use Coroutines to asynchronously make API call
13:     CoroutineScope(Dispatchers.IO).launch {
14:         completionStreamingCoroutine(
15:             completionCall
16:         ).collectLatest {
17:             withContext(Dispatchers.Main) {
18:                 val content = it.choices[0].delta?.content
19:                 // Checking if the stream contains specific key words, indicating the AI model states
20:                 if (content.contains("Model Ready") || content.contains("Model Loading")) {
21:                     // Ignore these messages, since they are not part of the AI language model's response to the prompt
22:                 } else {
23:                     Log.d("completionResponse", "Received partial response: ${content}")
24:                 }
25:             }
26:         }
27:     }
28: }
29: 
30: // Coroutine function to parse streaming response
31: private fun <T> completionStreamingCoroutine(call: Call<ResponseBody>) = flow {
32:     try {
33:         val response = call.awaitResponse() // Make call to API
34:         val gson = Gson() // GSON parser
35:         if (response.isSuccessful) {
36:             response.body()?.byteStream()?.bufferedReader().use { input -> // Create an InputStream from the response
37:                 while (currentCoroutineContext().isActive) {
38:                     val line = input?.readLine() // Read line from streaming response
39:                     if (line != null && line.startsWith("data:")) { // streaming response lines always start with this 'data:'
40:                         try {
41:                             val data = gson.fromJson(
42:                                 line.substring(5).trim(), // parse JSON after the 'data:' prefix
43:                                 MilmResponse::class.java // Expected format of the response data
44:                             )
45:                             
46:                             if (data.choices[0].finishReason == "stop") {
47:                                 // The completion response is completed, end streaming
48:                                 Log.d("completionResponse", "Response finished")
49:                                 break
50:                             }
51:                             emit(data) // Emit the data back to the handler
52:                         } catch (e: Exception) {
53:                             Log.e("Error!", "Exception while parsing streaming response data!")
54:                         }
55:                     } else {
56:                         delay(100) // Wait a bit if no data is available, to avoid pinning
57:                     }
58:                 }
59:             }
60:         } else {
61:             Log.e("Error!", "API Call failed!")
62:         }
63:     } catch (e: IOException) {
64:         Log.e("Error!", "IOException while making API call!")
65:     }
66: }

Since the AI language model gets fully downloaded onto your device, once a language model is downloaded by mILM, the example application can chat with the model even when the device's internet connection is disabled. For example in airplane mode.

Android application project example on GitHub.

In order to get more out of this article, the reader could further familiarize themselves with the following concepts and techniques:

Using mimik ai for Android AI applications

Objective

Intended Readers

What You'll Be Doing

Prerequisites

Overview

Application Setup

Declaring AI component dependencies

Adding the mILM microservice package

Using the mILM Client Library

Deploying the mILM microservice

Configuring AI language model source

Queueing a download of an AI language model

Referencing downloaded AI language models

Chatting with downloaded AI language model

Example project also works offline

Additional reading

Was this article helpful?