You’re probably thinking of building an app that has an AI/ML component, or you might want to add an intelligent component in your already existing one, but what exactly is possible and how can you do it?
Artificial Intelligence and Machine Learning are popular terms nowadays which are spread all over the IT scene and there is a trend in trying to apply the concepts of those domains to everything related to IT (building apps, recommending movies, showing ads, finding the best time to travel from one place to another, etc.).
Mobile phones are the devices that use AI and ML the most and I’m not talking about a single scope approach where AI techniques are used for solving a certain problem, in a mobile phone AI is used from the hardware (the custom AI chips) to software (Siri, picture taking, recommendations, natural language processing and many more).
Now, what’s the difference between AI and ML?
AI is the science and all the practices involved in making a computer behave in ways that we previously thought were only accomplishable by humans (such as playing Chess, driving cars, writing poems, etc.). ML on the other hand is a subset of practices in the AI domain that allow computer algorithms to improve through experience (for instance, categorizing music, recommending movies, giving diagnostics, and so on).
In the realm of mobile applications, ML is one of the most common braches of AI in use, that is mainly because of hardware considerations (the computational power, energy efficiency, and storage) but not only, mobile phones are devices which are usually used by a single user for day to day tasks and they are meant to solve a multitude of problems not a single one. In addition to ML, there are a lot of other AI techniques used in mobile products such as Computer Vision, Computer Audio, Deep Networks, and so on, however, those are not that widespread.
So what is currently possible from an ML perspective in the realm of mobile applications?
Well, everything is possible, but if you’re just starting in this world, and you don’t have a large team of scientists and developers your options might be a little bit more limited. Luckily, there are a ton of resources and tools available that can be easily used in MVPs or already existing apps, I’m going to talk about a few of those today.
Firstly we need to talk about how those tools can be accessed, mainly there are two ways in which the tools can be used and accessed
- API based — those are usually external services that take some data as input and give you back a response based, all the processing happens on a server somewhere, and there is a need for an Internet connection to access those — so they don’t work for offline apps.
- Device-based — the algorithms and the AI models live on the device, those can be accessed and used usually offline but some of them require an Internet connection as well.
There is a third model, which is a hybrid of the two enunciated above in which a model is downloaded periodically from a server, however, those cases are rare and are not that commonly used as the other two.
ML techniques are heavily used for image processing but not only, but they also apply to other domains (medicine, predictions, etc). The product of every ML algorithm is an ML model that can be used for solving a certain problem. One can think of an ML model as a blueprint for solving a certain problem, once we know how to do multiplication, we can apply the same operation to an infinite number of options.
Training a model
Based on the user behavior, for instance, an app could learn from the user’s behavior which action he is most likely to do, an example would be a food ordering application that would make suggestions based on previous behavior (for instance recommend vegetarian dishes and restaurants if it’s lunch to users who eat vegetarian food for launch multiple times). The model and the algorithm is updated every time the user makes a choice, thus resulting in completely different app experiences for different users.
The example above recommending the food based on previous orders can take into consideration multiple other parameters such as the period of the year, time of the day, the location of the user, what other users on the platform are trying out, and so on. A model is more than just doing some simple decisions, it can take into consideration a lot more parameters than the human brain and it can take really good decisions regarding what to recommend if it’s configured and build correctly, with high-quality data.
For this sort of intelligent algorithm to work well, it needs the right data and it needs some time to learn until accurate predictions can be made. In addition to this, doing this on the user’s device can raise some privacy issues if the model or the data used to compute it leaves the device, so think about it carefully before deciding that you want to use it.
Another approach is to distribute the same model for all the users of the app, a model trained to solve a certain, well-defined problem, for instance, identifying the season from the photos taken by the user. This can be achieved by training the model with either your data (if you have enough of it) or by using one of the data sets available on the web. You can find datasets in many domains, mainly used for academic purposes but not only, some examples of the sets available are:
- Food types sets
- Diagnostic for certain diseases sets
- Language sets
- Voices sets
Using a pre-trained model
Knowledge is power, most commonly nowadays data is power because no matter how good coders are your team, won’t be able to implement great ML algorithms without a large set of data that covers as many as the possible causes of the problem you want to solve. Unfortunately finding the right data to train your model can be hard, not only that you need correct data, correctly labeled and written in a consistent format, you also need a ton of it to make your algorithm performant.
Luckily there is some good news as well, on the web you will find a lot of pre-trained models that you can use in your app, or even API’s that expose some trained models for solving certain issues as:
- Recognizing the main object from an image
- Recognizing the people, faces, and body parts and body poses from an image
- Recognizing brands, names, texts, or food (including the type) from an image
- Finding the answer to a question in a text
Another class of intelligent algorithms is the Natural Language Processing which handles text and audio (containing someone speaking) to extract certain content from the analyzed phrases or sentences. This is a central piece in software such as Siri, Alexa, Google Assistant.
The good news is that a lot of tools are available that can process natural language data and extract with high accuracy different information such as:
- The language of the data (English, Deutch, French, etc.)
- Identifying parts of speech (noun, verbs, adjectives, numbers, pronouns, and many others)
- Identifying punctuation
- Identifying businesses or organization names
- Finding similar words or sentences
- Training models for identifying custom items such as names of products, sports rules (ex. offside, goal, handball, etc).
- Analysing the sentiments of a text (positive, negative, etc.).
As you can see, with the aid of those tools we can easily build more complex and intelligent systems, for instance, an app that could take voice commands for completing certain actions, or modifying the score table while watching a football game in the app by only analyzing the sound of the commentators.
Please note that we haven’t talked about AR/VR techniques and tools even though they heavily use AI and ML techniques, they will make the subject of a future article.