Author: Shashank Raj (Business Analyst, Valiance)
ML models have become indispensable in today’s world where we have large volume of data being generated everyday. From predictive modelling to classification models, data scientists and ML engineers are building different ML models for their company’s growth or research purposes.
The steps which are typically followed in building a ML model are:
- Model building
- Model selection
- Model deployment
Taking ML models from conceptualisation to production is a complex process and it takes a lot of time. One has to choose the best algorithm for the model, manage large amounts of data to train and test the model, and then deploy the model for continuous use and improvement. Data scientists are responsible for the model building whereas ML engineers have expertise in the deployment of these models. In this article, we will particularly look at step 3 which is the deployment of the ML models:
ML model deployment:
Deployment of an ML model means the integration of the model into an existing production environment which can take in an input and return an output that can be used in making practical business decisions.
Example: Credit default prediction model
Here, we developed a ML model based on logistic regression. The model was trained on the historical credit data of the customers. Once the model is tested with the data, we want to use the model for the fresh credit off take. For the same purpose the model has to be integrated with the new credit database so that the model continuously predicts the default risk with each of the transactions.
There are various tools which provide the platform for deployment of these models.
- Google Cloud
- Amazon SageMaker
- Azure ML services
How to approach ML model deployment?
- API-first approach – Web APIs have made it easy for cross-language applications to work well. If a frontend developer needs to use the ML model to create a ML powered web application, they would just need to get the URL endpoint from where the API is being served.
The basic architecture for deploying an ML model is:
The user who is using a browser, is the frontend. The backend server responds to the frontend’s requests. For this the backend needs to talk to databases, other APIs, and microservices. The backend may also produce other jobs — such as ML jobs — at the request of the user.
The architecture of deployment depends on:
- Scalability of the model i.e the amount of data to which the model will cater
- Testing i.e. the ability to test different versions of models
- Automation required for eliminating manual steps wherever possible to reduce error chances
- Extensibility i.e. if the model is to be extended for similar other use case
Here we look at the broad steps involved in deploying locally trained ML model on Google Cloud. Other platforms will have more or less similar steps:
One of the important factors that can affect the accuracy of deployed models is if the data being used to generate predictions differs from data used to train the model. For example, changing economic conditions could drive the cost of a product affecting sales predictions. Thus a robust and continuous evolving model and the ML architecture is required.
As Machine Learning techniques continue to evolve and perform more complex tasks, so is evolving our knowledge of how to manage and deliver such applications to production.