Case Study

Case Study Post Type

businessman hand drawing virtual chart business
Financial Services

AI based Credit Risk Scoring model for Micro Small Consumer Loans

Client Background Client is a FinTech providing underserved population with limited credit risk history in East Africa and Columbia region access to micro loans ranging from 5 to 200 USD based on their needs. Business Objective Client is looking to expand presence into new markets using the Credit platform powered by ML models built on alternative data. These ML models will decide the first cut of customer eligibility based on credit score post which rule based business logic will decide the credit limit that can be tracked on real time basis. Customer eligibility will be function of score generated through model, while credit limit will be a function of recent transaction pattern. Solution Solution development journey was divided into following phases Problem discovery & understanding phase During this phase spanning 2 weeks, our team of data scientists & data engineers worked with client’s team to identify and explore relevant datasets for the study. These datasets included customer demographics, wallet transaction data, past loan history and repayment data. We further worked with the business to define a “risky” customer that needed to be modelled against. Based on past loan history and wallet transaction patterns, entire customer base was segmented into group of 4 and it was decided to model these separately due to different risk characteristics. Model Development & Training We created comprehensive list of features in collaboration with client’s team. These features were believed to have distinguishing capabilities between risk and non-risky borrower. Such features were based on transaction patterns across time intervals (past 2 weeks, 4 weeks etc), time of the day, consistency and volatility in transactions, percentage & index growth in transactions, past loan repayment behavior Used various feature reduction techniques, including IV & VIF, to reduce model features to set of 30 features. Based on decision boundary we decided to use tree-based algorithms and selected the best model (Xgboost) after comparing AUC, accuracy, rank – ordering and KS for other models Random Forest, Decision Tree, GBM and tuned the hyper parameters of the model. Out of time validation was done on 3 months of data post training window cut off. We built normalized scoring framework by fixing factor and offset for four models developed for different segments. Model Deployment and Post Implementation We created comprehensive list of features in collaboration with client’s team. These features were believed to have distinguishing capabilities between risk and non-risky borrower. Such features were based on transaction patterns across time intervals (past 2 weeks, 4 weeks etc), time of the day, consistency and volatility in transactions, percentage & index growth in transactions, past loan repayment behavior We implemented PSI and CSI to compare the output performance each week to with the developed model. Job is set up which sends an email with PSI and CSI as an excel attachment. We used ML Flow to deploy model trained in python to be used in Pyspark for scoring purpose and NIFI jobs for data pipeline Also Read: Credit Risk & Fraud Outcome Trained models produced lift of 60-70% within first three deciles for 4 customer segments in out of time validation. Automation of credit risk scoring and limit assignment removed the subjectivity and improved TAT for loan application to within few minutes thereby improving overall customer experience.

Picture of two electrical engineers checking electrical work usi
Industrial

Business Intelligence suite for Power Transmission Utility

Client Background Our client is the State Power Transmission Utility (STU) responsible for building, operating and maintaining Transmission Substations and Lines at Extra High Voltages (EHV) of 132 KV, 220 KV, 400 KV & above. It endeavors to be among the best Power Transmission utilities in India in operating efficiency, system standards, and commercially viable operation in three years Business Objective Client is currently undertaking various IT initiatives to improve its operational efficiency. As a part of the efficiency improvement initiatives, it is in the process of deploying various applications that would help them streamline their processes, reduce the workload of their personnel and improve the internal workflow, communication, and collaboration capabilities. In this pursuit it intends to develop an Online Platform for Monitoring of Transmission Project’s parameters, Operation’s parameters, Finance’s parameters & HR’s parameters on Real-Time Basis acting as a decision support system for itself. Solution Valiance team spent the initial few weeks with the client to understand KPI & access control requirements of different business units. We further discussed with the client’s IT team to understand their present application stack, deployment infrastructure, and their technology roadmap. Considering different internal stakeholders needs and varied access requirements we proposed a suite of web & mobile-based applications built natively for the cloud. AWS was selected as the cloud platform with its alignment with the client’s technology roadmap and vision of IT as a business catalyst. The next three months were spent in extensive design & development activity power transmission utility covering Designing User interfaces & user journeys for both web & mobile workflows Designing data architecture on the cloud to support the ingestion of input data and storage of processed data & KPI’s for BI consumption. We used S3 for input data storage and RDS to store KPI’s. Development of data pipeline to ingest data into S3 and then process and move data to RDS. Development of Website & Mobile App, both Android & IOS. The website was deployed using Elastic Beanstalk with Route 53 acting as a DNS provider. We followed agile methodology during the design & development phase. After another 1 month of client UAT & feedback, the application was moved to the production environment and three years of historical data was migrated. Incremental data migration was set up at a daily frequency. Mobile Apps were also pushed to the android and IOS app store. Outcome Inherent cut down in reporting time to less than an hour compared to days & weeks in certain departments presently. Substantial cost savings, to the tune of 90% in manual efforts spent on consolidating datasets and report generation.

Technology security concept safety digital protection system
Public Sector

TechSagar – India’s Cybertech Repository

Client Background Data Security Council of India (DSCI) is a not-for-profit, industry body on data protection in India, setup by NASSCOM®, committed to making cyberspace safe, secure and trusted by establishing best practices, standards and initiatives in cyber security and privacy. To further its objectives, DSCI engages with governments and their agencies, regulators, industry sectors, industry associations and think tanks for policy advocacy, thought leadership, capacity building and outreach activities Business Objective Cyber technology capabilities have become central to National strategic outlook. There is a need for a concerted effort in developing critical technology capabilities in the country for India’s geopolitical advantage. Start-ups, enterprises, academia, researchers, and R&D institutes in the country need to synergise their efforts and work in tandem to achieve this national goal. To further this goal, TechSagar – India’s Cybertech Repository was conceptualized by the Government of India in partnership with Data Security Council of India. This web based repository to be developed will serve as consolidated and comprehensive repository of India’s Cybertech capabilities and provides actionable insights about capabilities of the Indian Industry, Academia & Research; across 25 technology areas like IoT, AI/ML, Block Chain, Cloud & Virtualisation, Robotics & Automation, AR/VR, Wireless & Networking, and more. Solution Our engagement began at the product ideation stage. We held several closed discussions with business stakeholders and the product team to conceptualize the product, user journey and UX. UX & development team worked in agile mode to quickly build initial working prototypes that were refined with continuous client feedback. We did the first public launch in an eight months timeframe that was widely publicized and captured in the media. As of now, the repository features 4000+ entities from Industry, Academia & Research including large enterprises and start-ups providing a country level view of India’s Cyber competencies. In addition to entities, the repository also provides information about over 5000 products & solutions and 3500 + services from start-ups and large enterprises. Entity data is continuously refined and enriched by data integration with public data sources and third party data providers. It allows targeted search, granular navigation and drilldown methods using more than 3000 niche capabilities. Technologies Used : AWS EC2, MongoDB, ReactJs, Redux, Python Django, CloudFront Outcome Platform went live within 8 months. Current version can be accessed here Platform has facilitated increased collaboration between different stakeholders, forged new connections & relationships for the benefit of the country.

Businessman using digital screens interface with holograms datas
Retail

Data Integration For Integrated Business Planning Solutions For Supply And Demand Planning

Client Background Client is a US based 3 billion USD plus footwear designer and distribution brand with 140 company owned stores and selling in 50 countries.  Business Objective Given the disruption caused by Covid pandemic it’s imperative for the customer to break the data silos and incorporate relevant data sources for their demand and supply planning. It’s further required to break away from the manual process of planning by various stakeholders like demand planners, supply planners, S&OP teams etc. To achieve this customer is looking to implement AI/ML driven integrated business planning solutions for supply and demand planning. Solution Valiance in its role as an implementation partner for IBP solution is responsible for ensuring that datasets required to drive various IBP modules, namely demand planning, supply planning, control tower etc are properly understood; documented and mapped. Thereafter source datasets need to be picked from sources like SFTP or via API calls, cleaned, transformed and fed into the destination system. To be able to achieve the above, our team of data architects and data engineers worked with the business and technology team to propose a data architecture based on MSBI technologies. MSBI stack was selected considering the existing landscape of technologies. Team narrowed down the data mapping, data models for staging and data load frequencies for different datasets. Over the course of next 6 months we implemented SSIS routines for demand planning addressing below scenarios Data transformations considering downstream needs. Duplicate records and bad data quality Recovery from job failures Notifications for success and failure Audit logs for traceability SSIS data pipeline processed 50 million & 1 million records for historical and incremental runs respectively. ❝ Valiance’s leadership team has been engaged right alongside their resources making a positive difference in a very difficult project. They are going above and beyond to make the project successful, and I appreciate their contributions very much. Team is knowledgeable and dedicated to the customer’s success -SVP for Apprarel and Fashion, Global supply chain ISV Outcome Successful data integration for demand planning ensures successful demand forecasts across various product categories, styles, geographies and SKU’s. This will further enable proper supply chain management, sourcing of materials, production planning and distribution. At Valiance we have developed precise understanding of relevant data for AFL companies w.r.t demand planning, supply planning, S&OP & IBP.  This capability, combined with our ability to correctly analyze all the required data as a result of our expertise in data science enables us to help such clients increase demand forecast accuracies that directly result in better sourcing precision, and more optimized manufacturing & distribution.

businessman hand drawing virtual chart business
Retail

AI Based Demand Forecasting For A Global Beverages Company

Client Background The client is one of the largest brewing companies in the United States and employs over 30,000 people. It was the world’s largest brewing company based on revenue, but third in brewing volume, before its acquisition. The division operates 12 breweries in the United States and 17 others overseas. Business Objective The client wanted to replace the legacy APO system and leverage machine learning forecast capability which can automatically enrich the forecasts with external drivers and minimize the manual enrichments. Solution Our data science team worked with client and ISV to acquire following datasets needed for forecasting on google cloud platform. Orders: Data at a weekly level for each of 10,000 SKU’s Macro-Economic Drivers: Consumer Price Index, LCU per USD, Retail Sales Index (USD), Industrial Production Index and Inverse Exchange Rate. Weather Drivers: Weighted Precipitation, Snow, Weighted Temperature, Max Weighted Temperature and Min Weighted Temperature. Marketing Spend Drivers: Digital Media Buying, Brand Events, Traditional Media Buying and Brand Promotion. Promotions: We have two types of promotions in Canada, price drop promotions and goodies promotion.   Google Data proc was used to process and prepare the data needed for model training along with Vertex AI for model training. Highly level summary of modelling process we followed is indicated below Finally demand forecasting capability we developed had the Operational forecast – weekly forecast. The level of forecast was at Item, sales channel and monthly level. This has been implemented for Orders, Shipments and POS for the regions USA, Mexico and Canada. Outcome Forecasted values were generated in supply chain solution to provide granular view at Region, province, item and customer level. We were able to achieve accuracy levels for 90% percent for 12 weeks forward forecast on 20% percent of SKU’s that drive roughly 80% of customer business. For remaining accuracy levels were 65%.

Minsk, Belarus - May 29, 2019: interior shot of racks with shirts, undershirts and jeans
Retail

Data Integration for Merchandise Financial & Assortment Planning

Client Background Client is a Mexico based retail giant which operates largest department stores in Mexico. It is a USD 4.5 billion company having 35,000 employees. Business Objective Implementation of a sophisticated AI/ML driven IBP platform for Merchandise Financial Planning, Assortment Planning, Replenishment, Allocation, and Demand Planning to achieve business goals. This implementation will help to achieve. Annual projections to sales LP, turnover margin. Determine OTB. Synchronize with daily & weekly goals for salespeople. Reconcile with global plan and channel mix. Reconcile with assortment. Solution Valiance in its role as an implementation partner for IBP solution is responsible for ensuring that datasets required to drive various IBP modules, namely Merchandise Financial Planning (MFP),Assortment Planning (AP) etc are properly understood; documented and business points are mapped. Thereafter source datasets need to be picked from sources like Google Cloud Storage(GCS) with the use of gsUTIL commands written inside batch files , validated, cleaned, transformed and fed into the destination system. To be able to achieve the above, our team of data architects and data engineers worked with the business and technology team to propose a data architecture based on MSBI technologies specially SSIS. Team completed the data mapping, data models for staging and data load frequencies for different datasets. Over the course of next 12 months we implemented SSIS routines for MFP, AP addressing below scenarios Data fetching from Google storage as a UTF-8 compatible text files and dumping into staging tables. Data transformations between staging and output tables making sure correct Spanish data to be pushed. Duplicate records and bad data quality. Recovery from job failures. Notification email for success and failure. Audit logs for traceability. Outcome Successful implementation of data integration will serve as the foundation for the second phase of the project where the process will be enhanced with analytics which will improve forecast accuracy, increase inventory turns, reduce standing orders and lost sales. At Valiance we have developed precise understanding of relevant data for AFL companies w.r.t demand planning, supply planning, S&OP & IBP.  This capability, combined with our ability to correctly analyze all the required data because of our expertise in data science enables us to help such clients increase demand forecast accuracies that directly result in better sourcing precision, and more optimized manufacturing & distribution.

Aerial view of gas and oil refinery, Oil Industry.
Industrial

Industrial IoT Implementation For A Global Chemicals Manufacturer

Client Background Client is a leading Fortune 500 company in chemical business. It has a significant presence in specialty chemicals value chain through offerings in epoxy, food phosphates, sulphites and water treatment chemicals. The comprehensive product portfolio provides solutions across a wide range of industries like water treatment, disinfection, cleaning and sanitization, food, pharmaceuticals, agriculture and allied sectors. Business Objective As a part of digital transformation exercise for achieving business and operational excellence, there is a need to collect data from various plant assets and devices using industrial sensors. This data will be used to remotely monitor the assets, monitoring the production process for yield maximization, safety monitoring by preventing any industrial mishaps, predictive maintenance workloads and energy optimization. Solution Client engaged Valiance to implement data collection and ingestion pipeline using open source technologies from plant assets. Data was to be extracted from sensors, filtered from anomalies and ingested into databases for further monitoring and analysis. Some of these physical assets included: Air Compressor Cell House Chlorine Compressor Chlorine Liquifier Our team created a modular data pipeline using python libs ex: separate scripts for fetching data from sensors, data validation and data ingestion. We also maintained metadata for sensors, equipment’s and other configurations. For different physical assets, properties like; temperature, pressure, energy consumption etc are being monitored Analytical module determines error values, abnormalities based on expected values and historical data. Any alarms are thus raised and sent to the end user as notifications through email, SMS and app channels. Application is deployed on Azure virtual machines using Fast API. Further, we are using Power BI to display metrics for operational teams. Outcome We have successfully achieved first phase of digitization of assets by putting in place monitoring applications. As a next step we are working on building predictive capabilities and dockerising the applications to make it suitable for multi cloud hybrid deployments.

Women's fashion store in the shopping center
Retail

ML based Demand Forecasting for Fashion & Lifestyle Retailer

Client Background Client is a global well-being company retailing, distributing and manufacturing a portfolio of leading international and home-grown brands across sport, food and health sectors. Business Objective The client has both seasonal and core lines, with overall SKUs being more than 200,000 and over 500 retail stores. This SKUs also include products which are launched as New Product Initiatives This in turn, resulted in a non-traditional forecasting technique (machine learning)., having the ability to incorporate multiple external variables*, such as google mobility, weather events, covid, promotional flags, floating calendar etc. Solution The project was executed in 3 phases: Phase 1: It consisted of the Data Collection and Harmonization – Phase 2:  It focused on the Exploratory Data Analysis & Segmentation –   Phase 3: ML model building/ iteration followed by model training & validation The third and the final phase constituted of building and validating the ML models. Following modeling techniques were used: Linear Regression: These models are easier to interpret and debug, so it is a good starting point. XGBoost: This is a tree-based approach and uses boosting technique. Boosting is a homogeneous weak learners’ model. learners learn sequentially and adaptively to improve model predictions of a learning algorithm. Random Forest: It uses bagging technique. It is also homogeneous weak learners’ model that learns from each other independently in parallel and combines them for determining the model average. Gradient Boosting Light Gradient Boosted Machine Model validation in forecasting: The dataset is divided into training and validation data sets. We go backward for some time period (say weeks) and try to forecast for the same time period using historical sales. Post which, we evaluate the model with the help of Actual sales vs Forecast Sales for the Validation data set Below are some external variables that were taken into consideration: Macro-economic: GDP, Inflation, Industrial Production, Unemployment rate, Inverse exchange rate. Holiday Information: Festivals, National Holiday, other holiday Floating Calendar: Includes special event like sports league Promotional Details: Includes discounts and offers on Items Item level Information: SKU, style color, sub-class, category; item attributes: polo grey color Weather Information: Temperature, Wind speed, Humidity, Rain Marketing Spend: It considers expenses through various ways of marketing, for example digital media, traditional media , Brand events. Mobility Data from Google/Apple: Traffic of mobile phones over different places like parks, residential areas etc. Events happening (Sports) Special days: Thanksgiving Day, Christmas, new year, black Friday etc. Product Attributes:  Size, color, design etc. Sales channel Store format Covid Data: Total confirmed cases/tested cases for any region Other specific variables: holiday calendar, promotions, markdown, Mobility Data from Google/Apple: Traffic of mobile phones over different places like parks, residential areas etc. Outcome 80% improvement in forecast accuracy 8X reduction in manual intervention 40% reduction in inventory value

Excavator
Industrial

Machine Learning Model to Predict Likelihood of Mineral

Client Background: Client is a multinational metals and mining corporation, producing iron ore, copper, diamonds, gold and uranium Business Objective: In order to meet current or future demand and explore new business opportunities, client makes efforts to find new mineral sites Client wants to process remote sensing image data to identify new mineral occurrence locations (of commercial interest) Solution  We developed a Machine Learning model to identify new mineral occurrence locations by scoring sites on likelihood (using band ratios and other such combinations of band reflectance values) Key steps involved: Re-factor code to python and access AWS processing infrastructure Test and validate re-factored python code in current process Information extraction on RS scenes to create prediction variables Use the information gained from the above analysis to create Machine Learning model Validate Machine Learning model against hold out data Score incoming RS scene data to identify new potential mineral sites Outcome Our solution achieved an accuracy of 95% on round 1 of Images.

Happy young Asian businessmen and businesswomen meeting brainsto
Financial Services

Assessing The Effectiveness Of Communication Methods and Content Through A/B Testing

Client Background Client is a mobile app based FinTech organization in the Indian market, operating in the space of personal loans, housing loans, health insurance, and mutual funds. Its aim is to make personal finance simple and accessible to a wider set of audience. Business Objective Client is a start-up company in their growing phase. As part of the journey, the client was working on figuring out different things that work best in the market and kept experimenting to acquire this learning. We worked with the Growth and Marketing teams, and the business objective was; to identify how efficient are different modes of communication (SMS/ Push Notifications/ Emails/ Performance Marketing) and what type of messaging content helps in higher customer engagement. Solution We leveraged data from different customer engagement and communications platforms like MoEngage, AppsFlyer, Kaleyra and Gupshup; which were being used by client The, we performed series for A/B tests to assess the performance of different modes of communications and different types of messaging contents The user base was divided into different cohorts with an uniform distribution across the cohorts based on user attributes (demographic and bureau info) Different success metrics were defined, to track the effectiveness of these campaigns, viz. user reachability, user engagement (views/clicks) and user conversions (from on-boarding to conversion journey) User engagement via clicks was studied using AppsFlyer data, which helped in tracking the engagement across different deep links that were incorporated into these campaigns Defined the attribution time window to link any success event to a particular event, by looking at the data distribution and studying industry standards Built a dashboard on Superset (which was later migrated to Tableau), to help the different stakeholders keep a track of the performance of these different campaign experiments over a period time Outcome These activities helped the clients to identify the appropriate communication channels and right set of content that worked best for different set of customers. This helped clients in enhancing their marketing strategy and channelizing their budget accordingly. Success metrics were defined in terms of proportion of users who engaged on the platform after receiving communication and then till what stage of the product journey did they proceed. And eventually how many of them ended up getting converted to paid customers.

Scroll to Top