HRM teams put constant efforts to improve their hiring process to bring in the best talent into the organisation. Even when hiring managers focus on behavioural and cultural-fit aspects of any candidate along with impressive experience and skill sets, many times the HR teams are unable to evaluate the long-term success of a future candidate, leading to high voluntary attrition

Problem Statement

Organisations invest significant resources in hiring & training new employees, along with running training programs for their existing employees. All of this is done with presumption of improving employee productivity, with a significant gestation period. High voluntary attrition can be detrimental to both the organisation’s growth as well as the existing employees’ morale, business continuity and contributes to a significant impact on the bottom line.

Business Need

The key objective of the solution is to come up with:

  • A classification model to predict the chances of an employee leaving the organisation which can be used by the HRM team to know the requirement of resources beforehand and at the same time improve the policies for their employees
  • Create intervention strategies based on employee segments

Proposed Solution

The HRM team needs to identify employees at high risk of attrition & thereby creating timely intervention to prevent voluntary attrition. Employees would be scored on a monthly basis to understand propensity of attrition in the coming month, thereby giving an advance signal to HRM & Managerial Teams with time to intervene.

Example:- Let’s assume that training of new employee costs 2000$ and if we can predict which employee is going to leave next month, and propose him/her a bonus program worth 500$ to keep him for next 6 months, we can keep experienced, well-trained employee under the hood, with higher morale.

This frequently updated ML Score will be a significant tool in combating attrition, as companies can design retention strategies accordingly, with direct impact on the bottom line.

Indicative attributes: The more exhaustive the attributes are, the more accurate our model is in classifying the employees:

  1. Age
  2. Business travel frequency
  3. Daily rate
  4. Department: Sales, Research & Development, Human Resources, Marketing etc.
  5. Distance from home
  6. Educational qualification
  7. Employee count
  8. Environment satisfaction
  9. Gender: Male, Female
  10. Hourly rate
  11. Job involvement (Feedback)
  12. Job level & Role
  13. Job satisfaction from employee surveys
  14. Marital status
  15. Monthly income
  16. Monthly rate
  17. Number of companies worked
  18. Appraisal hike
  19. Performance rating & Percentile
  20. Standard hours: True or False
  21. Stock option level if Applicable
  22. Total working years
  23. Work life balance survey
  24. Years at company
  25. Years in current role
  26. Years since last promotion
  27. Years with current manager
  28. Market salary benchmarking

Our Analytical Approach



It is important to test the outcomes generated by AI algorithms through manual validation. The AI predictions must be compared with actual outcomes to understand the effectiveness of the algorithm. Comparing this outcome with current work flow results will help the AI system with continuous learning.

Alternative options of feedback must be created for the AI system to validate its outcome, particularly for use cases where reasoning is crucial. For example, while determining medical admissibility of claims application, it important to consider the mandatory documentation before processing the claims. A man+machine ecosystem can gather enough relational information to design a complex system to automate such high level tasks in future.

Key Steps Involved

  • Standardize the data provided by the client
  • Perform statistical analysis to study impact on attrition
  • Use the information gained from the above analysis to create a Machine Learning model to predict attrition rate
  • Integrate model with data stream for monthly run


Building Machine Learning Model

  1. Gathering the data: We had the data of employees for the last 4 years. The data contained basic details such as age, gender, educational qualification, address, place of residence, and the professional details such as date of joining, experience, skills, projects worked, designation,year end reviews, reasons for resignation (in case of ex-employees) etc.
  2. We ran association analysis on the historical data, to check for association between the variables.
  3. Building the model: Tool – Python
    • Divide the dataset into train and test data. We used 70% data to train the model and it is tested with 30% of the data
    • Classification algorithms used were Random Forest, Decision Tree, Gradient Boosting
    • The models were validated using the test data for accuracy and then the champion model was selected

The model scores the existing employees on a scale of 0 and 1. 0 indicates that the employee is least likely to leave the company and 1 indicates that employee is most likely to leave the organisation



HR Analytics is no longer a luxury for organisations, it is now to be seen as an essential ingredient for success. ML will play a crucial role in the evolving path of HR Teams of every organisation.

The attrition prediction model will lead to an overall profit or savings in the Human Resource Management process which includes hiring of new employees, retention, amount spent on the training and development of new employees. Attrition causes loss to the project due to loss of valuable employees. It will prescribe the organisation on the further actions to be taken. Thus, organisations can keep track of the amount of budget it has spent on human resource management and budget to be spent on future and take necessary actions. Through these data driven models they can forge long term engagement with the employees.

Author: Shashank Raj (Business Analyst) and Kavita Yadav


Leave a Reply

Your email address will not be published.