Organizations around the world strive to achieve profitability in their business. To become more profitable, it is essential to satisfy the needs of customers. But, when variations exist between individual customers how they can effectively do that.


The answer is- by recognizing these differences and differentiating the customers into different segments. But how do organizations segment their customers? And in this article we’ll help you understand this from a data science perspective.


What is customer segmentation?

Customer segmentation is the process of dividing the customer base into different segments where Each segment represents a group of customers who have common characteristics and similar interests. As explained above, the exercise of customer segmentation is done to better understand the needs of the customer and deliver targeted products/services/content. With time, all sorts of organizations from e-commerce to pharmaceutical to digital marketing have recognized the importance of customer segmentation and are using it improve customer profitability.


Customer segmentation can be carried out on the basis of various traits. These include :


  1. Demographic :
    This includes variables like age, gender, income, location, family situation, income, education etc. A simple example of demographic segmentation could be a vehicle manufacturer that sells a luxury cars (say Mercedes). Here, the company would likely target an audience that has a higher income.

  3. Geographic:
    Geographic segments are created out of factors like country, radius around a certain location, climate, urban or rural areas. Example: In India almost 2/3rd of the population resides in the rural area while the remaining 1/3rd resides in the urban area. So, the customer reach out strategy for each segment will to be different.

  5. Psychographic:
    In this customers are categorised by factors that relate to their personalities and characteristics like interest, values, lifestyle, attitudes etc.

  7. Behavioral:
    Behavioural segmentation includes analyzing the purchasing habits, brand interactions and purchasing habits. E-commerce companies frequently use behavioral segmentation to position their products through recommendation engines.


How to perform customer segmentation?


Start with –


Identifying the problem statement

One of the foremost steps is to identify the need for the segmentation exercise. The problem statement and the output expectation will guide the process of segmentation.




  1. A media house may want to deliver targeted and tailored content to its audience
  2. An E-commerce company may want to retain its customers


In both the cases, the intent or need to perform customer segmentation is different. This will further determine the approach taken to achieve desired outcome.


Gathering data

Next step is to have the right data for the analysis. Data can come from different sources- internal database of the company or surveys and other campaigns. Other third party platforms like Google, Facebook, Instagram have advanced analytics capabilities to allow capture of behavioral and psychographic data of customers.


Creating the customer segments

Once you have defined problem statement, and gathered all the required data for it, the next step is to carry out the segmentation exercise. Key steps here will be:


  1. Determine the number of segments required
  2. Select the segment variables
  3. Modelling the data through clustering algorithms


Data science and statistical analysis with the help of machine learning tools help organizations deal with large customer databases and apply segmentation techniques. Clustering, a data science method, is a good fit for customer segmentation in most of the cases. Usage of the right clustering algorithm depends on which type of clustering you want.


Many algorithms use similarity or distance measures between data points in the feature space in an effort to discover dense regions of observations. Some of the widely used machine learning clustering algorithms are :


  • K-means clustering: It is a method  to partition n observations into k clusters in which each observation belongs to the cluster with the nearest mean (cluster centers or cluster centroid), serving as a prototype of the cluster.
  • BIRCH (Balanced Iterative Reducing and Clustering using Hierarchies) is an unsupervised data mining algorithm used to perform hierarchical clustering over particularly large data-sets.
  • DBSCAN (Density-Based Spatial Clustering of Applications with Noise): It is a density based clustering, non-parametric algorithm: given a set of points in some space, it groups together points that are closely packed together.


Segmentation backed by data science helps organisations to forge a deeper relation with their customers.  It helps them to take informed retention decisions, build new features, and strategically positioning their product in the market.


Leave a Reply

Your email address will not be published. Required fields are marked *