In present times, when we talk about hiring, recruitment and human resources, we often refer to the term ‘analytics’; to be more specific, we often refer to or come across the term HR analytics. And related to HR analytics are several other terms such as data mining, predictive analytics and so on. Nonetheless, how many of us know the exact meaning of these terms and how many words exist in the domain of HR analytics.
In the following post, we will take a look at the glossary of HR analytics
#1. HR analytics
HR analytics refers to the application of essential data mining and business analytics techniques to talent data. It usually refers to analytics that measures performance and efficiency that matter to HR only.
#2. Predictive analytics
It is a section of advanced analytics used to make extrapolations about anonymous impending events and referred as predictive analytics. In this case, it is about recruiters predicting the likely job candidates for a vacant position.
Predictive analytics implements many techniques including statistics modelling, data mining, artificial intelligence and machine learning to scrutinise existing data and make predictions about the coming event. In recruitment, it allows organisations to become proactive; anticipating behaviours and outcomes based on actual data.
#3. Data mining
Data mining is almost like digging for gold. Just as gold diggers sift through piles of dust and sand in hope to strike a piece of shiny gold, data mining is the method of learning patterns in piles of raw data and turning them into concrete information; which later can be used to make predictions about staffing.
#4. Machine learning
Machine learning is a representation of Artificial Intelligence (AI) that allows computers with the ability to learn without being explicitly programmed. It is mostly achieved through various pattern recognition processes. With the help of machine learning, can start recognising the pure data points’ of candidate’s information, their work history and their profile.
#5. Descriptive analytics
Descriptive analytics mines historical performance data to look for the reasons behind the past success or failure.
#6. Cost modelling
Cost modelling helps the one in the C-Suite to understand and interpret several HR related expenses. These include recruitment and on-boarding costs, estimated time required for an employee to attain maximum productivity, compensation, employee turnover, and overall productivity costs. Cost modelling can also offer an insightful picture of retention and recruitment plans, even for a stipulated period.
#7. Decision tree
A decision tree is a model that looks like a tree. It comprises decisions and their possible consequences. It is a significant tool to make predictions. A decision tree allows you to predict what might happen by learning from existing data.
Many HR practitioners often use excel. However, most predictive HR analysts use R. It is the most attractive tool for data scientists. R is a free open-source system for statistical visualisation and computation. It also enables you to work with massive data sets that would be too huge to handle in Excel.
#9. Structured data vs. unstructured data
There are a two types of data in the HR analytics domain — structured and unstructured. When data is neatly organised into a spreadsheet or database, it is called structured data.
On the other hand, where the data is not properly structured, it is referred to as unstructured data. Its lack of structure makes it time-consuming and tiring to use.
#10. Multivariate analysis
Multivariate analysis is essentially the statistical procedure of simultaneously analysing multiple independent (or predictor) variables with multiple dependent (outcome or criterion) variables using matrix algebra (most multivariate analyses has a correlation).
In human resources when you want to predict how age and engagement levels influence someone’s compensation and performance ratings, there are two dependent variables. This is what is known as multivariate analysis. Take a look at the image below:
#11. Quantitative scissors
Quantitative scissors is a phrase widely used by data scientists to describe a moment when an employee begins to be profitable. Consider the example of 2 lines intersecting. One is a cost line, and another is a benefit line. When the benefit line is higher than the cost line, then the employee becomes an asset to the organisation, not an expense.
This term was first introduced by talent analytics chief scientist Pasha Roberts.
When one creates an algorithm, one wants to be as accurate and as predictable as possible. Boosting is an interactive statistical technique, used in the process developing an algorithm that creates multiple extra training data-sets. A model is created for each these data-sets. Since these data sets are created deliberately, it implies that the weight of the misclassified data points is increased. Therefore the next algorithm will fit these miscalculations much better. This process repeats itself several times. Together these models decide on the most reasonable consequence. They make this choice based on a weighted vote in which more accurate models have more voting power than less specific models.
#13. Random forest
Contrary the boosting technique; the random forest technique randomises the algorithm instead of the data. Usually, a decision tree algorithm selects the best attribute to divide its branches. However, in a random forest technique, this procedure of selecting the best attribute is randomised. It leads to the production of different trees. Hence, a forest and these random trees produce a much better result together.
The technique of pruning is associated with the concept of a decision tree. Pruning is used to reduce the complexity of a decision tree. A decision tree is built by taking the most critical attribute to split its branches, and this process continues till the tree is completed.
(to be continued in part II)