Decision Tree using CART algorithm Solved Example 2

 
Video Tutorials
Artificial Intelligence Tutorial
Machine Learning Projects TutorialPython Video tutorial
Computer Graphics Video TutorialC++ Video Tutorial
Big Data Analytics Video TutorialSoft Computing Video Tutorial
Placement Video TutorialJava Video Tutorial

Decision Tree using CART algorithm Solved Example 2 – Loan Approval Data Set

In this tutorial, we will understand how to apply Classification And Regression Trees (CART) decision tree algorithm (Solved Example 2) to construct and find the optimal decision tree for the given Loan Approval Data set. Also, predict the class label for the given example…?

AgeJobHouseCreditLoan Approved
YoungFalseNoFairNo
YoungFalseNoGoodNo
YoungTrueNoGoodYes
YoungTrueYesFairYes
YoungFalseNoFairNo
MiddleFalseNoFairNo
MiddleFalseNoGoodNo
MiddleTrueYesGoodYes
MiddleFalseYesExcellentYes
MiddleFalseYesExcellentYes
OldFalseYesExcellentYes
OldFalseYesGoodYes
OldTrueNoGoodYes
OldTrueNoExcellentYes
OldFalseNoFairNo
AgeJobHouseCreditLoan Approved
YoungFalseNoGood?

Solution:

First, we need to Determine the root node of the tree

Start with any variable, in this case, Age. It can take three values: Young, Middle, and Old.

Start with the Young value of outlook. There are five instances where the Age is Young.

In two of the five instances, the loan approval decision was yes, and in the other three, the loan approval decision was no.

Thus, if the decision rule was that Age: Young → no, then three out of five loan approval decisions would be correct, while two out of five loan approval decisions would be incorrect. There are two errors out of five. This can be recorded in Row 1.

Similarly, we will write all rules for the Age attribute.

Age Attribute

Young5Yes2
No3
Middle5Yes3
No2
Old5Yes4
No1

Rules, individual error, and total for Age attribute

AttributeRulesErrorTotal Error
AgeYoung->No2/55/15
Middle->Yes2/5
Old->Yes1/5

Job Attribute

False10Yes4
No6
True5Yes5
No

Rules, individual error, and total for Job attribute

AttributeRulesErrorTotal Error
JobFalse->No4/104/15
True->Yes0/5

House Attribute

No9Yes3
No6
Yes6Yes6
No

Rules, individual error, and total for House attribute

AttributeRulesErrorTotal Error
HouseNo->No3/93/15
Yes->yes0/6

Credit Attribute

Fair5Yes1
No4
Good6Yes4
No2
Excellent4Yes4
No

Rules, individual error, and total for Credit attribute

AttributeRulesErrorTotal Error
CreditFair->No1/53/15
Good->Yes2/6
Excellent->Yes0/4

Consolidated rules, errors for individual attributes values, and total error of the attribute are given below.

AttributeRulesErrorTotal Error
AgeYoung->No2/55/15
Middle->Yes2/5
Old->Yes1/5
JobFalse->No4/104/15
True->Yes0/5
HouseNo->No3/93/15
Yes->yes0/6
CreditFair->No1/53/15
Good->Yes2/6
Excellent->Yes0/4

From the above table, we can notice that the attributes House and credit have the same minimum error that is 3/15 (3 errors out of 15 examples). Hence we consider the individual attribute value errors. Both House and credit have one rule which generates zero error that is the rule Yes → Yes in House attribute and Excellent → Yes in credit attribute. Again there is a tie. Here, with respect to house one rule is remaining, and with respect to credit, we have two rules with errors. Hence we consider House as the splitting attribute.

Now we build the tree with House as the root node. It has two branches for each possible value of the House attribute. As the rule, Yes → Yes generates zero error. When the house attribute value is Yes we get the result as Yes. For the remaining attribute value that is no, we consider the subset of data and continue building the tree. Tree with House as root node is,

Tree with House as root node

Now, for the right subtree, we write all possible rules and find the total error. Based on the total error table, we will construct the tree.

Right subtree,

Consolidated rules, errors for individual attributes values, and total error of the attribute are given below.

AttributeRulesErrorTotal Error
AgeYoung->No1/42/9
Middle->No0/2
Old->Yes1/3
JobFalse->No0/60/9
True->Yes0/3
CreditFair->No0/42/9
Good->Yes/No2/4
Excellent->Yes0/1

From the above table, we can notice that Job has the lowest error. Hence Job is considered as the splitting attribute. Also, when Job is False the answer is No as it produces zero errors. Similarly, when Job is True the answer is Yes, as it produces zero errors.

The final decision tree for the given Loan Approval data set is,

final decision tree for the given Loan Approval data set

Also, from the above decision tree the prediction for the new example:

AgeJobHouseCreditLoan Approved
YoungFalseNoGoodNo

Summary:

In this tutorial, we understood, how to apply Classification And Regression Trees (CART) decision tree algorithm (solved example 2) to construct and find the optimal decision tree for the Loan Approval data set. If you like the tutorial share it with your friends. Like the Facebook page for regular updates and YouTube channel for video tutorials.

Leave a Comment

Your email address will not be published. Required fields are marked *