Decision Tree

machine-learning classification regression

Definition

Decision Tree

A decision tree is a non-parametric supervised learning model that predicts the value of a target variable by learning simple decision rules inferred from the data features. Formally, it is a directed tree structure consisting of:

Split Nodes: Represent a test on an attribute (e.g., “is $x_{j} < t$ ?” or “is $x_{j} = true$ ?”).

Edges: Represent the outcome of the test, leading to child nodes.

Leaf Nodes: Store the final predicted value or class label.

Representational Expressivity

Any Boolean function $h : {0, 1}^{d} \to {0, 1}$ can be represented by a decision tree. This is demonstrated by converting the function into its Disjunctive Normal Form (DNF), where each product term corresponds to a path from the root to a leaf with label $1$ . As a result, a sufficiently deep decision tree can achieve zero empirical risk on any binary training set, which typically necessitates stopping criteria to prevent overfitting.

Top Down Induction

Trees are typically constructed recursively using a greedy approach (e.g., ID3, C4.5, CART). The process involves identifying the attribute that achieves the optimal result for a splitting criterion $C$ , such as the Gini index or Entropy.

Construction Algorithm

Given a dataset $X$ and a splitting criterion $C$ :

Step 1 (Initialization): Assign all $x_{i} \in X$ to the root node.

Step 2 (Attribute Evaluation): For each attribute in $X$ , partition all $x_{i}$ at the current node by the attribute’s value and compute the criterion $C$ for this partitioning.

Step 3 (Splitting): Identify the attribute that achieves the optimal result for $C$ and set it as the splitting feature. If $C = 0$ (perfect purity), designate the node as a leaf and return.

Step 4 (Recursion): Create child nodes for each partition. For each node, if it is pure, designate as a leaf; otherwise, repeat from step 2 using the subset of data in the child node.

Lukas' Notes

Decision Tree

Definition

Representational Expressivity

Top Down Induction

Construction Algorithm

Graph View

Table of Contents

Backlinks