Last Updated on May 15, 2026 by Rajeev Bagra

Machine learning beginners often experience a moment of excitement and confusion when they first train a model and see a large error value printed on the screen.

For example:

from sklearn.model_selection import train_test_split
from sklearn.tree import DecisionTreeRegressor
from sklearn.metrics import mean_absolute_error

train_X, val_X, train_y, val_y = train_test_split(X, y, random_state=0)

melbourne_model = DecisionTreeRegressor()

melbourne_model.fit(train_X, train_y)

val_predictions = melbourne_model.predict(val_X)

print(mean_absolute_error(val_y, val_predictions))

Output:

265806.91478373145

At first glance, this number may look shocking.

But what exactly does it mean?

Let us understand the entire machine learning workflow behind this code step by step.

What Is Happening in This Code?

This code performs a complete beginner-level machine learning pipeline:

Data
 → Split Data
 → Train Model
 → Make Predictions
 → Measure Error

The model being used here is a <scientific_concept>Decision Tree</scientific_concept> regressor from the scikit-learn official website.

Step 1: Splitting the Dataset

train_X, val_X, train_y, val_y = train_test_split(X, y, random_state=0)

In machine learning:

X contains the input features
y contains the target values

For a house-price dataset:

Features (X)	Target (y)
Rooms, Area, Location	House Price

The function:

train_test_split()

randomly divides the dataset into two parts.

Dataset Portion	Purpose
Training Data	Used to teach the model
Validation Data	Used to test the model

Think of it like studying for an exam:

Training set → study materials
Validation set → final exam

Why Do We Split the Data?

Suppose a student memorizes answers instead of understanding concepts.

They may perform well on practice questions but fail on new questions.

Machine learning models behave similarly.

If we test the model on the same data used for training, it may simply memorize the dataset instead of learning useful patterns.

The validation set helps us answer the real question:

Can the model predict unseen data accurately?

This idea is fundamental in <academic_field>Machine Learning</academic_field>.

Understanding `random_state=0`

random_state=0

Data splitting involves randomness.

Without a fixed random state:

the split changes every time
results become inconsistent

Setting random_state=0 ensures:

the same training/validation split every run
reproducible experiments

This is very important in real-world data science workflows.

Step 2: Creating the Decision Tree Model

melbourne_model = DecisionTreeRegressor()

This creates a <scientific_concept>Decision Tree</scientific_concept> regression model.

A decision tree works like a series of questions.

Example:

Is number_of_rooms > 3?
    YES → Is area > 2000 sq ft?
              YES → expensive house
              NO  → medium-priced house
    NO → cheaper house

The algorithm automatically discovers such rules from the data.

Step 3: Training the Model

melbourne_model.fit(train_X, train_y)

This is where learning happens.

The model studies relationships between:

Input Features → Output Values

Example:

Rooms	Area	Price
2	900	300000
4	2500	850000

The model gradually learns patterns such as:

larger houses tend to cost more
more rooms often increase price
location impacts valuation

Training is essentially the model discovering mathematical patterns hidden inside data.

Step 4: Making Predictions

val_predictions = melbourne_model.predict(val_X)

Now the trained model predicts prices for houses it has never seen before.

Example:

Actual Price	Predicted Price
500000	470000
900000	850000

The closer the predictions are to actual values, the better the model.

Step 5: Measuring Prediction Error

mean_absolute_error(val_y, val_predictions)

This calculates:

Mean Absolute Error (MAE)

Where:

= actual value
= predicted value

Plain English Meaning of MAE

The calculation process is:

Find prediction error
Ignore positive or negative sign
Add all errors
Divide by total predictions

Example:

Actual	Predicted	Absolute Error
500000	470000	30000
800000	850000	50000

Average these errors → MAE.

Interpreting Your Result

Output:

265806.91478373145

This means:

On average, the model’s predictions differ from actual house prices by about $265,807.

Whether this is good or bad depends entirely on the dataset.

For example:

Average House Price	MAE Quality
$300,000	Very poor
$5 million	Reasonable

Machine learning metrics always need business or real-world context.

The Hidden Problem: Overfitting

A <scientific_concept>Decision Tree</scientific_concept> can become too complex.

Instead of learning general patterns, it memorizes training data.

This phenomenon is called:

Overfitting

An overfitted model:

performs extremely well on training data
performs poorly on unseen validation data

This is one of the most important concepts in machine learning.

How Developers Reduce Overfitting

One common solution is limiting tree complexity.

Example:

DecisionTreeRegressor(max_leaf_nodes=100)

This prevents the tree from becoming excessively detailed.

Other advanced methods include:

<scientific_concept>Random Forest</scientific_concept>
<scientific_concept>Gradient Boosting</scientific_concept>

These techniques usually produce more accurate and stable predictions.

Why This Workflow Matters

This simple example represents the foundation of modern AI systems.

The same workflow powers:

recommendation engines
fraud detection systems
medical diagnosis tools
stock forecasting
ad targeting
demand prediction
pricing systems

Almost every practical AI system follows this structure:

Collect Data
 → Train Model
 → Predict Outcomes
 → Evaluate Accuracy
 → Improve Model

Final Thoughts

Although the code is short, it introduces several foundational concepts in <academic_field>Data Science</academic_field> and <academic_field>Machine Learning</academic_field>:

dataset splitting
model training
prediction generation
error evaluation
overfitting
model generalization

Understanding these ideas deeply is far more important than simply running the code.

Once these fundamentals become clear, advanced topics like neural networks, deep learning, and large language models become much easier to understand.

Discover more from Aiannum.com

Subscribe to get the latest posts sent to your email.

Understanding Train-Test Split, Decision Trees, and Mean Absolute Error in Machine Learning

What Is Happening in This Code?

Step 1: Splitting the Dataset

Why Do We Split the Data?

Understanding `random_state=0`

Step 2: Creating the Decision Tree Model

Step 3: Training the Model

Step 4: Making Predictions

Step 5: Measuring Prediction Error

Mean Absolute Error (MAE)

Plain English Meaning of MAE

Interpreting Your Result

The Hidden Problem: Overfitting

Overfitting

How Developers Reduce Overfitting

Why This Workflow Matters

Final Thoughts

Like this:

Related

Discover more from Aiannum.com

Additional menu

What Is Happening in This Code?

Step 1: Splitting the Dataset

Why Do We Split the Data?

Understanding random_state=0

Step 2: Creating the Decision Tree Model

Step 3: Training the Model

Step 4: Making Predictions

Step 5: Measuring Prediction Error

Mean Absolute Error (MAE)

Plain English Meaning of MAE

Interpreting Your Result

The Hidden Problem: Overfitting

Overfitting

How Developers Reduce Overfitting

Why This Workflow Matters

Final Thoughts

Share this:

Like this:

Related

Discover more from Aiannum.com

Reader Interactions

Leave a ReplyCancel reply

Discover more from Aiannum.com

Understanding `random_state=0`