Implementation of Simple Linear Regression in Python

 

Implementation of Simple Linear Regression in Python – Machine Learning

In this tutorial, we will understand the Implementation of Simple Linear Regression in Python – Machine Learning.

Importing the Necessary libraries

To begin the implementation first we will import the necessary libraries like NumPy for numerical computation, MatPlotlib for visualization, and pandas for reading the dataset.

import numpy as np
import matplotlib.pyplot as plt
import pandas as pd

Importing the dataset

Next, we import or read the dataset. Click here to download the salary dataset used in this implementation. The dataset has two features YearsExperience and Salary. After reading the dataset, divide the dataset into concepts and targets. Store the concepts into X and targets into y.

dataset = pd.read_csv('Salary_Data.csv')
X = dataset.iloc[:, :-1].values
y = dataset.iloc[:, -1].values

Next, display the first five rows of the salary dataset using head() function from pandas.

dataset.head()
YearsExperienceSalary
01.139343.0
11.346205.0
21.537731.0
32.043525.0
42.239891.0

Splitting the dataset into the Training set and Test set

Next, divide the dataset into two parts, training and testing using the train_test_split function from sklearn. The test_size and random_state attributes are set to 1/3 and 0 respectively. You can change these attributes as per your requirements.

from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 1/3, random_state = 0)

Training the Simple Linear Regression model on the Training set

A Linear Regression algorithm is used to create a model. The LinearRegression function is imported from sklearn.linear_model library.

ffrom sklearn.linear_model import LinearRegression
regressor = LinearRegression()
regressor.fit(X_train, y_train)

Simple Linear Regression classifier model

LinearRegression(copy_X=True, fit_intercept=True, n_jobs=None, normalize=False)

Predicting the Test set results

y_pred = regressor.predict(X_test)
pd.DataFrame(data={'Actuals': y_test, 'Predictions': y_pred})
ActualsPredictions
037731.040835.105909
1122391.0123079.399408
257081.065134.556261
363218.063265.367772
4116969.0115602.645454
5109431.0108125.891499
6112635.0116537.239698
755794.064199.962017
883088.076349.687193
9101302.0100649.137545

Visualising the Training set results

Here scatter plot is used to visualize the results. The title of the plot is set to Salary vs Experience (Training set), xlabel is set to Years of Experience, and ylabel is set to Salary.

plt.scatter(X_train, y_train, color = 'red')
plt.plot(X_train, regressor.predict(X_train), color = 'blue')
plt.title('Salary vs Experience (Training set)')
plt.xlabel('Years of Experience')
plt.ylabel('Salary')
plt.show()
Visualising the Training set results Linear Regression

Visualising the Test set results

The title of the plot is set to Salary vs Experience (Test set), xlabel is set to Years of Experience, and ylabel is set to Salary.

plt.scatter(X_test, y_test, color = 'red')
plt.plot(X_train, regressor.predict(X_train), color = 'blue')
plt.title('Salary vs Experience (Test set)')
plt.xlabel('Years of Experience')
plt.ylabel('Salary')
plt.show()
Visualising the Testing set results Linear Regression

Summary:

In this tutorial, we understood, the Implementation of Simple Linear Regression in Python. If you like the tutorial share it with your friends. Like the Facebook page for regular updates and YouTube channel for video tutorials.

Leave a Comment

Your email address will not be published. Required fields are marked *