2017年9月7日 星期四

Ml udemy course:Python-Data-Science-and-Machine-Learning-Bootcamp\Machine Learning Sections\Logistic-Regression

Python-Data-Science-and-Machine-Learning-Bootcamp\Machine Learning Sections\Logistic-Regression Project


Logistic Regression for classification
steps
>import data, handling the format,like  missing data,
>plot some correlation graphs

>import logistic regression tool
>split to training and testing sets
>run logistic model and predict the values
>perform prediction metrics and check the model

codes
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns

%matplotlib inline

ad_data = pd.read_csv('advertising.csv')

ad_data.head()

sns.set_style('whitegrid')
ad_data['Age'].hist(bins=30)

plt.xlabel('Age')

sns.jointplot(x='Age',y='Area Income',data=ad_data)

sns.jointplot(x='Age',y='Daily Time Spent on Site',data=ad_data,color='red',kind='kde');


sns.jointplot(x='Daily Time Spent on Site',y='Daily Internet Usage',data=ad_data,color='green')


sns.pairplot(ad_data,hue='Clicked on Ad',palette='bwr')



Split the data into training set and testing set using train_test_split

from sklearn.model_selection import train_test_split

X = ad_data[['Daily Time Spent on Site', 'Age', 'Area Income','Daily Internet Usage', 'Male']]

y = ad_data['Clicked on Ad']
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.33, random_state=42)

from sklearn.linear_model import LogisticRegression
logmodel = LogisticRegression()

logmodel.fit(X_train,y_train)

out:
LogisticRegression(C=1.0, class_weight=None, dual=False, fit_intercept=True,
          intercept_scaling=1, max_iter=100, multi_class='ovr', n_jobs=1,
          penalty='l2', random_state=None, solver='liblinear', tol=0.0001,
          verbose=0, warm_start=False)

predictions = logmodel.predict(X_test)
from sklearn.metrics import classification_report
print(classification_report(y_test,predictions))

沒有留言:

張貼留言