Logistic Regression_1 Question Suppose that you are the administrator of a university department and you want to determine each applicant’s chance of admission based on their results on two exams.
理论基础 最基本的逻辑回归底层依然是线性函数,同线性回归,有: $$ f(x_{1},x_{2},x_{3},…,x_{n})=\theta_{0}+\theta_{1}x_{1}+…+\theta_{n}x_{n} $$
$$ \begin{array}{}\Theta =\begin{bmatrix} \theta_{0}\\ …\\ \theta_{n} \end{bmatrix}\quad X=\begin{bmatrix}1\\ x_{1}\\ …\\ x_{n} \end{bmatrix} \end{array} $$
$$ f(x_{1},x_{2},x_{3},…,x_{n})=X^{T}\Theta $$
逻辑回归作用是对数据进行分类,可以通过sigmoid函数将f(x)映射到0-1之间形成一个概率值,并与分界线0.5比较进行预测 $$ p(X)=\frac{1}{1+e^{-f(X)}} $$
costfunction的构建:
该函数(交叉熵)鼓励将p(x)趋近于1的f(x)往正无穷进行训练,将p(x)趋近于0的f(x)往负无穷进行训练 $$ J(\Theta)= - \frac{1}{m} \sum_{i}^{m} (y_{i} ln{}^{p(X)} + (1 - y_{i}) ln{}^{(1-p(X))}) $$ 梯度下降类似线性回归进行偏微分,结果为:
j=0时,x_j=1 $$ \theta_{j}=\theta_{j}-\frac{\alpha }{m}\sum_{i=1}^{m}(p(x^{i})-y^{i})x_{j}^{i} $$ 设定迭代次数进行循环,接收最终的theta,可得: $$ p(X)=\frac{1}{1+e^{-(X^T\Theta)}} $$
$$ \hat{y}=1 \quad if \quad p(X)>0.5 \quad else \quad \hat{y}=0 $$
数据读取处理 1 2 3 4 5 6 7 import numpy as np import pandas as pd import matplotlib.pyplot as plt path='Logistic Regression_1.txt' data=pd.read_csv(path,names=['Exam1','Exam2','Accepted']) data.head()
Exam1
Exam2
Accepted
0
34.623660
78.024693
0
1
30.286711
43.894998
0
2
35.847409
72.902198
0
3
60.182599
86.308552
1
4
79.032736
75.344376
1
1 2 3 4 5 6 fig,ax=plt.subplots() ax.scatter(data[data['Accepted' ]==0 ]['Exam1' ],data[data['Accepted' ]==0 ]['Exam2' ],c='r' ,marker='x' ,label='y=0' ) ax.scatter(data[data['Accepted' ]==1 ]['Exam1' ],data[data['Accepted' ]==1 ]['Exam2' ],c='b' ,marker='o' ,label='y=1' ) ax.legend() ax.set (xlabel='exam1' ,ylabel='exam2' ) plt.show()
1 2 3 4 5 6 7 8 9 10 11 12 13 14 def get_Xy(data): data.insert(0,'ones',1) X_=data.iloc[:,0:-1] X=X_.values y_=data.iloc[:,-1] y=y_.values.reshape(len(y_),1) return X,y X,y=get_Xy(data) X.shape (100, 3) y.shape (100, 1)
构造损失函数 1 2 def sigmoid (z ): return 1 /(1 +np.exp(-z))
1 2 3 4 5 6 7 8 def costFunction (X,y,theta ): A=sigmoid(X@theta) first=y*np.log(A) second=(1 -y)*np.log(1 -A) return -np.sum (first+second)/len (X)
1 2 3 4 5 6 7 8 theta=np.zeros((3,1)) theta.shape (3, 1) cost_init=costFunction(X,y,theta) print(cost_init) 0.6931471805599453
构造梯度下降 1 2 3 4 5 6 7 8 9 10 11 def gradientDescent (X,y,theta,alpha,iters ): m=len (X) costs=[] for i in range (iters): theta=theta-X.T@(sigmoid(X@theta)-y)*alpha/m cost=costFunction(X,y,theta) costs.append(cost) if i%1000 ==0 : print (cost) return costs,theta
迭代得到结果 1 2 3 4 alpha=0.004 iters=200000 costs,theta_final=gradientDescent(X,y,theta,alpha,iters)
1 2 3 4 theta_final array([[-23.77288372], [ 0.20687383], [ 0.19997746]])
1 2 3 def predict (X,theta ): prob=sigmoid(X@theta) return [1 if x>=0.5 else 0 for x in prob]
准确率预估 1 2 3 4 5 6 7 y_=np.array(predict(X,theta_final)) y_pre=y_.reshape(len(y_),1) acc = np.mean(y_pre==y) print(acc) 0.91
画图 1 2 3 4 5 6 7 8 9 10 11 12 13 coef1=-theta_final[0,0]/theta_final[2,0] coef2=-theta_final[1,0]/theta_final[2,0] x=np.linspace(20,100,100) f=coef1+coef2*x fig,ax=plt.subplots() ax.scatter(data[data['Accepted']==0]['Exam1'],data[data['Accepted']==0]['Exam2'],c='r',marker='x',label='y=0') ax.scatter(data[data['Accepted']==1]['Exam1'],data[data['Accepted']==1]['Exam2'],c='b',marker='o',label='y=1') ax.legend() ax.set(xlabel='exam1',ylabel='exam2') ax.plot(x,f,c='g') plt.show()
Site 代码(Jupyter)和所用数据:https://github.com/codeYu233/Study/tree/main/Logistic%20Regression_1
Note 该题与数据集均来源于Coursera上斯坦福大学的吴恩达老师机器学习的习题作业,学习交流用,如有不妥,立马删除