交通事故理赔审核
标杆:交通事故理赔审核
LASSO逻辑回归模型(Python)
该模型预测结果的PR-AUC为:0.714644
# -*- coding: utf-8 -*-
import pandas as pd
from sklearn.linear_model import LogisticRegression
# 读取数据
train = pd.read_csv("train.csv")
test = pd.read_csv("test.csv")
submit = pd.read_csv("sample_submit.csv")
# 删除id
train.drop('CaseId', axis=1, inplace=True)
test.drop('CaseId', axis=1, inplace=True)
# 取出训练集的y
y_train = train.pop('Evaluation')
# 建立LASSO逻辑回归模型
clf = LogisticRegression(penalty='l1', C=1.0, random_state=0)
clf.fit(train, y_train)
y_pred = clf.predict_proba(test)[:, 1]
# 输出预测结果至my_LASSO_prediction.csv
submit['Evaluation'] = y_pred
submit.to_csv('my_LASSO_prediction.csv', index=False)
随机森林分类模型(Python)
该模型预测结果的PR-AUC为:0.850897
# -*- coding: utf-8 -*-
import pandas as pd
from sklearn.ensemble import RandomForestClassifier
# 读取数据
train = pd.read_csv("train.csv")
test = pd.read_csv("test.csv")
submit = pd.read_csv("sample_submit.csv")
# 删除id
train.drop('CaseId', axis=1, inplace=True)
test.drop('CaseId', axis=1, inplace=True)
# 取出训练集的y
y_train = train.pop('Evaluation')
# 建立随机森林模型
clf = RandomForestClassifier(n_estimators=100, random_state=0)
clf.fit(train, y_train)
y_pred = clf.predict_proba(test)[:, 1]
# 输出预测结果至my_RF_prediction.csv
submit['Evaluation'] = y_pred
submit.to_csv('my_RF_prediction.csv', index=False)