旋转立方体相册制作教程
本相册使用HTML+CSS+JS实现旋转立方体相册,图片可手动添加到固定路径,当鼠标移动到立方体时可实现立方体展开。鼠标左键单击形成爱心。
文件:url80.ctfile.com/f/25127180-733999961-54a3db?p=551685 (访问密码: 551685)
sklearn提供了VotingRegressor和VotingClassifier两个投票方法。使用模型需要提供一个模型的列表,列表中每个模型采用tuple的结构表示,第一个元素代表名称,第二个元素代表模型,需要保证每个模型拥有唯一的名称。看下面的例子:
from sklearn.linear_model import LogisticRegression
from sklearn.svm import SVC
from sklearn.ensemble import VotingClassifier
from sklearn.pipeline import make_pipeline
from sklearn.preprocessing import StandardScaler
models = [(‘lr’,LogisticRegression()),(‘svm’,SVC())]
ensemble = VotingClassifier(estimators=models) # 硬投票
models = [(‘lr’,LogisticRegression()),(‘svm’,make_pipeline(StandardScaler(),SVC()))]
ensemble = VotingClassifier(estimators=models,voting=‘soft’) # 软投票
我们可以通过一个例子来判断集成对模型的提升效果。
首先我们创建一个1000个样本,20个特征的随机数据集合:
from sklearn.datasets import make_classification
def get_dataset():
X, y = make_classification(n_samples = 1000, # 样本数目为1000
n_features = 20, # 样本特征总数为20
n_informative = 15, # 含有信息量的特征为15
n_redundant = 5, # 冗余特征为5
random_state = 2)
return X, y
补充一下函数make_classification的参数:
n_samples:样本数量,默认100
n_features:特征总数,默认20
n_imformative:信息特征的数量
n_redundant:冗余特征的数量,是信息特征的随机线性组合生成的
n_repeated:从信息特征和冗余特征中随机抽取的重复特征的数量
n_classes:分类数目
n_clusters_per_class:每个类的集群数
random_state:随机种子
使用KNN模型来作为基模型:
def get_voting():
models = list()
models.append((‘knn1’, KNeighborsClassifier(n_neighbors=1)))
models.append((‘knn3’, KNeighborsClassifier(n_neighbors=3)))
models.append((‘knn5’, KNeighborsClassifier(n_neighbors=5)))
models.append((‘knn7’, KNeighborsClassifier(n_neighbors=7)))
models.append((‘knn9’, KNeighborsClassifier(n_neighbors=9)))
ensemble = VotingClassifier(estimators=models, voting=‘hard’)
return ensemble
为了显示每次模型的提升,加入下面的函数:
def get_models():
models = dict()
models[‘knn1’] = KNeighborsClassifier(n_neighbors=1)
models[‘knn3’] = KNeighborsClassifier(n_neighbors=3)
models[‘knn5’] = KNeighborsClassifier(n_neighbors=5)
models[‘knn7’] = KNeighborsClassifier(n_neighbors=7)
models[‘knn9’] = KNeighborsClassifier(n_neighbors=9)
models[‘hard_voting’] = get_voting()
return models
接下来定义下面的函数来以分层10倍交叉验证3次重复的分数列表的形式返回:
from sklearn.model_selection import cross_val_score
from sklearn.model_selection import RepeatedStratifiedKFold
def evaluate_model(model, X, y):
cv = RepeatedStratifiedKFold(n_splits=10, n_repeats=3, random_state = 1)
# 多次分层随机打乱的K折交叉验证
scores = cross_val_score(model, X, y, scoring=‘accuracy’, cv=cv, n_jobs=-1,
error_score=‘raise’)
return scores
接着来总体调用一下:
from sklearn.neighbors import KNeighborsClassifier
import matplotlib.pyplot as plt
X, y = get_dataset()
models = get_models()
results, names = list(), list()
for name, model in models.items():
score = evaluate_model(model,X, y)
results.append(score)
names.append(name)
print(“%s %.3f (%.3f)” % (name, score.mean(), score.std()))
plt.boxplot(results, labels = names, showmeans = True)
plt.show()
knn1 0.873 (0.030)
knn3 0.889 (0.038)
knn5 0.895 (0.031)
knn7 0.899 (0.035)
knn9 0.900 (0.033)
hard_voting 0.902 (0.034)
KNN提升箱型图
可以看到结果不断在提升。
bagging
同样,我们生成数据集后采用简单的例子来介绍bagging对应函数的用法:
from numpy import mean
from numpy import std
from sklearn.datasets import make_classification
from sklearn.model_selection import cross_val_score
from sklearn.model_selection import RepeatedStratifiedKFold
from sklearn.ensemble import BaggingClassifier
X, y = make_classification(n_samples=1000, n_features=20, n_informative=15, n_redundant=5, random_state=5)
model = BaggingClassifier()
cv = RepeatedStratifiedKFold(n_splits=10, n_repeats=3, random_state=1)
n_scores = cross_val_score(model, X, y, scoring=‘accuracy’, cv=cv, n_jobs=-1, error_score=‘raise’)
print(‘Accuracy: %.3f (%.3f)’ % (mean(n_scores), std(n_scores)))
Accuracy: 0.861 (0.042)
本文来自互联网用户投稿,文章观点仅代表作者本人,不代表本站立场,不承担相关法律责任。如若转载,请注明出处。 如若内容造成侵权/违法违规/事实不符,请点击【内容举报】进行投诉反馈!
