R语言的决策树

# 计算商
a <- rep(0.5,2) #0.5重复两遍
-sum(a*log2(a))
1
b <- rep(0.25,4)
-sum(b*log2(b))
2
c <- rep(0.125,8)
-sum(c*log2(c))
3



# r语言实现决策树
library(rpart)
iris.rp = rpart(data=iris,Species~.,method="class")
iris.rp
n= 150 node), split, n, loss, yval, (yprob)* denotes terminal node1) root 150 100 setosa (0.33333333 0.33333333 0.33333333) 2) Petal.Length< 2.45 50 0 setosa (1.00000000 0.00000000 0.00000000) *3) Petal.Length>=2.45 100 50 versicolor (0.00000000 0.50000000 0.50000000) 6) Petal.Width< 1.75 54 5 versicolor (0.00000000 0.90740741 0.09259259) *7) Petal.Width>=1.75 46 1 virginica (0.00000000 0.02173913 0.97826087) *
plot(iris.rp,uniform = T,branch = 0,margin = 0.1,main="Classification Tree")
text(iris.rp,use.n =T ,fancy = T,col="blue")

# kyphosis驼背数据集
library(rpart)
head(kyphosis)
| Kyphosis | Age | Number | Start |
|---|---|---|---|
| absent | 71 | 3 | 5 |
| absent | 158 | 3 | 14 |
| present | 128 | 4 | 5 |
| absent | 2 | 5 | 1 |
| absent | 1 | 4 | 15 |
| absent | 1 | 2 | 16 |
[1] 4Error in shape(kyphosis): 没有"shape"这个函数
Traceback:

ct <- rpart.control(xval = 10,minsplit = 20,cp=0.1)
fit <- rpart(Kyphosis~Age + Number +Start,data = kyphosis,method = "class",control =ct,parms = list(prior =c(0.65,0.35),split = 'information'))



# 并列画两张图
### 第一种
plot(fit)
text(fit,use.n=T,all=T,cex=0.9)
# 第二种
library(rpart.plot)
rpart.plot(fit,branch=1,branch.type = 2,type =1, extra =102,shadow.col="gray",box.col="green",border.col="blue",split.col='red',split.cex = 1.2,main="Kyphosis 决策树")



res <- predict(fit)
result <- ifelse(res[,2]>0.5,'present','absent')
table(kyphosis$Kyphosis,result)
resultabsent presentabsent 53 11present 3 14
a <- table(kyphosis$Kyphosis,result)
# 准确度
(a[1,1] +a[2,2 ])/sum(a)
0.827160493827161
# 召回率
a[2,2]/sum(a[2,])
0.823529411764706
# 精确度
a[2,2]/sum(a[,2])
0.56
本文来自互联网用户投稿,文章观点仅代表作者本人,不代表本站立场,不承担相关法律责任。如若转载,请注明出处。 如若内容造成侵权/违法违规/事实不符,请点击【内容举报】进行投诉反馈!
