R 语言聚类关联规则




a <- c(10,9,8)
b<- c(4,3,2)
c<- c(8,9,10)
a
- 10
- 9
- 8
# 余角相似度
sum(a*b)/sqrt(sum(a^2)*sum(b^2))
0.984682118265774
sum(a*c)/sqrt(sum(a^2)*sum(c^2))
0.983673469387755
x<- rbind(a,b,c)
x
| a | 10 | 9 | 8 |
|---|---|---|---|
| b | 4 | 3 | 2 |
| c | 8 | 9 | 10 |
# 欧式距离
dist(x)
a b
b 10.392305
c 2.828427 10.770330

newiris <- iris
newiris$Species <- NULL
head(newiris)
| Sepal.Length | Sepal.Width | Petal.Length | Petal.Width |
|---|---|---|---|
| 5.1 | 3.5 | 1.4 | 0.2 |
| 4.9 | 3.0 | 1.4 | 0.2 |
| 4.7 | 3.2 | 1.3 | 0.2 |
| 4.6 | 3.1 | 1.5 | 0.2 |
| 5.0 | 3.6 | 1.4 | 0.2 |
| 5.4 | 3.9 | 1.7 | 0.4 |
kc <- kmeans(newiris,3)
kc
K-means clustering with 3 clusters of sizes 38, 62, 50Cluster means:Sepal.Length Sepal.Width Petal.Length Petal.Width
1 6.850000 3.073684 5.742105 2.071053
2 5.901613 2.748387 4.393548 1.433871
3 5.006000 3.428000 1.462000 0.246000Clustering vector:[1] 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3[38] 3 3 3 3 3 3 3 3 3 3 3 3 3 2 2 1 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2[75] 2 2 2 1 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 1 2 1 1 1 1 2 1 1 1 1
[112] 1 1 2 2 1 1 1 1 2 1 2 1 2 1 1 2 2 1 1 1 1 1 2 1 1 1 1 2 1 1 1 2 1 1 1 2 1
[149] 1 2Within cluster sum of squares by cluster:
[1] 23.87947 39.82097 15.15100(between_SS / total_SS = 88.4 %)Available components:[1] "cluster" "centers" "totss" "withinss" "tot.withinss"
[6] "betweenss" "size" "iter" "ifault"
table(kc$cluster,iris$Species)
setosa versicolor virginica1 0 2 362 0 48 143 50 0 0

library(cluster)
med <- pam(iris[,-5],3)
med
Medoids:ID Sepal.Length Sepal.Width Petal.Length Petal.Width
[1,] 8 5.0 3.4 1.5 0.2
[2,] 79 6.0 2.9 4.5 1.5
[3,] 113 6.8 3.0 5.5 2.1
Clustering vector:[1] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1[38] 1 1 1 1 1 1 1 1 1 1 1 1 1 2 2 3 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2[75] 2 2 2 3 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 3 2 3 3 3 3 2 3 3 3 3
[112] 3 3 2 2 3 3 3 3 2 3 2 3 2 3 3 2 2 3 3 3 3 3 2 3 3 3 3 2 3 3 3 2 3 3 3 2 3
[149] 3 2
Objective function:build swap
0.6709391 0.6542077 Available components:[1] "medoids" "id.med" "clustering" "objective" "isolation" [6] "clusinfo" "silinfo" "diss" "call" "data"
table(med$cluster,iris$Species)
setosa versicolor virginica1 50 0 02 0 48 143 0 2 36




library(arules)
Loading required package: MatrixAttaching package: 'arules'The following objects are masked from 'package:base':abbreviate, write
data(Groceries)
Groceries
transactions in sparse format with9835 transactions (rows) and169 items (columns)
summary(Groceries)
transactions as itemMatrix in sparse format with9835 rows (elements/itemsets/transactions) and169 columns (items) and a density of 0.02609146 most frequent items:whole milk other vegetables rolls/buns soda 2513 1903 1809 1715 yogurt (Other) 1372 34055 element (itemset/transaction) length distribution:
sizes1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
2159 1643 1299 1005 855 645 545 438 350 246 182 117 78 77 55 46 17 18 19 20 21 22 23 24 26 27 28 29 32 29 14 14 9 11 4 6 1 1 1 1 3 1 Min. 1st Qu. Median Mean 3rd Qu. Max. 1.000 2.000 3.000 4.409 6.000 32.000 includes extended item information - examples:labels level2 level1
1 frankfurter sausage meat and sausage
2 sausage sausage meat and sausage
3 liver loaf sausage meat and sausage
freq = eclat(Groceries,parameter = list(support=0.06,maxlen=10))
Eclatparameter specification:tidLists support minlen maxlen target extFALSE 0.06 1 10 frequent itemsets FALSEalgorithmic control:sparse sort verbose7 -2 TRUEAbsolute minimum support count: 590 create itemset ...
set transactions ...[169 item(s), 9835 transaction(s)] done [0.00s].
sorting and recoding items ... [20 item(s)] done [0.00s].
creating sparse bit matrix ... [20 row(s), 9835 column(s)] done [0.00s].
writing ... [21 set(s)] done [0.00s].
Creating S4 object ... done [0.00s].
inspect(freq)
items support count
[1] {other vegetables,whole milk} 0.07483477 736
[2] {whole milk} 0.25551601 2513
[3] {other vegetables} 0.19349263 1903
[4] {rolls/buns} 0.18393493 1809
[5] {yogurt} 0.13950178 1372
[6] {soda} 0.17437722 1715
[7] {root vegetables} 0.10899847 1072
[8] {tropical fruit} 0.10493137 1032
[9] {bottled water} 0.11052364 1087
[10] {sausage} 0.09395018 924
[11] {shopping bags} 0.09852567 969
[12] {citrus fruit} 0.08276563 814
[13] {pastry} 0.08896797 875
[14] {pip fruit} 0.07564820 744
[15] {whipped/sour cream} 0.07168277 705
[16] {fruit/vegetable juice} 0.07229283 711
[17] {domestic eggs} 0.06344687 624
[18] {newspapers} 0.07981698 785
[19] {brown bread} 0.06487036 638
[20] {bottled beer} 0.08052872 792
[21] {canned beer} 0.07768175 764
model <- apriori(Groceries,parameter = list(support=0.01,confidence=0.5))
AprioriParameter specification:confidence minval smax arem aval originalSupport maxtime support minlen0.5 0.1 1 none FALSE TRUE 5 0.01 1maxlen target ext10 rules FALSEAlgorithmic control:filter tree heap memopt load sort verbose0.1 TRUE TRUE FALSE TRUE 2 TRUEAbsolute minimum support count: 98 set item appearances ...[0 item(s)] done [0.00s].
set transactions ...[169 item(s), 9835 transaction(s)] done [0.00s].
sorting and recoding items ... [88 item(s)] done [0.00s].
creating transaction tree ... done [0.00s].
checking subsets of size 1 2 3 4 done [0.00s].
writing ... [15 rule(s)] done [0.00s].
creating S4 object ... done [0.00s].
inspect(model)
lhs rhs support
[1] {curd,yogurt} => {whole milk} 0.01006609
[2] {other vegetables,butter} => {whole milk} 0.01148958
[3] {other vegetables,domestic eggs} => {whole milk} 0.01230300
[4] {yogurt,whipped/sour cream} => {whole milk} 0.01087951
[5] {other vegetables,whipped/sour cream} => {whole milk} 0.01464159
[6] {pip fruit,other vegetables} => {whole milk} 0.01352313
[7] {citrus fruit,root vegetables} => {other vegetables} 0.01037112
[8] {tropical fruit,root vegetables} => {other vegetables} 0.01230300
[9] {tropical fruit,root vegetables} => {whole milk} 0.01199797
[10] {tropical fruit,yogurt} => {whole milk} 0.01514997
[11] {root vegetables,yogurt} => {other vegetables} 0.01291307
[12] {root vegetables,yogurt} => {whole milk} 0.01453991
[13] {root vegetables,rolls/buns} => {other vegetables} 0.01220132
[14] {root vegetables,rolls/buns} => {whole milk} 0.01270971
[15] {other vegetables,yogurt} => {whole milk} 0.02226741confidence lift count
[1] 0.5823529 2.279125 99
[2] 0.5736041 2.244885 113
[3] 0.5525114 2.162336 121
[4] 0.5245098 2.052747 107
[5] 0.5070423 1.984385 144
[6] 0.5175097 2.025351 133
[7] 0.5862069 3.029608 102
[8] 0.5845411 3.020999 121
[9] 0.5700483 2.230969 118
[10] 0.5173611 2.024770 149
[11] 0.5000000 2.584078 127
[12] 0.5629921 2.203354 143
[13] 0.5020921 2.594890 120
[14] 0.5230126 2.046888 125
[15] 0.5128806 2.007235 219
inspect(subset(model,subset = rhs%in%"whole milk"&lift>2.2))
lhs rhs support confidence
[1] {curd,yogurt} => {whole milk} 0.01006609 0.5823529
[2] {other vegetables,butter} => {whole milk} 0.01148958 0.5736041
[3] {tropical fruit,root vegetables} => {whole milk} 0.01199797 0.5700483
[4] {root vegetables,yogurt} => {whole milk} 0.01453991 0.5629921 lift count
[1] 2.279125 99
[2] 2.244885 113
[3] 2.230969 118
[4] 2.203354 143
本文来自互联网用户投稿,文章观点仅代表作者本人,不代表本站立场,不承担相关法律责任。如若转载,请注明出处。 如若内容造成侵权/违法违规/事实不符,请点击【内容举报】进行投诉反馈!
