simhash实现

import com.clearspring.analytics.hash.MurmurHash/*** Created by fhqplzj on 17-3-1 at 下午6:07.*/
object Sim {def simHash(features: Array[String], weights: Array[Int]): Long = {val hist = Array.ofDim[Int](64)features.zip(weights).foreach {case (feature, weight) => {val hash = MurmurHash.hash64(feature)for (i <- 0 until 64) {if ((hash & (1 << i)) == 0) {hist(i) += -weight} else {hist(i) += weight}}}}var result: Long = 0for (i <- 0 until 64) {if (hist(i) >= 0) {result |= (1 << i)}}result}def main(args: Array[String]): Unit = {val features = "zhao jun haha".split(" ")val weights = Array.fill(features.length)(1)println(simHash(features, weights))}
}


本文来自互联网用户投稿,文章观点仅代表作者本人,不代表本站立场,不承担相关法律责任。如若转载,请注明出处。 如若内容造成侵权/违法违规/事实不符,请点击【内容举报】进行投诉反馈!

相关文章

立即
投稿

微信公众账号

微信扫一扫加关注

返回
顶部