项目重点问题

面试大保健:

https://blog.csdn.net/Mrerlou/article/details/114295888?ops_request_misc=%257B%2522request%255Fid%2522%253A%2522167863142316800186566694%2522%252C%2522scm%2522%253A%252220140713.130102334..%2522%257D&request_id=167863142316800186566694&biz_id=0&utm_medium=distribute.pc_search_result.none-task-blog-2~all~sobaiduend~default-1-114295888-null-null.142^v73^control,201^v4^add_ask,239^v2^insert_chatgpt&utm_term=%E9%9B%86%E7%BE%A4%E8%B5%84%E6%BA%90%E5%88%86%E9%85%8D%E5%8F%82%E6%95%B0%EF%BC%88%E9%A1%B9%E7%9B%AE%E4%B8%AD%E9%81%87%E5%88%B0%E7%9A%84%E9%97%AE%E9%A2%98%EF%BC%89%20&spm=1018.2226.3001.4187

Hadoop宕机

Hadoop解决数据倾斜方法

https://blog.csdn.net/a934079371/article/details/109233998?ops_request_misc=%257B%2522request%255Fid%2522%253A%2522167863132316800186547751%2522%252C%2522scm%2522%253A%252220140713.130102334.pc%255Fall.%2522%257D&request_id=167863132316800186547751&biz_id=0&utm_medium=distribute.pc_search_result.none-task-blog-2~all~first_rank_ecpm_v1~rank_v31_ecpm-9-109233998-null-null.142^v73^control,201^v4^add_ask,239^v2^insert_chatgpt&utm_term=Hadoop%E8%A7%A3%E5%86%B3%E6%95%B0%E6%8D%AE%E5%80%BE%E6%96%9C%E6%96%B9%E6%B3%95%20%20%E9%9D%A2%E8%AF%95&spm=1018.2226.3001.4187

集群资源分配参数(项目中遇到的问题)

HDFS小文件处理

https://blog.csdn.net/gym02/article/details/124169185?ops_request_misc=%257B%2522request%255Fid%2522%253A%2522167863123316800227426570%2522%252C%2522scm%2522%253A%252220140713.130102334..%2522%257D&request_id=167863123316800227426570&biz_id=0&utm_medium=distribute.pc_search_result.none-task-blog-2~all~baidu_landing_v2~default-2-124169185-null-null.142^v73^control,201^v4^add_ask,239^v2^insert_chatgpt&utm_term=HDFS%E5%B0%8F%E6%96%87%E4%BB%B6%E5%A4%84%E7%90%86%20%E9%9D%A2%E8%AF%95&spm=1018.2226.3001.4187

Hadoop优化

hive和spark比较

Flume挂掉

Flume优化

kafka常见问题:

https://www.cnblogs.com/erlou96/p/14401394.html

Kafka挂掉

https://blog.csdn.net/huaxing_ba/article/details/125023374?utm_medium=distribute.pc_relevant.none-task-blog-2~default~baidujs_utm_term~default-1-125023374-blog-122420561.pc_relevant_landingrelevant&spm=1001.2101.3001.4242.2&utm_relevant_index=4

Kafka丢失

https://blog.csdn.net/YoungJ_Zhou/article/details/125605128?ops_request_misc=%257B%2522request%255Fid%2522%253A%2522167861085216800226539195%2522%252C%2522scm%2522%253A%252220140713.130102334..%2522%257D&request_id=167861085216800226539195&biz_id=0&utm_medium=distribute.pc_search_result.none-task-blog-2~all~baidu_landing_v2~default-4-125605128-null-null.142^v73^control,201^v4^add_ask,239^v2^insert_chatgpt&utm_term=kafka%E6%95%B0%E6%8D%AE%E4%B8%A2%E5%A4%B1&spm=1018.2226.3001.4187

Kafka数据重复

https://blog.csdn.net/feiying0canglang/article/details/120514976?ops_request_misc=%257B%2522request%255Fid%2522%253A%2522167861119616800225535554%2522%252C%2522scm%2522%253A%252220140713.130102334..%2522%257D&request_id=167861119616800225535554&biz_id=0&utm_medium=distribute.pc_search_result.none-task-blog-2~all~sobaiduend~default-2-120514976-null-null.142^v73^control,201^v4^add_ask,239^v2^insert_chatgpt&utm_term=kafka%20%E6%95%B0%E6%8D%AE%E9%87%8D%E5%A4%8D&spm=1018.2226.3001.4187

Kafka消息数据积压

https://blog.csdn.net/y277an/article/details/117828798?ops_request_misc=%257B%2522request%255Fid%2522%253A%2522167862968516800227442646%2522%252C%2522scm%2522%253A%252220140713.130102334..%2522%257D&request_id=167862968516800227442646&biz_id=0&utm_medium=distribute.pc_search_result.none-task-blog-2~all~sobaiduend~default-1-117828798-null-null.142^v73^control,201^v4^add_ask,239^v2^insert_chatgpt&utm_term=Kafka%E6%B6%88%E6%81%AF%E6%95%B0%E6%8D%AE%E7%A7%AF%E5%8E%8B%20&spm=1018.2226.3001.4187

Kafka优化

Kafka单条日志传输大小

Kafka为什么快

  • 1.磁盘读写原理

  • 2.利用Pagecache+mmap

  • 3.零拷贝

  • 4.存储设计

  • 5.批量读写

  • 6.批量压缩

  • 7.消息写入过程

  • 8.消息读取过程

https://blog.csdn.net/feiying0canglang/article/details/120495496?ops_request_misc=&request_id=&biz_id=102&utm_term=Kafka%E4%B8%BA%E4%BB%80%E4%B9%88%E5%BF%AB&utm_medium=distribute.pc_search_result.none-task-blog-2~all~sobaiduweb~default-3-120495496.nonecase&spm=1018.2226.3001.4187

自定义UDF、UDTF函数

Hive优化

Hive解决数据倾斜方法

7天内连续3次活跃

1 7 30指标

分摊

备注

Sqoop空值、一致性、数据倾斜

Azkaban任务挂了怎么办?

Azkaban故障报警

Spark数据倾斜

Spark优化

SparkStreaming精确一次性消费

为啥spark会比hive 不稳定

Flink 数据倾斜

Flink水位线

https://blog.csdn.net/mynameisgt/article/details/124205582?ops_request_misc=%257B%2522request%255Fid%2522%253A%2522167863162316800192275657%2522%252C%2522scm%2522%253A%252220140713.130102334..%2522%257D&request_id=167863162316800192275657&biz_id=0&utm_medium=distribute.pc_search_result.none-task-blog-2~all~sobaiduend~default-2-124205582-null-null.142^v73^control,201^v4^add_ask,239^v2^insert_chatgpt&utm_term=Flink%E6%B0%B4%E4%BD%8D%E7%BA%BF%20&spm=1018.2226.3001.4187

Flink 维表关联,太大了(lru置换):

https://zhuanlan.zhihu.com/p/266638799?utm_id=0

Flink反压 :

https://blog.csdn.net/young_0609/article/details/123438764?ops_request_misc=%257B%2522request%255Fid%2522%253A%2522167860573216782428633478%2522%252C%2522scm%2522%253A%252220140713.130102334..%2522%257D&request_id=167860573216782428633478&biz_id=0&utm_medium=distribute.pc_search_result.none-task-blog-2~all~sobaiduend~default-3-123438764-null-null.142^v73^control,201^v4^add_ask,239^v2^insert_chatgpt&utm_term=Flink%E5%8F%8D%E5%8E%8B&spm=1018.2226.3001.4449

Flink处理函数

Flink SQL

Flink多流join

https://blog.csdn.net/weixin_45366499/article/details/115208985

Flink 精确一次性消费

Flink Gc

elasticsearch 使用场景、原理

olap引擎对比:

https://blog.csdn.net/weixin_52672149/article/details/119248932?ops_request_misc=%257B%2522request%255Fid%2522%253A%2522167862560016800215038718%2522%252C%2522scm%2522%253A%252220140713.130102334.pc%255Fall.%2522%257D&request_id=167862560016800215038718&biz_id=0&utm_medium=distribute.pc_search_result.none-task-blog-2~all~first_rank_ecpm_v1~rank_v31_ecpm-2-119248932-null-null.142^v73^control,201^v4^add_ask,239^v2^insert_chatgpt&utm_term=elasticsearch%20%20%E5%9C%A8olap%E5%BA%94%E7%94%A8&spm=1018.2226.3001.4187

ck原理

ck适用场景

https://blog.csdn.net/huzechen/article/details/106030627?ops_request_misc=%257B%2522request%255Fid%2522%253A%2522167862456216800211577110%2522%252C%2522scm%2522%253A%252220140713.130102334.pc%255Fall.%2522%257D&request_id=167862456216800211577110&biz_id=0&utm_medium=distribute.pc_search_result.none-task-blog-2~all~first_rank_ecpm_v1~rank_v31_ecpm-6-106030627-null-null.142^v73^control,201^v4^add_ask,239^v2^insert_chatgpt&utm_term=clickhouse%20%E9%80%82%E7%94%A8%E5%9C%BA%E6%99%AF&spm=1018.2226.3001.4187

mysql 主从同步延迟

https://blog.csdn.net/KIMTOU/article/details/125033199

广告整体数据流向:

dsp 创建广告投放计划-》竞价系统自动出价投标-》投放端投放广告-》超波流量控制-》server端收集日志-》反作弊打标-》计费系统计费-》数仓日志落盘、分层、统计广告点击曝光消耗-》投放效果展示给广告主和各个部门业务方


本文来自互联网用户投稿,文章观点仅代表作者本人,不代表本站立场,不承担相关法律责任。如若转载,请注明出处。 如若内容造成侵权/违法违规/事实不符,请点击【内容举报】进行投诉反馈!

相关文章

立即
投稿

微信公众账号

微信扫一扫加关注

返回
顶部