项目重点问题
面试大保健:
https://blog.csdn.net/Mrerlou/article/details/114295888?ops_request_misc=%257B%2522request%255Fid%2522%253A%2522167863142316800186566694%2522%252C%2522scm%2522%253A%252220140713.130102334..%2522%257D&request_id=167863142316800186566694&biz_id=0&utm_medium=distribute.pc_search_result.none-task-blog-2~all~sobaiduend~default-1-114295888-null-null.142^v73^control,201^v4^add_ask,239^v2^insert_chatgpt&utm_term=%E9%9B%86%E7%BE%A4%E8%B5%84%E6%BA%90%E5%88%86%E9%85%8D%E5%8F%82%E6%95%B0%EF%BC%88%E9%A1%B9%E7%9B%AE%E4%B8%AD%E9%81%87%E5%88%B0%E7%9A%84%E9%97%AE%E9%A2%98%EF%BC%89%20&spm=1018.2226.3001.4187
Hadoop宕机
Hadoop解决数据倾斜方法
https://blog.csdn.net/a934079371/article/details/109233998?ops_request_misc=%257B%2522request%255Fid%2522%253A%2522167863132316800186547751%2522%252C%2522scm%2522%253A%252220140713.130102334.pc%255Fall.%2522%257D&request_id=167863132316800186547751&biz_id=0&utm_medium=distribute.pc_search_result.none-task-blog-2~all~first_rank_ecpm_v1~rank_v31_ecpm-9-109233998-null-null.142^v73^control,201^v4^add_ask,239^v2^insert_chatgpt&utm_term=Hadoop%E8%A7%A3%E5%86%B3%E6%95%B0%E6%8D%AE%E5%80%BE%E6%96%9C%E6%96%B9%E6%B3%95%20%20%E9%9D%A2%E8%AF%95&spm=1018.2226.3001.4187
集群资源分配参数(项目中遇到的问题)
HDFS小文件处理
https://blog.csdn.net/gym02/article/details/124169185?ops_request_misc=%257B%2522request%255Fid%2522%253A%2522167863123316800227426570%2522%252C%2522scm%2522%253A%252220140713.130102334..%2522%257D&request_id=167863123316800227426570&biz_id=0&utm_medium=distribute.pc_search_result.none-task-blog-2~all~baidu_landing_v2~default-2-124169185-null-null.142^v73^control,201^v4^add_ask,239^v2^insert_chatgpt&utm_term=HDFS%E5%B0%8F%E6%96%87%E4%BB%B6%E5%A4%84%E7%90%86%20%E9%9D%A2%E8%AF%95&spm=1018.2226.3001.4187
Hadoop优化
hive和spark比较
Flume挂掉
Flume优化
kafka常见问题:
https://www.cnblogs.com/erlou96/p/14401394.html
Kafka挂掉
https://blog.csdn.net/huaxing_ba/article/details/125023374?utm_medium=distribute.pc_relevant.none-task-blog-2~default~baidujs_utm_term~default-1-125023374-blog-122420561.pc_relevant_landingrelevant&spm=1001.2101.3001.4242.2&utm_relevant_index=4
Kafka丢失
https://blog.csdn.net/YoungJ_Zhou/article/details/125605128?ops_request_misc=%257B%2522request%255Fid%2522%253A%2522167861085216800226539195%2522%252C%2522scm%2522%253A%252220140713.130102334..%2522%257D&request_id=167861085216800226539195&biz_id=0&utm_medium=distribute.pc_search_result.none-task-blog-2~all~baidu_landing_v2~default-4-125605128-null-null.142^v73^control,201^v4^add_ask,239^v2^insert_chatgpt&utm_term=kafka%E6%95%B0%E6%8D%AE%E4%B8%A2%E5%A4%B1&spm=1018.2226.3001.4187
Kafka数据重复
https://blog.csdn.net/feiying0canglang/article/details/120514976?ops_request_misc=%257B%2522request%255Fid%2522%253A%2522167861119616800225535554%2522%252C%2522scm%2522%253A%252220140713.130102334..%2522%257D&request_id=167861119616800225535554&biz_id=0&utm_medium=distribute.pc_search_result.none-task-blog-2~all~sobaiduend~default-2-120514976-null-null.142^v73^control,201^v4^add_ask,239^v2^insert_chatgpt&utm_term=kafka%20%E6%95%B0%E6%8D%AE%E9%87%8D%E5%A4%8D&spm=1018.2226.3001.4187
Kafka消息数据积压
https://blog.csdn.net/y277an/article/details/117828798?ops_request_misc=%257B%2522request%255Fid%2522%253A%2522167862968516800227442646%2522%252C%2522scm%2522%253A%252220140713.130102334..%2522%257D&request_id=167862968516800227442646&biz_id=0&utm_medium=distribute.pc_search_result.none-task-blog-2~all~sobaiduend~default-1-117828798-null-null.142^v73^control,201^v4^add_ask,239^v2^insert_chatgpt&utm_term=Kafka%E6%B6%88%E6%81%AF%E6%95%B0%E6%8D%AE%E7%A7%AF%E5%8E%8B%20&spm=1018.2226.3001.4187
Kafka优化
Kafka单条日志传输大小
Kafka为什么快
1.磁盘读写原理
2.利用Pagecache+mmap
3.零拷贝
4.存储设计
5.批量读写
6.批量压缩
7.消息写入过程
8.消息读取过程
https://blog.csdn.net/feiying0canglang/article/details/120495496?ops_request_misc=&request_id=&biz_id=102&utm_term=Kafka%E4%B8%BA%E4%BB%80%E4%B9%88%E5%BF%AB&utm_medium=distribute.pc_search_result.none-task-blog-2~all~sobaiduweb~default-3-120495496.nonecase&spm=1018.2226.3001.4187
自定义UDF、UDTF函数
Hive优化
Hive解决数据倾斜方法
7天内连续3次活跃
1 7 30指标
分摊
备注
Sqoop空值、一致性、数据倾斜
Azkaban任务挂了怎么办?
Azkaban故障报警
Spark数据倾斜
Spark优化
SparkStreaming精确一次性消费
为啥spark会比hive 不稳定
Flink 数据倾斜
Flink水位线
https://blog.csdn.net/mynameisgt/article/details/124205582?ops_request_misc=%257B%2522request%255Fid%2522%253A%2522167863162316800192275657%2522%252C%2522scm%2522%253A%252220140713.130102334..%2522%257D&request_id=167863162316800192275657&biz_id=0&utm_medium=distribute.pc_search_result.none-task-blog-2~all~sobaiduend~default-2-124205582-null-null.142^v73^control,201^v4^add_ask,239^v2^insert_chatgpt&utm_term=Flink%E6%B0%B4%E4%BD%8D%E7%BA%BF%20&spm=1018.2226.3001.4187
Flink 维表关联,太大了(lru置换):
https://zhuanlan.zhihu.com/p/266638799?utm_id=0
Flink反压 :
https://blog.csdn.net/young_0609/article/details/123438764?ops_request_misc=%257B%2522request%255Fid%2522%253A%2522167860573216782428633478%2522%252C%2522scm%2522%253A%252220140713.130102334..%2522%257D&request_id=167860573216782428633478&biz_id=0&utm_medium=distribute.pc_search_result.none-task-blog-2~all~sobaiduend~default-3-123438764-null-null.142^v73^control,201^v4^add_ask,239^v2^insert_chatgpt&utm_term=Flink%E5%8F%8D%E5%8E%8B&spm=1018.2226.3001.4449
Flink处理函数
Flink SQL
Flink多流join
https://blog.csdn.net/weixin_45366499/article/details/115208985
Flink 精确一次性消费
Flink Gc
elasticsearch 使用场景、原理
olap引擎对比:
https://blog.csdn.net/weixin_52672149/article/details/119248932?ops_request_misc=%257B%2522request%255Fid%2522%253A%2522167862560016800215038718%2522%252C%2522scm%2522%253A%252220140713.130102334.pc%255Fall.%2522%257D&request_id=167862560016800215038718&biz_id=0&utm_medium=distribute.pc_search_result.none-task-blog-2~all~first_rank_ecpm_v1~rank_v31_ecpm-2-119248932-null-null.142^v73^control,201^v4^add_ask,239^v2^insert_chatgpt&utm_term=elasticsearch%20%20%E5%9C%A8olap%E5%BA%94%E7%94%A8&spm=1018.2226.3001.4187
ck原理
ck适用场景
https://blog.csdn.net/huzechen/article/details/106030627?ops_request_misc=%257B%2522request%255Fid%2522%253A%2522167862456216800211577110%2522%252C%2522scm%2522%253A%252220140713.130102334.pc%255Fall.%2522%257D&request_id=167862456216800211577110&biz_id=0&utm_medium=distribute.pc_search_result.none-task-blog-2~all~first_rank_ecpm_v1~rank_v31_ecpm-6-106030627-null-null.142^v73^control,201^v4^add_ask,239^v2^insert_chatgpt&utm_term=clickhouse%20%E9%80%82%E7%94%A8%E5%9C%BA%E6%99%AF&spm=1018.2226.3001.4187
mysql 主从同步延迟
https://blog.csdn.net/KIMTOU/article/details/125033199
广告整体数据流向:
dsp 创建广告投放计划-》竞价系统自动出价投标-》投放端投放广告-》超波流量控制-》server端收集日志-》反作弊打标-》计费系统计费-》数仓日志落盘、分层、统计广告点击曝光消耗-》投放效果展示给广告主和各个部门业务方
本文来自互联网用户投稿,文章观点仅代表作者本人,不代表本站立场,不承担相关法律责任。如若转载,请注明出处。 如若内容造成侵权/违法违规/事实不符,请点击【内容举报】进行投诉反馈!
