High Performance Spark - Holden Karau, Rachel Warren

High Performance Spark

Holden Karau, Rachel Warren

出版时间

2016-07-25

ISBN

9781491943205

评分

★★★★★
精彩摘录
  • "From a code readability perspective, this solution is ugly. It requires dozens of lines of code and four passes through the data. However, we expect that it will avoid memory errors on the executors and complete faster than the `groupByKey` or secondary sort solutions. This is because the data on ea"
用户评论
确实牛逼
目前看到的最全面最有用的讲解spark的书籍
简略
两周时间 陆陆续续读完 收获不少
第2到6章还行。
觉得比Spark权威指南稍差一点,第六章的goldilocks的例子不错。
书真不错可惜国内引进有点晚,期待下一版能有更多关于dateframe/dataset的内容
进阶提升,查漏补缺
收藏