Stream Processing with Apache Spark

Name: Stream Processing with Apache Spark
Availability: InStock
ISBN: 9781491944240

Gerard Maas

出版社

O'Reilly Media

出版时间

2018-07-25

ISBN

9781491944240

评分

★★★★★

书籍介绍

To build analytics tools that provide faster insights, knowing how to process data in real time is a must, and moving from batch processing to stream processing is absolutely required. Fortunately, the Spark in-memory framework/platform for processing data has added an extension devoted to fault-tolerant stream processing: Spark Streaming.

If you're familiar with Apache Spark and want to learn how to implement it for streaming jobs, this practical book is a must.

Understand how Spark Streaming fits in the big picture

Learn core concepts such as Spark RDDs, Spark Streaming clusters, and the fundamentals of a DStream

Discover how to create a robust deployment

Dive into streaming algorithmics

Learn how to tune, measure, and monitor Spark Streaming

About the Author

François Garillot worked on Scala's type system in 2006, earned his PhD from the French École Polytechnique in 2011, and worked at Typesafe, after a brief stint in Internet advertising. He's worked on interactive interfaces to the Scala compiler, while nourishing a strong enthusiasm for data analytics in his spare time, until Apache Spark let him fullfill this ...

(展开全部)

用户评论

够新，考虑篇幅较短，快速过一下还是会有收获的

这本书只能说中规中矩。主要介绍了用spark 处理流，可惜是本2019年的书了。这个书用的是spark 2.4 这个在2023年已经是老版本了。里面流处理两个很重要的方面是 sparking stream 和 structure stream。在 spark 3 以后的版本都比较偏向后面的用法了。这本书对两者几乎平等介绍。在2019年的时候的确没法想到以后spark 3的发展。建议看spark 3 的书籍和spark 3 的流处理

O'Reilly Media的其他书籍查看全部