Machine Learning in Action - Peter Harrington

Machine Learning in Action

Peter Harrington

出版时间

2012-04-18

ISBN

9781617290183

评分

★★★★★
书籍介绍

It's been said that data is the new "dirt"—the raw material from which and on which you build the structures of the modern world. And like dirt, data can seem like a limitless, undifferentiated mass. The ability to take raw data, access it, filter it, process it, visualize it, understand it, and communicate it to others is possibly the most essential business problem for the coming decades.

"Machine learning," the process of automating tasks once considered the domain of highly-trained analysts and mathematicians, is the key to efficiently extracting useful information from this sea of raw data. By implementing the core algorithms of statistical data processing, data analysis, and data visualization as reusable computer code, you can scale your capacity for data analysis well beyond the capabilities of individual knowledge workers.

Machine Learning in Action is a unique book that blends the foundational theories of machine learning with the practical realities of building tools for everyday data analysis. In it, you'll use the flexible Python programming language to build programs that implement algorithms for data classification, forecasting, recommendations, and higher-level features like summarization and simplification.

As you work through the numerous examples, you'll explore key topics like classification, numeric prediction, and clustering. Along the way, you'll be introduced to important established algorithms, such as Apriori, through which you identify association patterns in large datasets and Adaboost, a meta-algorithm that can increase the efficiency of many machine learning tasks.

Peter Harrington holds Bachelors and Masters Degrees in Electrical Engineering. He worked for Intel Corporation for seven years in California and China. Peter holds five US patents and his work has been published in three academic journals. He is currently the chief scientist for Zillabyte Inc. Peter spends his free time competing in programming competitions, and building 3D pr...

(展开全部)

AI导读
核心看点
  • 以Python代码实现经典机器学习算法
  • 涵盖分类、回归及无监督学习核心内容
  • 强调从底层构建代码理解算法原理
适合谁读
  • 具备Python基础的编程初学者
  • 希望快速上手机器学习的实践者
  • 偏好通过代码实例理解理论的读者
读前提醒
  • 注意核对Python版本兼容性
  • 需补充数学理论以深入理解
  • 结合源码调试以掌握细节
读者共识
  • 优秀的机器学习入门实战指南
  • 代码实现有助于直观理解概念
  • 数学理论较浅需配合其他教材

本导读基于书籍简介、目录、原文摘录、短评和书评生成,不等同于全文精读。

精彩摘录
  • "Pros: High accuracy, insensitive to outliers, no assumptions about data Cons: Computationally expensive, requires a lot of memory Works with: Numeric values, nominal values The first machine-learning algorithm we’ll look at is k-Nearest Neighbors (kNN). It works like this: we have an existing set of"
  • "Pros: Computationally cheap to use, easy for humans to understand learned results, missing values OK, can deal with irrelevant features Cons: Prone to overfitting Works with: Numeric values, nominal values"
  • "General approach to decision trees 1. Collect: Any method. 2. Prepare: This tree-building algorithm works only on nominal values, so any continuous values will need to be quantized. 3. Analyze: Any method. You should visually inspect the tree after it is built. 4. Train: Construct a tree data struct"
  • "Logistic regression Pros: Computationally inexpensive, easy to implement, knowledge representation easy to interpret Cons: Prone to underfitting, may have low accuracy Works with: Numeric values, nominal values"
  • "The clear syntax of Python has earned it the name executable pseudo-code."
  • "With Python, you can program in any style you’re familiar with: object-oriented, procedural, functional, and so on."
  • "With Python it’s easy to process and manipulate text"
  • "Python is popular in the scientific and financial communities as well.A number of scientific libraries such as SciPy and NumPy allow you to do vector and matrix operations."
作者简介
Peter Harrington holds Bachelors and Masters Degrees in Electrical Engineering. He worked for Intel Corporation for seven years in California and China. Peter holds five US patents and his work has been published in three academic journals. He is currently the chief scientist for Zillabyte Inc. Peter spends his free time competing in programming competitions, and building 3D printers.
目录
Part 1: Classification
1 Machine learning basics
2 Classifying with k-nearest neighbors
3 Splitting datasets one feature at a time: decision trees
4 Classifying with probability distributions: Na�ve Bayes

显示全部
用户评论
python和机器学习浅显地入门材料。很有趣。
超级赞的入门好书,很多之前模糊的概念都通过本书中的例子弄明白了
一般般
随便翻翻,当复习Python和相关库了。适合初学者。
入门书籍。。超多python代码..
业界的书接地气
第三章决策树,说好的要用决策树测试一个实例,结果根本就没有测试的内容,让人失望。
一般吧,简单入门书
这本书适合理论功底强,而实践能力弱的人读。算法的理论部分极其鸡肋,有些算法根本讲不清楚,需要自己查阅资料去学习。作为一本重实践的书,可以没有公式推导,但书中出现的公式要求描述清楚不过分吧?另外最近几年机器学习飞速发展,本书显然略显老旧,书中代码是Python2的,很多需要安装的module并没有Python3的版本。还有一些抓取网站数据的,网站已经不存在了。一些近年来火爆的算法并没有包含,比如深度学习。但是作为新人的入门书籍,本书还是很好读的,可以了解一些机器学习的基本问题,基本算法以及基本面貌。最主要的是,学习一些作者好的代码习惯。
算法
收藏