Doing Data Science

Cathy O'Neil, Rachel Schutt

出版时间

2013-11-03

ISBN

9781449358655

评分

★★★★★
书籍介绍
Now that answering complex and compelling questions with data can make the difference in an election or a business model, data science is an attractive discipline. But how can you learn this wide-ranging, interdisciplinary field? With this book, you’ll get material from Columbia University’s "Introduction to Data Science" class in an easy-to-follow format. Each chapter-long lecture features a guest data scientist from a prominent company such as Google, Microsoft, or eBay teaching new algorithms, methods, or models by sharing case studies and actual code they use. You’ll learn what’s involved in the lives of data scientists and be able to use the techniques they present. Guest lectures focus on topics such as: Machine learning and data mining algorithms Statistical models and methods Prediction vs. description Exploratory data analysis Communication and visualization Data processing Big data Programming Ethics Asking good questions If you’re familiar with linear algebra, probability and statistics, and have some programming experience, this book will get you started with data science. Doing Data Science is collaboration between course instructor Rachel Schutt (also employed by Google) and data science consultant Cathy O’Neil (former quantitative analyst for D.E. Shaw) who attended and blogged about the course.
精彩摘录
  • "Exploratory data analysis Visualization (for exploratory data analysis and reporting) Dashboards and metrics Find business insights Data-driven decision making Data engineering/Big Data (Mapreduce, Hadoop, Hive, and Pig) Get the data themselves Build data pipelines (logs→mapreduce→dataset→join with "
  • "Being humanist in the context of data science means recognizing the role your own humanity plays in building models and algorithms, thinking about qualities you have as a human that a computer does not have (which includes the ability to make ethical decisions), and thinking about the humans whose l"
作者简介
Cathy O’Neil earned a Ph.D. in math from Harvard, was postdoc at the MIT math department, and a professor at Barnard College where she published a number of research papers in arithmetic algebraic geometry. She then chucked it and switched over to the private sector. She worked as a quant for the hedge fund D.E. Shaw in the middle of the credit crisis, and then for RiskMetrics, a risk software company that assesses risk for the holdings of hedge funds and banks. She is currently a data scientist on the New York start-up scene, writes a blog at mathbabe.org, and is involved with Occupy Wall Street. Rachel Schutt is a Senior Research Scientist at Johnson Research Labs, and most recently was a Senior Statistician at Google Research in the New York office. She is also an adjunct assistant professor in the Department of Statistics at Columbia University where she taught Introduction to Data Science. She earned a PhD from Columbia University in statistics, and masters degrees in mathematics and operations research from the Courant Institute and Stanford University, respectively. Her statistical research interests include modeling and analyzing social networks, epidemiology, hierarchical modeling and Bayesian statistics. Her education-related research interests include curriculum design. Rachel enjoys designing and creating complex, thought-provoking situations for other people. She won the Howard Levene Outstanding Teaching Award at Columbia and also taught probability and statistics at Cooper Union, and remedial math as a high school teacher in San Jose, CA. She was a mathematics curriculum expert for the Princeton Review, and won a game design award for best family game at the Come Out and Play Festival in New York.
用户评论
大概知道数据科学是啥了
很多地方都讲到了,语言也很简练,易理解
一本400页的书,讲明白data science,勉为其难啊。不过总得有人给数据科学作为一个完整的主题开个著书立说的头不是。
各种data scientist出来现身说法讲经验,挺受益的
结合案例,由一线实践者现身说法,作为入门来看比较合适。btw,字体排版不错。
大学的时候请过Cathy ONeil给我们讲课
很好的入门书
这本够科普扫盲了。
下载
收藏