A list of the best data science books. This collection of books have something to offer even to the most professional data scientist.
1) Hadoop, the Definitive Guide: Storage and Analysis at an Internet Level, by Tom White
Hadoop is mostly written in Java and it is one of the best data science books, but that doesn’t exclude the use of other programming languages with this distributed storage and processing framework, particularly Python. With this concise book, you’ll learn how to use Python with the Hadoop Distributed File System (HDFS), MapReduce, the Apache Pig platform and Pig Latin script, and the Apache Spark cluster-computing framework.
2) Doing Data Science: Straight Talk from the Frontline, by Cathy O’Neil and Rachel Schutt
This book every Data Scientist Should Keep Nearby, it is written with the hope that it will find itself into the hands of someone—you?—who will make even more of it than what it is, and go on to solve important problems.
3) R Programming for Data Science: Roger Peng
You’ll learn the fundamentals of programming with R, from reading and writing data to customize, visualize and perform predictive analysis.