Veridical Data Science


Book by Bin Yu and Rebecca L. Barter: “Most textbooks present data science as a linear analytic process involving a set of statistical and computational techniques without accounting for the challenges intrinsic to real-world applications. Veridical Data Science, by contrast, embraces the reality that most projects begin with an ambiguous domain question and messy data; it acknowledges that datasets are mere approximations of reality while analyses are mental constructs.
Bin Yu and Rebecca Barter employ the innovative Predictability, Computability, and Stability (PCS) framework to assess the trustworthiness and relevance of data-driven results relative to three sources of uncertainty that arise throughout the data science life cycle: the human decisions and judgment calls made during data collection, cleaning, and modeling. By providing real-world data case studies, intuitive explanations of common statistical and machine learning techniques, and supplementary R and Python code, Veridical Data Science offers a clear and actionable guide for conducting responsible data science. Requiring little background knowledge, this lucid, self-contained textbook provides a solid foundation and principled framework for future study of advanced methods in machine learning, statistics, and data science…(More)”.