Data Science for Business

Foster Provost, Leonard N. Stern School of Business, New York University, New York, NY

Tom Fawcett, Silicon Valley Data Science, Mountain View, CA USA

What you need to know about data mining and data-analytic thinking

 

Design

The book builds up the reader's understanding of data science by discussing the fundamental principles in the context of business examples, and then shows specifically how the principles can provide understanding of many of the most common methods and techniques used in data science.  After reading the book, the reader should be able to (i) discuss data science intelligently with data scientists and with other stakeholders, (ii) better understand proposals for data science projects and data science investments, and (iii) participate integrally in data science projects. 


As one example, a fundamental principle of data science is that solutions for extracting useful knowledge from data must carefully consider the problem from the business perspective.  This may sound obvious at first, but the notion underlies many choices that must be made in the process of data analytics, including problem formulation, method choice, solution evaluation, and general strategy formulation.  Another fundamental principle is that some data items can give us information about other data items.  This principle manifests itself throughout data science: in the basic notion of finding “correlations” among variables, in the specific design of many particular data mining procedures, and more generally as the basis for all predictive analytics. 


Audience

Data Science for Business is intended for business people who will be managing or working with data scientists, for developers who will be implementing data science solutions, as well as for aspiring data scientists.  By its very nature the material is somewhat technical---the goal is to really understand data science, not to give a high-level overview.  However, the book does not presume a sophisticated mathematical background, relegating the few technical details to optional "starred" sections. 


More info

The table of contents is available on this page, and some advance reviews are available.

 

Data Science for Business by Foster Provost and Tom Fawcett is intended for those who need to understand data science/data mining, and those who want to develop their skill at data-analytic thinking.  Data Science for Business is not a book about algorithms.  Instead it presents a set of fundamental principles for extracting useful knowledge from data.  These fundamental principles are the foundation for many algorithms and techniques for data mining, but also underlie the processes and methods for approaching business problems data-analytically, evaluating particular data science solutions, and evaluating general data science plans.

About the authors


Foster Provost is a Professor and NEC Faculty Fellow at the NYU Stern School of Business, where he has taught data science to MBAs for 15 years. Previously, he worked as a data scientist for what's now Verizon for five years, winning a President's Award for his work there. Professor Provost's research and teaching focus on data science, machine learning, business analytics, (social) network data, and crowd-sourcing for data analytics. He was Editor-in-Chief of the journal Machine Learning from 2004 to 2010 and was Program Chair of the premier data science conference in 2001. Professor Provost has worked with companies large and small on improving their data science capabilities.  He has collaborated with AT&T, IBM, and others, and he has founded several data-science based companies focusing on modeling consumer behavior data especially for marketing and advertising applications.  His prior work applied and extended data science methods to business applications including fraud detection, counterterrorism, network diagnosis, and more. Professor Provost’s work has won (among others) IBM Faculty Awards, the aforementioned President's Award, Best Paper awards at KDD, including the 2012 Best Industry Paper, and the INFORMS Design Science Award.


Tom Fawcett is a Principal Data Scientist at Silicon Valley Data Science.  He is an active member of the machine learning and data mining communities.  He has a Ph.D. in machine learning from UMass-Amherst and has worked in industrial research (GTE Laboratories, NYNEX/Verizon Labs, HP Labs, etc.).  In his career he has published numerous conference and journal papers in machine learning.  He has just completed a five year term as action editor of the Machine Learning journal, before which he was an editorial board member. In 2003 he co-chaired the program of the premier machine learning conference (ICML) and has organized many workshops and journal special issues. He received a Best Paper Award from KDD, a SCOPUS Award (most cited paper) from Pattern Recognition Letters, and a President's Award from Verizon.

Data Science for Business, listed as required reading by Fortune and Harvard University, is already being used as a textbook by over 150 universities in 26 countries around the world, in a wide variety of programs, including MBA, MS in Business Analytics, MS in Data Science, Business Undergraduate, Executive Education, and Arts & Science Undergraduate.

We adapted the first chapter of our book into an article, Data Science and its Relationship to Big Data and Data-Driven Decision Making, published in the inaugural issue of Big Data Journal.  It is freely available here.

“A must-read resource for anyone who is serious about embracing the opportunity of big data.”

— Craig Vaughan, Global Vice President, SAP


“There is no other book on practical data science for business applications that simultaneously has as much authority and as much clarity. Students cannot stop raving about this book.”

— Prof. Sinan Aral, MIT Sloan School of Business