What is data science?

What Data Science (and Predictive Analytics) Can Do for You

Data Science (Predictive Analytics and Machine Learning), while not new, are more actionable than ever.  They enable you to use all your data to make classifications and predictions, on a case-by-case basis. For example, instead of an overall sales forecast, you can make predictions about each individual customer’s likelihood of buying (or “churning”).  Here's the Wikipedia description of Data Science.

With Data Science plus simple business rules (such as “is the churn score > 0.6? Then send a marketing offer”), you can impact business results. But Data Science doesn’t address complex decisions such as “how do I allocate my marketing budget over email, online ads, social media and more?” or “how do I manage inventory and reorders to minimize stock-outs and holding cost?”  (Management science does.)

In the past, managers couldn’t answer both kinds of questions in an integrated way.  But now you can … if you have the right analytical and model-building tools.

Data Driven, Machine-Generated Models

The models used in data science are standard mathematical forms -- such linear or logistic regression, or neural networks -- 'trained' or fitted to your past data by machine learning algorithms.  Human expertise is needed to ask the right questions, and select the right data to be fed into machine learning, but the models are automatically generated. (In contrast, management science models describe the structure of some part of your present or future business -- they require a human modeler with business domain knowledge, model-building expertise and tools.)

Data Systems and Languages

How are data science models written and tested?  There are now many tools available for data science -- indeed, the field is crowded, with end-user offerings from IBM, Microsoft, SAS, Oracle, RapidMiner and KNIME, developer tools from Google, Amazon, Intel, Microsoft and others, open-source tools such as R and Python libraries, and billions of dollars in venture funding for data science startups in the past few years.

  • Diagram editors are the most common end-user tool, used to construct a multi-stage workflow that includes data "wrangling" steps, training and validating a machine learning model, and applying the model to new data. Nodes represent operations, and arrows indicate the flow of data -- normally a table where each row is an individual case -- from stage to stage.
  • SDKs / Object Libraries are popular among developers programming in R or Python.  Compared to use of a diagram editor, more skill, time and effort are needed to create an entire application in code.  SDKs and Web APIs offer flexibility for deploying models in production; many vendors support PMML (Predictive Modeling Markup Language) as a way to deploy models.

Frontline offers a comprehensive platform for data science and machine learning that's easy to learn and use in Microsoft Excel and/or our RASON® modeling language. Our XLMiner SDK (Software Development Kit) is a rich object library for C++, C#, Java, R and Python, with full PMML support.  What distinguishes Frontline's platform from others is its integrated, comprehensive support for management science -- making complex decisions using optimization, simulation, and similar methods.

Where other platforms seek to "tack on" management science methods, by simply offering "nodes with R code" or a simple linear programming solver that assumes you have a "coefficient matrix", Frontline's platform is designed from the ground up to support complex multi-faceted resource allocation decisions, as well as simple "one case at a time" business rule decisions.