Comprehensible Knowledge Discovery From Data

Dr. Michael Pazzani

Thursday, October 30, 2008, 12:00pm - 07:00pm

Rutgers University, Office of VP for Research and Graduate and Professional Education and the Department of Computer Sci

Knowledge discovery in databases is a field whose goal is to turn data into knowledge. For example, by analyzing a database of credit card customers we can determine what types of customers are most likely to be profitable for the company. By "mining" databases of medical records, new cost-effective procedures for screening for diseases may be uncovered. We review advances in the field over the past two decades of research in statistics, neural networks and artificial intelligence that have identified a variety of approaches that produce accurate descriptive or predictive models. However, we show that experts are unwilling to accept the results of these techniques when they don't make sense, are difficult to understand, or violate prior understanding. We discuss factors that make learned knowledge acceptable to experts and discuss modifications to rule learning, linear regression and text classification algorithms that make the learned models more comprehensible.

