## Learning from Data: Artificial Intelligence and Statistics VTen years ago Bill Gale of AT&T Bell Laboratories was primary organizer of the first Workshop on Artificial Intelligence and Statistics. In the early days of the Workshop series it seemed clear that researchers in AI and statistics had common interests, though with different emphases, goals, and vocabularies. In learning and model selection, for example, a historical goal of AI to build autonomous agents probably contributed to a focus on parameter-free learning systems, which relied little on an external analyst's assumptions about the data. This seemed at odds with statistical strategy, which stemmed from a view that model selection methods were tools to augment, not replace, the abilities of a human analyst. Thus, statisticians have traditionally spent considerably more time exploiting prior information of the environment to model data and exploratory data analysis methods tailored to their assumptions. In statistics, special emphasis is placed on model checking, making extensive use of residual analysis, because all models are 'wrong', but some are better than others. It is increasingly recognized that AI researchers and/or AI programs can exploit the same kind of statistical strategies to good effect. Often AI researchers and statisticians emphasized different aspects of what in retrospect we might now regard as the same overriding tasks. |

Two Algorithms for Inducing Structural Equation Models from Data | 3 |

Using Causal Knowledge to Learn More Useful Decision Rules From Data | 13 |

A Causal Calculus for Statistical Research | 23 |

Likelihoodbased Causal Inference | 35 |

Inference and Decision Making | 45 |

Ploxoma Testbed for Uncertain Inference | 47 |

Solving Influence Diagrams Using Gibbs Sampling | 59 |

Modeling and Monitoring Dynamic Systems by Chain Graphs | 69 |

Searching for Dependencies in Bayesian Classifiers | 239 |

General Learning Issues | 249 |

Statistical Analysis of Complex Systems in Biomedicine | 251 |

Learning in Hybrid Noise Environments Using Statistical Queries | 259 |

On the Statistical Comparison of Inductive Learning Methods | 271 |

Dynamical Selection of Learning Algorithms | 281 |

Learning Bayesian Networks Using Feature Selection | 291 |

Data Representations in Learning | 301 |

Propagation of Gaussian belief functions | 79 |

On Test Selection Strategies for Belief Networks | 89 |

Representing and Solving Asymmetric Decision Problems Using Valuation Networks | 99 |

A HillClimbing Approach for Optimizing Classification Trees | 109 |

Search Control in Model Hunting | 119 |

Learning Bayesian Networks is NPComplete | 121 |

Heuristic Search for Model Structure the Benefits of Restraining Greed | 131 |

Learning Possibilistic Networks from Data | 143 |

Detecting Imperfect Patterns in Event Streams Using Local Search | 155 |

Structure Learning of Bayesian Networks by Hybrid Genetic Algorithms | 165 |

An Axiomatization of Loglinear Models with an Application to the ModelSearch Problem | 175 |

Detecting Complex Dependencies in Categorical Data | 185 |

Classification | 197 |

A Comparative Evaluation of Sequential Feature Selection Algorithms | 199 |

Classification Using Bayes Averaging of Multiple Relational Rulebased Models | 207 |

Picking the Best Expert from a Sequence | 219 |

Hierarchical Clustering of Composite Objects with a Variable Number of Components | 229 |

EDA Tools and Methods | 311 |

Rule Induction as Exploratory Data Analysis | 313 |

NonLinear Dimensionality Reduction A Comparative Performance Analysis | 323 |

OmegaStat An Environment for Implementing Intelligent Modeling Strategies | 333 |

Framework for a Generic Knowledge Discovery Toolkit | 343 |

Control Representation in an EDA Assistant | 353 |

Decision and Regression Tree Induction | 363 |

A Further Comparison of Simplification Methods for DecisionTree Induction | 365 |

Robust Linear Discriminant Trees | 375 |

Tree Structured Interpret able Regression | 387 |

An Exact Probability Metric for Decision Tree Splitting | 399 |

Natural Language Processing | 411 |

Two Applications of Statistical Modelling to Natural Language Processing | 413 |

A Model for PartofSpeech Prediction | 423 |

ViewpointBased Measurement of Semantic Similarity between Words | 433 |

PartofSpeech Tagging from Small Data Sets | 443 |

Learning from Data: Artificial Intelligence and Statistics V Doug Fisher,Hans-J. Lenz

