Abstract:
This paper establishes a general framework for Bayesian model-based clustering, in which subset labels are exchangeable, and items are also exchangeable, possibly up to covariate e®ects. It is rich enough to encompass a variety of existing procedures, including some recently discussed methodologies involving stochastic search or hierarchical clustering, but more importantly allows the formulation of clustering procedures that are optimal with respect to a speci¯ed loss function. Our focus is on loss functions based on pairwise coincidences, that is, whether pairs of items are clustered into the same subset or not.