Giving AdaBoost a Parallel Boost

Merler, Stefano; Furlanello, Cesare; Caprile, Bruno Giovanni

AdaBoost is one of the most popular classification methods in use. Differently from other ensemble methods (e.g., Bagging), AdaBoost is inherently sequential. In many data intensive, real world applications this may limit the practical applicability of the method. In this paper, a scheme is presented for the parallelization of the AdaBoost. The procedure builds upon earlier results concerning the dynamics of AdaBoost weights, and yields approximations to the standard AdaBoost models that can be easily and efficiently distributed over a network of computing nodes. Margin maximization properties of the proposed procedure are discussed, and experiments are reported on either synthetic and benchmark data sets