526 publications from this institution
Researchers from various disciplines such as pattern recognition, statistics, and machine learning have explored the use of ensemble methodology since the late seventies. Thus, they are faced with a wide variety of methods, given the growing interest in the field. This book aims to impose a degree of order upon this diversity by presenting a coherent and unified repository of ensemble methods, theories, trends, challenges and applications. The book describes in detail the classical methods, as well as the extensions and novel approaches developed recently. Along with algorithmic descriptions of each method, it also explains the circumstances in which this method is applicable and the consequences and the trade-offs incurred by using the method.
Automatic machine learning (AutoML) is an area of research aimed at automating machine learning (ML) activities that currently require human experts. One of the most challenging tasks in this field is the automatic generation of end-to-end ML pipelines: combining multiple types of ML algorithms into a single architecture used for end-to-end analysis of previously-unseen data. This task has two challenging aspects: the first is the need to explore a large search space of algorithms and pipeline architectures. The second challenge is the computational cost of training and evaluating multiple pipelines. In this study we present DeepLine, a reinforcement learning based approach for automatic pipeline generation. Our proposed approach utilizes an efficient representation of the search space and leverages past knowledge gained from previously-analyzed datasets to make the problem more tractable. Additionally, we propose a novel hierarchical-actions algorithm that serves as a plugin, mediating the environment-agent interaction in deep reinforcement learning problems. The plugin significantly speeds up the training process of our model. Evaluation on 56 datasets shows that DeepLine outperforms state-of-the-art approaches both in accuracy and in computational cost.