Publications

526 publications from this institution

Troika – An improved stacking schema for classification tasks

Stacking is a general ensemble method in which a number of base classifiers are combined using one meta-classifier which learns their outputs. Such an approach provides certain advantages: simplicity; performance that is similar to the best classifier; and the capability of combining classifiers induced by different inducers. The disadvantage of stacking is that on multiclass problems, stacking seems to perform worse than other meta-learning approaches. In this paper we present Troika, a new stacking method for improving ensemble classifiers. The new scheme is built from three layers of combining classifiers. The new method was tested on various datasets and the results indicate the superiority of the proposed method to other legacy ensemble schemes, Stacking and StackingC, especially when the classification task consists of more than two classes.

Eitan Menahem, Lior Rokach, Yuval Elovici 2009Article

Fast and space-efficient shapelets-based time-series classification

Time series classification is a research area which has drawn much attention over the past decade. A novel approach for classification of time series uses shapelets. A shapelet is a subsequence extracted from one of the time series in the dataset which best separates between time series coming from different classes of the data set. A disadvantage of current shapelet-based classification approaches is their high time and memory consumption, which results from the examination of all possible subsequences. In this study, our initial goal was to find an evaluation order of the shapelets space which enables fast generation of an accurate classification model with a small memory footprint. The comparative analysis we conducted clearly indicates that a random evaluation order yields the best results. We present an algorithm for randomized model generation for shapelet-based classification that can generate a model with surprisingly high accuracy after evaluating only an exceedingly small fraction (∼ 10-3) of the shapelets space and has modest memory requirements. We propose several methods for estimating the number of shapelets to examine, and present extensive evaluation on 51 data sets establishing the effectiveness of our approach.

Publications

Troika – An improved stacking schema for classification tasks

Fast and space-efficient shapelets-based time-series classification

Best Usage Context Prediction for Music Tracks

Improving Sentiment Analysis in an Online Cancer Survivor Community Using Dynamic Sentiment Lexicon

Machine Learning Prediction for Prognosis of Patients With Aortic Stenosis

Incorporating Fuzzy Logic in Data Mining Tasks

Automated algorithm selection using meta-learning and pre-trained deep convolution neural networks

Taxonomy of mobile users' security awareness

Boosting Simple Collaborative Filtering Models Using Ensemble Methods

Feature Selection by Combining Multiple Methods

Diffusion Ensemble Classifiers

Transfer Learning for Content-Based Recommender Systems using Tree Matching

Data Leakage/Misuse Scenarios

Identity theft, computers and behavioral biometrics

A GIS-based decision support system for hotel room rate estimation and temporal price prediction: The hotel brokers' context

Machine-learning analysis of factors that shape cancer aneuploidy landscapes reveals an important role for negative selection

An evolutionary algorithm for constructing a decision forest: Combining the classification of disjoints decision trees

Cost-Sensitive Detection of Malicious Applications in Mobile Devices

Evolving context-aware recommender systems with users in mind

Explainable decision forest: Transforming a decision forest into an interpretable tree