Classification

keywords: ALFA, classification, classifier

OUTLINE

Load a classification model
Train a classifier
Test/Run a classifier
k-fold cross validation
Find the best classifier
Find top predictors

1. Load a classification model

This command basically loads a classifier and the supporting libraries into Alfarvis. Currently supported classifiers include

Support Vector Machines (SVMs)
Random Forest (RF)
Decision Trees (DT)
Logistic Regression (LR)

To load a model simply type one of the following commands

load SVM
import RF
import decision trees

Once loaded, the classifier parameters would be displayed in an editable panel to the right of the ALFA GUI. These could be edited and saved as per user preferences.

2. Train a classifier

Once a classifier is loaded, we can train the classifier on a dataset or a set of variables. To train the classifier, simply type either of the following commands.

train [classifier name] on the [dataset]
train [classifier name] on [variable1] [variable2] and so on

3. Test/Run a classifier

The trained classifier can be run on a new test dataset by using one of the following commands.

run [trained classifier model] on the [dataset]
train [trained classifier model] on [variable1] [variable2] and so on

4. k-fold cross validation

A classification algorithm can be evaluated on a dataset using leave-one-out or k-fold cross validation as well using either of the following commands.

k fold cross validation using [classifier name] on the [dataset]
leave one out cross validation using [classifier name] on the [dataset]

Once the command is run, an editable panel would pop-up on the right side of the GUI allowing the user to select a desired k value.

5. Find the best classifier

A set of classification algorithms can be evaluated on a dataset using k-fold cross validation to identify the best algorithm for that particular dataset.

Find the best classifier among [classifier1] [classifier2] and [classifier3] on the [dataset]

6. Find the top predictors

The top predictors that produce the best classification result on a particular dataset can be identified using the following command:

Find the best predictors from the [dataset]

Note: The commands shown in the examples are only representative commands. Different syntax and commands can also be used as long as they are unambiguous and clearly communicates the user intent.