Classification
keywords: ALFA, classification, classifier
OUTLINE
- Load a classification model
- Train a classifier
- Test/Run a classifier
- k-fold cross validation
- Find the best classifier
- Find top predictors
1. Load a classification model
This command basically loads a classifier and the supporting libraries into Alfarvis. Currently supported classifiers include
- Support Vector Machines (SVMs)
- Random Forest (RF)
- Decision Trees (DT)
- Logistic Regression (LR)
To load a model simply type one of the following commands
load SVM
import RF
import decision trees
Once loaded, the classifier parameters would be displayed in an editable panel to the right of the ALFA GUI. These could be edited and saved as per user preferences.
2. Train a classifier
Once a classifier is loaded, we can train the classifier on a dataset or a set of variables. To train the classifier, simply type either of the following commands.
train [classifier name] on the [dataset]
train [classifier name] on [variable1] [variable2] and so on
3. Test/Run a classifier
The trained classifier can be run on a new test dataset by using one of the following commands.
run [trained classifier model] on the [dataset]
train [trained classifier model] on [variable1] [variable2] and so on
4. k-fold cross validation
A classification algorithm can be evaluated on a dataset using leave-one-out or k-fold cross validation as well using either of the following commands.
k fold cross validation using [classifier name] on the [dataset]
leave one out cross validation using [classifier name] on the [dataset]
Once the command is run, an editable panel would pop-up on the right side of the GUI allowing the user to select a desired k value.
5. Find the best classifier
A set of classification algorithms can be evaluated on a dataset using k-fold cross validation to identify the best algorithm for that particular dataset.
Find the best classifier among [classifier1] [classifier2] and [classifier3] on the [dataset]
6. Find the top predictors
The top predictors that produce the best classification result on a particular dataset can be identified using the following command:
Find the best predictors from the [dataset]
Note: The commands shown in the examples are only representative commands. Different syntax and commands can also be used as long as they are unambiguous and clearly communicates the user intent.