Basic Instructions

Input a text to receive response from Alfarvis

Enter Bye to close the program

Alfa is data exploration and analysis software that uses natural language


To begin data exploration, you should first load a dataset using the "load command". For example "load tour de france dataset" to load a cycling dataset


To get a list of datasets that can be loaded, call "list files"


Once you have loaded a dataset, you can perform different sets of commands : visualization, statistics, machine learning


You can list commands using "list commands" or "list visualization/statistics... commands" for specific command list


You can also get individual help for each of the commands by calling "help [command name]"


Apart from the above commands, there are also data handling commands that can set row labels, set reference for setting legends etc


You can get a list of data handling commands by calling "list data handling commands"

______________________________________________________


Load the breast cancer dataset and set reference to diagnosis

User: load breast cancer data set

Alfa:

Running command in background!

Loaded breast cancer tumor

Showing Statistics for breast cancer tumor

Loaded file: breast_tumor_data.csv

______________________________________________________


User: set reference as diagnosis

Alfa:

Setting reference to diagnosis

______________________________________________________


You can get some summary statistics of any feature - mean, median, standard deviation ,etc.

User: what is the mean of area

Alfa:


Feature

Mean

area

654.89


Saving result to mean.area

______________________________________________________


You can also do any of the summary statistics for each label/group

User: what is the groupwise mean of area

Alfa:


area

diagnosis

462.79

1.00

978.38

2.00


______________________________________________________


Similarly, you can do a summary statistic or groupwise summary of the complete dataset or a selected set of features. See below

User: summary of breast cancer dataset

Alfa:


features

max

mean

min

stdev

area

2501.00

654.89

143.50

351.91

compactness

0.35

0.10

0.02

0.05

concave.points

0.20

0.05

0.00

0.04

concavity

0.43

0.09

0.00

0.08

diagnosis

2.00

1.37

1.00

0.48

fractal.dimension

0.10

0.06

0.05

0.01

id

911320502.00

30371831.43

8670.00

125020585.61

perimeter

188.50

91.97

43.79

24.30

radius

28.11

14.13

6.98

3.52

smoothness

0.16

0.10

0.05

0.01

symmetry

0.30

0.18

0.11

0.03

texture

39.28

19.29

9.71

4.30


______________________________________________________


User: groupwise summary of breast cancer dataset

Alfa:


1_mean

1_stdev

2_mean

2_stdev

features

pValue: 0 vs 1

462.79

134.29

978.38

367.94

area

0.00

0.08

0.03

0.15

0.05

compactness

0.00

0.03

0.02

0.09

0.03

concave.points

0.00

0.05

0.04

0.16

0.08

concavity

0.00

0.06

0.01

0.06

0.01

fractal.dimension

0.77

26543824.62

116739662.03

36818050.44

137896549.64

id

0.36

78.08

11.81

115.37

21.85

perimeter

0.00

12.15

1.78

17.46

3.20

radius

0.00

0.09

0.01

0.10

0.01

smoothness

0.00

0.17

0.02

0.19

0.03

symmetry

0.00

17.91

4.00

21.60

3.78

texture

0.00


______________________________________________________


You can use any of the visualization commands shown in the visualization section to visualize and explore different characteristics of the dataset. A few examples are shown below

User: bar plot of fractal.dimension and smoothness

Alfa:


Saving result to figure.1.bar

______________________________________________________


User: violin plot of symmetry and concavity

Alfa:


Saving result to figure.2.violin

______________________________________________________


User: scatter between perimeter and radius

Alfa:


Saving result to figure.3.scatter2d

______________________________________________________


User: multiscatter between compactness concave.points and texture

Alfa:


Saving result to figure.4.multiscatter

______________________________________________________


User: correlation between symmetry smoothness radius perimeter texture

Alfa:

Displaying the result as a heatmap


Saving result to figure.5.correlation

______________________________________________________


User: ttest of breast cancer dataset

Alfa:


features

pValue: 0 vs 1

area

0.00

compactness

0.00

concave.points

0.00

concavity

0.00

fractal.dimension

0.77

id

0.36

perimeter

0.00

radius

0.00

smoothness

0.00

symmetry

0.00

texture

0.00


______________________________________________________


User: roc analysis of breast cancer dataset

Alfa:


AUC

features

0.94

area

0.86

compactness

0.96

concave.points

0.94

concavity

0.52

fractal.dimension

0.55

id

0.95

perimeter

0.94

radius

0.72

smoothness

0.70

symmetry

0.78

texture


______________________________________________________


User: save notebook

Alfa: