# Dr. Stat

This is a demo of my chatbot called Dr. Stat.

It can answer questions about statistics and help a user select an appropriate statistical technique.

I’ve been working on this for quite a while now. Chatbots certainly aren’t anything new, but chatbots that are useful and display a more ‘natural’ way of conversing are rare.

I chose the domain of statistics for two reasons. Firstly, many of my students (and researchers I know) struggle with statistical concepts and I wanted to create something that would make their lives easier. Secondly, this domain has proven to be suitable for trying out different ideas about dialogue management.

Read on for a transcript with annotations.

The dialogue starts with the system asking how it may help the user. When humans converse we are driven by goals. A goal can be to find out something, or to inform someone of something. Goals can trigger other goals. When the system starts a new conversation, its goal is nothing more than finding out what question(s) the user has for the system. To complete this goal the user triggers an action that consists of asking how it may help the user.

Aswin Dunno
Dr. Stat OK. Do you want to know which statistical test you should use?

Whenever the system outputs OK it is indicating that it has understood the user’s intent. In this case the user intended to say that it doesn’t know how the system can help. To help the user a bit, the system makes a guess. This is still the system trying to complete the goal of finding out how it may help the user.

Aswin Yes
Dr. Stat OK. Let’s see. How many dependent variables do you have?

The user can reply by typing or by pressing buttons. When the user confirms that it wants help selecting an appropriate test, the system changes its goal. It is no longer finding out what the user wants as this is known now. The new goal becomes figuring out what statistical technique is most appropriate for the user. To be able to answer this question the system needs some information first and starts collecting this information by asking questions. The “Let’s see” indicates that the system has found a plan to help find an appropriate test for the user. This plan consists of a series of questions and choices. The first question is about how many dependent variables there are.

Aswin What’s a dependent variable?
Dr. Stat A dependent variable is the outcome or effect variable. How many dependent variables do you have?

If the user is unable to answer this question because it doesn’t know what a dependent variable is it can simply ask this question. The system makes answering this question its top goal and gives an explanation of what a dependent variable is. The system keeps track of previous goals and after answering it goes back to fulfilling its previous goal, namely finding out how many dependent variables there are, which was to be answered to find out which test is most appropriate.

Aswin Two or more
Dr. Stat Okay. How many independent variables do you have?

After the user answers this question the system makes finding an answer to the next question its top goal, and asks the next question.

Aswin Ehm, I mean, I have one dependent variable.
Dr. Stat OK. You have 1 dependent variable. Is the variable continuous or is it categorical?

The user is able to revise previously given answers. In this case the system reloads the plan and comes up with a different follow-up question. It also confirms that it has understood the user actually has only one dependent variable. Its knowledge base is also updated with this new information.

Aswin It’s continuous.
Dr. Stat OK. How many independent variables do you have?

The user provides another piece of information that the system stores in its working memory.

Aswin I have one continuous independent variable.
Dr. Stat OK. Does the data meet assumptions for parametric tests?

If the user had just answered that it has one independent variable, the system would have asked if the variable is continuous or categorical. In this case the user has already provided that information, so this extra question is skipped.

Aswin What does that mean?
Dr. Stat Okay. Let’s see. For practical purposes, you can think of “parametric” as referring to tests, such as t-tests and analysis of variance, that assume the underlying source population(s) to be normally distributed; they generally also assume that one’s measures derive from an equal-interval scale. And you can think of “non-parametric” as referring to tests that do not make on these particular assumptions. Does the data meet assumptions for parametric tests?

A question like “What does that mean?” is highly dependent on context. What does “that” refer to for example. By using system goals as context the intent classifier is able to classify this question as meaning “What does parametric mean?”.

Aswin How can I check for normality?
Dr. Stat Okay. Let’s see. Do you want to check for normality visually or do you want to test for normality?

The system is capable of task switching. In answering how the user can check for normality, a temporary new plan (and goal) is activated and causes the system to suspend the current goals of finding out if the data meet assumptions for parametric tests and finding out which statistical test to use. This temporary plan needs to be completed first before going back to the original goals.

Aswin I wanna test for it.
Dr. Stat OK. Do you have more than 50 cases?

One more question needs to be answered before the system can give advice on how to test for normality.

Aswin Yes
Dr. Stat OK. You can test for normality by using the Kolmogorov-Smirnov test. Let’s find out which statistical test you should use. Does the data meet assumptions for parametric tests?

The system completes this temporary goal/plan by informing the user how it can test for normality and returns to the original goal and reloads the original plan. The system tells the user it is returning to its original plan by saying “Let’s find out which statistical test you should use”. Then the system asks the last unanswered question again.

Aswin Yes
Dr. Stat Okay. You should use Pearson correlation or you should use regression.