Design of controlled experiments and comparison tests

  • Participants (also known as subjects, or respondents for surveys). Ideally these people are actual users. However, it is often possible to find substitutes for users who would produce comparable results.
  • Variables
    • Independent variable. This is the condition or the design that the experiment (or survey) manipulates. It can be a continuous value (e.g. font size) or it can be a discrete value (e.g. product ABC or XYZ).
    • Dependent variable. This is what the practitioner measures and compares among the different conditions or products.
  • Condition assignment
    • Between-groups design. Participants are randomly assigned to groups. Each group is tested with only one condition. The difference in peformance is compared across groups.
    • Within-groups design. Each participant is tested with several conditions. The difference in performance is compared for each participant.

Issues for discussion

  • Advantages of the different designs
  • Order effects and counter-balancing

Using statistical tests

A statistical test (e.g. T-Test, ANOVA, CHI-Squared) can determine if the difference between the conditions (or products) is real, or simply due to chance.

Usually there is a difference. The null hypothesis asserts that the difference between the conditions is due simply to chance. The alternate hypothesis (also called the experimental hypothesis) asserts that the difference between the conditions is real. That is, the difference between the conditions (or the difference between the products) caused the difference with the dependent variable.

A statistical test produces the probability that the observed difference could occur assuming the null hypothesis. If this probability is small (e.g. less than 0.05), the null hypothesis is rejected and the alternate hypothesis is accepted.

Statistical tools can also provide confidence intervals for the averages of each condition.

Many statistical tools can be found online. This online tool handles a variety of t-tests for comparing dependent variables from two categories.