A-B Testing Lab
Exploratory Analysis
This data is collected from a homepage, with a control group and experiment group, and whether or not they viewed the page or clicked the page. It includes the timestamp of action as well as the participant id.
# Create an object of the experimental group
eids = set(df[df.group=='experiment']['id'].unique())
# Create an object of the control group
cids = set(df[df.group=='control']['id'].unique())
# Print the overlap of the groups
print('Overlap of experiment and control groups: {}'.format(len(eids&cids)))
Conduct a Statistical Test
#Convert clicks into a binary variable on a user-by-user-basis
control = df[df.group=='control'].pivot(index='id', columns='action', values='count')
control = control.fillna(value=0)
experiment = df[df.group=='experiment'].pivot(index='id', columns='action', values='count')
experiment = experiment.fillna(value=0)
print("Sample sizes:\tControl: {}\tExperiment: {}".format(len(control), len(experiment)))
# Sample sizes: Control: 3332 Experiment: 2996
print("Total Clicks:\tControl: {}\tExperiment: {}".format(control.click.sum(), experiment.click.sum()))
# Total Clicks: Control: 932.0 Experiment: 928.0
print("Average click rate:\tControl: {}\tExperiment: {}".format(control.click.mean(), experiment.click.mean()))
# Average click rate: Control: 0.2797118847539016 Experiment: 0.3097463284379172
import flatiron_stats as fs
fs.p_value_welch_ttest(control.click, experiment.click)
# 0.004466402814337078
Analysis:
Does this result roughly match that of the previous statistical test?
Comment: Yes, both p-values are < 0.05, so we would reject the null hypothesis, indicating the the experimental page has a more effective design than does the control page.
Attribution: The Flatiron School