two groups of people separated on two sides and looking at each other

Testing Behavioral Differences Between Two Sets Of Visitor

Built-in reports in web analytics provide a lot of information but lack statistical measures on how significant reported data is, such as are visitors from search engines spend more time than direct access. We may be able to answer by having counts of page views, visits, time spent, etc for different sources and compare. But does 100 seconds per visit from search traffic against 90 seconds per visit from direct access mean visitors from search engines spend more time than direct access? Such as the following, do visitors from Country B view more pages per visit than visitors from Country A?

comparisons

The question could be answered by a simple t-test by comparing the means of two different sets of data. But first, we need to prepare the data. The following is a sample data structure trying to test the difference in time spent per visit for visitors located in two different countries.

Visitor IDCountryNumber of VisitTime Spent

You may get the data using Data Warehouse if you are using Adobe Analytics, or Reporting API if you are using Google Analytics. However, you may not have the Visitor ID if you are using GA and didn’t set up User ID tracking, as you cannot get the visitor tracking cookie from Reporting API. If this is the case, you may want to get the data like the following.

DateCountryTime Spent per Visit

The above two data structures give different interpretations of the final analysis. The one with Visitor ID tested the null hypothesis that visitors from different countries spent the same amount of time on each visit. The one with Date is testing the null hypothesis that the average daily time spent per visit from different countries is the same.

I will continue the example with the first data structure as I just have the Visitor ID. The data exported could be a very large data set depending on your website traffic and the date range you selected for data export, and it may take a long time to run the t-test depending on your computer processing power.

The t-test was conducted using R and RStudio. Running the t-test firstly gave the means of average time spent per visit for two testing countries, 189.74 seconds for country A and 173.62 seconds for country B in my case. Back to the original question, is 189.74 seconds more than 173.62 seconds?

Then we need to look at the difference range and confidence level. I had specified a 99.75 confidence level when running the t-test and the difference range was 11.66 seconds to 20.59 seconds. So here is the conclusion that there is a 99.75% confidence that visitors from country A spent more time on the website than visitors from country B by 11.66 seconds to 20.59 seconds.

Comments

Leave a Reply

Your email address will not be published. Required fields are marked *