R, Statistics and Visualization

Visualizing Web Analytics in R Part 4: Interactive Globe

This article is the fourth in a series about visualizing Google Analytics and other web analytics data using R. This article focuses on using interactive globe visualizations to show where search-related website traffic originates. The series hopes to show how R and interactive visualizations can help to answer the following business questions:

  • Which articles need work to improve search engine ranking
  • Which articles are well ranked but do not get clicked, and need work on titles or meta data
  • Where to focus efforts for new content
  • How to use passive web search data to focus new product development

The other articles in the series are:

Globe Showing Relative Page Use

Geographic visualizations are increasingly important in understanding many types of data. The section that follows shows a way to visualize Google Search Console and Google Analytics data on an interactive globe using the gblobejs call in the threejs package. This type of visualization is especially suited to origin-destination pair data like airline or telecommunications data, but is still very useful for point data like web analytics.

The first step in this process was to find geocoded values for the ISO Region Codes provided by Google Analytics. Country data is readily available, but state or province level data is more difficult to obtain. The analysis in this article aggregates data at the country level.

Figure 1 shows the country of origin for sessions for each of four categories of article:

  • General articles are blue
  • Web commerce articles are red
  • Consumer articles are yellow
  • Banking articles are green

Because the globejs command does not allow different arc heights, the latitude/longitude of the origin is jittered so that the different arcs do not overlap and are all visible. The session volume is scaled and applied to the line width, but this is not particularly easy to read in this visualization. This visualization makes it clear that the vast majority of traffic comes from the United States and Europe, and that “general” articles are only used in the US and Europe.

Figure 2 shows the same session data, but in a geographic bar chart format. For this particular dataset, this visualization is easier to understand–and easier to generate, as it does not require artificially generating the origin-destination pairs. In Figure 2, it is clear that the US and developed world generate the vast majority of the traffic.

Figure 1. Globe origin-destination visualization of Google Analytics session data by country or origin. Blue arcs show traffic for general articles, yellow arcs show consumer article traffic, red arcs show web-commerce article traffic and green arcs show banking article traffic. The line width for the arcs shows the relative traffic volume.
Figure 2. Globe bar chart visualization of Google Analytics session data by country or origin. Blue bars show traffic for general articles, yellow arcs show consumer article traffic, red arcs show web-commerce article traffic and green arcs show banking article traffic. The line height for the arcs shows the relative traffic volume.

These figures are useful understanding the regional patterns in the data and make it very easy for users to combine the visualization data with their understanding of underlying geographic and demographic information. Unfortunately, the globe visualizations can’t really show article-level detail. For article-level detail, a heatmap is really a better visualization; interactive heatmaps using the d3heatmap package are demonstrated in the next article in the series, Visualizing Web Analytics in R Part 5: Interactive Heatmap.

Notes

This article was written in RStudio and uses the threejs package for all graphics.