This past week I have been studying the Google Analytics data to determine the best topics to discuss in the upcoming month and to prepare the SAS Global Forum 2012 conference. Google Analytics allows me to measure which posts get the most hits when posted and continue to be popular or referenced. I’ll unveil my little known, but highly coveted analytics process. How’s that for some hype?

Getting the Google Analytics Data

In a past article, I discussed how I extracted the data from Google Analytics using Excellent Analytics. The following figure shows my query in the Excellent Analytics tool, the data results, and the results in SAS Enterprise Guide. The data consists of the date, post path and title (with URL), source (how person came to site), and visitor type. It’s probably obvious why I would want the date, post, and source but what is not so obvious is visitor type. Visitor Type allows me analyze what topics bring new visitors to the site as well as the topics that retain visitors.

Cleaning Up the Data

My original plan was to use MS Excel to complete this analysis since I thought it would be simple. However, I realized that the data needed some cleanup and I wanted to place it in categories (similar posts, similar sources, etc). Obviously, SAS Enterprise Guide was a better choice for this task. I could clean the data and categorize it to my heart’s content.

Good thing I know how to parse data, because the variables I want are page_path or page_title. I need the data to be a little more categorical. When I started blogging, I let WordPress name my articles. Now I try to include the topic in the page path (URL) so I can parse it. The blog topics generally fit into about 10 SAS BI topics, so it’s easy for me to control in this manner. If there were more people writing for site, I would probably find a way to use the WordPress categories.

This figure gives you an idea of how I cleaned the data. Since page_path can have a lot of variations – you have to look for patterns. For instance, all posts have the year and month to begin so those values always have more than 3 slashes. The value represented as “/” is when someone comes to the site home page and the “/yoast-ga/” is when someone followed a link I had on the site.

Let’s Bounce …

To shorten my story, the data was put into a dataset, pulled into SAS Information Map Studio, and then lovingly placed in SAS Web Report Studio. In the report, I started with the unadulterated view, which was similar to this figure. First, it is everything from the past three months and there is too much to try to review at once. Notice that I added the Bounce Rate – which explains if the visitor viewed more pages after looking at the landing page. Since this is a blog, it’s not odd that the bounce rate is high. Typically, someone reads the post for the day and then goes on about his business. This chart does give me some ideas – for instance, BI Dashboard and SAS Management Console have a very low bounce rates.


I cleaned up the chart so I could more easily understand what topics were the most popular last month. I added a filter and ranked the data by visits. I always throw out the Site Home Page visits because many people enter the site that way and by default, it’s always the highest. Likewise, many people are interested in the upcoming book, “Building Business Intelligence with SAS” so that link ranks high. This leaves me with Enterprise Guide, Stored Processes, and OLAP Cubes. SAS EG is a favorite tool, but it’s difficult to find new things to discuss (The SAS Dummy (aka Chris Hemedinger) rules that realm!). Steve has OLAP covered and he has some good articles planned, so I’ll let him continue those discussions.

I am curious about what attracts new visitors, so I added another line to see what percentage of topics attracted new users. New users were attracted to the stored processes topic and it had a lower bounce rate. This is a topic close to my heart, so I want to explore it a little more. As a result, I created a stored process that I could setup as a link from this chart.

What is so interesting about Stored Processes?

From the above chart, if you click on any of the bars, the stored processes uses the data in a prompt and produces the chart below. The stored process is a PROC TABULATE that filters by the previous month and shows the pages associated with the topic. The results show that the new visitors were interested in prompts and the returning visitors were interested in prompts. Looks like I need to think of some more ideas about prompts or creative ways to use a stored process. For instance, linking a SAS Web Report Studio chart to as stored process as I did for this post. ;-)

 What about You?

Is there a topic you want to see more about?  If you want to prompt me with a stored process idea – leave a comment or drop me a line. If you have some other ideas, just let me know.