RSS

Six Degrees of SEO Separation : Can We Reverse Engineer TrustRank and Hilltop?

0 Comments | This entry was posted on Jul 08 2009

Imagine playing the “Six Degrees of Separation” game, but with websites. The closer you are to having a link from Maclean’s magazine or Time, the better.

The ideal would be to actually be Maclean’s or Time, obviously.

But short of that, getting a link from either site is excellent (1 degree of separation). Getting a link from a site that was linked from Time is pretty good (2 degrees). Getting a link from a site that was linked from a site that was linked to by Time is good (3 degrees). A link from a site, which was linked from a site, which was linked from a site, which was linked from Time is OK…

TrustRank and HillTop are two search engine algorithms that are based on “seed set” of trusted sites. The closer you are to having a link (or several) from the trusted sites in the seed set, the more trust you have.

More trust means higher rankings. What if we could measure the trust our site has, or that of our competitors? Here’s an idea I just had to do exactly that.

Ironically, one of the most competitive keywords to rank for online is probably not something anyone has intentionally tried to rank for. I’m referring to “click here.” Those huge sites that have millions of links, many of which use “click here” as anchor text, end up ranking for that phrase.

And while the correlation isn’t perfect, the sites that rank for click here tend to be pretty well trusted authorities.

  • So perhaps by Googling “click here,” we can form a “seed set” of trusted sites?
  • Once you have that seed set, you can come up with search keywords in Yahoo Site Explorer that let you look for links from the likes of Time.com etc. This will show you any first degree links passing a lot of trust to the recipient.
  • If you build a scraper and get clever with it, you can even dig through some of those linking sites’ backlinks, and find 2nd degree trust relationships.
  • And while the use/benefit of this data may not have been initially evident, anyone using SEO for Firefox can look for .edu or .gov sites in a given site’s backlinks. Typically, these are pretty well trusted links too.

So SEJ readers, what do you think? Is this a valuable tip for finding out how much trust your competitors have? Is it too imprecise, or is an approximation like this better than nothing? Also, here’s a ping to @SEOmoz – Is this how your mozTrust metric works?

Gab Goldenberg writes on advanced seo for his blog, often sharing new techniques and ideas.

Check out the SEO Tools guide at Search Engine Journal.

Six Degrees of SEO Separation : Can We Reverse Engineer TrustRank and Hilltop?


Interview with Microsoft Bing’s Rajesh Srivastava, Principle Group Manager, Bing

0 Comments | This entry was posted on Jul 08 2009

Bing was launched by Microsoft as a consolidated search engine brand and technology which the company hopes to compete side by side with Google. Bing’s much awaited launch caused some real buzz in the Internet world. Today we are featuring the interview with Rajesh Srivastava, Principle Group Manager, Bing.

First question that probably interests everyone: why no Bing toolbar? I understand you may be tired if people asking you that – but still, just a few words on Bing plans on that.

We currently have two amazing toolbars in the MSN toolbar and Windows Live toolbar, and the search functionality for both is powered by Bing. We’re always listening to customers to learn what they want and introduce new products appropriately, however we don’t have any plans for a Bing toolbar to announce at this time.

How long did it take the whole Bing team to choose and approve the name?

We went through an extensive naming process to develop the new brand name: months of development and worldwide naming research helped us narrow in on a few top candidates. We did all the usual things you would expect: trademark searches, WHOIS lookups, usability and recall studies.

We were looking for a name that was short, easy to say and spell, and would be globally appropriate. In addition, we were looking for names that carried inherent qualities that spoke to the search category itself. Our research around Bing showed that it connoted “fast,” “easy,” and “delight” – all qualities that mapped very naturally to the search experience.

It was also seen as the friendliest and most approachable name option. We like Bing because it allows for a new experience beyond search. It sounds off in our heads when we think about that moment of discovery and decision making – about resolution of the important tasks we all think are important. 

Are you generally satisfied with publicity. There have been a lot of positive reviews as well very good trends. What do you think about how Bing was accepted?

The launch of Bing was a milestone for us and we’re pleased with the initial reviews, but this is only the beginning. We believe there is much yet to accomplish in search and the positive feedback we’ve seen to this point is confirmation that people want more from search. Bing is a first step on this journey of evolving search into a more refined tool to help customers cut through the Internet’s clutter to make faster, more informed decisions.

How was webmasters’ feedback? Did they find Bing ranking algorithm predictable / clear enough? Or was there much misunderstanding?

We are continuously refining and improving our crawling and indexing abilities, however with Bing, there were no major changes to our indexing which helped with a smooth transition for our webmaster community when we launched Bing. We continue to engage directly with our webmasters via forums and the Webmaster blog and are focused on making the experience transparent and predictable for this core audience.

A very good thing about Bing is its team willingness to interact with webmasters and searchers… How (fast) is users’ feedback implemented?

We maintain and manage an active community site for all our customers including webmasters, developers and our every day searchers. The community site features a blog where we share updates on features, details behind the design and development of Bing and tips and tools for increasing engagement with your site. 

We work as quickly as possible to respond to and implement feedback from customers. For example, when we launched Bing, our community made us aware of an issue with our video preview functionality. We worked quickly to resolve the issues and within a couple of days we had addressed the situation and updated our community on the blog.  It’s because of the two way dialogue we maintain with our passionate community that we are able to address and fix issues quickly.

How do you leverage social media to communicate with webmasters?

We see social media as an excellent way to communicate with webmasters and all our customers. In fact, last week we made available the Bing Toolbox which is designed to build community and provide all of the content and tools webmasters and developers need to enhance their sites, understand the impact of Bing’s new features, and get the most out of using the Bing API.  With the launch of Bing, we also launched a new Bing community site focused on engaging our broad search community including webmasters.

We publish regular blog posts and have active forum discussions on topics including SEO, SEM, and site architecture.  In addition to our community sites, forums and blogs, we’ve developed a strong following on Twitter where we have a team monitoring and responding to issues regularly. We also have a presence on Facebook which we use to communicate updates on Bing as well.

Launching a search engine in the social media era, do you feel it was worth trying to make the search more "social" (most established search engines are now trying to introduce some social search features like voting and commenting). Do you plan to start experimenting with user-enhanced search features?

Our main focus right now is providing the best search experience for our customers. We’re always looking at areas where we can grow and innovate, and social and user-enhanced search features are certainly on our list.

For example, through Facebook and Twitter we are engaging our users in cool and interesting ways. We continue to hear that one of the favorite features of Bing is the rich and interactive daily image that appears on the Bing homepage. Due to the great feedback through channels like Twitter and Facebook, we decided to open photo submissions and voting to the public in the form of a Bing Photo contest on Facebook. The winning photograph will be featured as the homepage image on Bing on August 3rd and the winning photographer will be given credit for the photo.

We’ve also created a photo sharing app on Facebook in which people can share or tag a Bing homepage image they like. This type of voting and commenting helps us better understand what our users like to see.  So these are a few ways we’re using social media to engage our users.  We’re also working with twitter to incorporate searches for tweets into Bing. See our recent blog post for more details.

Meet Bing team on Twitter and Facebook!

Check out the SEO Tools guide at Search Engine Journal.

Interview with Microsoft Bing’s Rajesh Srivastava, Principle Group Manager, Bing


Using the Adcenter Excel Plugin for Keyphrase Research

0 Comments | This entry was posted on Jul 08 2009

Posted by Tom_C

While brainstorming potential blog post topics I decided that keyphrase research doesn’t get enough love here and I merrily agreed to write a post on the topic. Little did I know that this would result in a morning of maths. Monday morning to be precise. Monday maths morning. I might make that a recurring theme.

This post is going to be all about the adcenter Ad Intelligence plugin for Microsoft Excel. For those that don’t know what it is, this is a nifty plugin for excel which allows you to import keyphrase data into your spreadsheets.

Why do you care about Ad Intelligence

Reasons you should care about the Adcenter Ad Intelligence plugin for Excel:

  • It’s really easy to install
  • It works from right within excel so is really easy to use with existing keyphrase data
  • Data feels like it’s right at your fingertips – running functions and querying data is VERY quick.
  • The comparative keyphrase traffic levels are well correlated with Google’s so you can use this data to predict Google data with a high degree of accuracy*

Note that you’ll need to have an adcenter account to use the plugin but you can sign up for one of those relatively easily and given the above benefits I think it’s worth it.

* – Please see below for some maths notes. This is NOT a linear correlation so you can’t just use a multiple of the data to generate the data. The important take-away for non-maths types is that the ORDERING of the keyphrases with respect to traffic levels is well correlated to Google’s data.

Screenshots

The plugin integrates very smoothly with Excel (as you would hope!) and can be installed very quickly, once installed you see a new tab within Excel with the following options:

Or, if you’re in the UK you get fewer options:

There’s a couple of awesome things to realise about this:

  1. Since it’s just another tab within your excel it’s going to integrate with zero hassle into any other data that you can format in excel – for example a keyword export from adwords or from google analytics…
  2. There’s a country/language dropdown which allows you make sure you’re getting the correct data. International support FTW (though this is limited to only a few countries at the moment).

But what do all these functions do?

Keyword Extraction

You provide the URL and the tool returns a bunch of relevant keyphrases. Here’s the data for SEOmoz:

Keyword Suggestion

The keyword suggestion tool is really handy for doing speculative keyphrase research – trying to find keyphrases that you might have missed. There’s three aspects to the keyword suggestion tool.

The first is the "Campaign Association" which looks at the keywords that other advertisers have in their account as well as the selected phrase. For example here is the data for the term "football" (from the UK) and you can see that it contains nicely related phrases but which don’t always contain the keyword in them. Note how this has a commercial bent since it’s pulled from bidding data not search data:

The second is the "Queries that contain your keyword" which returns everything you might consider to be a phrase match in Google Adwords and is pulled from search data so doesn’t skew the data to commercial phrases:

The third is the "Related Search" which is most useful for pulling in keyphrases you might not have thought of. The following is the data returned for the phrase "car insurance" – you can see that not all keyphrases mention car or insurance (note that this is also pulled from search data not bidding data):

Popular Keywords

The popular keywords tool is useful for analysing trending terms and spikes in your data. I’m not as convinced about the usefulness of this data – looking at it I think tools like Google Trends are probably more useful. Here are the screenshots for the Animals Category (because the data it pulls out is amusing and all about monkeys):

"Traffic Spikes":

"Top Frequent Search Queries":

Traffic

This is, for me, one of the most interesting features of the tool and probably the one that I will use most frequently. This tool generates monthly (or daily) traffic figures for a list of keyphrases that you select. For example:

Note that this data has been chopped up a bit to fit into a blog post – in reality it gives you data going back 12 months as well as the projected data for the coming 3 months. This is pretty awesome. Of course, it’s only awesome if the data is accurate right? Suffice to say that the data is pretty damned good. As stated above, the data correlates very well with the Google search volumes so you can use this tool to order keyphrases in order by traffic levels very quickly. Note though that the correlation is NOT linear so you can’t apply a simple multiple to the data to generate the estimated Google search volume. More on this in the maths section below.

Demographics

The demographics tool is very cool and not only gives you a breakdown by age (truncated in the screenshot below – in reality the percentages add up to 100) but it also gives you a gender breakdown which can be very useful information:

Note how this data at least correlates with what you’d expect – the phrase "auto insurance quotes" is heavily male dominated while the more generic "car insurance" phrase is split about 50:50.

Summary

In summary, this tool is essential for performing keyphrase research – not only is the data solid but it’s also lightning quick (compared to other sources anyway) and provides you with some information which it can be very difficult to get elsewhere (such as the demographic split). There are a few other features which I’ve not included here, mainly related to bidding on adcenter or related to the content network. They’re not so applicable to keyphrase research so I’ve not included them here. Perhaps a future blog post!

If you’re a Non-maths Person (NMP) then you can stop reading here.

Maths

So while the above is all very well and certainly shaded pretty colours (that’s all done by default by the tool!) it’s pretty useless if the data isn’t of a high quality. I’ve spent a fair amount of time trying to ascertain whether the traffic data is worth using or not and the conclusion is that the Google Adwords traffic data that you get through the Adwords Tool is well correlated to the traffic data generated by the Ad Intelligence tool, but the correlation is not linear. Instead, I’ve found for UK data that the following equation determines the relationship:

\Large Google Data=81*(Adcenter Data)^{0.8}

How did I determine this formula?

Step 1 – Pull out keyphrase data levels from the adwords tool and the ad intelligence tool for the top football keyphrases (as determined by the Google tool)

Step 2 – Look at the correlation between the two data sets.

This provides some really positive-looking graphs like this one for instance over the top keyphrases related to ‘Football’:

However, looking at this data more closely we see that this data is severely skewed by the head terms – they’re so much bigger than the small values that the correlation coefficient is essentially looking at a straight line between the small figures and the one or two head terms. Removing the top four or five head terms we see that the correlation quickly falls apart:

Step 3 – I pulled out the data for plenty of other verticals and compared the data from Adwords and Adcenter.

Collating all the data together across a number of verticlas we see that there isn’t a particularly strong correlation:

This is still being skewed by the head terms. Trying to come up with a way of smoothing the data I decided to analyse the logs of the traffic levels. What I saw was quite a remarkable result:

This shows a strong correlation over a large data set and implies that there is an underlying correlation but that it’s not linear. In fact, this is exactly the forumla (y=0.8055x +2.1158) that I’ve used to generate the above equation for estimating Google data from Adcenter data.

Please note, one thing that I’ve not really touched on in this post is the fact that the Google Data isn’t 100% accurate (and in fact can be quite a way off at times) so I’m not advocating that you can find the true search volume, merely that you can find a relation between the data that Google gives you and the data that Adcenter gives you.

Do you like this post? Yes No