
Tracking cookies. Advertising links. The things that make every web user spit teeth. Is there a way for Google to get around them?
According to a nice article I read by a guy called Bill Slawski (@bill_slawski), yes. In theory it is possible for Google to crawl a page and then recrawl it a few seconds later, to see if any of the content on the page has changed. This is transient content, and it can be a signifier of the kind of thing that the search engine really doesn’t need to index (given that it changes all the time, what’s the point?).
Indexing a site for any transient (for a short time only/temporary) content seems to be not in the best interests of the search engines. Even if the content is not an ad, it is still unlikely to be particularly apposite to the core subject of the page in question otherwise it would not be changing. I am thinking for example of weather data on a surfing website – which is useful to users, who need to know wind speed and direction, but isn’t actually the cut and thrust of what the page is about. If Google indexed that page according to its transient content as well as the static content, it would not know if it was looking at a page about surfing or about the weather.
A Token Gesture
Google can identify transient content by regularly crawling HTML and splitting the results into simple tokens. In my example above, these tokens would be basically identical except where the text of the weather information between the relevant tags had changed. By parsing these tokens out into a data table Google can quickly analyze the similarities and differences on the page to determine the transient content – and ignore it.
Time to Reflect
Obviously, content changes pretty regularly on successful websites anyway. It is Google’s job to make sure that it is absolutely clear on what is transient content and what is not. Otherwise everyone starts to get penalized for doing exactly what Google always told us we should, namely keep our content as fresh and changeable as possible.
The only way to do this is by repeated re crawling, with an algorithm in place to determine the frequency of content change in the same part of the page. If content changes minute to minute, for example, it can safely be assumed to be transient and left out of the page’s index. If on the other hand the content is only changing once a week, it might be that it is part of a regular update and is still pertinent to the core subject enshrined in the rest of the content.
What About the Rest of the Site?
Google can also look at other pages within the same site, to see if their HTML has similar code paths in it to the transient content already identified. If an identical or similar HTML path exists Google can check it using the same rapid fire technique to see if it, too contains transience.
So What’s the Point?
Transient content may well be damaging to your site’s index. Because it dilutes the core message of your page. You’ve fought hard to get your site optimized for perfect indexing on all of our major key words and phrases. It would be a shame for a simple date counter to offset some of that hard work by diminishing the watertight authority of your site.
Transient content is not the only content that could affect the strength of your indexing for your keywords. Boilerplate, which is stuff like menu and navigation bars, needs to be weighted with a diminished rating to prevent it, too, from diluting the main thrust of your page. In a similar routine, Google can check HTML in a page against HTML in other pages in your site to determine how much boilerplate you have. While the links in your navigation are optimized, the existence of some can diminish the authority of each page because they refer to other pages, which are about other things.
Google’s algorithms are there to help properly optimized sites get good rankings. Be aware of what you can and can’t do.
Guest Author : Roxanne Peterson, working closely with different communities like SEO Positive, a social media expert as well as an expert in internet communications and technologies, has an all round expertise in social media and SEO activities. With an intense knowledge of the latest developments in the field she likes to share them with her readers through her blogs and articles regularly.

January 25th, 2012
Guest Authors
Posted in
Tags: 

