javascript - Exists a way to let crawlers ignore parts of a document? -

- January 15, 2014

i aware can control documents crawler/spider can access robots.txt, meta tags, link attributes , on.

but in special case want portion of document being ignored. portion can not exist in iframe, "normal" content. <noscript> blocks amazing, mark partial content "don't index this, please."

first, thought using document.write() write out parts, learned assumption "spiders not execute javascript" seems wrong.
i thinking serving different version of page when detect crawler, not accurate, right?
also, can not put content on image.

are there tricks avoid getting specific part of document (not specific words spread around document) indexed?

[edit] know "if user agent in list of robots", don't idea. possibly there more agnostic approach. part suppressed contains dynamic content , whatever doing, has work "old" browsers ie6 :\

only difference between static content , dynamic content extension of file include:

var extension = "js"; // change "php" example load dynamic content function loadjs(filename){   var js=document.createelement('script')   js.setattribute("type","text/javascript")   js.setattribute("src", filename)   document.getelementsbytagname("head")[0].appendchild(js); } window.onload=function() {   loadjs("somecontenttoload."+extension); // hard crawlers read  }

in somecontenttoload.js:

document.getelementbyid("content").innerhtml="this static";

in somecontenttoload.php

<?php    header("content-type:text/javascript");   // load data database   $bla = .....;   ?>   document.getelementbyid("content").innerhtml="<? echo $bla; ?>";

Search This Blog

Permission

javascript - Exists a way to let crawlers ignore parts of a document? -

Comments

Post a Comment

Popular posts from this blog

java - Jmockit String final length method mocking Issue -

asp.net - Razor Page Hosted on IIS 6 Fails Every Morning -

c++ - wxwidget compiling on windows command prompt -