Thursday, June 03, 2004

Search Ranking Factors

I'm trying to footle about with some fairly rudimentary factors to improve relevant results in our intranet search engine. What I want to know is - assuming that a matching word in a document can have a weighting of between 1000 and one depending upon its location within a document (with 1000 being instances that occur very early in the document) - this is set, I cannot alter it - what should be the related weighting for instances in a) titles b) subheadings (h1, h2 etc - which are not, unfortunately very widespread on the intranet) c) link text.

At the moment, I am reckoning that a) and c) should be fairly meaty, and should be the same as document titles are generally used for link text on our intranet, but what should the exact figure be? 150? 500? 1500? There must be a heuristic out there that says a word in a tilte is worth X in the body text! I feel like this might be a trial and error situation... Any suggestions?

