<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Search Engine Marketing &#38; Optimization(SEM &#38; SEO) &#187; Search Engine Related</title>
	<atom:link href="http://semways.com/archives/category/search-engine-related/feed" rel="self" type="application/rss+xml" />
	<link>http://semways.com</link>
	<description>Search Engine Marketing(SEM) and Search Engine Optimization(SEO)</description>
	<lastBuildDate>Wed, 09 Dec 2009 06:04:08 +0000</lastBuildDate>
	<generator>http://wordpress.org/?v=2.8.4</generator>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
			<item>
		<title>Information Search Engines can Trust</title>
		<link>http://semways.com/archives/18</link>
		<comments>http://semways.com/archives/18#comments</comments>
		<pubDate>Sun, 31 May 2009 08:59:57 +0000</pubDate>
		<dc:creator>admin</dc:creator>
				<category><![CDATA[Search Engine Related]]></category>
		<category><![CDATA[Search Engines]]></category>

		<guid isPermaLink="false">http://www.semways.com/?p=18</guid>
		<description><![CDATA[As search engines index the web&#8217;s link structure and page contents, they find two distinct kinds of information about a given site or page &#8211; attributes of the page/site itself and descriptives about that site/page from other pages. Since the web is such a commercial place, with so many parties interested in ranking well for [...]]]></description>
			<content:encoded><![CDATA[<p><span style="font-family: Arial; color: #43443c; font-size: 10pt;" lang="EN-US">As search engines index the web&#8217;s link structure and page contents, they find two distinct kinds of information about a given site or page &#8211; attributes of the page/site itself and descriptives about that site/page from other pages. Since the web is such a commercial place, with so many parties interested in ranking well for particular searches, the engines have learned that they cannot always rely on websites to be honest about their importance. Thus, the days when artificially stuffed meta tags and keyword rich pages dominated search results (pre-1998) have vanished and given way to search engines that measure trust via links and content.</span></p>
<p><span style="font-family: Arial; color: #43443c; font-size: 10pt;" lang="EN-US">The theory goes that if hundreds or thousands of other websites link to you, your site must be popular, and thus, have value. If those links come from very popular and important (and thus, trustworthy) websites, their power is multiplied to even greater degrees. Links from sites like NYTimes.com, Yale.edu, Whitehouse.gov and others carry with them inherent trust that search engines then use to boost your ranking position. If, on the other hand, the links that point to you are from low-quality, interlinked sites or automated garbage domains (aka link farms), search engines have systems in place to discount the value of those links.</span></p>
<p><span style="font-family: Arial; color: #43443c; font-size: 10pt;" lang="EN-US">The most well-known system for ranking sites based on link data is the simplistic formula developed by Google&#8217;s founders &#8211; PageRank. PageRank, which relies on log-based calculations, is <a href="http://www.google.com/technology/" target="_blank"><span style="color: #135eb0;">described</span></a> by Google in their technology section:</span></p>
<p style="margin-left: 36pt;"><em><span style="font-family: Arial; color: #43443c; font-size: 10pt;" lang="EN-US">PageRank relies on the uniquely democratic nature of the web by using its vast link structure as an indicator of an individual page&#8217;s value. In essence, Google interprets a link from page A to page B as a vote, by page A, for page B. But, Google looks at more than the sheer volume of votes, or links a page receives; it also analyzes the page that casts the vote. Votes cast by pages that are themselves &#8220;important&#8221; weigh more heavily and help to make other pages &#8220;important.&#8221;</span></em><span style="font-family: Arial; color: #43443c; font-size: 10pt;" lang="EN-US"></span></p>
<p><span style="font-family: Arial; color: #43443c; font-size: 10pt;" lang="EN-US">PageRank is derived (roughly speaking), by amalgamating all the links that point to a particular page, adding the value of the PageRank that they pass (based on their own PageRank) and applying calculations in the formula (see <a href="http://www.iprcom.com/papers/pagerank/" target="_blank"><span style="color: #135eb0;">Ian Rogers&#8217; explanation</span></a> for more details).</span></p>
<p style="text-align: center;" align="center"><span style="font-family: Arial; color: #43443c; font-size: 10pt;" lang="EN-US"><br />
Google&#8217;s toolbar (<a href="http://toolbar.google.com/" target="_blank"><span style="color: #135eb0;">available here</span></a>) includes an icon that shows a PageRank value from 0-10</span></p>
<p><span style="font-family: Arial; color: #43443c; font-size: 10pt;" lang="EN-US">PageRank, in essence, measures the brute link force of a site based on every other link that points to it without significant regard for quality, relevance or trust. Hence, in the modern era of SEO, the PageRank measurement in Google&#8217;s toolbar, directory or through sites that query the service is of limited value. Pages with PR8 can be found ranked 20-30 positions below pages with a PR3 or PR4. In addition, the toolbar numbers are updated only every 3-6 months by Google, making the values even less useful. Rather than focusing on PageRank, it&#8217;s important to think holistically about a link&#8217;s worth.</span></p>
<p><span style="font-family: Arial; color: #43443c; font-size: 10pt;" lang="EN-US">Here&#8217;s a small list of the most important factors search engines look at when attempting to value a link:</span></p>
<ul type="disc">
<li class="MsoNormal" style="margin: 0cm 0cm 0pt; color: #43443c; mso-margin-top-alt: auto; mso-margin-bottom-alt: auto; mso-list: l0 level1 lfo1; tab-stops: list 36.0pt;"><strong><span style="font-family: Arial; font-size: 10pt;" lang="EN-US">The Anchor Text of Link </span></strong><span style="font-family: Arial; font-size: 10pt;" lang="EN-US">- Anchor text describes the visible characters and words that hyperlink to another document or location on the web. For example in the phrase, &#8220;<a href="http://www.cnn.com/" target="_blank"><span style="color: #800080;">CNN</span></a> is a good source of news, but I actually prefer <a href="http://news.bbc.co.uk/"><span style="color: #135eb0;">the BBC&#8217;s take on events</span></a>,&#8221; two unique pieces of anchor text exist &#8211; &#8220;CNN&#8221; is the anchor text pointing to <em><span style="font-family: Arial;">http://www.cnn.com</span></em>, while &#8220;the BBC&#8217;s take on events&#8221; points to <em><span style="font-family: Arial;">http://news.bbc.co.uk</span></em>. Search engines use this text to help them determine the subject matter of the linked-to document. In the example above, the links would tell the search engine that when users search for &#8220;CNN&#8221;, SEOmoz.org thinks that <em><span style="font-family: Arial;">http://www.cnn.com</span></em> is a relevant site for the term &#8220;CNN&#8221; and that <em><span style="font-family: Arial;">http://news.bbc.co.uk</span></em> is relevant to &#8220;the BBC&#8217;s take on events&#8221;. If hundreds or thousands of sites think that a particular page is relevant for a given set of terms, that page can manage to rank well even if the terms NEVER appear in the text itself (for example, see the BBC&#8217;s explanation of why Google ranks certain pages for the term &#8220;<a href="http://news.bbc.co.uk/2/hi/americas/3298443.stm" target="_blank"><span style="color: #135eb0;">Miserable Failure</span></a>&#8220;). </span></li>
</ul>
<ul type="disc">
<li class="MsoNormal" style="margin: 0cm 0cm 0pt; color: #43443c; mso-margin-top-alt: auto; mso-margin-bottom-alt: auto; mso-list: l1 level1 lfo2; tab-stops: list 36.0pt;"><strong><span style="font-family: Arial; font-size: 10pt;" lang="EN-US">Global Popularity of the Site</span></strong><span style="font-family: Arial; font-size: 10pt;" lang="EN-US"> &#8211; More popular sites, as denoted by the number and power of the links pointing to them, provide more powerful links. Thus, while a link from SEOmoz may be a valuable vote for a site, a link from bbc.co.uk or cnn.com carries far more weight. This is one area where PageRank (assuming it was accurate), could be a good measure, as it&#8217;s designed to calculate global popularity. </span></li>
</ul>
<ul type="disc">
<li class="MsoNormal" style="margin: 0cm 0cm 0pt; color: #43443c; mso-margin-top-alt: auto; mso-margin-bottom-alt: auto; mso-list: l3 level1 lfo3; tab-stops: list 36.0pt;"><strong><span style="font-family: Arial; font-size: 10pt;" lang="EN-US">Popularity of Site in Relevant Communities</span></strong><span style="font-family: Arial; font-size: 10pt;" lang="EN-US"> &#8211; In the example above, the weight or power of a site&#8217;s vote is based on its raw popularity across the web. As search engines became more sophisticated and granular in their approach to link data, they acknowledged the existence of &#8220;topical communities&#8221;; sites on the same subject that often interlink with one another, referencing documents and providing unique data on a particular topic. Sites in these communities provide more value when they link to a site/page on a relevant subject rather than a site that is largely irrelevant to their topic. </span></li>
</ul>
<ul type="disc">
<li class="MsoNormal" style="margin: 0cm 0cm 0pt; color: #43443c; mso-margin-top-alt: auto; mso-margin-bottom-alt: auto; mso-list: l2 level1 lfo4; tab-stops: list 36.0pt;"><strong><span style="font-family: Arial; font-size: 10pt;" lang="EN-US">Text Directly Surrounding the Link</span></strong><span style="font-family: Arial; font-size: 10pt;" lang="EN-US"> &#8211; Search engines have been noted to weight the text directly surrounding a link with greater important and relevant than the other text on the page. Thus, a link from inside an on-topic paragraph may carry greater weight than a link in the sidebar or footer. </span></li>
</ul>
<ul type="disc">
<li class="MsoNormal" style="margin: 0cm 0cm 0pt; color: #43443c; mso-margin-top-alt: auto; mso-margin-bottom-alt: auto; mso-list: l4 level1 lfo5; tab-stops: list 36.0pt;"><strong><span style="font-family: Arial; font-size: 10pt;" lang="EN-US">Subject Matter of the Linking Page</span></strong><span style="font-family: Arial; font-size: 10pt;" lang="EN-US"> &#8211; The topical relationship between the subject of a given page and the sites/pages linked to on it may also factor into the value a search engine assigns to that link. Thus, it will be more valuable to have links from pages that are related to the site/pages subject matter than those that have little to do with the topic. </span></li>
</ul>
<p><span style="font-family: Arial; color: #43443c; font-size: 10pt;" lang="EN-US">These are only a few of the many factors search engines measure and weight when evaluating links. For a more complete list, see <a title="Rand's Site" href="http://www.seomoz.org/articles/search-ranking-factors.php#4"><span style="color: #135eb0;">SEOmoz&#8217;s search engine ranking factors article</span></a>.</span></p>
<p><span style="font-family: Arial; color: #43443c; font-size: 10pt;" lang="EN-US">Link metrics are in place so that search engines can find information to trust. In the academic world greater citation meant greater importance, but in a commercial environment, manipulation and conflicting interests interfere with the purity of citation-based measurements. Thus, on the modern WWW, the source, style and context of those citations is vital to ensuring high quality results.</span></p>
]]></content:encoded>
			<wfw:commentRss>http://semways.com/archives/18/feed</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>How Search Engines Operate</title>
		<link>http://semways.com/archives/16</link>
		<comments>http://semways.com/archives/16#comments</comments>
		<pubDate>Sun, 31 May 2009 08:56:19 +0000</pubDate>
		<dc:creator>admin</dc:creator>
				<category><![CDATA[Search Engine Related]]></category>
		<category><![CDATA[search engine]]></category>

		<guid isPermaLink="false">http://www.semways.com/?p=16</guid>
		<description><![CDATA[Search engines have a short list of critical operations that allows them to provide relevant web results when searchers use their system to find information.

Crawling the Web
Search engines run automated programs, called &#8220;bots&#8221; or &#8220;spiders&#8221; that use the hyperlink structure of the web to &#8220;crawl&#8221; the pages and documents that make up the World Wide [...]]]></description>
			<content:encoded><![CDATA[<p><span style="font-family: Arial; color: #43443c; font-size: 10pt;" lang="EN-US">Search engines have a short list of critical operations that allows them to provide relevant web results when searchers use their system to find information.</span></p>
<ol type="1">
<li class="MsoNormal" style="margin: 0cm 0cm 9pt; color: #43443c; mso-margin-top-alt: auto; mso-list: l0 level1 lfo1; tab-stops: list 36.0pt;"><strong><span style="font-family: Arial; font-size: 10pt;" lang="EN-US">Crawling the Web</span></strong><span style="font-family: Arial; font-size: 10pt;" lang="EN-US"><br />
Search engines run automated programs, called &#8220;bots&#8221; or &#8220;spiders&#8221; that use the hyperlink structure of the web to &#8220;crawl&#8221; the pages and documents that make up the World Wide Web. Estimates are that of the approximately 20 billion existing pages, search engines have crawled between 8 and 10 billion. </span></li>
<li class="MsoNormal" style="margin: 0cm 0cm 9pt; color: #43443c; mso-margin-top-alt: auto; mso-list: l0 level1 lfo1; tab-stops: list 36.0pt;"><strong><span style="font-family: Arial; font-size: 10pt;" lang="EN-US">Indexing Documents</span></strong><span style="font-family: Arial; font-size: 10pt;" lang="EN-US"><br />
Once a page has been crawled, it&#8217;s contents can be &#8220;indexed&#8221; &#8211; stored in a giant database of documents that makes up a search engine&#8217;s &#8220;index&#8221;. This index needs to be tightly managed, so that requests which must search and sort billions of documents can be completed in fractions of a second. </span></li>
<li class="MsoNormal" style="margin: 0cm 0cm 9pt; color: #43443c; mso-margin-top-alt: auto; mso-list: l0 level1 lfo1; tab-stops: list 36.0pt;"><strong><span style="font-family: Arial; font-size: 10pt;" lang="EN-US">Processing Queries</span></strong><span style="font-family: Arial; font-size: 10pt;" lang="EN-US"><br />
When a request for information comes into the search engine (hundreds of millions do each day), the engine retrieves from its index all the document that match the query. A match is determined if the terms or phrase is found on the page in the manner specified by the user. For example, a search for <a href="http://www.google.com/search?q=car+and+drive+magazine" target="_blank"><span style="color: #135eb0;">car and driver magazine</span></a> at Google returns 8.25 million results, but a search for the same phrase in quotes (&#8221;<a href="http://www.google.com/search?q=%22car+and+driver+magazine%22" target="_blank"><span style="color: #135eb0;">car and driver magazine</span></a>&#8220;) returns only 166 thousand results. In the first system, commonly called &#8220;Findall&#8221; mode, Google returned all documents which had the terms &#8220;car&#8221; &#8220;driver&#8221; and &#8220;magazine&#8221; (they ignore the term &#8220;<em><span style="font-family: Arial;">and</span></em>&#8221; because it&#8217;s not useful to narrowing the results), while in the second search, only those pages with the exact phrase &#8220;car and driver magazine&#8221; were returned. Other advanced operators (Google has a <a href="http://www.google.com/help/operators.html" target="_blank"><span style="color: #135eb0;">list of 11</span></a>) can change which results a search engine will consider a match for a given query. </span></li>
<li class="MsoNormal" style="margin: 0cm 0cm 9pt; color: #43443c; mso-margin-top-alt: auto; mso-list: l0 level1 lfo1; tab-stops: list 36.0pt;"><strong><span style="font-family: Arial; font-size: 10pt;" lang="EN-US">Ranking Results</span></strong><span style="font-family: Arial; font-size: 10pt;" lang="EN-US"><br />
Once the search engine has determined which results are a match for the query, the engine&#8217;s algorithm (a mathematical equation commonly used for sorting) runs calculations on each of the results to determine which is most relevant to the given query. They sort these on the results pages in order from most relevant to least so that users can make a choice about which to select. </span></li>
</ol>
<p><span style="font-family: Arial; color: #43443c; font-size: 10pt;" lang="EN-US">Although a search engine&#8217;s operations are not particularly lengthy, systems like Google, Yahoo!, AskJeeves and MSN are among the most complex, processing-intensive computers in the world, managing millions of calculations each second and funneling demands for information to an enormous group of users. </span></p>
]]></content:encoded>
			<wfw:commentRss>http://semways.com/archives/16/feed</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
	</channel>
</rss>
