How to design a web crawler friendly website

A Brief History of Search Engine Optimization

Search engine optimization is the art and science of making web pages attractive to internet search engines. Some interne t businesses consider search engine optimization to be the subset of search engine marketing.

In the middle of the 1990s webmasters and search engine content providers started optimizing websites. At the time all the webmasters had to do was provide a URL to a search engine and a web crawler would be sent from the search engine. The web crawler would extract link from the webpage and use the information to index the page by down loading the page and then storing it on the search engines server. Once the page was stored on the search engines server a second program, called an indexer, extracted additional information from the webpage, and determines the weight of specific words. When this was complete the page was ranked.

It didn’t take very long for people to understand the importance of being highly ranked.

In the beginning search engines used search algorithms that webmasters provided about the web pages. It didn’t take webmasters very long to start abusing the system requiring search engines to develop a more sophisticated form of search engine optimization. The search engines developed a system that considered several factors; domain name, text within the title, URL directories, term frequency, HTML tags, on page key word proximity, Alt attributes for images, on page keyword adjacency, text within NOFRAMES tags, web content development, sitemaps, and on page keyword sequence.


Google developed a new concept of evaluating internet web pages called PageRank. PageRank weighs a web page’s quantity and quality based on the pages incoming links. This method of search engine optimization was so successful that Google quickly began to enjoy successful word of mouth and consistent praise. 

To help discourage abuse by webmasters, several internet search engines, such as Google, Microsoft, Yahoo, and Ask.com, will not disclose the algorithms they use when ranking web pages.The signals used today in search engine optimization typically are; keywords in the title, link popularity, keywords in links pointing to the page, PageRank (Google), Keywords that appear in the visible text, links from on page to the inner pages, and placing punch line at the top of the page.

For the most part registering a webpage/website on a search engine is a simple task. All Google requires is a link from a site already indexed and the web crawlers will visit the site and begin to spider its contents. Normally a few days after registering on the search engine the main search engine spiders will begin to index the website. 

Some search engines will guarantee spidering and indexing for a small fee. These search engines do not guarantee specific ranking. Webmaster’s who don’t want web crawlers to index certain files and directories use a standard robots.txt file. This file is located in the root directory. Occasionally a web crawler will still crawl a page even if the webmaster has indicated he does not wish the page indexed.

The most successful online businesses all have one thing in common. They all knew how to make search engine optimization work for them.

Designing a web crawler friendly website

Search engine optimization is the art and science of making websites attractive to the internet’s search engines. The first step in successfully achieving stellar search engine optimization is to lure search engine’s web crawlers to your website. Web crawlers are computer programs that the search engines use gather data and index information from the websites. The information the web crawlers gather is used to determine the ranking of a webpage.

One of the fastest ways to hamper a web crawler is to construct a website that has frames. Most search engines have crawlers that can’t penetrate the frames, if they can’t get into a webpage to read it then that webpage remains unindexed and unranked. Two search engines, Google and Inktome, have web crawlers that are capable of penetrating frames.

If you have written frames into your URL it will probably be worth your effort to go back and rewrite your URL’s. Once you have rewritten your URLs you might be surprised to find that the new addresses are easier on humans as well as web crawlers, the frameless URLs are easier to type in documents as links and references.

Once you have rewritten your URL’s it is time to start submitting your website to search engines. Some webmasters like to use an automated search engine submission service. If you decide to go with the submission service you should be aware that there will be a fee involved, the minimum fee is typically fifty-nine US dollars. This price should keep a few URLs on the search engines for a year. Other webmasters like to avoid big fees by submitting their website to individual search engine on their own.

Once your webpage is submitted to a search engine you need to sit down and design a crawler page. A crawler page is a webpage that contains nothing else expect links to every single page of your website, Use the title of each page as the as the link text. This will also give you some extra keywords that will help improve the ranking the crawlers assign to your website. Think of the crawler page as a site map to the rest of your website. 

Typically, the crawler page won’t appear in the search results. This happens because the page doesn’t have enough text for the crawlers to give that individual page a high ranking, after all its nothing more then a portal to the rest of your site and your human users won’t need to use it. Don’t panic if it crawlers don’t instantly appear to index your website. There are a lot of websites available on the internet that need to be crawled, indexed, and then ranked. It can sometimes take up to three months for a web crawler to get to yours.

  

Leave a Comment

Your email address will not be published. Required fields are marked *

Scroll to Top