Google

I’ll start with the behemoth: Google. Here’s the quickest and easiest way to see what Google has in its index. Search Google, either at the site or through the Google toolbar (see blog 1) for this:

site: domain .com

Don’t type the www. piece, just the domain name. For instance, say your site’s domain name is RodentRacing.com . You’d search for this:

site:rodentracing.com

Google returns a list of pages it’s found on your site; at the top, on the blue bar, you see something like this:

Results 1 - 10 of about 256 from rodentracing.com

That’s it — quick and easy. You know how many pages Google has indexed on your site, and can even see which pages.

Here’s another way to see what’s in the index, in this case a particular page in your site. Open your browser and load a page at your site. Then follow these steps:

1. Click the i icon on the Google toolbar.

I’m assuming that you’re using Internet Explorer and have, as I suggest in blog 1, downloaded the Google toolbar — available at toolbar. google.com — to your computer. If you don’t have the toolbar, don’t worry; I explain a non-toolbar method in a moment.

2. Select Cached Snapshot of Page from the drop-down list that appears. If you’re lucky, Google loads a page showing you what it has in its cache, so you know Google has indexed the page. (See Figure 2-1.) If you’re unlucky, Google tells you that it has nothing in the cache for that page. That doesn’t necessarily mean Google hasn’t indexed the page, though.

A cache is a temporary storage area in which a copy of something is placed. In the context of the Web, a cache stores a Web page. Google, Yahoo!, MSN, and Ask.com keep a copy of many of the pages they index, and all but Yahoo! even tell you the date that they indexed the cached pages.

If you don’t have the Google Toolbar, you can instead go to Google ( www. google.com ) and type the following into the Google search box:

cache:http:// yourdomain.com / page.htm

A page stored in the Google cache.

Your One-Hour Search-Engine-Friendly Web Site Makeover

Replace yourdomain.com with your actual domain name, and page.htm with the actual page name, of course. When you click Search, Google checks to see if it has the page in its cache.

What if Google doesn’t have the page? Does that mean your page isn’t in Google? No, not necessarily. Google may not have gotten around to caching it. Sometimes Google grabs a little information from a page but not the entire page.

By the way, you can see a cached page saved by Google, Yahoo!, or MSN directly from the search results; look for the Cached or Cached page link after a search result.

Yahoo! and MSN


And now, here’s a bonus. The search syntax I used to see what Google had in its index for rodentracing.com — site:rodentracing.com — not only works on Google, but also on Yahoo! and MSN. That’s right, type the same thing into any of these search sites and you see how many pages on the Web site are in the index . . . with one caveat. MSN, at the time of writing at least, is a little flaky.

MSN reports a much higher number when you first run this search; but view a later page, and this number drops. For instance, MSN may show you this:

Page 1 of 456 results containing

site:peterkentconsulting.com

At the bottom of the page you see the page-navigation numbers, like this:

1 2 3 4 5 Next

I suggest you click the number 5, to move to Page 5 in the results. Then you see this:

Previous 1 2 3 4 5 6 7 8 9 Next

Click page 9, and continue this way until you get to the last page or until the Page x of xxx line at the top changes. You then see the actual number of pages that MSN has in its index. Why is this? Just a bug; by the time you read this, in fact, it may not be necessary to do this, MSN may have fixed the prob- lem, but it’s been there for some time now.

You can search for a Web site at Google another way, too. Simply type the domain name into the Google search box and click Search. Google returns just that site’s home page. If you want to use the search box on the Google toolbar to do this, type the domain name and then click the binoculars button. (If you type the domain name and press Enter, Google simply redi- rects your browser to the specified domain name.)

Yahoo! Directory


You must check whether your site is listed in the Yahoo! Directory. You have to pay to get a commercial site into the Yahoo! Directory, so you may already know if you’re listed there. Perhaps you work in a large company and suspect that another employee may have registered the site with Yahoo! Here’s how to find out:

1. Point your browser to dir.yahoo.com .

This takes you directly to the Yahoo! Directory search page. 2. Type your site’s domain name into the Search text box.

All you need is yourdomain.com , not http://www. or anything else. 3. Make sure that the Directory option button is selected, then click

Search.

If your site is in the Yahoo! Directory, your site’s information appears on the results page. You may see several pages, one for each category in which the site has been placed (though in most cases a site is placed into only one category).


Open Directory Project

You must also know if your site is listed in the Open Directory Project ( www. dmoz.org ). If it isn’t, it should be. Just type the domain name, without the www. piece. If your site is in the index, the Open Directory Project will tell you. If it isn’t, you’d better register it; see blog 12.

Taking Action if You’re Not Listed


First, if your site isn’t in Yahoo! Directory or the Open Directory Project, you have to go to those systems and register your site. See blog 12 for infor- mation. What if you search for your site in the search engines and can’t find it? If the site isn’t in Google, Yahoo!, and MSN, you have a huge problem.

Here are two possible reasons your site isn’t being indexed in the search engines:

The search engines haven’t found your site yet. The solution is relatively easy, though you won’t get it done in an hour.

The search engines have found your site, but can’t index it. This is a serious problem, though in some cases you can fix it quickly.

For the lowdown on getting your pages indexed in the search engines — to ensure that the search engines can find your site — see the section “Getting Your Site Indexed,” later in this blog. To find out how to make your pages search-engine-friendly — to ensure that once found, your site will be indexed well — check out the section “Examining Your Pages,” later in this blog. But first, let’s see how to check to see if your site can be indexed.

Is your site invisible?

Some Web sites are virtually invisible. A search engine might be able to find the site (by following a link, for instance). But when it gets to the site, can’t read it or, perhaps, can read only parts of it. A client (before he was a client) built a Web site that had only three visible pages; all the other pages, includ- ing those with product information, were invisible.

How does a Web site become invisible? I talk about this subject in more detail in blog 7, but here’s a brief explanation:

The site is using some kind of navigation structure that the search engines can’t read, so they can’t find their way through the site.

The site is creating dynamic pages that the search engines choose not to read.

Unreadable navigation

Many sites have perfectly readable pages, with the exception that the search- bots — the programs the search engines use to index Web sites — can’t nego- tiate the site navigation. The searchbots can reach the home page, index it, and read it, but they can’t go any further. If, when you search Google for your pages, you find only the home page, this is likely the problem.

Why can’t the searchbots find their way through? The navigation system may have been created using JavaScript, and because search engines ignore JavaScript, they don’t find the links in the script. Look at this example:

<SCRIPT TYPE=”javascript” SRC=”/menu/menu.js”></SCRIPT>

In one site I reviewed, this was how the navigation bar was placed into each page: The page called an external JavaScript, held in menu.js in the menu subdirectory. The search engines won’t read menu.js , so they’ll never read the links in the script.

Try these simple ways to help search engines find their way around your site, whether or not your navigation structure is hidden:

Create more text links throughout the site. Many Web sites have a main navigation structure and then duplicate the structure by using simple text links at the bottom of the page. You should do the same.

Add a sitemap page to your site. This page contains links to most or all of the pages on your Web site. Of course, you also want to link to the sitemap page from those little links at the bottom of the home page.

Dealing with dynamic pages

In many cases, the problem is that the site is dynamic — that is, a page is cre- ated on the fly when a browser requests it. The data is pulled out of a data- base, pasted into a Web page template, and sent to the user’s browser. Search engines often won’t read such pages, for a variety of reasons explained in detail in blog 7.

How can you tell if this is a problem? Take a look at the URL in the browser’s location bar. Suppose that you see something like this:

http://www.yourdomain.edu/rodent-racing-scores/march/index.php

This address is okay. It’s a simple URL path made up of a domain name, two directory names, and a filename. Now look at this one:

http://www.yourdomain.edu/rodent-racing/scores.php?prg=1

The filename ends with ?prg=1 . This parameter is being sent to the server to let it know what information is needed for the Web page. If you have URLs like this, with just a single parameter, they’re probably okay, especially for Google; however, a few smaller search engines may not like them. Here’s another example:

http://yourdomain.com/products/index.html?&DID=18&CATID=13

&ObjectGroup_ID=79

This one may be a real problem, depending on the search engine. This URL has too much weird stuff after the filename:

?&DID=18&CATID=13&ObjectGroup_ID=79 . That’s three parameters — DID=18 , CATID=13 , and ObjectGroup_ID=79 — which are too many. Some systems cannot or will not index this page. (My feeling is that Google tends to index “deeper” into dynamic sites than, for instance, Yahoo!)

Another problem is caused by session IDs — URLs that are different every time the page is displayed. Look at this example:

http://yourdomain.com/buyAHome.do;jsessionid=07D3CCD4D9A6A

9F3CF9CAD4F9A728F44

Each time someone visits this site, the server assigns a special ID number to the visitor. That means the URL is never the same, so Google won’t index it.

Search engines may choose not to index pages with session IDs. If the search engine sees links to a page that appears to have a session ID, it doesn’t know whether the URL will change between sessions or whether many different URLs point to the same page. Search engines don’t want to overload the site’s server and don’t want garbage in their indexes.

If you have a clean URL with no parameters, the search engines should be able to get to it. If you have a single parameter in the URL, it’s probably fine. Two parameters may not be a problem, although they’re more likely to be a problem than a single parameter. Three parameters are almost certainly a problem with some search engines. If you think you have a problem, I suggest reading blog 7.


Picking Good Keywords

Getting search engines to recognize and index your Web site can be a prob- lem, as the first part of this blog makes clear. Another huge problem — one that has little or nothing to do with the technological limitations of search engines — is that many companies have no idea what keywords (the words people are using to search for Web sites at the search engines) they should be using. They try to guess the appropriate keywords, without know- ing what people are really using in the search engines.

In blog 5, I explain keywords in detail, but here’s how to do a quick key- word analysis:

1. Point your browser to http://searchmarketing.yahoo.com/rc/srch/ .

You see the Yahoo! Search Marketing Resource Center; Search Marketing is Yahoo!’s PPC (pay-per-click) division. See blog 17 for more about PPC.

2. Click the Advertiser Keyword Selector Tool link on the right side of the page.

A small window opens with a search box.

3. In the search box, type a keyword you think people may use to search

for your products or services.

4. Press Enter.

The tool returns a list of keywords, showing you how often that term and related terms are used by people searching on Yahoo! and partner sites. See Figure 2-2.

I’m tired of looking for the Yahoo! keyword tool, and having to explain to people how to find it. It keeps moving! So I’ve placed a link on my site; go to http://searchenginebulletin.com/yahoo-keywords.html .

You may find that the keyword you guessed is perfect. Or you may discover better words, or, even if your guess was good, find several other great key- words. A detailed keyword analysis almost always turns up keywords or keyword phrases you need to know about.

Don’t spend a lot of time on this task. See if you can come up with some useful keywords in a few minutes and then move on; see blog 5 for details about this process.

Examining Your Pages

Making your Web pages “search-engine-friendly” was probably not uppermost in your mind when you sat down to design your Web site. That means your Web pages — and the Web pages of millions of others — probably have a few problems in the search-engine-friendly category. Fortunately, such problems are pretty easy to spot; you can fix some of them quickly, but others are more troublesome.

Using frames

In order to examine your pages for problems, you need to read the pages’ source code; remember, I said you’d need to be able to understand HTML!

In order to see the source code, choose View ➪ Source in your browser.

When you first peek at the source code for your site, you may discover that your site is using frames. (Of course, if you built the site yourself, you already know whether it uses frames. However, you may be examining a site built by someone else.) You may see something like this in the page:

<HTML> <HEAD>

</HEAD>

<FRAMESET ROWS=”20%,80%”> <FRAME SRC=”navbar.html”> <FRAME SRC=”content.html”>

</FRAMESET>

<BODY>

</BODY> </HTML>

When you choose View ➪ Source in Internet Explorer, you’re viewing the

source of the frame-definition document, which tells the browser how to set up the frames. In the preceding example, the browser creates two frame rows, one taking up the top 20 percent of the browser and the other taking up the bottom 80 percent. In the top frame, the browser places content taken from the navbar.html file; content from content.html goes into the bottom frame.

Framed sites don’t index well. The pages in the internal frames get orphaned in the search engines; each page ends up in search results alone, without the navigation frames with which they were intended to be displayed.

Framed sites are bad news for many reasons. I discuss frames in more detail in blog 7, but here are a few quick fixes:

Add TITLE and DESCRIPTION tags between the <HEAD> and </HEAD> tags. (To see what these tags are and how they can help with your frame issues, check out the next two sections.)

Add <NOFRAMES> and </NOFRAMES> tags between the <BODY> and </BODY> tags, and place 200 to 300 words of keyword-rich content between the tags. The NOFRAMES text is designed to be displayed by browsers that can’t work with frames, and search engines will read this text, although they won’t rate it as high as normal text (because many designers have used NOFRAMES tags as a trick to get more keywords into a Web site).

Include a number of links, in the text between the NOFRAMES tags, to other pages in your site to help the search engines find their way through.

Looking at the TITLE tags

TITLE tags tell a browser what text to display in the browser’s title bar, and they’re very important to search engines. Quite reasonably, search engines figure that the TITLE tags may indicate the page’s title — and therefore its subject.

Open your site’s home page and then choose View ➪ Source (in Internet

Explorer) to view the page source. A text editor opens, showing you what the page’s HTML looks like. Here’s what you should see at the top of the page:

<HTML> <HEAD>

<TITLE> Your title text is here </TITLE>

Here are a few problems you may have with your TITLE tags:

They’re not there. Many pages simply don’t have TITLE tags. If not, you’re not giving the search engines one of the most important pieces of information about the page’s subject matter.

They’re in the wrong position. Sometimes you find the TITLE tags, but they’re way down in the page. If they’re too low in the page, search engines may not find them.

They’re there, but they’re poor. The TITLE tags don’t contain the proper keywords.

Your TITLE tags should be immediately below the <HEAD> tag and should contain useful keywords. Have around 40 to 60 characters between the <TITLE> and </TITLE> tags (including spaces) and, perhaps, repeat the pri- mary keywords once. If you’re working on your Rodent Racing Web site, for example, you might have something like this. Find out more about keywords in blog 5:

<TITLE>Rodent Racing Info. Rats, Mice, Gerbils, Stoats,

all kinds of Rodent Racing</TITLE>

Examining the DESCRIPTION tag

The DESCRIPTION tag is important because search engines often index it (under the reasonable assumption that the description describes the con- tents of the page) and, in some cases, may use the DESCRIPTION tag to pro- vide the site description on the search-results page.

In most cases these days, the major search engines usually don’t use the DESCRIPTION tag to provide the description in the search results. Instead, they typically find the search words in the page, grab a snippet of informa- tion from around the words, and use that as the description. (In some cases, Google and MSN may grab the description from the Open Directory project, while Yahoo! may use the description from the Yahoo! Directory.)


In some cases, though, if the search engine can’t find the keywords in the page (if it finds the page based on its TITLE tag, for example, or links point- ing at the page rather than page content), it may use the DESCRIPTION tag.

Open a Web page, then open the HTML “source” (select View ➪ Source from

your browser’s menu), and take a quick look at the DESCRIPTION tag. It should look something like this:

<META NAME=”description” CONTENT=”your description goes

here”>

Sites often have the same problems with DESCRIPTION tags as they do with TITLE tags. The tags aren’t there, are hidden away deep down in the page, or simply aren’t very good.

Place the DESCRIPTION tag immediately below the TITLE tags (see Figure 2-3) and create a keyworded description of up to 250 characters (again, including spaces). Here’s an example:

<META NAME=”description” CONTENT=”Rodent Racing - Scores,

Schedules, everything Rodent Racing. Whether you’re into mouse racing, stoat racing, rats, or gerbils, our site provides everything you’ll ever need to know about Rodent Racing and caring for your racers.”>

No comments:

Post a Comment