Dealing with Dynamic URLs (SEO)

Wednesday, September 5, 2012
This is a slightly advanced topic on SEO.
What are Dynamic URLs?
Nowadays, a lot of sites are dynamic, that is their site draws information from somewhere (usually a database) and outputs these information into your browser. Think of them as webpages created "on the fly". The biggest advantage in this is that you can create a lot of dynamic pages easily and consistently without too much additional coding. Just use a template, then output the different information on them. For example, if you were to browse this site, you would realize that the pages actually are displayed with the same layout but with just different content. They can save webmasters a lot of time and effort and mistakes as well.
Thus, a site with a lot of pages and content will usually be dynamic, because it would be too tedious for the webmaster to manually edit and recreate the pages and the internal links.They are usually done with server-side technologies like php or asp and a database like MYSQL as their backend storage.
So we now know what is a dynamic site. But what about dynamic URLs? Dynamic sites usually have dynamic URLs. Let me give you an example: [http://somesite.com/index.php?area=search&browse=1&category=2&] . From this URL you can sort of see that it is pointing to some category(through a search function) in some directory or something from a dynamic site. The weird characters (eg "?", "=", "&" and so on) are sometimes called STOP characters for obvious reasons. These are the characters which you should be careful about.
Why Search Engines don't like dynamic URLs?
Simple, dynamic URLs can cause problems to Search Engine's indexing, specifically for duplicate content. Search Engine spiders are just not eager to index dynamic URLs because of this.
- A dynamic page may create more pages, sometimes trapping the search engine bots in a endless loop and worse, create tons and tons of pages to index which is practically useless to the Search Engine users.
- Search Engines like unique content, not duplicate or near duplicate content. Dynamic sites may create pages which are similar(or very similar) in content with just different URLs. It's like having one page with a lot of different URLs pointing to it. Think of the poor sob who is searching for something, only to find a lot of links that are pointing to the same page!
- Session ids("sid=") can cause problems with duplicate content as well. Session ids are placed in the URL usually to tell the webserver who the user is. Think of it as cookies in your URL and a lot of website use them. You must have been to some sites which require you to login to enter the site, some of which uses session ids. So a different session id each time (and thus a different URL) may point to the same page, and this creates duplicate content for the Search Engine to deal with. And some URLs with session ids expire and if their server is strict, leaving a dead link. So if Search Engine index this dead link, the next time someone clicks on this link(which is indexed in the Search Engines, it goes to a error page, which everyone(that is you, me and Search Engines) hates. Although Google DO index URLs with session ids, it is still best avoided.
Although Search Engines do read them and do index them(even those with session ids in them), they are best avoided. Your purpose as a webmaster is to HELP Search Engines index your site, don't make it difficult for them by having Dynamic URLs. In the past, a lot of Search Engine disregard the URL after stop characters which resulted in truncated URLs, so you need get a lot of the same URL, for example for [http://somesite.com/index.php?area=search&browse=1&category=2&], it may be truncated to [http://somesite.com/index.php]. Think of the problem it would create for the Search Engines and you as a webmaster(the Search Engine only index one page!). Of course, that's in the past, but it doesn't mean that it is still not problematic to Search Engines. I definitely won't blame the Search Engines for not indexng dynamic URLs because of both the problems it will have on Search Engines indexes and the users using the Search Engine.
What can I do?
Basically, you need to change your dynamic URLs to static URLs or at least reduce the number of STOP characters in these dynamic URLs. Static URLs are like for example [http://somesite.com/cars/toyota/index.html] , you can see they don't have the STOP characters in them and they do look neat and juicy to Search Engine spiders.
- If your site is small, do you really need a dynamic site? If you have like 10 or 20 pages, do you really need a dynamic site? Yes, it may take some effort, but using a static site can help Search Engine bots index your site faster and more efficiently. And best of all, you have full control on your layout, script and it is probably much easier to modify and loads faster as well.
- Use some URL rewrite script or feature. If your webserver is on Apache (a very popular webserver software), they have this feature called mod rewrite which displays dynamic URLs as static looking ones and can be done in the .htaccess file. However, this requires a lot of expertise and knowledge on apache and mod rewrite. Those on MS IIS server also have a choice. There is a Module which can create static URLs from dynamic ones.This method is the most popular and most effective way to get rid of dynamic URLs. Ask your programmer for help.
- Use robots.txt to block Search Engine bots from doing certain actions, like search and so on. These actions are usually created with a dynamic URL and can create duplicate content because they may point to a page which can be accessed by other URL. For example, you may have a link which randomly shows your visitors a page. But that page can also be accessed by another different link. Two different URLs, the same page equals duplicate content. So just block Search Engine spiders from following the random link. However, you must be careful not to block the bots from indexing important parts of your site when you do this. Yes, this method will not completely remove the dynamic URL problem, but it will minimize the duplicate content problem.
- Reduce the number of STOP characters and variables in your URL if possible. Ask your programmer to use the minimal number of these characters as possible when coding.
- Use cookies instead of session ids. Althought there are some users who may disable cookies, most won't. However, you have to ensure that your webpage allows requests made without cookies because search engines don't accept cookies. If your coding requires the user to have a cookie, then your site may not get indexed as they can't be navigated by search engines.
- Use of sitemaps. By placing a sitemap on your webpage helps Search Engines index your site more efficiently than other internal links (which are dynamic) in your site. And submit them to Search Engines if possible. Oh, make sure the link to your sitemap on your webpage is static by the way!
- Cloaking. Ask your programmer to write a code especially for Search Engines. So if a search engine bot requests a page instead of an actual human visitor, then they will be redirected to another page with a static url. This way, you won't have to worry about session ids causing you and the Search Engine problems. Note that Search Engines do not like cloaking. But I believe that if your purpose is to prevent session ids problems and the cloaked page is the same as what your visitors see (except without the session ids in the URL), I think Search Engines would not mind.
The advantage of changing to static URLs is that you can add keywords into it. Like [http://somesite.com/cars/toyota/service-center.html]. You can see that this URL is keywords full. They have "cars", "toyota" and "service center" as the keywords. Some major Search Engine place relevance on URLs, and that includes Google.

0 comments:

Post a Comment