In the previous lesson, we explained that internal search engine optimization can be divided into two main areas: on-page optimization, which includes the individual elements on each web page, and on-site optimization, which includes the structure of the website as a whole.
In this lesson, we will analyze factors that affect the on-site search engine optimization.
Domain Name Structure
When setting up a website, one of the first decisions you will need to make is whether or not you will use the www. prefix. You could use either:
http://example.com
or
http://www.example.com
Under the SEO perspective, it does not matter which version you choose. The important thing is to choose one and stick with it.
Leaving both versions active is possible, but not recommended. Google and other search engines tend to see sub-domains as completely different websites, and www.example.com is technically a sub-domain of the example.com website. In other words, if both the http://example.com and the http://www.example.com versions of your website are active, Google might see each of them as a separate website. The two websites will obviously contain the same exact content, and as a consequence you could be penalized for duplicate content.
Secondly, if both of your versions are active, you might end up with people linking to both of them, which will split your backlink portfolio and reduce the overall search engine trust that your main domain will have.
Once you have chosen the version that you want to use, therefore, you will need to redirect the other one using a 301 Permanent Redirect. The easiest way to do this is with the .htaccess file located in the root of your server. Simply open that file (or create one if you don’t have it), and paste the following code there:
Options +FollowSymlinks
RewriteEngine on
rewritecond %{http_host} ^domain.com [nc]
rewriterule ^(.*)$ http://www.domain.com/$1 [r=301,nc]
Remember to substitute your own domain name there. The code above will redirect all http://example.com requests to http://www.example.com. If you want to use the other way around, simply invert the domains.
If you are using a PHP based CMS, like WordPress or Drupal, you can also include a PHP code in the header of your website. The code will look like this:
<?php
if (substr($_SERVER[‘HTTP_HOST’],0,3) != ‘www’) {
header(’HTTP/1.1 301 Moved Permanently’);
header(’Location: http://www.’.$_SERVER[‘HTTP_HOST’]
.$_SERVER[‘REQUEST_URI’]);
}
?>
URL
The URL, or Uniform Resource Locator, specifies where a certain resource is available. Practically speaking, URLs are the web addresses that you use to find specific websites, pages, and files. The structure of the URLs on your site can affect its search engine optimization, and below we will explain how.
File Names and Folders
This is an example of a file based URL:
http://www.example.com/file.html
This is a folder based URL:
http://www.example.com/folder/
And this is an example of the two combined:
http://www.example.com/folder/file.html
In the example shown above, the file has a .html extension, which is the file type served out by web-server (in this case a simple HTML page). There are lots of other file types, including .php, .jsp, .asp and so on. Each of those extensions refers to the programming language used by the web server. From a search engine perspective, there’s no benefit to using one over the other. If you are using custom programming on your site, you should just be careful to not create new or unusual file extensions, as this can cause problems. An example of a maverick file type would be:
http://example.com/file.qxra
Search engines might not understand the output of that file and they would be cautious to list that page in the SERPs.
While there is no SEO benefit to choosing a specific file type, many site owners prefer to follow the W3C recommendation and not use file types at all. The benefit of this approach is a higher degree of consistency. For example, if your website is built using the ASP technology, you could serve the following page:
http://www.example.com/page.asp
Should you need to change the technology behind the site to PHP, that page will start being served as:
http://www.example.com/file.php
The old URL would therefore, become invalid, and you could also lose some backlinks. In reality it is possible to solve such a problem with server redirects, but the process is not that simple. A better solution is to use a simple folder based structure like this one:
http://www.example.com/file/
In this case, when the website changes from ASP to PHP, none of the URLs have to change, and you don’t have to maintain any backward legacy code compatibility. What’s actually going on behind the scenes is that the web-server is configured to display default pages for each directory, so the real address of the page would be:
http://www.example.com/file/index.asp
However, the web-server takes care of some front end magic and serves it without needing the “index.asp” part. The only thing that you need to care about with such structure is to make sure that a http://www.example.com/file/index.asp request would be redirected to http://www.example.com/file/, to avoid creating two separate pages displaying the exact same content. Most modern content management systems handle this redirect automatically.
If your current website is using extensions, there’s no need to change things right away. However, when the next redesign or major platform update occurs, it might be something to consider doing.
Word Separator
The next aspect to be concerned with is the word separator and length. If you have one or more words in your file/folder name, you could use this configuration:
http://www.example.com/twowords/
If the words are common, well known, or frequently used, search engines can usually break them apart and understand what is going on. If they aren’t, however, search engines might have a problem. If you want to make sure that words will be identified in your URL, therefore, it is a good idea to use a separator. The most common separator is the hyphen, which would make your URL look like this:
http://www.example.com/two-words/
In recent years search engines have improved the way they handle other separators, including the underscore. In theory, therefore, you could also this URL:
http://example.com/two_words/
Some webmasters reported having problems with the underscore as a word separator, though, so if you want to be on the safe side, a hyphen is probably the best choice.
Keywords in the URL
When you are building your URLs, you want to include the main keywords of the page in question there, as this can help with the search rankings. You can also remove the connecting words and unrelated keywords to make the URL cleaner. For example, Instead of using:
http://www.example.com/my-best-vacation-to-las-vagas/
you could use:
http://www.example.com/vacation-las-vagas/
Just like with title tags, you want to make your URLs as concise as possible. Most search engines also stop weighing it after the sixth or seventh word, so five or less is usually what you should aim for.
Static URLs and URLs with Parameters
When you are building a website that has information contained in a database, there are different ways to retrieve that information. The easiest way is by passing parameters through a URL and making it dynamic:
http://www.example.com/file/?id=12345
While search engines have the ability to index and rank these pages, it is often desirable to make those URLs static, by using a folder structure for example:
http://example.com/file/12345/
The programming still takes the information from the database and builds the page in the same way, but this method facilitates the indexation. If search engines see the following URLs, for instance:
http://example.com/file/?id=12345
http://example.com/file/?id=67890
they might not see them as two different pages, and therefore choose to index only one of them.
On the other hand, search engines will always see the following URLs as separate ones:
http://example.com/file/12345/
http://example.com/file/67890/
Another advantage of static URLs is you can use them to impart a keyword component, so instead of this:
http://example.com/file/?id=12345
you could have:
http://example.com/file/coffee/
As we explained before, having your main keywords in the URL is important, so try to use static URLs whenever possible.
Site Architecture
Site architecture is an important and fairly complex aspect of building a website. In fact entire books have been written about it. In this section we’ll be addressing the key aspects of it. First of all, keep in mind the following basic principles:
- You want to divide your website into meaningful categories or sections.
- You want an architecture that is easily crawlable and therefore not complex.
- You want an architecture that is flat as opposed to deep.
- You want an architecture that allows you to expand easily.
We’ll be taking a more in-depth look at each of these aspects below.
Dividing Your Website into Meaningful Sections
When you begin thinking about building your website and doing keyword research, usually certain major topics or keyword groups emerge. These represent an efficient way to divide your website into sections. If you have a site about cars, for example, the different sections could be:
http://www.example.com/sport-cars/
http://www.example.com/family-cars/
http://www.example.com/luxury-cars/
Such division would be easy to understand and to navigate by your visitors.
In fact, it is important to create your internal sections with the end user in mind. Think about the way they segment things in your market or niche, and use that same division in your website. Sometimes businesses make the mistake of using website sections that reflect how they see the market, and not how the end users or clients see it.
If used properly, the different sections and categories in your site will also enrich your URLs with your main keywords.
Setting up a Crawlable Architecture
Another key aspect of your site architecture is how easily it can be crawled by the search engines. If they can’t discover all your internal pages easily and figure out the hierarchy that controls them, your search rankings might suffer.
Practically speaking, all of the main sections within a website should be interconnected with one another. Depending on the overall size of the site it may be advisable to interconnect even the second level of the website. This has to be decided case by case.
It is also good practice to make all your internal pages link back to the homepage, and possibly to your main sections, too.
Google suggests not having more than 100 links per page. While you can have more and Google will still crawl them, having too many can create problems.
One of the best ways to solve crawling problems is with one or more HTML sitemaps (not to be confused with XML sitemaps, which we will address in a different lesson). These are nothing more than pages inside your site that will explain your site structure, with links to all the the different sections and pages.
If you have a blog, for example, an “Archive” page containing links to all your published posts would act as an HTML sitemap. You could even group those posts by month, year or category. You can use the SGR Clean Archives WordPress plugin to achieve that.
Keeping Your Architecture Flat and Not Deep
Some websites are built with too many layers or levels, and this can create problems for search engines. As a rule of thumb, the farther a page is from the homepage, the harder it will be for robots to find it. Suppose a website about cars has the following architecture:
http://www.example.com/cars/
http://www.example.com/cars/ford/
http://www.example.com/cars/ford/sports-car/
http://www.example.com/cars/ford/sports-car/mustang/
This is an example of a deep website architecture, which is inefficient. A search robot would need to go through three levels of sub-folders to find the Mustang page, so it could have problems to find and index it.
Ideally, you want to create a single page about the Mustang sitting at the top level like this:
http://www.example.com/cars/mustang/
Then you can link to the mustang page from each of these sub pages:
http://www.example.com/cars/ford/
http://www.example.com/cars/sports-car/
A secondary benefit of this flat structure is that if you decide to remove one of the sub-folders, say the “sports car” one, you won’t need to 301 redirect the Mustang page and every other sub page underneath that folder.
Setting up Your Architecture to Allow for Expansion
Another key mistake many people make when setting up a site architecture is not leaving themselves room for future expansion. To continue with our example of a car website, every year there will be new cars, so you would need to accommodate that. An architecture that wasn’t optimized for the future would look like this:
http://www.example.com/cars/2008/mustang/
http://www.example.com/cars/2007/mustang/
http://www.example.com/cars/2006/mustang/
With such a structure, every year the site owner would have to build new links to the new Mustang page to get it ranking well. A more efficient architecture for that situation would be:
http://www.example.com/cars/mustang/2008/
http://www.example.com/cars/mustang/2007/
http://www.example.com/cars/mustang/2006/
By adding the year as a sub-folder to the Mustang page you can keep any links and rankings from the main Mustang page and share them with the other pages. Secondly, any users who happen to come to the page looking for an older version will easily be able to reach it.
Action Points
- Make sure that only one version of your domain is active, and that the other one is being redirected. If you are using the http://www.example.com version, for example, you simply need to type http://example.com in your browser and see whether you will be redirected to the www. version or not.
- Analyze your URL structure to see if it reflects the optimization factors mentioned in this lesson. If you are not using folder based URLs, there is no need to change them right away. Just keep that point in mind for a future upgrade.
- If you don’t have one, consider creating an HTML sitemap for your website.
- Analyze your website architecture to make sure that it is not too deep, and evaluate whether or not you have enough room for expansion with the current structure.
Navigation Links
Previous Lesson: On-Page Search Engine Optimization
Next Lesson: The PageRank Algorithm