Skip to main content

How Google decides to index your content

I like to tell people that “Google doesn’t search the internet” because technically it doesn’t. Instead, Google makes a copy of as much of the internet as possible on their massive servers and searches that instead. Whatever it finds there will be presented to you as the results, and Google hopes/assumes that when you click through the results the current version of the page is similar to what they had on their server.

Indexing

Google does this through a process calling indexing. It sends out spiders to crawl through websites and copy all of the info onto their servers. In theory, the more pages you can get into Google’s index, the more chances that one of your pages will be presented in a search result and you’ll get another visitor.

The thoughts in this post will relate mostly to indexing, which is only half the battle. Once in their index, there is a lot you can do to help rank better, which we often discuss in posts in the SEO category of this blog.

There are two facets of indexing that are important:

  • Speed: The quicker and more frequently Google indexes your content, the better.
  • Depth: Google won’t automatically index your entire site, especially if it’s new. Encouraging Google to go deeper is essential.

Get more indexed

There are a few key things to do to help get more of your content indexed.

  • Submit a sitemap to Google: The better that Google can understand your site, the more likely they are to index your pages. We typically recommend using the WordPress SEO by Yoast plugin to generate the sitemap, which you should then submit to Google Webmaster Tools.
  • Increase your PageRank: PageRank is a value of how important your site is, based on the number of other sites that link over to it. Back in the early days of Google, PageRank was king. Google used to even provide a score from 1-10 for each site to see where things stand. Today, PageRank isn’t nearly as important for ranking but it’s a huge factor when it comes to indexing speed and depth. This can be a complicated subject, but simply getting quality websites to link over to yours can help quite a bit.
  • Make your content easy to understand: At the end of the day, your primary job is to make your content easy to understand by humans. All of the SEO in the world is fruitless if it turns away people. However, you also need to make sure that Google has an easy time understanding things as well. Proper markup, solid code and fast loading pages will help Google to get through things quickly and hopefully stick around a bit longer.
  • Avoid duplicate content: While there is no actual Google penalty for duplicate content, Google isn’t too keen to index it. This applies not only to content that you’ve “borrowed” from another site, but also having duplicate content on your own site. Even trickier, you need to make sure another site doesn’t steal your content and then Google indexes them instead of you! A quick tip to help with that is the PuSHPress plugin, which notifies Google the instant you push “publish” so that they know you were first.

Make sure you don’t block Google

This is a small but very critical tip; in WordPress there is an option in the “Settings” –> “Reading” panel to block search engines. It’s good to do this if you have a site under development on a separate domain, but it’ll hurt you very badly if you leave it enabled on a live site.

This is a directive asking Google to not index any pages on the site, and any reputable search engine (including Google, Bing and others), will respect that. Make sure it’s unchecked!

search-engine-visibility

Track it

There are a lot of good reasons to use Google Webmaster Tools, and tracking your index status is one of them. There are a few good places to look in there.

Google Index –> Index Status
This shows how many pages Google has indexed over the past year, and ideally is a slow, steady climb. Look for any big jumps up or down so you’ll know if there are any problems. Here is a good-looking index chart for our friends over at addONE Marketing, showing growth due to their blogging efforts over the past year.

index-status

Crawl –> Crawl Stats
Similar to the example above, this gives you an idea of how Google has been reacting to your site lately. This shows how many pages they’re crawling each day over the past 90 days. In a perfect world, they’d crawl every page on your site daily (if not twice or more per day), but what we’re looking for here is activity. If Google’s spiders are visiting every day and crawling around, that’s a good thing. Keep an eye on this from time to time to make sure they’re visiting often. Here is an example from our site showing solid crawling over the past few months.

gmm-crawling

Keep an eye on it

As you can see above, there’s not much you can do to directly influence Google’s crawling of your site, but there are certainly some little things you can do to help. If you’re concerned that your site simply isn’t being indexed as well as it should, reach out to us and we’ll be happy to help you get to the bottom of it.

About the Author

Mickey Mellen

Co-Founder and Technical Director

View Mickey's Profile

More from Our Blog

Case Studies: Telling Client Stories To Help Build Brand Awareness

Using stories of your existing clients and past projects gives your experience specificity. In marketing, we call these case studies—which is a boring name for...

Read More
white ruled book

Balancing the Creative and Technical Elements of a Website

There exists a tension between the creative and technical elements of a well-built website. While you always strive to create a beautiful website pleasing to…

Read More

Making the Most Out of Your Website’s About Page

While every website we build at GreenMellen is custom, there are a handful of pages that should be on all websites. This list includes a…

Read More
people sitting around a table with their computers