Feb 18, 2013 - Land of lost clicks.

Google tracks clicks from a search page. It does this with a little redirect so any time you click a result from a search page it knows what is clicked on. Because it uses a http redirect you know that the stats for these reported on webmaster tools and analytics are pretty reliable. (even if they are rounded up and down)

If you have google analytics you can also track the number of requests you get from google for a given search term. This uses the google analytics javascript which will normally fire after your page has loaded.

Due to the nature of the web these numbers will never match up entirely, but if you are seeing a big difference between the reported clicks from google and the reported visits on your site then you have an idea of the proportion of users who give up before the page has finished loading * This is incredibly bad for a number important reasons.

  1. You are throwing away money, visitors from search are a source of income for many companies. If people don't load the page you have lost a potential customer.
  2. Google uses such stats to rank search results, it is likely that a user has clicked back and found another link, if they stay on that link google will think it is higher quality than your site.
  3. You brand is being tarnished before you have even had a chance to show a logo. Pretty pictures and fonts, good copy and even a brilliant product are not evaluated because your site didn't load. People may still remember the name, and associate it with a slow bloated website.

How to check if this is happening to you?

  1. Open google analytics for your site choose a suitable date range leave the last week off as the google stats are not always right up to date.
  2. Open a second tab with the same date range.
  3. In one tab open "Traffic sources", "Search Engine optimisation", "queries"
  4. In the other tab look for "Traffic sources", "search", "organic"
  5. if you look at a "query" in one tab the number of "clicks" should be similar to the "visits" from a the same keyword.

If your number of "visits" is considerably lower than the number of "clicks" then you are loosing people in this manner.

caveat: This is based on my understanding and experience with google tools, I do not work for google or have direct knowledge of how tools work under the hood or how reports are generated. There will always be some differences in the number, but if you are constantly getting much lower visits than clicks something needs looking at. Most likely the speed your pages load at and the page weight.

* or for who javascript is not working on the site.

Mar 6, 2012 - Thoughts on internationalization and SEO

I am looking at implementation strategies for a website, it has been a while since I have been in i18n land. So I have been doing some research and I am shocked at the state of things.

Web browsers when they request a page send details of the language that they would like to receive it in Accept-Languageen-us,en; is what my firefox sends with every request. This tells the server that I would like english content. (us english, but that is another rant). This is part of the core http standard

What this would mean is that when I go to a website, the server could send back responses that depend on my browsers setting. http://www.example.com/ Would show english content if I was browsing with an en language set, french if I had fr set and so on. This is a pretty good user experience. That page represents a point in the web graph you would want to link to it from anywhere and not care about the locale of the person reading the link.

So it seems like a nice mechanism to create a multi lingual site, right.

WRONG.

Google make this next to impossible. Googlebot dosn't send a accept-language header, which is valid http but it means that you only get one language indexed.

In fact google even states:

"Keep the content for each language on separate URLs. Don’t use cookies to show translated versions of the page. Consider cross-linking each language version of a page. That way, a French user who lands on the German version of your page can get to the right language version with a single click."
So they are saying, don't track user preferences via cookies (or headers), they also say that they ignore language codes in a page and guess the language based on the content.

What google would like apparently is:

http://www.example.com/ (shows a language picker page with no content)
http://www.example.com/en - shows the english version
http://www.example.com/fr - shows the french version
etc

Why is this bad?

As a french user I may create link to the french url, as an english user the english, so even though they have the same equivalent content. A german user may not see a link to the page or may assume there is no german translation because there isn't a link in their language.

Remember Google also doesn't like duplicate content on a website, so you are going against google's own wishes to do the language dance.

What can you do about it?

If you don't care about your alternate languages being indexed then you don't need to do anything, you can use the http header (backed with a cookie preference override). If you have a web app (which won't be indexed) this is not a bad idea.

However a lot of pages do need to be indexed, there you are stuck with language based URLs, there isn't even a language option when creating site-maps.

You can avoid being marked down by google for duplicate content by using rel-alternate meta tags:

<link rel="canonical" href="http://www.example.com/" />

<link rel="alternate" hreflang="en" href="http://www.example.com/" />

<link rel="alternate" hreflang="en-gb" href="http://en-gb.example.com/" />

<link rel="alternate" hreflang="en-us" href="http://en-us.example.com/" />

<link rel="alternate" hreflang="de" href="http://de.example.com/" />

I don't often have a go at the big G, but in this case I think they have done the web a disservice.

Mar 2, 2012 - Drupal Batch API as simple as possible.

Drupal's batch API is a handy tool for processing multiple items giving you feedback on how a process is going and avoiding page timeouts or memory issues.

All that you need to be able to make use of the batch api is:

  • A batch of items to process
  • A function to wrap items in the batch

So say you wanted to amend a subset of nodes, you could write a query to retrieve the NIDs, create a function to do your amendment and use the batch api.

There is only one parameter you need to know for the batch api array and that is operations.

For the nodes you want to amend that would be something like:

<?php
$batch['operations'] = array(
  array('mymodule_ammend_node',1),
  array('mymodule_ammend_node',2),
  array('mymodule_ammend_node',3),
);

Of course you are more likely to create operations in a loop rather than hard coding

<?php
foreach ($nid_list as $nid) {
  $batch['operations'][] = array('mymodule_ammend_node',$nid);
}

Once you have your operations array created running a batch is very simple, this is all you need:

<?php
batch_set($batch); //See code above for batch array creation
batch_process();

There are a lot of other things that you can do with the batch API. But many times this simple case is all you need.