Jeremy French

Infrequently blogging since 2010

Mar 6, 2012 - Thoughts on internationalization and SEO

I am looking at implementation strategies for a website, it has been a while since I have been in i18n land. So I have been doing some research and I am shocked at the state of things.

Web browsers when they request a page send details of the language that they would like to receive it in Accept-Languageen-us,en; is what my firefox sends with every request. This tells the server that I would like english content. (us english, but that is another rant). This is part of the core http standard

What this would mean is that when I go to a website, the server could send back responses that depend on my browsers setting. http://www.example.com/ Would show english content if I was browsing with an en language set, french if I had fr set and so on. This is a pretty good user experience. That page represents a point in the web graph you would want to link to it from anywhere and not care about the locale of the person reading the link.

So it seems like a nice mechanism to create a multi lingual site, right.

WRONG.

Google make this next to impossible. Googlebot dosn't send a accept-language header, which is valid http but it means that you only get one language indexed.

In fact google even states:

"Keep the content for each language on separate URLs. Don’t use cookies to show translated versions of the page. Consider cross-linking each language version of a page. That way, a French user who lands on the German version of your page can get to the right language version with a single click."
So they are saying, don't track user preferences via cookies (or headers), they also say that they ignore language codes in a page and guess the language based on the content.

What google would like apparently is:

http://www.example.com/ (shows a language picker page with no content)
http://www.example.com/en - shows the english version
http://www.example.com/fr - shows the french version
etc

Why is this bad?

As a french user I may create link to the french url, as an english user the english, so even though they have the same equivalent content. A german user may not see a link to the page or may assume there is no german translation because there isn't a link in their language.

Remember Google also doesn't like duplicate content on a website, so you are going against google's own wishes to do the language dance.

What can you do about it?

If you don't care about your alternate languages being indexed then you don't need to do anything, you can use the http header (backed with a cookie preference override). If you have a web app (which won't be indexed) this is not a bad idea.

However a lot of pages do need to be indexed, there you are stuck with language based URLs, there isn't even a language option when creating site-maps.

You can avoid being marked down by google for duplicate content by using rel-alternate meta tags:

<link rel="canonical" href="http://www.example.com/" />

<link rel="alternate" hreflang="en" href="http://www.example.com/" />

<link rel="alternate" hreflang="en-gb" href="http://en-gb.example.com/" />

<link rel="alternate" hreflang="en-us" href="http://en-us.example.com/" />

<link rel="alternate" hreflang="de" href="http://de.example.com/" />

I don't often have a go at the big G, but in this case I think they have done the web a disservice.

Mar 2, 2012 - Drupal Batch API as simple as possible.

Drupal's batch API is a handy tool for processing multiple items giving you feedback on how a process is going and avoiding page timeouts or memory issues.

All that you need to be able to make use of the batch api is:

  • A batch of items to process
  • A function to wrap items in the batch

So say you wanted to amend a subset of nodes, you could write a query to retrieve the NIDs, create a function to do your amendment and use the batch api.

There is only one parameter you need to know for the batch api array and that is operations.

For the nodes you want to amend that would be something like:

<?php
$batch['operations'] = array(
  array('mymodule_ammend_node',1),
  array('mymodule_ammend_node',2),
  array('mymodule_ammend_node',3),
);

Of course you are more likely to create operations in a loop rather than hard coding

<?php
foreach ($nid_list as $nid) {
  $batch['operations'][] = array('mymodule_ammend_node',$nid);
}

Once you have your operations array created running a batch is very simple, this is all you need:

<?php
batch_set($batch); //See code above for batch array creation
batch_process();

There are a lot of other things that you can do with the batch API. But many times this simple case is all you need.

Feb 6, 2012 - Upgrading to drupal 7

A couple of notes on this. Which should be obvious.

Don't try and do it in a lunch break. Make sure your host gives you enough memory. 32MB won't cut it. Hopefully if you are seeing this it means that I have rectified these mistakes.