The Joomla! Community Magazine™

Dead Links Walking

Written by | Wednesday, 01 May 2013 00:00 | Published in 2013 May
A cautionary tale of how the past caught up to Martin Raja and myself, and how we beat it senseless when it did.
Dead Links Walking image by @Helvecio

"Hey! This link isn't working anymore!"

Never what you want to hear after an upgrade under the best circumstances. And this was far from the best. The link in the address bar was completely scrambled, not recognizable as a link at all.

Where did the link come from? Great. It was a link from the really old format, containing "task=view" as the cue for Joomla to display the page. Old. Seriously old.

But the news kept getting worse. That link was still being sent out in emails today, and the link was supposed to work for the customer for a full year after we sent it. Which left us with two choices: fix the emails and somehow make this old link work, or just fix the emails going out and wait a full year to migrate off Joomla 1.5. The latter option clearly could not be taken seriously.

What to do? It seemed like an easy task for apache's mod_rewrite to handle so that's where Martin and I started. Our first few tries failed, and from what Google could tell us no one else had apparently done this before, either.

Friends advised us to write a custom router plugin for this purpose, but by now the challenge had been laid; we were going to convince apache to do this. We took turns hammering on the problem, each try moving us closer, but not succeeding. It was Martin, my co-conspirator in this article, who found the final missing piece.

We had started with the idea of constructing a full regex to substitute "view=article" for "task=view" but after the first few tries didn't work, we simplified the task: we didn't need to drop the task parameter, after all, Joomla would merely ignore it; we only needed to add the view parameter. Doing this triggered apache's propensity for looping; while that cost us time to debug it also finally revealed the key to solving the puzzle.

Our final solution was just three lines for mod_rewrite. Feel free to experiment with it for your own purposes; it solved our problem (we only had content items to display with those links) and while ours may not be the best fit for your situation, the approach we used should be able to guide you to your solution.

To fully understand this, you're going to need a grasp of apache’s mod_rewrite, as well as how regular expressions work. Others have already written good introductions to that, so rather than repeat what they wrote, I'll simply direct you to apache’s mod_rewrite documentation and this introduction to regular expressions.

With that under your belt, grab your hard hat and let’s go exploring the rewrite code:

RewriteCond %{QUERY_STRING} ^(.*)task=view(.*)$

The RewriteCond directive sets the conditions for a rewrite to happen. Every RewriteCond must evaluate to true before the next RewriteRule will be executed. This particular line evaluates to true if the URL coming in has "task=view" anywhere in it. (The ^ represents the beginning of the string and the $ the end. The “.” matches any character and the “*” repeats that match for any number of characters, so with that expression we're processing the whole string.)

RewriteCond %{QUERY_STRING} !(view=article)

This line turned out to be the key to the puzzle. Apache loves to loop; if it makes a change, it repeats itself. This was made very clear when we tried just adding the new term to the string; without this line, it would loop infinitely, locking up the server. (Or it would if apache didn't recognize its addiction to looping. Instead apache says "Nope, ain't gonna go there" and simply declines to execute any of the rewrites that might cause the loop.) This particular quality had bitten us in several earlIer tries before Martin realized what was happening. This line guards against that; if the "view=article" parameter is present anywhere in the query string (the “!” means Not) this test returns false and we exit this rewrite set, eliminating the loop.

RewriteRule ^.*$ /$0?%1view=article%2 [R=301,L]

And this is where the work gets done. The first part of the line ("^.*$") simply selects the entire URI as it comes in. The second part creates the new, rewritten request. The $0 takes the string selected in the first part of the rule, and places it after a "/", giving us "/index.php" for the URI.
The %1 and %2 are called backreferences, and they refer back to the two parts of the first RewriteCond that surrounded the "task=view" -- this moves them to surround "view=article" instead.

The two arguments inside square brackets (“R=301” and “L”) mark the final disposition: do a 301 redirect, and make this the last operation for this set of rules.

This reconstructs the original request into the form of an article request in today's Joomla, which is what we needed.

While this rewrite rule worked for the limited circumstances we had, it is not enough to handle every possible Joomla 1.0 request. But even if it’s not enough for you, it can be used as a guide for creating a rewrite that will fit your need.

Read 10311 times
Tagged under Developers, English