Is there a clever / automated way to tidy up an list of Redirects?

9 replies [Last post]
Jon
User offline. Last seen 4 hours 11 min ago. Offline
Joined: 27/08/2010

OK, my htaccess file is a mess. In the last 5 years I have changed the URL structure of my site about a hundred times.

I had static files in root, then created directories, then used Blogger FTp for the "blog", then added new sections with Blogger FTP, then converted all these sections to Wordpress, then moved all Wordpresses into 1 root installation. Then I had Panda.

This has resulted in a rather large htaccess file, as each time I was careful to ensure that everything was redirected and link juice wastage was limited.

I have about 4500 Redirect 301's. 

Now, I heard that lots of redirects can slow a server down and also that Google steals juice on each redirect, something like 15% (although I am not sure if this is just between domains, and not within the same domain). I also get the impression from what one google employee said that there is a trust element to multiple redirects to consder too.

I suspect that I have a lot of redirects that are not required and many that bouce from one url format to another. I did actually attempt a clean up a while ago, so those 4500 are already a cut down version (I removed loads of Blogger labels a few weeks ago).

Obviously a lot of the redirects are probably totally redundant now - set up to redirect search queries after a restructure, but never had any links in place. I know all links on my site are good as I checked them all and "fixed" all site redicts. So it is just if there is a link.

So I am thinking, probably not worth the time investment to go through that list and ensure that the final URL is correct. But, if there was a clever automated way of cleaning the list that would be great!

If there is one, it is not being shared with Google search ....

Any ideas?

User offline. Last seen 7 hours 30 min ago. Offline
Joined: 19/08/2010

When people talk about lots of redirects being bad it's chains of them, not the total number.

So...

pagea.html > newpage1.php, pageb.html to newpage2.php, pagec.html to newpage3.php... etc etc

is ok, but

pagea.html > newpage1.php, newpage1.php > newerpage2.php, newerpage2.php > andevennewerversion.php is sucky

However the answer is no. there is no easy way. it all hurts.

Jon
User offline. Last seen 4 hours 11 min ago. Offline
Joined: 27/08/2010

Actually, while eating my dinner I came up with a solution! Let's hope I can pull it off .....

I will transfer the redirects into a spreadsheet, then take the resultant column and past into a Wordpress page. Then run Broken Link Checker on that page and fix all redirects. Then copy that back to the spreadsheet, then copy all 3 columns back to htaccess.

Simple ......

Jon
User offline. Last seen 4 hours 11 min ago. Offline
Joined: 27/08/2010

In fact, I may not even have to go via a spreadsheet as Broken Link Checker can also check and mend URLs written in text format. Will test with a few lines to check it does not try to doing anything daft with the first reletive URL. If I just paste into HTML then this should work .....

Jon
User offline. Last seen 4 hours 11 min ago. Offline
Joined: 27/08/2010

Yes, that worked. Simples.

In Broken Link Checker I set it to only look in Privately published pages and only to look at Plain Text URLS (although really I could have just left it at Private only). Then privately published a page, ran the checker, fixed all URLs, pasted the resultant page back to htaccess. Job done.

"However the answer is no. there is no easy way. it all hurts."

Sometimes developers just don't think outside the box!

User offline. Last seen 7 hours 30 min ago. Offline
Joined: 19/08/2010

That was the painful way.  I think your list was just shorter than you let on.

Jon
User offline. Last seen 4 hours 11 min ago. Offline
Joined: 27/08/2010

OK, maybe it is! Still working on it, doing it in lumps (is that a techy term?) got a fair few lines to process .... 

Jon
User offline. Last seen 4 hours 11 min ago. Offline
Joined: 27/08/2010

this is SO PAINFUL NOW!

You were right.

This is taking forever! Aarrrghghghhghghh!

User offline. Last seen 7 hours 30 min ago. Offline
Joined: 19/08/2010

:p

Jon
User offline. Last seen 4 hours 11 min ago. Offline
Joined: 27/08/2010

Another job done. Let's hope that it was worth the effort!