How can it be? Redirects, It's a RUM do Part Two...

Written by @marcemarc on Wednesday, 31 January 2018

So we've inherited a new client, they have two sites, it is a travel company, and as is so often the case, one site is an obvious clone of the first site, and you know somebody has just said we just want that again but with a different logo... and they are so close... twins almost, and although people still talk about the 'main' site... like it was a favourite child - well you still wouldn't expect them to have any major differences...

Redirects are important for this company. SEO is everything. the 301 redirect is king (or queen).

"we want to be able to have the same 'redirects thing' on this one, as we have on the main site"

"Sure.. no problem",  I figure I just install whatever package is used on the 'main site' on the twin site, it all should just work...

".. and actually the tracking thing's not setup properly on the main site, it works on the second site though, you know automatically creating a redirect if we rename a resort.."

"ahh...", alarm bells begin to ring...

So the 'main site' has Simple 301 redirect package installed the 'second site' doesn't.

...On the one hand, I can see how the lack of symmetry might really annoy some people...

So the 'second site' without the Simple 301 package relies on the Core Redirect Url Management dashboard for it's 301 Redirecting powers, and the editors have fathomed the 'rename-rename-remove' method from 'part 1' to create redirects.

(They have no connection though with the other company, how do editors pass these secrets across organisations?... it's like whale song or something).

If I install, as requested, the Simple 301 package on the 'second site' it will turn off the Redirect Url Management tracking functionality and any redirects already created with that mechanism will fail...

... I can make the core Redirect Url Management tracking happen on the 'main site' but only by uninstalling the Simple 301 package, and losing all the redirects that have been already created with that package...

... what? ...

I said it was a RUM do... Part Two...

Why don't existing 301 and Tracking packages play nicely with the RUM dashboard?

When we squeezed the Redirect Url Management tracking functionality and barebones dashboard into the release of 7.5 at that retreat, I mentioned in Part 1 -  we had a horrible, 'end of day' thought... What if?

What if somebody already has installed one of the many existing 301 redirect management packages, and they upgrade to Umbraco 7.5 and their site suddenly has two redirect mechanisms... what will happen?

Well, if one contains a redirect to a nice url of /about-us from a not-so-nice url /aboutus and the other mechanism contains a redirect from the not-so-nice url /aboutus back to the nicer url /about-us ...well...

FIGHT!

...a redirect loop will create a continuous game of redirect tennis™ between the two mechanisms until well, I guess, well the server melts... creates a mini black hole, unicorns die etc... the internet is broken.

How, How can this be?

It all depends on the incoming request pipeline of Umbraco and where the different redirect mechanisms choose to look up and try to perform the redirect.

The Umbraco request pipeline is a series of IContentFinders... individual rules to find Umbraco content based upon the incoming url ... executed one after another in a queue, until content is found.

[DIAGRAM HERE]

A butterfly representing the a content request flies through the Umbraco ContentFinder incoming request pipeline, each ContentFinder represented by a butterfly net, tries to find the content associated with the request

(and you can add your own custom rules too, by implementing an IContentFinder, and using the ContentFinderResolver to insert your custom content finder rule into the pipeline)

...and what we didn't know, at the very last minute, is quite where in the request pipeline each individual plugin/package/redirect mechanism was plumbed, in relation to our new ContentFinderByRedirectUrl implementation...

so very pragmatically this code was added...

// if any of these dlls are loaded we don't want to run our finder
var dlls = new[]
{
"InfoCaster.Umbraco.UrlTracker",
"SEOChecker",
"Simple301",
"Terabyte.Umbraco.Modules.PermanentRedirect",
"CMUmbracoTools",
"PWUrlRedirect"
};

// assuming all assemblies have been loaded already
// check if any of them matches one of the above dlls
var found = AppDomain.CurrentDomain.GetAssemblies()
.Select(x => x.FullName.Split(',')[0])
.Any(x => dlls.Contains(x));
if (found)
ContentFinderResolver.Current.RemoveType<ContentFinderByRedirectUrl>();
}

Essentially if you have one of the existing redirect plugins/package dlls in your site, the core Redirect Url Management dashboard, does not want to play, it is turned off, it goes home in a sulk... and removes itself from the content finding pipeline...

... on the upside, nobody's servers melted after upgrading to 7.5! - but quite a few people were puzzled about this new functionality that plainly did not work... which was a shame, especially for those whose motivation for upgrading to 7.5 was the thought of 'ditching their reliance on a 3rd party tracking package and taking advantage of this new core thing' ...

Dilemma

So now you understand my dilemma for the twin travel sites? (two sites remember! -  not a single site dedicated to travel destinations for twins)...

... and why installing simple 301 or any other package to service their SEO fueled 301 redirect cravings will break the core tracking functionality and vice versa...

So I figured I needed to do some investigation... just where in the request pipeline does each of these packages kick in?...was there really a problem or just a last minute 'what if'?

Why don't you just use UrlTracker mate?, it does tracking AND redirects...

If you do have UrlTracker installed, please make sure you either have 404 tracking turned off, or ignore common 404 requests in the app settings, or have a maintenance plan to remove entries if you are not going to redirect them...  or... this will happen:

undefined

basically, the 404's are tracked in the same db table as the 301s, each tracked 404 is a new row, which if you are not careful, build up over time, and cripple the speed of the dashboard. (UrlTracker appears not to be actively maintained but secretly it is! - You can find it here: https://www.nuget.org/packages/UrlTracker/ there are later versions on Nuget than on Our.Umbraco)

It has the ability to 'force redirects' which then occur before the Umbraco pipeline has its turn, and 'un-forced' redirects that occur after the Umbraco pipeline has executed, and it has regex pattern matching too - but it wouldn't run nicely alongside the Redirect Url Management dashboard... we'd have Url changes tracked in two locations. and we could have the potential 'redirect loop meltdown problem', we're so worried about... it doesn't currently use an IContentFinder to hook into the Umbraco request pipeline but runs instead using a HttpModule on every request...

... if you have UrlTracker installed then it makes sense to turn off the core Redirect Url Management functionality but...

... but that's part of the core now, and the nub of this is people want to use the 'core thing', regardless of whether it fulfills their needs, simply because it is in the 'core' ... There is encouragement that this will always be maintained, work with Umbraco deploy, Umbraco Cloud etc, and there is the expectation it will evolve... but yes, stop reading, just use UrlTracker, it isn't my point not to!

What about Simple 301? 

It's simple - it doesn't try to do any automatic tracking of changes... and it's built in angularJS, which makes its UI fit with the paradigms of Umbraco 7 neatly, although the colours are wrong... also it's using an IContentFinder to register itself in the pipeline, looking good! ... but oh wait... the IContentFinder is registered at the beginning of the pipeline, so on every content request, redirects are being looked for, albeit the database request is being cached...but..

... I don't understand why this IContentFinder is being registered first?

for me you shouldn't be using a redirect system to override an existing Umbraco Url?, the redirects should occur after attempts to find genuine published content have been made... and having this IContentFinder for Simple 301 execute first and importantly before the RedirectUrlManagement IContentFinder is unfortunately just the set of circumstances that would enable a set of redirect tennis™ to be played between the two...

...so I tried it...

undefined

Not quite the end of the internet, no black holes, unicorns still exist, but pretty annoying for anyone visiting the URL and having the 301 loop cached in their browser... so I guess this is a no go too, am I going to have to write my own? [Insert link to part 3]

...and then I thought, but hang on, I've got access to the source of this, this is opensource...

... so as an experiment I've forked Simple 301, and changed where the IContentFinder is registered, and now it's plumbed in after the ContentFinderByRedirectUrl finder...

undefined

... now the redirect loop meltdown cannot occur...

Why?

if we think about the /aboutus -> /about-us -> /aboutus example - the Simple 301 can still redirect /aboutus to /about-us, and there can still be a redirect in the Redirect Url Management dashboard from /about-us but this can ONLY be to a published content item, and therefore the content finding content finders ahead of the redirect finders in the content request pipeline will find that published item first, no loop can occur, if the item being redirected to is subsequently un-published, the redirect is removed from the Redirect Url Management dashboard, so again no loop!

undefined

But why isn't the core tracking turned off by the presence of the dll?

... umm, well I changed the dll name in the fork... 

undefined

So the world of 'pragmatism' cheers and drowns out the orbiting satellites of 'best practice' and 'common sense' and I have the RUM edition of Simple 301 running on the twin travel sites, editors can add custom SEO driven 301 redirects to their hearts content, and any re-names of published content are automatically tracked and 301 redirected to the new name by the core Redirect Url Management,

everything is more symmetrical... the twins are a little more identical...

... am I meant to say 'for the win' here?, I've seen other people say that kind of thing, so try to imagine I have seamlessly done so too.

You can find the fork and RUM edition here.

So Part 2

So Part 2, not what you expected? it is hard in a trilogy, for the middle episode, to keep the plot moving of the overarching story arc and yet still manage to work as a standalone ep.

What I'm trying to say here in all this is: should this kind of redirection always be outside of the core? and if so, can we find a way to adopt one of the packages to work in harmony with the core tracking mechanism, we could then extend and improve the package if necessary, and just avoid the confusion of it not being clear why everything stopped working on one thing, when we installed another thing, because of its name?

Do we even need the additional level of redirection? would the additions in Part 1,  being able to add to the Redirect Url Management dashboard manually, cope with most redirection scenarios that editors need to manage via the CMS?

Where should our 'sorting out redirection in Umbraco' development energies lie? Core or Package or...

...is there a third way? There is a Part 3...[add link to part 3]

NB: thanks to Wade Kallhoff for creating the simple 301 package, and apologies for experimenting with it like this.