What is Reverse Proxying
Most modern websites are dynamic in nature. When you ask for /widgets/blue you are in-fact getting content from a dynamic page such as page.php ?type=widgets&colour=blue. The server contains rules (.htaccess, web.config, httpd.conf etc.) which basically say “they have asked for X, to get X we need to do Y and send the result back to them as X”.
These rules can be set to a massive range of things but most commonly they redirect users or they rewrite requests. A rewrite is what happen above. The server doesn’t actually get X, it simply sends the results of Y back as if it was X. The user is none-the-wiser. A redirect is more along the lines of “they have asked for X, tell them that they actually want Y, they have now asked for Y, cool send them Y”.
Anyhow, the important takeaway is that a rewrite masks what is actually happening and the URL the user has requested remains intact. Reverse proxying takes this one step further and in effect says “they have asked for X, to get X we need to do Y and Y is hosted on the domain Z rather than this webserver so we need to go there to get it before we send the result back to them as X”.
It’s as easy as that :)
What do I need to get Reverse Proxying working?
Reverse proxying can be pretty complicated. I’m going to avoid the complicated approaches and focus on the easier ones to get you used to the concepts etc. which once mastered should make more complicated set-ups easier to handle.
Let us presume you have a Wordpress blog installed on a sub-domain (sub.yourwebsite.com) which is situated on an Apache box. You also have your main domain (“www.yourwebsite.com”) hosted on a separate webserver (IIS or Apache).
Finally you have admin access to your Wordpress installation and your IIS / Apache box hosting your main domain has the ability to use .htaccess (Apache) or something like ISAPI Rewriter 3 (IIS).
So far that is all I have ever needed to get Reverse Proxying working using the following steps.
Getting Reverse Proxying working
The first thing you need to do is set up the rule to tell your main domain that any request for a blog page (X) needs to be processed using Wordpress (Y) on the sub-domain (Z).
This is simply a case of adding the following: #Statement to initialise the ability to use Mod_Rewrite. May already exist. If so ignore. RewriteEngine on #Rule to proxy all requests for blog URLs to the blog sub-domain. [P] is the important bit.
RewriteRule ^/blog/(.*) “sub.yourwebsite.com/$1” [NC,P]
The above rule basically says “if someone requests a URL starting with /blog/, send that request to sub.yourwebsite.com and get the result from there”.
The [NC] means ignore case and the [P] means proxy. With this set up you should now see your Wordpress blog appearing when you browse to “www.yoursite.com/blog/”
That can’t be it surely
Well technically yes, that’s it. You will however notice that once you are within /blog/ all of the internal links to other blog pages (categories etc.) use sub.yourwebsite.com rather than /blog/. This is not good!
To fix this you need to set the Site Address URL in your Wordpress Admin Panel to be “www.yoursite.com/blog”.
This will update all of your permalinks.
Duplicate content issue
There may still be an outstanding issue with your set-up. In effect your blog is now accessible on “sub.yoursite.com” and “yoursite.com/blog/”. You can’t 301 the sub-domain to /blog/ as that will put you in an infinite loop.
You can however take certain steps to migrate any power to /blog/ while ensuring the sub-domain remains un-indexed (or is removed from the index if already present).
Firstly I would recommend installing the SEO for Wordpress plug-in by Yoast, details of which can be found here - “yoast.com/articles/wordpress-seo/” Other SEO plug-ins are available which do similar things.
Anyway, this plug-in does loads of cool stuff including adding rel=canonical tags to your pages. With this installed plus the fact you are using permalinks (you really should be) you will automatically have a canonical tag present on each page pointing to the /blog/ version.
The next thing you can do is add a robots.txt file to the sub-domain that stops robots from indexing it. As Reverse Proxying keeps the requested URL the /blog/ URLs will use the robots.txt from the main domain rather than the sub-domain.
The final (and most extreme) thing you can do is to register Google Webmaster Tools for the sub-domain and remove it from the index. If you are doing this, you need to do it in conjunction with robots.txt.
There you go, you now (hopefully) have a fully working Wordpress blog running as a sub-folder on your domain. The above works for me and I really hope it works for you to. I have listed a few other alternatives below in-case it doesn’t.
You can drop Wordpress on to an IIS box if you can get PHP etc. working on it. More on this subject can be found here - http://codex.wordpress.org/Installing_on_Microsoft_IIS. I don’t like doing this as it makes me feel dirty so I will leave it there.
You can also install a different Blog platform that is native to the architecture of the main domain such as BlogEngine.NET which can be found here - http://www.dotnetblogengine.net/. You will however need to migrate your old posts etc. if you already have a blog running elsewhere.
I’m sure there are more and if you know of any please add them as a comment. I’m not the font of all knowledge and I do love learning new techie stuff. I really hope you have found this post useful etc. and welcome any feedback.
Good luck and God speed.