Couldn't find an answer to this and thought it might be a quick answer.
My company, a local news site, is working on migrating to WordPress from a proprietary CMS. Part of the challenge is we are restructuring URLs. I will be utilizing 301 redirects but my issue is as follows:
Example Page name: Story Name: is "this"
Example Old CMS Page URL: /story-name--is--this-/
New CMS Page URL: /news/2012/09/12/story-name-is-this/
The old CMS turned special characters and spaces into hyphens. WordPress will be configured to instead ignore special characters and simply turn spaces into hyphens. Additionally, the old CMS did not include the date in the URL, and I'm not sure the best route to take regarding adding the date.
Thanks!
You're either going to have to write a script that takes all of your old links, does a lookup in your database to transform it into the new link, and redirect the browser to the new link. Or you'll have to enumerate the entire mapping of old links -> new links and create a 301 redirect for each of them (in either your vhost/server config or in an htaccess file):
Redirect 301 /story-name--is--this-/ /news/2012/09/12/story-name-is-this/
It's not clear what is your real question? I am also not sure what Regular expressions have to do with the problem.
There is no information about what your old CMS is capable of, assuming that you can intercept the calls to old articles when they are accessed via the browser, but before they are rendered you can form and send the redirect back to the browser dynamically generating the url using the programming mechanisms available in your proprietary CMS.
Again, assuming you have access to Java:
A. When generating the redirect URL you can access the article's date and form the
2012/09/12 from the date, you can use SimpleDateFormatter to format Dates into a string representation like YYYY/MM/DD.
B. You can use similar approach with the titles and replace the list of special characters in the title string with empty spaces. For example Apache StringUtils library can let you specify a set of characters to look for and if any are found they will be replaced with the target character.
C. You concatenate the output of A and B to create the target redirect URL and send it back to the browser instead of the article itself.
Related
I'm trying to migrate my ASPX site to Kentico, and as part of my task I'm migrating URLs. I need to preserve my URL structure, so I need to keep URLs which look like : "foo.com/bar.aspx?pageid=1".
I checked page's "URLs" property tried to use wildcards, some patterns like /bar/{pageid}- /bar/{?pageid?}-, etc but Kentico always replaces question marks.
Is there a way to achieve that via the admin interface?
You don't need to do anything in order to use "foo.com/bar.aspx?pageid=1" url.
Create a page under the root and call it bar, so you'll get a page # foo.com/bar.aspx. Kentico and/or .net does not care what you add to a url after question mark, so foo.com/bar.aspx?pageid=1 will work as well as foo.com/bar.aspx?someparam=sdf, or foo.com/bar.aspx?id=1&p=3&t=3.
You may (or may not) implement some functionality based on query string (e.g. paging), so it will parse query string and act in appropriate way.
By default Kentico UI does not handle adding URL aliases with URL parameters like you show. There is an article on the DevNet for a URL Redirection module which has code you can import into your site to allow you to perform these redirects within the Kentico UI. I'd suggest using this approach.
Unfortunately, I can't share a code sample since it's an article but it also has a link to download the code too. This appears to only be coded for Kentico 8.2 right now but I'm guessing you could do some work to make it work for other versions if you needed.
I think there are few concepts that you are clubbing here. I will start with your line code here
/bar/{pageid} - {pageid} is a positional parameter in Kentico's language if you choose to use dynamic URLS based on patterns. SO if you have a code that relies on pageid parameter to fetch some data then Kentico will pass that value. E.g in case of /bar/420, it will pass pageid as 420 different web parts on your template
/bar/{?pageid?} - This will search for query string parameter "pageid" on the request URL and replace its value here. So if you passed foo.com/bar.aspx?pageid=366, the resulting URL will be /bar/366
The #1 is positional parameter and #2 is the way in which Kentico resolves query string macros.
I hope this clarifies.
I am facing some problem or maybe I am confused. I've followed the following link to generate dynamic Site Map:
MVCSiteMapProvider Dynamic Sitemap
I am confused at node.RouteValues.Add("id", album.AlbumId); in the class given in above link. My website links are not in the form given in the example they discussed (their URLs are like mysite.com/controller?id=some id)
where as my URLs are in following format:
mysite.com/mycontroller/querystring1/querystring2/querystring3
How can I "mention" such kind of URL in node.RouteValues.Add(..., ...); so that it should redirect to mysite.com/controller/querystring1/querystring2/querystring3?
Thanks
The format of the URLs makes no difference. Either way, they are transformed into a collection of route values, which is what MvcSiteMapProvider uses under the hood.
Therefore, the same line node.RouteValues.Add("id", album.AlbumId); can be used in either scenario to make the node match. The actual processing of the URL into route values is done by .NET routing, not MvcSiteMapProvider.
I updated my website CMS and the URL formats have changed. Where previously I had the URL /blog.aspx?Year=XXXX&Month=YY I now have /blog/XXXX/YY
Can someone help me create a regex for this?
Two additional notes:
it has to also support simply the year (/blog.aspx?Year=XXX)
the old Month urls use only 1 digit for single digit months (/blog.aspx?Year=2009&Month=2 instead of Month=02)
Here is what I came up with:
/blog.aspx[?]Year=([0-9]{4})([&]?)(Month=)?([0-9]*)
I can't seem to get it to work, as I still get a 404 on the page when I go to one of the above URLs.
Is this workable?
/blog.aspx\?Year=([0-9]{4})(?>\&?Month=?([0-9]{1,2})|)
works with these input
/blog.aspx?Year=1983&Month=2
/blog.aspx?Year=1983
/blog.aspx?Year=1983&Month=12
there is this (?>blabla|moomoo) syntax.
If it cant find blabla match, it will match moomoo
Though i suspect regex here is not the root problem, what CMS handles the redirect?
-I'm using a number of WordPress rewrite rules to allow for the injection of country-codes immediately at the beginning of the URL path, which are used to determine a timezone offset. An example:
add_rewrite_rule('^([A-Za-z]{2})/days/([0-9]+)/?$', 'index.php?geo=$matches[1]&m=$matches[2]&post_type=days','top');
This takes a request like www.daysoftheyear.com/days/2011/ (which would usually return all valid content for this request) and allows for, e.g., www.daysoftheyear.com/us/days/2011/ to return the same content but with support for a timezone offset based on the country-code.
This works fine in almost all places, with the exception of a single query type - one for 'days' custom post type pages, e.g., http://www.daysoftheyear.com/days/waffle-day/.
The rules I have in place are:
add_rewrite_rule('^([A-Za-z]{2})/?$', 'index.php?geo=$matches[1]','top');
add_rewrite_rule('^([A-Za-z]{2})/days/([0-9]+)/?$', 'index.php?geo=$matches[1]&m=$matches[2]&post_type=days','top');
add_rewrite_rule('^([A-Za-z]{2})/days/([0-9]+)/([0-9]+)/?$', 'index.php?geo=$matches[1]&m=$matches[2]$matches[3]&post_type=days','top');
add_rewrite_rule('^([A-Za-z]{2})/days/([0-9]+)/([0-9]+)/([0-9]+)/?$', 'index.php?geo=$matches[1]&m=$matches[2]$matches[3]$matches[4]&post_type=days','top');
add_rewrite_rule('^([A-Za-z]{2})/days/([A-Za-z\-].*)/?$', 'index.php?geo=$matches[1]&page=$matches[2]','top');
add_rewrite_rule('^([A-Za-z]{2})/([A-Za-z\-].*)/?$', 'index.php?geo=$matches[1]&pagename=$matches[2]','top');
The fifth rule shoud match http://www.daysoftheyear.com/gb/days/waffle-day/ in much the same way as above, but redirects - I suspect that it's confliucting with the inbuilt rules which attempt to redirect to a correct URL if it's malformed (e.g., if I type a close structural match to a correct URL, it'll redirect me to the correct resource).
I can confirm that the 'raw' URL for this request works - e.g., http://www.daysoftheyear.com/index.php?geo=en&name=soup-month&post_type=days returns a valid and expected result.
I'm not convinced this is a regex rule, rather than a specific challenge with the way WP manages custom post types?
EDIT
Updated to allow for hyphens - no change in behaviour, though regexpal reports that the regex works against the example URL.
Updated after disabling WP canonical redirects functionality - now 404'ing rather than 301'ing to the page.
Updated to use 'page' rather than 'pagename', based on the information here: http://codex.wordpress.org/Class_Reference/WP_Query#Post_.26_Page_Parameters - no change in behaviour.
Updated the code, added a linebreak and clarified that I'm actually referencing line 5, rather than line 4.
This request http://www.daysoftheyear.com/days/waffle-day/ won't match your fourth rule since you didn't allow - inside the group cature : ([A-Za-z].*). Replace this group with ([A-Za-z\-].*) and it should match.
HTH
Resolved; it appears that the above ruleset now works correctly - thanks all!
For example:
http://stackoverflow.com/questions/698627/ms-access-properties
The number is part of the URL but is an argument to the web app as opposed to other options like:
http://www.google.com/firefox?client=firefox-a&rls=org.mozilla:en-US:official
where all the args come after the '?'. I have used the second form before and I'm only trying to learn about the first form.
I'm sure I can find what else I need once I known what that's called so I can Google it.
URL Rewriting, generally.
Edit: Here is a good introduction to URL Rewriting.
Variables passed in the form of a URL are called the Query String. In a url like:
http://examples.com?a=b&c=d&e=f
The query string is ?a=b&c=d&e=f
In the Stackoverflow example, it uses URL Rewriting, specifically with MVC Routing to make 'pretty URLs'. There are other ways to do it in other languages. Some make use of Apache's mod_rewrite (example) while others parse the requested URI. In PHP a url like
http://example.com/index.php/test/path/info
can be parsed by reading $_SERVER['PATH_INFO'] which is /text/path/info.
Generally, they are using URL Rewriting to simulate the query string however. In the Stackoverflow example:
http://stackoverflow.com/questions/698711/what-is-the-name-for-that-thing-that-lets-part-of-the-url-be-an-argument
The important parts are the questions/698711. You can change the title of the question with impunity but the other two parts you cannot.
It's usually called the 'path info'.
That's just URL mapping. It lets you use pretty URLs instead of a large query string.
I believe the StackOverflow URL works that way because it is using MVC whereas your bottom example is using standard requests.
It is indeed done by URL rewriting.
Usually, web application frameworks do this automatically if you install it correctly on your server.
Check out CakePHP as an example.
It's called a URL parameter and uses the HTTP GET method. As others mentioned, it can be rewritten using URL rewriting so that the URL is easier to read and use. Some search keywords: "SEF URLs", "Apache Rewrite", "pretty URLs".