Get Tridion URI from a Page Publish Path? - tridion

I would like to get a Page URI from the Tridion Page PublishPath (URL from public site).
I know I can use the WebDavURL to get the Page - but there are many cases where this does not match the Page Path (such as the case when the Page Filename is 'News Home' but the filename is 'index').
I currently do not think this is possible in Tridion and have thought about the following solutions:
Store the Page Publish Path in the Tridion Solr Index (by extending it)
Customized Tridion Search Index Handler: Custom vs Standard field for page url?
Use the Event System and persist the PublishPath and Uri to an external DB / KeyValueStore.
Other ideas?
This code would be used in a script that updates many pages but also for Editors to open a Page using the Page URL.

This may be considered an odd approach, but could you query the Broker to get the PageMeta by URL, and then you will be able to access the the URI from the PageMeta object?
Just a thought - but it is probably not ideal - Can you share some more background on the problem?

I'm assuming you need to do this CME side? Is performance a factor?
You could build a function in Tom.net or the CoreService to split the path you have and loop through the sgs and pages till you arrive at the correct content page. As long as the path information hasn't been changed since the page published, this should map together.

Related

firebase storage custom html page for errors like permission denied?

Is there a way to show a custom html page or redirect a user when storage throws errors like if file is not available, not sufficient permissions or in case of any other issues when the file can't be accessed/displayed?
So instead of the below
We need to show a custom page with our brand designs.
What are the options here?
No, there is not. You are using an API endpoint (download URL) meant for programmatic consumption or inline page insertion, not for loading into a browser to present a full web page.

Kentico Google Analytics page view

I'm looking at the GA traffic and I'm seeing page views for pages like this: /cms/getdoc/2d22c1db-ae83...angobjectlifetime=request
Is this page used when a user is viewing a document (PDF, Word, etc.)?
Not necessary. It could be any page within the content tree (including files). The cms prefix means it requires authentication (it's usually within the administration interface) and the getdoc handler means the url is permanent one (uses the GUID that follows), so you always get this page/file no matter where in the content tree it is (after you move it for example)
Google Analytics records anything hit to your website which is used to access either to access a legitimate page or used to access a resource on your website. You should use filters during report to filter out this data.

Adding simple CMS functionality to an existing MVC application

ASP.NET 4.51, MVC 5
Have read Integrating a CMS into an established application-centric MVC website
We have a number of MVC applications that serve as public facing websites. The applications were built using MVC as that was the technology stack understood by the developers and primarily the content that was being delivered was based on business process data.
However more and more we are being asked to add "another page" to the websites which for all intents and purposes is a plain old static content page. This ultimately involves:
Adding a new route
Creating a view with the required HTML
We have various "home grown" solutions which now pull HTML from the database for these views. However this means we are writing custom back end data entry screens as well as 1 & 2 above.
So.... There must be a better way. Has anyone got any practical experience or suggestions on how to add simple CMS functionality that we can give to end users, plugged into our MVC application? We need to provide the following functionality to the end user:
Create new pages, edit pages using WYSIWYG
Add meta tags and canonical tags for SEO
Specify the url portion of the uri for SEO purposes
All insights appreciated.
Is it feasible to do the following:
Have a database table to house the content for these pages. e.g. title, summary, description, url, meta, image(s) etc...
In the front end have a template for these pages. The database data fills in the placeholders within this template.
Perhaps hold all the pages on a base URL like www.yoursite.com/page/dynamic-page-url-from-db
You can use the Remote attribute validation on the url field to make sure they are all unique in the database.
With this in mind, create a single Route to catch the requests and filter valid/invalid requests in the Page controller based on the URL provided with the db. If non-existent throw new HttpException(404, "Page Not Found"); and have an error handler pick that up and deliver your 404.
META could be set via ViewBag or a dedicated section that alters the _Layout file at the point of rendering the view.
TinyMCE is a decent WYSIWYG editor. You can even add dynamic image gallery functionality to it if you want to embed images within the main body of the pages.
I'm working on making a CMS currently used in a demanding production environment into a product. I've just (as of 20 Jan 2015) made a NuGet package which installs the CMS into an MVC project which should be possible to add to any existing MVC site without breaking it. CMS functionality can then be added where needed. Currently I'm looking to work with some users to help them get the CMS into production on their sites, however this may have changed by the time you read this. Look at http://www.lynicon.com for more information and to sign up to a Slack community where I can give you access to the NuGet package.

Updateable Google Sitemap for ASP.NET 3.5 Web App Project

I am working on an ASP.NET 3.5 Web Application project in C#. I have manually added a Google-friendly sitemap which includes entries for every page in the project - this is not a CMS.
<url>
<loc>http://www.mysite.com/events.aspx</loc>
<lastmod>2009-11-17T20:45:46Z</lastmod>
<changefreq>daily</changefreq>
<priority>0.8</priority>
</url>
The client updates events using an admin back-end. Other than that, the site is relatively static. I'm trying to decide on the best way to update the <lastmod> values for a handful of pages that are regularly updated.
In particular, I am using the QueryStringField of the ListView control to enhance SEO as described here:
https://web.archive.org/web/20211029044137/https://www.4guysfromrolla.com/articles/010610-1.aspx
http://gsej.wordpress.com/2009/05/31/using-a-datapager-with-both-a-querystringfield-and-renderdisabledbuttonsaslabels/
When the QueryStringField property is set, the DataPager renders the paging interface as a series of hyperlinks which the crawler can follow and index. However, if Google has crawled my list of events two days ago, and in the meantime, the admin has added another dozen events... say the page size is set to 6; in this case, the Google SERP links would now be pointing to the wrong pages. This is why I need to be sure that the sitemap reflects changes to the events page as soon as they happen.
I have already looked though other SO questions for info and didn't find what I needed. Can anyone offer some guidance or an alternative approach?
UPDATE:
Since this is a shared hosting environment, a directory watcher/service won't work:
How to create file watcher in shared webhosting environment
UPDATE:
Starting to realize that I may need signify to Google that the containing page has been updated; update the last-modified HTTP header?
Rather than using a hand-coded sitemap, create a sitemap handler that will generate the sitemap on the fly. You can create a method in the handler that will grab pages from an existing navigation sitemap, from the database, or even from a hard-coded list of pages. You can create an XmlDocument from the list, and write the InnerXml of the document out to the handler response stream.
Then, create a class with a method that will automatically ping search engines with the above handler's URL (like http://www.google.com/webmasters/tools/ping?sitemap=http://www.mysite.com/sitemap.ashx).
Whever someone adds a new event, call the above method. This will ping Google using your latest sitemap (freshly generated by the above method).
You want to make sure that the ping only works if the sitemap has actually been updated. You could use File.SetLastWriteTime on events.aspx in the AddNewEvent handler to signify that the containing page has been updated.
Aslo, be careful to make sure there have been no pings for the last hour (as Google guidelines discourage pinging more than once per hour).
I actually plan to implement this in the following OSS project: http://cyclemania.codeplex.com. I will let you know once it's done and you can have a look.
If you let your user add events to the website you are probably using a database.
This means you can generate the XML-Sitemap at runtime like this:
create a page where your sitemap will be available (this doesn't need to be sitemap.xml but can also be sitemap.aspx or even sitemap.ashx).
open a database connection
loop through all records and create an Xml Element for each record
This blog post should help you further: Build a Search Engine SiteMap in C#.
It is not using the new XElements from .Net 3.5, but is will work fine.
You can put this in an aspx page, but adding an HttpHandler is probably better as described on the same blog, different post: (creating a httphandler for a sitemap)

Where content based websites store their content?

Sites like cnn.com or foxnews.com.
Where do they store all the articles? In html files? In database?
More logically to store everything in DB but how to generate a static link to something that is inside DB?
It's not that they have a a dynamic page load like: LoadArticle.aspx?ArticleID=123, every article has it's own address.
Please explain how this is done.
They use a special content management library called VoodooLib.dll.
Seriously, when you write something to a database, you normally generate some kind of unique identifier - 123, for example. It gets permanently associated with that record (article content). After that it is used to generate the same id as part of an Url at any time later.
As for the static link, it is a simple matter of Url Rewriting.
You generate static links to display on a page because they work much better for SEO. When a request for that static Url hits the server, it gets substituted for something "server friendly" and then gets to be processed.
They probably use some form of Content Management System (CMS). There are many different ones out there - most store the actual content in a database or as XML (some store XML in a database). They will the either publish that content as static HTML pages or, more commonly now, as dynamic pages that are cached. Many use what are known as "friendly URLs" that are virtual addresses that are mapped to the actual physical file path using URL-rewriting techniques.
Note you can't tell whether a page is dynamic or static simply from the extension. It is quite possible to have dynamic pages that end in the .html extension.
Just because the URL looks "static" doesn't mean it is; they could be using something like mod_rewrite or an IIS ISAPI to make the URLs more search engine friendly.
For the high-volume news sites that you mention, however, they may very well generate the pages statically in order to prevent overloading the database with repeated requests for the same article.
Look at the URl of this page, it doesn't have xxx.aspx?some-query-string
You are refering to using friendly URLs.
To do something like that, one common way is to use URL Rewrite and/or some custom HTTPModule
Here's a good reference: http://weblogs.asp.net/scottgu/archive/2007/02/26/tip-trick-url-rewriting-with-asp-net.aspx
Just because a page has a normal URL does not mean that it isn't serving dynamic content. With the Apache mod_rewrite module, it is possible to manipulate URLs. So, for example, a page like http://www.domain.tld/permalink/12345/message-title-slug can be converted internally to http://www.domain.tld/permalink/index.php?id=12345&slug=message-title-slug.
I do not know exactly what cnn.com and foxnews.com use, but I would bet that they use a Content Management System (CMS) which serves all pages dynamically, with the content stored either in a database or on the filesystem, and with authoring/publishing all being performed through the particular CMS.
Just checking cnn.com, the article links have in them
Year
Location (US or WORLD/specificlocationid)
Month
Day
Article name.
All of this information together can be used to uniquely identify any article (even less of it is probably actually needed). The dynamic content loading page address could easily be hidden by some method of URL rewriting, and then the information in the requested URL is used to determine which article in the DB is to be served up.
I don't know why all the other answerers seem to assume that some form of URL rewriting is necessary to create friendly URLs. It's not true at all.
It's perfectly possible to write web serving code that splits a URL into parameters - eg year, month, title - and pass that directly to the code that gets the content from the database, without any need to rewrite the URL. Most modern web frameworks such as Django and Rails include this functionality out of the box.
This is done through mod-rewrite techniques.
Here's an article about the mod rewriting engine: http://httpd.apache.org/docs/1.3/mod/mod_rewrite.html
And here's their "guide": http://httpd.apache.org/docs/2.0/misc/rewriteguide.html
I hope that helps. It should make for a good starting point. Goodluck.

Resources