What's the best way to get a file's basename from an URI, in Vala? - uri

I can think of two ways: first, I could just manipulate the string itself; strip everything that precedes the last "/". Or, I could use the URI to get a File object, then call query_info().get_display_name().
The first doesn't feel right, while the second results in two objects being created. What is the best practice to follow here?

The second way (using GLib.File) is probably the most robust, but...
If what you have is really a path, not a URI (e.g.., /home/foo/bar not file:///home/foo/bar) you can just use GLib.Path.get_basename:
GLib.Path.get_basename ("/home/foo/bar");
Because characters in a URI can be encoded (e.g., %20 instead of a space), if you really have a URI you may need to unescape the string first:
GLib.Path.get_basename (GLib.Uri.unescape_string ("file:///home/foo/bar%20baz"));

Related

Is it better to use a "?" or a ";" in a URL?

In my application, I redirect an HTTP request and also pass a parameter. Example:
http://localhost:9000/home;signup=error
Is it better to use a ; or shall I use a ? i.e. shall I do http://localhost:9000/home;signup=error or http://localhost:9000/home?signup=error?
Are the above two different from each other semantically?
The ? is a reserved character; I have read that this is both valid and invalid, but I have used it for 'slugs' when templating.
Should you choose to use it, percent-encode the query string using %3F which is not human readable, but will produce the ?. (An encoder is recommended)
Perhaps you will find a more suitable solution for your redirects by adding an .htaccess file to your project.

Parameter separator in URLs, the case of misused question mark

What I don't really understand is the benefit of using '?' instead of '&' in urls:
It makes nobody's life easier if we use a different character as the first separator character.
Can you come up with a reasonable explanation?
EDIT: after more research I found that "&" can be a part of file name (terms&conditions.html) so "?" is a good separator. But still I think using "?" for separators makes lives easier (from url generators and parsers point of view):
Is there any advantage in using "&" which is not clear at the first glance?
From the URI spec's (RFC 3986) point of view, the only separator here is "?". the format of the query is opaque; the ampersands just are something that HTML happens to use for form submissions.
The answer's pretty much in this article - http://www.skorks.com/2010/05/what-every-developer-should-know-about-urls/ . To highlight it, here goes :
Query is the preferred way to send some parameters to a resource on
the server. These are key=value pairs and are separated from the rest
of the URL by a ? (question mark) character and are normally separated
from each other by & (ampersand) characters. What you may not know is
the fact that it is legal to separate them from each other by the ;
(semi-colon) character as well. The following URLs are equivalent:
http://www.blah.com/some/crazy/path.html?param1=foo&param2=bar
http://www.blah.com/some/crazy/path.html?param1=foo;param2
The RFC 3896 (https://www.ietf.org/rfc/rfc3986.txt) defines general and sub delimiters ... '?' is a general, '&' and ';' are sub. The spec is pretty clear about that.
In this case the latter '?' chars would be treated as part of the query. If the query parser follows the spec strictly, it would then pass the whole query on to the app-destination. If the app-destination could choose to further process the query string in a manner which treats the ? as a param name-value pairs delimiter, that is up to the app's designers.
My guess is that this often 'just works' because code that splits query strings and the original uri uses all delimiters for matching: 1) first query is split on '?' then 2) query string is parsed using char match list that includes '?' (convenience only).... This could be occurring in ubiquitous parsing libraries already.

In ASP.NET, why is there UrlEncode() AND UrlPathEncode()?

In a recent project, I had the pleasure of troubleshooting a bug that involved images not loading when spaces were in the filename. I thought "What a simple issue, I'll UrlEncode() it!" But, NAY! Simply using UrlEncode() didn't resolve the problem.
The new problem was the HttpUtilities.UrlEncode() method switched spaces () to plusses (+) instead of %20 like the browser wanted. So file+image+name.jpg would return not-found while file%20image%20name.jpg was found correctly.
Thankfully, a coworker pointed out HttpUtilities.UrlPathEncode() to me which uses %20 for spaces instead of +.
WHY are there two ways of handling Url encoding? WHY are there two commands that behave so differently?
UrlEncode is useful for use with a QueryString as browsers tend to use a + here in place of a space when submitting forms with the GET method.
UrlPathEncode simply replaces all characters that cannot be used within a URL, such as <, > and .
Both MSDN links include this quote:
You can encode a URL using with the UrlEncode method or the
UrlPathEncode method. However, the methods return different results.
The UrlEncode method converts each space character to a plus character
(+). The UrlPathEncode method converts each space character into the
string "%20", which represents a space in hexadecimal notation. Use
the UrlPathEncode method when you encode the path portion of a URL in
order to guarantee a consistent decoded URL, regardless of which
platform or browser performs the decoding.
So in a URL you have the path and then a ? and then the parameters (i.e. http://some_path/page.aspx?parameters). URL paths encode spaces differently then the url parameters, that's why there is the two versions. For a long time spaces were not valid in a URL, but were in in the parameters.
In other words the formatting urls has changed over time. For a long time only ANSI chars could be in a URL too.

dealing with an encrypted HttpUtility.UrlEncode parameter

I have a problem dealing with encrypted URL parameters when applying HttpUtility.UrlEncode or UrlDecode.
for a given url string: ?fid=7kqguwhYMNw=&uid=YCRSGG71+58=
the PLUS sign which is part of the encrypted data of uid is stripped out and replaced with a space so my attempts to decrypt it fail.
OK, so I know that the + is a reserved shorthand for space in QUERYSTRING(RFC 1630) but since I don't have too much control over the value that is returned from encryption how can I get around this.
EDIT:
OK, so good point brought up. Ignore the UrlEncode/UrlDecode part of the question. Request.QueryString(["uid"]) will still have the plus sign stripped out of it when I pass it to my decryption method.
I would suggest adding code to remove the = characters, replace + with -, and replace / with .
s = s.Replace("=", "").Replace("+", "-").Replace("/", ".")
If you need to process the resulting string, you can do the reverse:
s = s.Replace(".", "/").Replace("-", "+")
(there is no reason to put back the = characters... they are merely padding).
That way you don't need to worry about URL encoding and decoding and it avoids unnecessary expansion of your string. It also looks more professional to users if they end up seeing the URL... percent signs in URL are ugly and almost always unnecessary... it screams "amateur" whenever I see them.
The Base-64 encoded value needs to be URL-encoded before it is put in the URL. If I do HttpUtility.UrlEncode("YCRSGG71+58=") then I get YCRSGG71%2b58%3d - which has no plus signs, and can be correctly decoded.
In other words, the code that is putting a base-64 value on the URL without encoding it first is wrong. If you control that code, you should change it. If you don't control that code, then don't try to decode something that wasn't url-encoded in the first place.
As a side remark, you should use HttpUtility.UrlEncode and HttpUtility.UrlDecode for this kind of work. However, even these wont help you since the URL is malformed anyway.
So, don't use anything at all! Since it's not encoded, why decode it?

regular expression for physical path

can someone tell me the javascript regular expression for physical path like
1) User Should enter something like this in the textbox( c://Folder1/) . Maybe in d: or e:
2) But after that acceptable
a) (c://Folder1/Folder2/)
b) (d://Folder1/Folder2/Folder3/abc.txt)
e) (c://Folder1/Folder2/Folder3/abc.txt)
From the examples you've given, something like this should work:
[a-zA-Z]://(\w+/)+
ie:
[a-zA-Z] = a single letter (upper or lower case)
followed by
:// = the characters "://"
followed by:
(\w+/)+ = at least one "something/".
"something/" defined as :
\w+ = at least one word character (ie any alphanumeric), followed by
/ = literal character "/"
Hope this helps - my syntax may be a little off as I'm not fully up to speed on the javascript variant for regex.
Edit: put regex in code tags so it is visible! And tidy up explanation.
This problem is actually trickier than you think. You're trying to validate a path, but paths can be surprisingly hard to properly validate. Are you properly handling UNC network paths, e.g.?
This is known as the canonicalization problem and is part of writing secure code. I suggest checking out some guidance from Microsoft for properly canonicalizing and validating the path in your application. The advantage of canonicalizing your path is that you also implicitly validate its format because the canonical form will be returned from a library call that will only return paths that are potentially valid (properly formatted). This means that you don't have to do any sort of regex validation at all. Just throw your string at the method that canonicalizes the path (Path.GetFullPath() probably) and handle the exception for an invalid path.

Resources