Are there any symbols that come after all letters in Unix's alphabetical sorting? - unix

When I organize my directories I often want certain directories to stand out in ls. For example, I will sometimes have a directory called #backup# and this will end up in the top of the list of directories, rather than in between all directories starting in "b". Sometimes, though, I want a directory to be at the bottom of the list, but I haven't found any symbol that achieves this. (The closest I've come is z#name#z, but this doesn't quite cut it.) So: Are there any symbols that come after all letters in Unix's alphabetical sorting?

You can use any (e.g. ASCII or Unicode [it depends upon your encoding and localization]) character except NULL (used as the ending of filepath) and / (used to separate directories in file path). See path_resolution(7). You might consider using ~ because several utilities (see indent(1), mv(1)....) adopt the convention to backup file /home/nag/foo as /home/nag/foo~. AFAIK #foo# could be used by emacs to backup temporarily an edited (but unsaved) file foo.

Related

What is meaning of backslash aftera a folder directory?

What is the difference between "C:\Android\sdk" and "C:\Android\sdk\" when adding path in environment variables or any other place?
There is no difference. A trailing backslash (or slash in the case of UNIX) is ignored by the operating system. There are only two uses that I know of. First, dealing with broken programs that incorrectly assume a trailing slash/backslash is present when concatenating the value to another relative path. Second, a very small number of commands (e.g., scp) might assign special meaning to a trailing path delimiter.

Extract exactly one file (any) from each 7zip archive, in bulk (Unix)

I have 1,500 7zip archives, each archive contains 2 to 10 files, with no subdirectories.
Each file has the same extension, however the filename varies.
I only want one file out of each archive, but I'd like to perform this in bulk. I do not care which file is taken out, as long as only one file is taken out. It can be the first file, the newest, the biggest, the smallest, it doesn't matter.
Here's an example:
aa.7z {blah 56.smc, blah 57.smc, 1 blah 58.smc}
ab.7z {xx.smc, xx 1.smc, xx_2.smc}
ac.7z {1.smc}
I want to run something equivalent to:
7z e *.7z # But somehow only extract one file
Thank you!
Ultimately my solution was to extract all files and run the following in the directory:
for n in *; do echo "$n"; done > files.txt
I then imported that list into excel, and split the files by a special character that divided the title of the file with the qualifying data inside the filename (for example: Some Title (V1) [X2].smc), specifically I used a brackets delimiter.
Then I removed all duplicates, leaving me with only one edition of each from the zip. I finally remerged the columns (unfortunately the bracket was deleted during the splitting so wrote a function to add it back on the condition of whether there was content in the next column) and then resaved files.txt, after a bit of reviewing StackOverflow for answers, deleted files based on an input file (files.txt). A word of warning on this, spaces in filenames cause problems with rm and xargs so I had to encapsulate the variable with quotes.
Ultimately this still didn't serve me well enough so I just used a different resource entirely.
Posting this answer so others who find themselves in a similar predicament find an alternative resolution.

New line constant

Is there a new line constant that's platform independent in R? I'm used to C# and there's Environment.NewLine which will return \r\n on windows and \n otherwise. Searching turned up nothing, but I assume there has to be something somewhere so that scripts can be platform independent.
Related question: Is there a way to detect the platform a script is running on? This could be useful to know for other reasons (which I haven't thought of yet).
EDIT: Here's why I'm asking. I'm downloading files from an FTP server, but want to get a list of files and only download files that are on the server that don't exist locally. Here's how I'm getting the list of files:
filesonserver <- unlist(strsplit(getURL(basePath, ftp.use.epsv=F, dirlistonly=T), "\n"))
On windows, the files are separated by \r\n. On my mac (where I'm currently working), they're separated by \n. I was looking for a way to make this platform independent. I haven't tried just separating by \n on windows, which might work. There might also be a way to get the list of files as a vector without having to split them, which would avoid this entirely...
The package tryCatchLog has a function determine.platform.NewLine():
https://cran.r-project.org/package=tryCatchLog
https://github.com/aryoda/tryCatchLog/blob/master/R/platform_newline.R
If you consequently use this string instead of hard-coded "\n" your new lines will work platform-independently.
The answer to the initial question appears to be there isn't a new line constant like C# has. But it doesn't matter in my case, as the comments pointed out. It didn't occur to me until after I edited in the details that I probably didn't need to worry about it. Splitting by \n works fine on windows, even though the string containing the files names returned by getURL() is split by \r\n.

Difference between dir/**/* and dir/*/* in Unix glob pattern?

It seems that the output are the same when I echoed it.
I also tested other commands such as open, but the results from both are the same.
In traditional sh-style pattern matching, * matches zero or more characters in a component of the file name, so there is no difference between *, **, and ***, either on its own or as part of a larger pattern.
However, there are globbing syntaxes that assign a distinct meaning to **. Pattern matching implemented by the Z shell, for example, expands x/**/y to all file names beginning with x/ and ending in /y regardless of how many directories are in between, thus matching all of x/y, x/subdir/y, x/subdir1/subdir2/y, etc. This syntax was later implemented by bash, although only enabled when the globstar configuration option is set by the user.

Diff-command: doesn't print lines that are different but still says the two files are different

I'm using the diff command to compare two text files. They need to be literally matched.
So I use the diff:
diff binary.out binary.expected
(By the way, those files are NOT binary files. They are text file. I call them binary because that's the name of the project)
and got
Binary files binary.out and binary.expected differ
When I use another diff tool, the smartest of all (AKA human), and there's really nothing different between the two files.
Does anyone happen to know what's going on here?
Thanks.
diff from diffutils says the following about text/binary:
diff determines whether a file is text or binary by checking the
first few bytes in the file; the exact number of bytes is system
dependent, but it is typically several thousand. If every byte in
that part of the file is non-null, diff considers the file to be
text; otherwise it considers the file to be binary.
hence GNU diff have a quite open definition of what is text, and the use of the --text option to force it to treat the file as text should seldom be needed.
Have you checked if binary.out or binary.expected contains null characters? What version is your diff program?
Make sure to ignore white space in the diff options.
It may also see Unicode characters and interpret that as binary. See if your diff tool has an option to force text mode.

Resources