Mix of parameter and command substitutions - zsh

I found this snippet of code (simplified)
while read item; do
echo -n "${(q)item} "
done
from here https://github.com/junegunn/fzf/blob/master/shell/key-bindings.zsh#L12
I don't understand the expression "${(q)item} ".
What is variable q, I didn't find any declaration of it, is it a command substitution? Why parentheses use inside curly braces? What is meaning of this construction?

Parentheses immediately after ${ specify parameter expansion flags. The q flag is used to quote special characters in the expansion.
Quote characters that are special to the shell in the resulting words with backslashes; unprintable or invalid characters are quoted using the $'\NNN' form, with separate quotes for each octet.

Related

gsub remove backslash and numbers from string [duplicate]

I'm writing strings which contain backslashes (\) to a file:
x1 = "\\str"
x2 = "\\\str"
# Error: '\s' is an unrecognized escape in character string starting "\\\s"
x2="\\\\str"
write(file = 'test', c(x1, x2))
When I open the file named test, I see this:
\str
\\str
If I want to get a string containing 5 backslashes, should I write 10 backslashes, like this?
x = "\\\\\\\\\\str"
[...] If I want to get a string containing 5 \ ,should i write 10 \ [...]
Yes, you should. To write a single \ in a string, you write it as "\\".
This is because the \ is a special character, reserved to escape the character that follows it. (Perhaps you recognize \n as newline.) It's also useful if you want to write a string containing a single ". You write it as "\"".
The reason why \\\str is invalid, is because it's interpreted as \\ (which corresponds to a single \) followed by \s, which is not valid, since "escaped s" has no meaning.
Have a read of this section about character vectors.
In essence, it says that when you enter character string literals you enclose them in a pair of quotes (" or '). Inside those quotes, you can create special characters using \ as an escape character.
For example, \n denotes new line or \" can be used to enter a " without R thinking it's the end of the string. Since \ is an escape character, you need a way to enter an actual . This is done by using \\. Escaping the escape!
Note that the doubling of backslashes is because you are entering the string at the command line and the string is first parsed by the R parser. You can enter strings in different ways, some of which don't need the doubling. For example:
> tmp <- scan(what='')
1: \\\\\str
2:
Read 1 item
> print(tmp)
[1] "\\\\\\\\\\str"
> cat(tmp, '\n')
\\\\\str
>

Substitution with backreferencing in Atom editor

I have a frequent pattern on my text, say
(Eq. \ref{XXXX})
where XXXX is some word, and I'd like to change all this simply to
\refp{XXXX}
I can't make it work through CtrlF, even with Regex. The syntax
\(Eq. \\ref{.*}\)
works for finding the occurences (if with some bugs...) but the traditional backreferencing
\\refp{\1}
won't work for the replacement.
I tried to create a custom command with the atom-shell-commands package, the idea would be to use sed on the current selection. But the package won't accept octal escape sequences.
Any thoughts?
The replacement tokens use a $ sigil, not \. So you want $1, $2, $3, ...
The replacement in this case should be:
\\refp{$1}
As is common with regex matching, these tokens match the contents of paren groups, from left to right. So you need to add matching parens also. Your match string would be:
\(Eq. \\ref{(.*)}\)
Note there are parens around the .* match, so whatever is inside those parens is stored in $1. If there were a second and third set of parens, those would become $2 and $3.

How to include multiple backslashes in Unix Korn Shell

I have the following variables in Unix Korn Shell
host=nyc43ksj
qry_dir='\test\mydoc\mds'
fullpath="\\$host\$qry_dir"
echo "$fullpath"
When I execute the above, I get output such as \nyc43ksj\qrydir.
It looks like the backslashes are used as escape characters.
I tried changing fullpath as follows:
fullpath="\\$host\\$qry_dir"
echo "$fullpath"
This time I get \nyc43ksj\test\mydoc\mds. However, the two backslashes at the beginning are not display as two backslashes. How can get the fullpath as \\nyc43ksj\test\mydoc\mds (two backslashes at the beginning).
In a string, the \ (backslash) character acts as an escape character (as you mention), and the second backslash instructs the shell to put in an actual backslash, as opposed to some special character.
If you want to have two actual backslash characters in the string in sequence, you will need to put \\\\ in the string, so:
fullpath="\\\\$host\\$qry_dir"

Unix syntax for the grep command for only an ending character

For the file james, when I run this command:
cat james | grep ["."]
I get only the lines that contain a dot.
How do I get only the lines that end with a dot?
To find lines that end with a . character:
grep '\.$' james
Your cat command is unnecessary; grep is able to read the file itself, and doesn't need cat to do that job for it.
A . character by itself is special in regular expressions, matching any one character; you need to escape it with a \ to match a literal . character.
And you need to enclose the whole regular expression in single quotes because the \ and $ characters are special to the shell. In a regular expression, $ matches the end of a line. (You're dealing with some characters that are treated specially by the shell, and others that are treated specially by grep; the single quotes get the shell out of the way so you can control what grep sees.)
As for the square brackets you used in your question, that's another way to escape the ., but it's unusual. In a regular expression, [abc] matches a single character that's any of a, b, or c. [.] matches a single literal . character, since . loses its special meaning inside square brackets. The double quotes you used: ["."] are unnecessary, since . isn't a shell metacharacter -- but square brackets are special to the shell, with a similar meaning to their meaning in a regular expression. So your
grep ["."]
is equivalent to
grep [.]
The shell would normally expand [.] to a list of every visible file name that contains the single character .. There's always such a file, namely the current directory . -- but the shell's [] expansion ignores files whose names start with .. So since there's nothing to expand [.] to, it's left alone, and grep sees [.] as an argument, which just happens to work, matching lines that contain a literal . character. (Using a different shell, or the same shell with different settings, could mess that up.)
Note that the shell doesn't (except in some limited contexts) deal with regular expressions; rather it uses file matching patterns, which are less powerful.
You need to use $, which signals the end of the line:
cat james | grep ["."]$
This also works:
cat james |grep "\.$"
You can use a regular expression. This is what you need :
cat james | grep "\.$"
Look at grep manpage for more informations about regexp

Reading in text file with unmatched quotes

I have a large (>1GB) CSV file I'm trying to read into a data frame in R.
The non-numeric fields are enclosed in double-quotes so that internal commas are not interpreted as delimiters. That's well and good. However, there are also sometimes unmatched double-quotes in an entry, like "2" Nails".
What is the best way to work around this? My current plan is to use a text processor like awk to relabel the quoting character from the double-quote " to a non-conflicting character like pipe |. My heuristic for finding quoting characters would be double-quotes next to a comma:
gawk '{gsub(/(^\")|(\"$)/,"|");gsub(/,\"/,",|");gsub(/\",/,"|,");print;}' myfile.txt > newfile.txt
This question is related, but the solution (argument in read.csv of quote="") is not viable for me because my file has non-delimiting commas enclosed in the quotation marks.
Your idea of looking for quotes next to a comma is probably the best thing you can do; you could however try to turn it around and have the regex escape all the quotes that are not next to a comma (or start/end of line):
Search for
(?<!^|,)"(?!,|$)
and replace all the matches with "".
R might not be the best tool for this because its regex engine doesn't have a multiline mode, but in Perl it would be a one-liner:
$subject =~ s/(?<!^|,)"(?!,|$)/""/mg;
This would be a more foolproof variant of Tim's solution, in case non-boundary commas exist inside the cell:
(?<!,\s+)"(?!\s+,$)
I'm not sure if it would have any bugs though.

Resources