I am trying to learn text processing units in Unix through hackerrank.com's practice problems.
One of the problems is this: https://www.hackerrank.com/challenges/text-processing-cut-4
But I don't understand how to take inputs from standard output in unix, so I'm having trouble submitting my answer. Can you please let me know how to get started with taking inputs? Thanks
In the question we don't have to take input, we have to just display the first four characters from each line of text.
solution:
cut -c1-4
i know that this is an old question, but I'll answer anyway, so everyone can have a response, if they need it.
As you can read on the man page, the synopsis is cut [OPTION]... [FILE]..., so the FILE argument is optional (because it's surrounded by square brackets).
In addition:
With no FILE, or when FILE is -, read standard input.
So if you simply call cut command withoud any file specification, it will read from STDIN.
Related
I have a CSV file and I want to remove the all line feeds (LF or \n) which are all coming in between the double quotes alone.
Can you please provide me an Unix script to perform the above task. I have given the input and expected output below.
Input :
No,Status,Date
1,"Success
Error",1/15/2018
2,"Success
Error
NA",2/15/2018
3,"Success
Error",3/15/2018
Expected output:
No,Status,Date
1,"Success Error",1/15/2018
2,"Success Error NA",2/15/2018
3,"Success Error",3/15/2018
I can't write everything for you, as I am not sure about your system as well as which bash version that is running on it. But here are a couple of suggestions that you might want to consider.
https://www.unix.com/shell-programming-and-scripting/31021-removing-line-breaks-shell-variable.html
https://www.unix.com/shell-programming-and-scripting/19484-remove-line-feeds.html
How to remove carriage return from a string in Bash
https://unix.stackexchange.com/questions/57124/remove-newline-from-unix-variable
Remove line breaks in Bourne Shell from variable
https://unix.stackexchange.com/questions/254644/how-do-i-remove-newline-character-at-the-end-of-file
https://serverfault.com/questions/391360/remove-line-break-using-awk
I have the following in my .zshrc:
setopt PROMPT_SUBST
precmd(){
echo""
LEFT="$fg[cyan]$USERNAME#$HOST $fg[green]$PWD"
RIGHT="$fg[yellow]$(date +%I:%M\ %P)"
RIGHTWIDTH=$(($COLUMNS-${#LEFT}))
echo $LEFT${(l:$RIGHTWIDTH:)RIGHT}
}
PROMPT="$ "
This gives me the following screenshot
The time part on the right is always not going all the way to the edge of the terminal, even when resized. I think this is due to the $(date +%I:%M\ %P) Anyone know how to fix this?
EDIT: Zoomed in screenshot
While your idea is commendable, the problem you suffer from is that your LEFT and RIGHT contains ANSI escape sequences (for colors), which should be zero-width characters, but are nevertheless counted toward the length of a string if you naively use $#name, or ${(l:expr:)name}.
Also, as a matter of style, you're better off using Zsh's builtin prompt expansion, which wraps a lot of common things people may want to see in their prompts in short percent escape sequences. In particular, there are builtin sequences for colors, so you don't need to rely on nonstandard $fg[blah].
Below is an approximate of your prompt written in my preferred coding style... Not exactly, I made everything super verbose so as to be understandable (hopefully). The lengths of left and right preprompts are calculated after stripping the escape sequences for colors and performing prompt expansion, which gives the correct display length (I can't possibly whip that up in minutes; I ripped the expression off pure).
precmd(){
local preprompt_left="%F{cyan}%n#%m %F{green}%~"
local preprompt_right="%F{yellow}%D{%I:%M %p}%f"
local preprompt_left_length=${#${(S%%)preprompt_left//(\%([KF1]|)\{*\}|\%[Bbkf])}}
local preprompt_right_length=${#${(S%%)preprompt_right//(\%([KF1]|)\{*\}|\%[Bbkf])}}
local num_filler_spaces=$((COLUMNS - preprompt_left_length - preprompt_right_length))
print -Pr $'\n'"$preprompt_left${(l:$num_filler_spaces:)}$preprompt_right"
}
PROMPT="$ "
Edit: In some terminal emulators, printing exactly $COLUMN characters might wrap the line. In that case, replace the appropriate line with
local num_filler_spaces=$((COLUMNS - preprompt_left_length - preprompt_right_length - 1))
End of edit.
This is very customizable, because you can put almost anything in preprompt_left and preprompt_right and still get the correct lengths — just remember to use prompt escape sequence for zero width sequences, e.g., %F{}%f for colors, %B%b for bold, etc. Again, read the docs on prompt expansion: http://zsh.sourceforge.net/Doc/Release/Prompt-Expansion.html.
Note: You might notice that %D{%I:%M %p} expands to things like 11:35 PM. That's because I would like to use %P to get pm, but not every implementation of strftime supports %P. Worst case scenario: if you really want lowercase but %P is not supported, use your original command subsitution $(date +'%I:%M %P').
Also, I'm using %~ instead of %/, so you'll get ~/Desktop instead of /c/Users/johndoe/Desktop. Some like it, some don't. However, as I said, this is easily customizable.
I'm trying to read a file line-by-line in Ada, it's a XML text file. I'm following the instructions here:
http://rosettacode.org/wiki/Read_a_file_line_by_line#Ada
However there's a problem that annoys me: the "Get_Line" function seems to be unaware of byte-order marks and reads them as part of the text itself, which means that when I raed the lines, the first one will always start with some extra bytes that should not be there.
While removing the extra bytes manually from the string is no big deal it seems strange to me that a function dedicated to text input/output is unaware of BOMs, there must be a way to read a text file in ada without having to worry about this... is there?
Ada.Text_IO is specified to handle ISO-8859-1 encoded text, so ignoring an UTF-8 feature is the proper thing to do.
If Ada.Wide_Text_IO and Ada.Wide_Wide_Text_IO also output the byte-order-mark, when asked to read UTF-8 encoded text, then you should consider reporting it as a bug to GCC - but as there is quite a lot of implementation defined details for the text I/O packages in Ada, you should be ready for a "wont fix" answer.
One possibility is using the stream attributes and making a UTF_8 file-type to handle the BOM reading-and-discarding.
I have a csv, and each line reads as follows:
"http://www.videourl.com/video,video title,video duration,thumbnail,<iframe src=""http://embed.videourl.com/video"" frameborder=0 width=510 height=400 scrolling=no> </iframe>,tag 1,tag 2",,,,,,,,,,,,,,,,,,,,,,,,,,
Is there a program I can use to clean this up? I'm trying to import it to wordpress and map it to current fields, but it isn't functioning properly. Any suggestions?
Just use search and replace in this case. remove the commas at the end and then replace the remaining commas with ",".
Should anyone else have the same issue. Know that this solution will only work with data much like the example giving. If data has a lot of text and there are commas within the text that need kept. Then search replacing comma will not work. Using regex would be the next option and that can be done in Notepad ++
However I think the regex pattern depends on the data so not much point creating an example.
PHP could be used to explode each line also. Remove values that match a regex out of many i.e. URL, money. Then what is left could be (depending on the data again) just a block of text. That approach may not work if there are two or more columns with a lot of text
I'm using the diff command to compare two text files. They need to be literally matched.
So I use the diff:
diff binary.out binary.expected
(By the way, those files are NOT binary files. They are text file. I call them binary because that's the name of the project)
and got
Binary files binary.out and binary.expected differ
When I use another diff tool, the smartest of all (AKA human), and there's really nothing different between the two files.
Does anyone happen to know what's going on here?
Thanks.
diff from diffutils says the following about text/binary:
diff determines whether a file is text or binary by checking the
first few bytes in the file; the exact number of bytes is system
dependent, but it is typically several thousand. If every byte in
that part of the file is non-null, diff considers the file to be
text; otherwise it considers the file to be binary.
hence GNU diff have a quite open definition of what is text, and the use of the --text option to force it to treat the file as text should seldom be needed.
Have you checked if binary.out or binary.expected contains null characters? What version is your diff program?
Make sure to ignore white space in the diff options.
It may also see Unicode characters and interpret that as binary. See if your diff tool has an option to force text mode.