First variable replacing others - python-3.6

I have the following code:
from a3_functions import convert_date
date=int(input('Please enter the date in the format: ddmmyyyy'))
days, months, years=convert_date(date)
print("{0:02d}/{0:02d}/{0:04d}".format(days, months, years))
print(days)
print(months)
print(years)
But when I run it outputs the following:
Please enter the date in the format: ddmmyyyy03061314
03/03/0003
3
6
1314
Why does the sentence form of my output just output the first variable three times? Even though when I print them individually they print their respective values.

Your format string is using positional indices (which all are 0 and thus taking the first element).
Try this format string instead (I basically only removed all the 0 in front of the :):
>>> print("{:02d}/{:02d}/{:04d}".format(days, months, years))
03/06/1314
If you want to keep the positional arguments, you could just increment the numbers like this:
>>> print("{0:02d}/{1:02d}/{2:04d}".format(days, months, years))
03/06/1314
Alternatively, you could also name the positions to use an even more readable format string:
>>> print("{day:02d}/{month:02d}/{year:04d}".format(day=days, month=months, year=years))
03/06/1314
PyFormat has a great overview over the new-style format strings:
With new style formatting it is possible (and in Python 2.6 even
mandatory) to give placeholders an explicit positional index.
This allows for re-arranging the order of display without changing the
arguments.
This operation is not available with old-style formatting.
New: '{1} {0}'.format('one', 'two')
Output: two one

Related

Why is df.to_string printing out weird labels?

If I run the code like so:
print(df['Col1'].to_string(index=False))
I get:
1
2
3
Now if I use the code like so (without print):
s = df['Col1'].to_string(index=False)
s
I get:
'1\n2\n3'
Where are the backslashes and 'n' strings coming from? What is the appropriate way of listing a single columns with an ultimate goal of assigning to an array?
if you want to convert a data column to a list (array), then use this code:
col_list = df['Col1'].values
or
col_list = list(df['Col1'])
The \n sequence is a popular one found in many languages that support escape sequences. It is used to indicate a new line in a string. And print function will format the given string & inserts a new line

Two PASTE functions in a character vector

attach.files = c(paste("/users/joesmith/nosection_", currentDate,".csv",sep=""),
paste("/users/joesmith/withsection_", currentDate,".csv",sep=""))
Basically, if I did it like
c("nosection_051418.csv", "withsection_051418.csv")
And I did that manually it would work fine but since I'm automating this to run every day I can't do that.
I'm trying to attach files in an automated email but when I structure it like this, it doesn't work. How can I recreate this so that the character vector accepts it?
I thought your example implied the need for "parallel" inputs to the path stem, the first portion of the file name, and the date portions of those full paths. Consider this illustration of using a 2 item vector and a one item vector (produced by Sys.Date, replacing your "currentdate") to populate the %s positions in that sprintf string (suggested by #Gregor):
sprintf("/users/joesmith/%s_%s.csv", c("nosection", "withsection"), Sys.Date() )
[1] "/users/joesmith/nosection_2018-05-14.csv" "/users/joesmith/withsection_2018-05-14.csv"

toString of HH:MM in R

Having a Dataframe with "15:15" on dataframe[14,3], when I do a toString it prints:
911
What should be the problem here? If I print dataframe[14,3], it correctly prints 15:15.
I am trying to paste three variables and one of them, being in this format, it is appearing as a whole number (which I do not understand the relation with the original string).
The problem would be based on the class of the column. If it is a factor class, then by doing the toString, could change the class to integer storage mode. The option would be to convert it to character and then apply the function
dataframe[,3] <- as.character(dataframe[,3])

Using grep() with Unicode characters in R

(strap in!)
Hi, I'm running into issues involving Unicode encoding in R.
Basically, I'm importing data sets that contain Unicode (UTF-8) characters, and then running grep() searches to match values. For example, say I have:
bigData <- c("foo","αβγ","bar","αβγγ (abgg)", ...)
smallData <- c("αβγ","foo", ...)
What I'm trying to do is take the entries in smallData and match them to entries in bigData. (The actual sets are matrixes with columns of values, so what I'm trying to do is find the indexes of the matches, so I can tell what row to add the values to.) I've been using
matches <- grepl(smallData[i], bigData, fixed=T)
which usually results in a vector of matches. For i=2, it would return 1, since "foo" is element 1 of bigData. This is peachy and all is well. But RStudio seems to not be dealing with unicode characters properly. When I import the sets and view them, they use the character IDs.
dataset <- read_csv("[file].csv", col_names = FALSE, locale = locale())
Using View(dataset) shows "aß<U+03B3>" instead of "αβγ." The same goes for
dataset[1]
A tibble: 1x1 <chr>
[1] aß<U+03B3>
print(dataset[1])
A tibble: 1x1 <chr>
[1] aß<U+03B3>
However, and this is why I'm stuck rather than just adjusting the encoding:
paste(dataset[1])
[1] "αβγ"
Encoding(toString(dataset[1]))
[1] "UTF-8"
So it appears that R is recognizing in certain contexts that it should display Unicode characters, while in others it just sticks to--ASCII? I'm not entirely sure, but certainly a more limited set.
In any case, regardless of how it displays, what I want to do is be able to get
grep("αβγ", bigData)
[1] 2 4
However, none of the following work:
grep("αβ", bigData) #(Searching the two letters that do appear to convert)
grep("<U+03B3>",bigData,fixed=T) #(Searching the code ID itself)
grep("αβ", toString(bigData)) #(converts the whole thing to one string)
grep("\\β", bigData) #(only mentioning because it matches, bizarrely, to ß)
The only solution I've found is:
grep("\u03B3", bigData)
[1] 2 4
Which is not ideal for a couple reasons, most jarringly that it doesn't look like it's possible to just take every <U+####> and replace it with \u####, since not every Unicode character is converted to the <U+####> format, but none of them can be searched. (i.e., α and ß didn't turn into their unicode keys, but they're also not searchable by themselves. So I'd have to turn them into their keys, then alter their keys to a form that grep() can use, then search.)
That means I can't just regex the keys into a searchable format--and even if I could, I have a lot of entries including characters that'd need to be escaped (e.g., () or ), so having to remove the fixed=T term would be its own headache involving nested escapes.
Anyway...I realize that a significant part of the problem is that my set apparently involves every sort of character under the sun, and it seems I have thoroughly entrapped myself in a net of regular expressions.
Is there any way of forcing a search with (arbitrary) unicode characters? Or do I have to find a way of using regular expressions to escape every ( and α in my data set? (coordinate to that second question: is there a method to convert a unicode character to its key? I can't seem to find anything that does that specific function.)

Subsetting different length strings by spaces in R

In R, I currently have a long vector of dates and times saved as a string. So depending on the given date, the string can be 16 or 17 or 18 characters long and so I cannot just subset the first the 8 or 10 characters in the string, since that would not work for every date. But since there is a space between the date and time values, I am wondering how can I subset this string so that I only get the characters before the space?
Just to show how the string looks like now, here are a couple of examples:
"4/18/1950 0:00:00"
"6/8/1951 0:00:00"
"11/15/1951 0:00:00"
I'm not sure if you are familiar with regular expressions, if not you should learn as they are extremely useful:
tutorial
As akrun pointed out you can use the "sub" command to remove the space and everything after it like this:
sub(" .*","",stringVar)
First argument is the regular expression code which matches the space and everything that follows.
Second argument is what you want to replace the match with, in this case nothing
Third argument is the input string
Alternatively, you can just split the string at the space and select the first half using "strsplit"
strsplit(stringVar," ")[1]

Resources