On loading a yaml file with values such as 25.0, the .0 is ignored and what I get is 25. Is it possible to force yaml to consider the value as it is without manipulating the data? I have tried enclosing the values in single/double quotes, but that does not work.
[Edit]: I am using the yaml parser package for R programming language. The data type returned is double. If I set the value to 25.2, I get back the same value. How can I force YAML/R to read the the information in YAML as it is.
Your problem is that the parser recognises that these are floating point numbers and in R there is no difference between 25.0 and 25. Try this for example:
identical(25.0, 25)
25.0 and 25 are just two different representations of the same floating point number. If you want to retain the form in which the data is supplied you will have to read them in as strings (which you can later convert to numeric if you need to perform calculations). You can do this with a handler:
yaml.load("25.0", handlers=list("float#fix"=function(x) as.character(x)))
Maybe this will help: http://tolstoy.newcastle.edu.au/R/help/06/05/28016.html
Its suggested to change the settings for digits and possibly round the numbers too to avoid too many decimal places.
options(digits=2)
format(rounf(x, 2), nsmall = 2)
Related
When using sprintf() when I want to format a number with a fixed number of digits, I have to use format strings like "%.3f" or "%2d". Now the Qt-manual says, I have to use QStrings::arg()-function instead of sprintf():
QString("%1").arg(QString::number(1.3));
So how do I specify the number of digits to be shown in resulting string? Thanks :-)
QString::arg
You can specify the formatting with QString::arg like:
%.3f: QString("%1").arg(1.3, 0, 'f', 3): where the second argument is the field-width (minimum width of the resulting string, 0 here), the third is the format of the number, (in this case f means use no scientific notation), and the fourth is the precision (3 number of decimal places).
%2d: QString("%1").arg(42, 2).
Note: When using QString::arg you must be careful on using the adequate data type. For example, if you want to format the number 50 with one zero decimal, you must use QString("%1").arg(50.0, 0, 'f', 1). If you use QString("%1").arg(50, 0, 'f', 1) instead (note 50 is an integer), code won't compile due to a conflict of arguments.
This is the preferred way to do it in Qt, specially if the formatting string has to be localized. One of the main reasons is that the placeholders for values have an index (%1, %2...), allowing them to be in any order in the string and keeping their semantics (you may need to change order in some languages).. When using sprintf-like functions the order of the arguments matters.
QString::asprintf
Nevertheless, and though not recommended in new Qt code, you can use the sprintf-like QString::asprintf (do not use QString::sprintf which is deprecated). For example, QString::asprintf("%.3f", 1.3).
I don't want the display format like this: 2.150209e+06
the format I want is 2150209
because when I export data, format like 2.150209e+06 caused me a lot of trouble.
I did some search found this function could help me
formatC(numeric_summary$mean, digits=1,format="f").
I am wondering can I set options to change this forever? I don't want to apply this function to every variable of my data because I have this problem very often.
One more question is, can I change the class of all integer variables to numeric automatically? For integer format, when I sum the whole column usually cause trouble, says "integer overflow - use sum(as.numeric(.))".
I don't need integer format, all I need is numeric format. Can I set options to change integer class to numeric please?
I don't know how you are exporting your data, but when I use write.csv with a data frame containing numeric data, I don't get scientific notation, I get the full number written out, including all decimal precision. Actually, I also get the full number written out even with factor data. Have a look here:
df <- data.frame(c1=c(2150209.123, 10001111),
c2=c('2150209.123', '10001111'))
write.csv(df, file="C:\\Users\\tbiegeleisen\\temp.txt")
Output file:
"","c1","c2"
"1",2150209.123,"2150209.123"
"2",10001111,"10001111"
Update:
It is possible that you are just dealing with a data rendering issue. What you see in the R console or in your spreadsheet does not necessarily reflect the precision of the underlying data. For instance, if you are using Excel, you highlight a numeric cell, press CTRL + 1 and then change the format. You should be able to see full/true precision of the underlying data. Similarly, the number you see printed in the R console might use scientific notation only for ease of reading (SN was invented partially for this very reason).
Thank you all.
For the example above, I tried this:
df <- data.frame(c1=c(21503413542209.123, 10001111),
c2=c('2150209.123', '100011413413111'))
c1 in df is scientific notation, c2 is not.
then I run write.csv(df, file="C:\Users\tbiegeleisen\temp.txt").
It does out put all digits.
Can I disable scientific notation in R please? Because, it still cause me trouble, although it exported all digits to txt.
Sometimes I want to visually compare two big numbers. For example, if I run
df <- data.frame(c1=c(21503413542209.123, 21503413542210.123),
c2=c('2150209.123', '100011413413111'))
df will be
c1 c2
2.150341e+13 2150209.123
2.150341e+13 100011413413111
The two values for c1 are actually different, but I cannot differentiate them in R, unless I exported them to txt. The numbers here are fake numbers, but the same problem I encounter very day.
I'm running into some problems with the R function as.character() and paste(): they do not give back what they're being fed...
as.character(1415584236544311111)
## [1] "1415584236544311040"
paste(1415584236544311111)
## [1] "1415584236544311040"
what could be the problem or a workaround to paste my number as a string?
update
I found that using the bit64 library allowed me to retain the extra digits I needed with the function as.integer64().
Remember that numbers are stored in a fixed number of bytes based upon the hardware you are running on. Can you show that your very big integer is treated properly by normal arithmetic operations? If not, you're probably trying to store a number to large to store in your R install's integer # of bytes. The number you see is just what could fit.
You could try storing the number as a double which is technically less precise but can store larger numbers in scientific notation.
EDIT
Consider the answers in long/bigint/decimal equivalent datatype in R which list solutions including arbitrary precision packages.
I am reading a csv file with some really big numbers like 1327707999760, but R automatically converts it into 1.32771e+12. I've tried to assign it a double class but it didn't work because it's already a rounded value.
I've checked other posts like Preserving large numbers . People said "It's not in a "1.67E+12 format", it just won't print entirely using the defaults. R is reading it in just fine and the whole number is there." But when I tried to do some arithmetic things on them, it's just not right.
For example:
test[1,8]
[1] 1.32681e+12
test[2,8]
[1] 1.32681e+12
test[2,8]-test[1,8]
[1] 0
But I know they are different numbers!
That's not large. It is merely a representation problem. Try this:
options(digits=22)
options('digits') defaults to 7, which is why you are seeing what you do. All twelve digits are being read and stored, but not printed by default.
Excel allows custom formats: Format/Cells/Custom and enter #0
I have a SQLite3 table with a column having format DECIMAL(7,2), but whenever I select rows with values not having a non-zero 2nd decimal place (eg. 3.00 or 3.10), the result always has trailing zero(s) missing (eg. 3 or 3.1). Is there any way that I can apply a formatting function in the SELECT statement so that I get the required 2dp? I have tried ROUND(), but this has no effect. Otherwise I have to keep converting the resulting column values into the required format for display (using Python in my case) every time I do a SELECT statement, which is a real pain.
I don't even mind if the result is string instead of numeric, as long as it has the right number of decimal places.
Any help would be appreciated.
Alan
SQLite internally uses IEEE binary floating point arithmetic, which truly does not lend itself well to maintaining a particular number of decimals. To get that type of decimal handling would require one of:
Fixed point math, or
IEEE decimal floating point (rather uncommon), or
Handling everything as strings.
Formatting the values (converting from floating point to string) after extraction is the simplest way to implement things. You could even hide that inside some sort of wrapper so that the rest of the code doesn't have to deal with the consequences. But if you're going to do arithmetic on the value afterwards then you're better off not formatting and instead working with the value as returned by the query, because the format and reconvert back to binary floating point (which Python uses, just like the vast majority of other modern languages) loses lots of information in the reduced precision.