How to format output with write.table - r

I apologize if this is an easy question. I am trying to use the points from the MixSim package in R to act as the sampling points in an old Fortran program because I like the way that MixSim creates sampling points better, but am using the Fortran program to simulate vegetation data across many levels of beta diversity, alpha diversity, etc.
I generate my data in MixSim by:
d=MixSim(BarOmega=0.000,MaxOmega=0.000,K=4,p=3,ecc=0.99,int=c(10,90),PiLow=0.1)
m=simdataset(n=10,Pi=d$Pi,Mu=d$Mu,S=d$S)
And if I do use write.table, this is what I get
write.table(m$X,file="example.txt",quote=F,row.names=F)
V1 V2 V3
87.540626647788 62.8444539443256 17.0026406651813
83.9939847940881 65.0069747775257 18.8676229149976
84.4477456535804 63.6892673685408 18.6384437248469
84.7684968694547 65.4610993744652 17.6252989584773
13.4600970937604 16.9988156469822 49.6810813619893
23.9952555783055 18.6598302958281 48.4204641715953
17.0523647853253 11.518037157788 43.0417655739052
57.5107395863171 40.4872578216636 24.938188234695
11.8320140526743 52.9077915021041 34.5723480775864
12.8754032313702 53.1795899126135 34.1309377040482
But I need my output to look exactly like this for the Fortran program to accept it.
***** SAMPLING PATTERN FILE
50 3 1 0.0000
50
87.54 62.84 17.00
83.99 65.00 18.86
84.44 63.68 18.63
84.76 65.46 17.62
13.46 16.99 49.68
23.99 18.65 48.42
17.05 11.51 43.04
57.51 40.48 24.93
11.83 52.90 34.57
12.87 53.17 34.13
I should note, that I know exactly how to do the rounding in r by:
m=round(m$X,digits=2)
Is my best bet going to be simply using write.table and then formatting "by hand". Most of my models will then be created in a loop that I wrote in Fortran. I will only need to generate maybe a few dozen models in MixSim and then format them if that is the case. All models will have considerably more than 10 points.

(Tried a variety of things with write.table but always got undesired truncation of the decimal values when the trailing figures were nn.00.)
Use cat for the file preamble and write.fwf from pkg::gdata:
cat(top, file='out.txt')
install.packages('gdata')
gdata::write.fwf(signif(dat,4), file = "out.txt", append = TRUE, quote = FALSE, sep = "\t",
colnames = FALSE)
-------result----------
***** SAMPLING PATTERN FILE
50 3 1 0.0000
50
87.54 62.84 17.00
83.99 65.01 18.87
84.45 63.69 18.64
84.77 65.46 17.63
13.46 17.00 49.68
24.00 18.66 48.42
17.05 11.52 43.04
57.51 40.49 24.94
11.83 52.91 34.57
12.88 53.18 34.13
If you need padding on the LHS you can use width=7 or 8.

Related

Exporting R data - column classed as data.frame exports as blank with write_xlsx

I am new to R, I looked for other solutions such as converting the datatype or exporting as csv (which generated weird formatting) and was unable to find a solution. I think I am overlooking something simple - thank you in advance!
I exported my dataframe (dfCCVul) to excel via write_xlsx. The data exported fine, except for the column "logPopDens.PopDensity" which is a column I had created by taking the log of another column (PopDensity) That column exports blank.
This is a snippet of the data :
PerPoverty PerNotWhit PerServWor logPopDens.PopDensity
13.1 42.5 12.92 6.288305
30.2 48.9 13.03 4.861129
10.1 17.1 9.16 4.819233
26.3 49.8 23.32 4.862599
16.6 42.8 20.24 5.02263
12.5 25.6 8.28 4.448282
15.3 20.3 5.89 5.048188
When I check the data type of the new column, the results look embedded:
$ logPopDens:'data.frame': 1315 obs. of 1 variable:
..$ PopDensity: num 3.52 3.07 2.64 1.16 2.27 ...
When I check the class, the output is:
> class(dfCCVul$logPopDens)
[1] "data.frame"
My thought was to convert the datatype, but I've received a series of errors after trying different syntax, for example:
> data$logPopDens <- as.numeric(as.character(data$logPopDens))
Error in data$col11 : object of type 'closure' is not subsettable
> data$logPopDens.PopDensity <- as.numeric(as.character(data$logPopDens.PopDensity))
Error in data$logPopDens.PopDensity :
object of type 'closure' is not subsettable
Is there another way to export the values of the logPopDens?
Thank you!
dfCCVul$logPopDens is a dataframe, convert it into a vector. One way would be with unlist.
dfCCVul$logPopDens <- unlist(dfCCVul$logPopDens)
Or I think this should work as well.
dfCCVul$logPopDens <- dfCCVul$logPopDens$PopDensity

How to convert web scraped data into numeric?

I'm trying to convert data scraped from book depository, bests selling books into numeric data so that I can graph it.
My code currently is:
selector <- ".rrp"
library(rvest)
url <- "https://www.bookdepository.com/bestsellers"
doc <- read_html(url)
prices <- html_nodes(doc, selector)
html_text(prices)
library(readr)
Spiral <- read_csv("C:/Users/Ellis/Desktop/INFO204/Spiral.csv")
View(Spiral)
My attempting to clean the data:
text <- gsub('[$NZ]', '', Spiral) # removes NZ$ from data
But the data now looks like this:
[1] "c(\"16.53\", \"55.15\", \"36.39\", \"10.80\", \"27.57\", \"34.94\",
\"27.57\", \"22.06\", \"22.00\", \"16.20\", \"22.06\", \"22.06\",
\"19.84\", \"19.81\", \"27.63\", \"22.06\", \"10.80\", \"27.57\",
\"22.06\", \"22.94\", \"16.53\", \"25.36\", \"27.57\", \"11.01\",
\"14.40\", \"15.39\")"
and when I try run:
as.numeric(text)
I get:
Warning message:
NAs introduced by coercion
How do I clean the data up in such a way that NZ$ is removed from the price and I'm able to plot the 'cleaned data'
You have a single string that contains code, not numbers. You need to evaluate the code first.
as.numeric(eval(parse(text=text)))
[1] 16.53 55.15 36.39 10.80 27.57 34.94 27.57 22.06 22.00 16.20 22.06 22.06 19.84
[14] 19.81 27.63 22.06 10.80 27.57 22.06 22.94 16.53 25.36 27.57 11.01 14.40 15.39
Several options to get the desired outcome:
# option 1
as.numeric(gsub('(\\d+.\\d+).*', '\\1', html_text(prices)))
# option 2
as.numeric(gsub('\\s.*$', '', html_text(prices)))
# option 3
library(readr)
parse_number(html_text(prices))
all result in:
[1] 21.00 9.99 31.49 19.49 6.49 13.50 22.49 11.99 11.49 7.99 10.99 7.99 10.99 9.99 7.99 9.99 11.49 8.49 11.99 9.99 14.95 8.99 20.13 13.50 8.49 6.49
NOTES:
The result is a vector of prices in euros. Due to localisation prices may differ when you scrape from another county.
When the decimal spearator is a comma (,) in html_text(prices), the first two options can be changed to as.numeric(gsub('(\\d+),(\\d+).*', '\\1.\\2', html_text(prices))) to get the correct result. The third option should in that case be changed to: parse_number(html_text(prices), locale = locale(decimal_mark = ','))

getSymbols.yahoo returns just the symbol name without any data

I am trying to download some stocks data but the quantmod functions don't seem to work. For example:
getSymbols.yahoo("F",env= globalenv(), return.class = 'xts',
from = "2017-01-01",
to = Sys.Date())
[1] "F"
The package is upadated, as well as the local date set = Sys.setlocale("LC_TIME", "C"). I also tried with getSymbols.google but it doesn't work neither and to change the return class.
getSymbols() currently (as-of 0.4-10) loads the data into an environment, just like the load() function. In quantmod 0.5-0, it will return the data, like read.table() and most other functions.
If you want getSymbols() to return the data, you can set auto.assign = FALSE.
Data <- getSymbols("F", from = "2017-01-01", to = Sys.Date(), auto.assign = FALSE)
Also note that you should not call getSymbols.yahoo() directly (as it says in ?getSymbols.yahoo).
That's correct. Now if you want to see the historical data just type F:
> head(F)
F.Open F.High F.Low F.Close F.Volume F.Adjusted
2017-01-03 12.20 12.60 12.13 12.59 40510800 12.22555
2017-01-04 12.77 13.27 12.74 13.17 77638100 12.78876
2017-01-05 13.21 13.22 12.63 12.77 75628400 12.40034
2017-01-06 12.80 12.84 12.64 12.76 40315900 12.39063
2017-01-09 12.79 12.86 12.63 12.63 39183400 12.26440
2017-01-10 12.70 13.02 12.66 12.85 58703500 12.47803

txt to XTS format

I have a text file with close data that I am trying to convert to XTS format.
I am able to call it into R, but cannot figure out a way to convert this data to XTS format. Below is the sample data I am working with.
05/31/2017,32.78,FCOM
05/30/2017,32.72,FCOM
05/26/2017,32.56,FCOM
05/25/2017,32.57,FCOM
05/24/2017,32.47,FCOM
05/31/2017,35.63,FDIS
05/30/2017,35.71,FDIS
05/26/2017,35.67,FDIS
05/25/2017,35.54,FDIS
05/24/2017,35.23,FDIS
05/31/2017,18.17,FENY
05/30/2017,18.26,FENY
05/26/2017,18.53,FENY
05/25/2017,18.51,FENY
05/24/2017,18.90,FENY
05/31/2017,36.52,FHLC
05/30/2017,36.40,FHLC
05/26/2017,36.50,FHLC
05/25/2017,36.62,FHLC
05/24/2017,36.41,FHLC
05/31/2017,34.28,FIDU
05/30/2017,34.34,FIDU
05/26/2017,34.33,FIDU
05/25/2017,34.31,FIDU
05/24/2017,34.17,FIDU
05/31/2017,30.56,FMAT
05/30/2017,30.66,FMAT
05/26/2017,30.68,FMAT
05/25/2017,30.62,FMAT
05/24/2017,30.70,FMAT
05/31/2017,34.26,FNCL
05/30/2017,34.60,FNCL
05/26/2017,34.86,FNCL
05/25/2017,34.90,FNCL
05/24/2017,34.85,FNCL
05/31/2017,23.96,FREL
05/30/2017,23.96,FREL
05/26/2017,24.02,FREL
05/25/2017,24.21,FREL
05/24/2017,24.16,FREL
Thank you in advance for any assistance you can provide me with!
Use the split argument to read.zoo to indicate which column contains the data that should be used to create columns.
x <- read.zoo(text = "05/31/2017,32.78,FCOM
05/30/2017,32.72,FCOM
05/26/2017,32.56,FCOM
05/25/2017,32.57,FCOM
05/24/2017,32.47,FCOM
05/31/2017,35.63,FDIS
05/30/2017,35.71,FDIS
05/26/2017,35.67,FDIS
05/25/2017,35.54,FDIS
05/24/2018,35.23,FDIS
05/31/2017,18.17,FENY
05/30/2017,18.26,FENY
05/26/2017,18.53,FENY
05/25/2017,18.51,FENY
05/24/2017,18.90,FENY
05/31/2017,36.52,FHLC
05/30/2017,36.40,FHLC
05/26/2017,36.50,FHLC
05/25/2017,36.62,FHLC
05/24/2017,36.41,FHLC
05/31/2017,34.28,FIDU
05/30/2017,34.34,FIDU
05/26/2017,34.33,FIDU
05/25/2017,34.31,FIDU
05/24/2017,34.17,FIDU
05/31/2017,30.56,FMAT
05/30/2017,30.66,FMAT
05/26/2017,30.68,FMAT
05/25/2017,30.62,FMAT
05/24/2017,30.70,FMAT
05/31/2017,34.26,FNCL
05/30/2017,34.60,FNCL
05/26/2017,34.86,FNCL
05/25/2017,34.90,FNCL
05/24/2017,34.85,FNCL
05/31/2017,23.96,FREL
05/30/2017,23.96,FREL
05/26/2017,24.02,FREL
05/25/2017,24.21,FREL
05/24/2017,24.16,FREL", sep = ",", format = "%m/%d/%Y", split = 3)
Setting split = 3 tells read.zoo to use the 3rd column in the file to create columns. Then x is a zoo object:
R> x
FCOM FDIS FENY FHLC FIDU FMAT FNCL FREL
2017-05-24 32.47 35.23 18.90 36.41 34.17 30.70 34.85 24.16
2017-05-25 32.57 35.54 18.51 36.62 34.31 30.62 34.90 24.21
2017-05-26 32.56 35.67 18.53 36.50 34.33 30.68 34.86 24.02
2017-05-30 32.72 35.71 18.26 36.40 34.34 30.66 34.60 23.96
2017-05-31 32.78 35.63 18.17 36.52 34.28 30.56 34.26 23.96
You can convert x to xts using x <- as.xts(x).

Error on R at trying to get median from values in a text file

I created with python a simple text file with 100 real numbers between 0 and 10 (one value per line).
So I read and set it in a variable 'a' on R, with 'read.table()' function
The mean() function works fine, but the median() function returns the following error when used 'a' as parameter (my R:Base is PT_BR version, so I'm translating the error messages to English. I don't know it is equal to the original English version)
#Error in median.default(a) : need numeric data
So i tried to convert it to numeric
as.numeric(a)
#Error: object (a) cannot be coerced to type 'double'
So I tried to convert to a list and get the median
a <- as.list(a)
median(a)
#Error in sort.int(x, na.last = na.last, decreasing = decreasing, ...) :
#'x' must be atomic
Printing the list:
a
$V1
[1] 0.003 0.161 0.227 0.331 0.416 0.441 0.536 0.619 0.730 0.737 0.764 0.799
[13] 0.939 1.009 1.036 1.217 1.321 1.615 1.684 1.878 1.930 1.933 1.949 2.018
[25] 2.053 2.126 2.181 2.464 2.488 2.725 2.838 2.874 2.893 2.954 3.054 3.092
[37] 3.149 3.192 3.216 3.233 3.422 3.424 3.695 3.720 3.743 4.097 4.229 4.229
[49] 4.264 4.317 4.447 4.461 4.529 4.794 4.992 5.121 5.138 5.161 5.241 5.264
[61] 5.286 5.428 5.430 5.430 5.498 5.520 5.706 5.928 5.956 6.074 6.154 6.398
[73] 6.402 6.536 6.549 6.748 6.994 7.196 7.397 7.440 7.840 7.854 7.862 7.913
[85] 7.976 8.002 8.151 8.185 8.237 8.485 8.632 8.688 8.718 9.200 9.372 9.401
[97] 9.487 9.615 9.701 9.702
What is this $V1?
How i get the median?
You have read the data in as a data frame: that means that the basic structure is a list of columns. Even though there's only one column in this data frame, you need to extract it before you can apply a numeric operation like computing the median. As you will see at ?"[[", there are a variety of ways of indexing a data frame.
median(a$V1)
median(a[[1]])
both pull out the first column.
median(unlist(a))
drops the list structure.
median(scan("data.txt"))
uses scan() instead, which reads the results in as a single vector rather than as a list of vectors (i.e. a data frame).

Resources