R creating timeseries with existing dates from file - r

I'm trying to create a time series plot using R where obtain the dates from a REST request and then I want to group and count the date occurrences on a one week interval. I followed the examples of ts() in R and tried plots, which worked great. But I couldn't find any examples that shows how to create date aggregation based on existing data. Can someone point me in the proper direction?
This is a sample of my parsed REST data:
REST Response excerpt ....
"2014-01-16T14:51:50.000-0800"
"2014-01-14T15:42:55.000-0800"
"2014-01-13T17:29:08.000-0800"
"2014-01-13T16:19:31.000-0800"
"2013-12-16T16:56:39.000-0800"
"2014-02-28T08:11:54.000-0800"
"2014-02-28T08:11:28.000-0800"
"2014-02-28T08:07:02.000-0800"
"2014-02-28T08:06:36.000-0800"
....
Sincerely,
code B.

You can define the date with "as.Date" and then create a time series with "xts", as it allows merging by any period of time.
library(xts)
REST$date <- as.Date(REST$date, format="%Y-%m-%d")
REST$variable <- seq(0,2.4,by=.3)
ts <- xts(REST[,"variable"], order.by=REST[,"date"])
> to.monthly(ts)
ts.Open ts.High ts.Low ts.Close
Dec 2013 1.2 1.2 1.2 1.2
Xan 2014 0.6 0.9 0.0 0.0
Feb 2014 1.5 2.4 1.5 2.4
> to.weekly(ts)
ts.Open ts.High ts.Low ts.Close
2013-12-16 1.2 1.2 1.2 1.2
2014-01-16 0.6 0.9 0.0 0.0
2014-02-28 1.5 2.4 1.5 2.4
Not sure if this is what you needed. Is it?

Related

How to create two different CSV files with the same name but one uses a upper case letters and the other uses a lower case letters

I want to create multiple files for columns in a life table. I thought the easiest way to do this would be to save the files using their variable names (ax, Sx, lx, Lx, ...). However, I cannot get R to create two files based on the same name (one in lower case and one in upper case, e.g. lx.csv and Lx.csv).
To demonstrate the problem:
# write a csv as normal
write.csv(mtcars, "d.csv")
# next line seems to replace d.csv rather than create a new D.csv file
write.csv(iris, "D.csv")
# get iris when read back in
d <- read.csv("d.csv")
head(d)
# X Sepal.Length Sepal.Width Petal.Length Petal.Width Species
# 1 1 5.1 3.5 1.4 0.2 setosa
# 2 2 4.9 3.0 1.4 0.2 setosa
# 3 3 4.7 3.2 1.3 0.2 setosa
# 4 4 4.6 3.1 1.5 0.2 setosa
# 5 5 5.0 3.6 1.4 0.2 setosa
# 6 6 5.4 3.9 1.7 0.4 setosa
Is this behavior normal and is there a way to force the creation of new file with the upper case name?
I am using Windows and R 4.1.0
Update
Thanks to #tim for the answer. I had to go through the following steps in Powershell (in admin mode)
Run Enable-WindowsOptionalFeature -Online -FeatureName Microsoft-Windows-Subsystem-Linux
Restart PC
Run cd C:\folder to get to the location i want to enable case sensitive file names
Run (Get-ChildItem -Recurse -Directory).FullName | ForEach-Object {fsutil.exe file setCaseSensitiveInfo $_ enable}
I wanted to enable case sensitive file names for all the sub directories. I think if I just needed for a single folder I could have used fsutil.exe file setCaseSensitiveInfo C:\folder enable for 3 and 4
Windows' NTFS file system is case insensitive. with the april 18 update sensitivity for specific folders was introduced:
https://www.howtogeek.com/354220/how-to-enable-case-sensitive-folders-on-windows-10/#:~:text=Windows%2010%20now%20offers%20an%20optional%20case-sensitive%20file,see%20%E2%80%9Cfile%E2%80%9D%20and%20%E2%80%9CFile%E2%80%9D%20as%20two%20separate%20files.

One text file multiple data tables using R and Shiny

Using R/Shiny, how do I read from one text file with 3 sections into 3 separate data tables? Each section has its own column and row names as below. Thank you for your help!
s1c1 s1c2 s1c3
s1r1 1.0 2.3 3.0
s1r2 3.3 4.5 3.6
s2c1 s2c2 s2c3 s2c4
s2r1 1.0 2.3 3.0 4.2
s2r2 3.3 4.5 3.6 3.6
etc

htmlTable is replacing dataframe contents with sequential numbers

I'm using R markdown to create an html document. I've written a function that produces the following data frame as its output:
April ($) April Growth (%) Current ($) Current Growth (%) Change (%)
1 2013:3 253,963.49 0.2 251,771.20 0.7 -0.9
2 2013:4 253,466.09 -0.8 251,515.26 -0.4 -0.8
3 2014:1 255,448.95 3.2 255,300.10 6.2 -0.1
4 2014:2 259,376.84 6.3 259,919.99 7.4 0.2
5 2014:3 261,398.85 3.2 262,486.91 4.0 0.4
6 2014:4 264,309.06 4.5 266,662.59 6.5 0.9
I'm then supplying this data frame to htmlTable as shown:
html.tab <- htmlTable(sample.df, rnames=F)
print(html.tab)
However, when I knit the file I the following table is produced:
Can anyone explain what is happening? I thought perhaps it was the data class in the data frame but I didn't see anything in the htmlTable vignette saying it couldn't handle data of certain classes.
This is my first time working with R Markdown and htmlTables so hopefully I've just made some basic mistake but I haven't been able to find anyone else with the same problem.
Thanks to Benjamin for the suggestion. It turns out the problem was the data class. sample.df contained data of class factor which apparently htmlTable can't handle. By converting the data to characters the correct table is produced.
sample.df[] <- lapply(sample.df, as.character)
Perhaps someone more familiar with the package can explain why factors are a problem?
I knew it would be something basic like this!

St dev in periods defined by another variable

Here's what my data looks like (the structure):
omxh coalition.inpower
.01 2.4 (period begins)
.03 2.4
-.01 2.4
-.02 3.5 (another period begins)
.02 3.5
.05 3.5
.03 3.5
-.01 4.1 (again another period begins)
-.03 4.1
... ...
The first variable (being stock index returns) varies all the time but the other one (being the coalition in power) only changes once in a while. This is what it looks like then:
plot(lm(omxh ~ coalition.inpower))
abline(...)
So you can see that the volatility is different depending on the "block" of observations. How could I get the standard deviation for the first variable based on the periods defined by the other variable? The periods are not equally long.
Thanks. Something else you need to know?
You can use tapply or aggregate, e.g.:
tapply(df$omxh, df$coalition.inpower, sd)

Memory problems with large-scale social network visualization using R and Cytoscape

I'm relatively new to R and am trying to solve the following problem:
I work on a Windows 7 Enterprise platform with the 32bit version of R
and have about 3GB of RAM on my machine. I have large-scale social
network data (c. 7,000 vertices and c. 30,000 edges) which are
currently stored in my SQL database. I have managed to pull this data
(omitting vertex and edge attributes) into an R dataframe and then
into an igraph object. For further analysis and visualization, I would
now like to push this igraph into Cytoscape using RCytoscape.
Currently, my approach is to convert the igraph object into an
graphNEL object since RCytoscape seems to work well with this object
type. (The igraph plotting functions are much too slow and lack
further analysis functionality.)
Unfortunately, I always run into memory issues when running this
script. It has worked previously with smaller networks though.
Does anyone have an idea on how to solve this issue? Or can you
recommend any other visualization and analysis tools that work well
with R and can handle such large-scale data?
Sorry for taking several days to get back to you.
I just ran some tests in which
1) an adjacency matrix is created in R
2) an R graphNEL is then created from the matrix
3) (optionally) node & edge attributes are added
4) a CytoscapeWindow is created, displayed, and layed out, and redrawn
(all times are in seconds)
nodes edges attributes? matrix graph cw display layout redraw total
70 35 no 0.001 0.001 0.5 5.7 2.5 0.016 9.4
70 0 no 0.033 0.001 0.2 4.2 0.5 0.49 5.6
700 350 no 0.198 0.036 6.0 8.3 1.6 0.037 16.7
1000 500 no 0.64 0.07 12.0 9.8 1.8 0.09 24.9
1000 500 yes 0.42 30.99 15.7 29.9 1.7 0.08 79.4
2000 1000 no 3.5 0.30 73.5 14.9 4.8 0.08 96.6
2500 1250 no 2.7 0.45 127.1 18.3 11.5 0.09 160.7
3000 1500 no 4.2 0.46 236.8 19.6 10.7 0.10 272.8
4000 2000 no 8.4 0.98 502.2 27.9 21.4 0.14 561.8
To my complete surprise, and chagrin, there is an exponential slowdown in 'cw' (the new.CytoscapeWindow method) --which makes no sense at all. It may be that your memory exhaustion is related to that, and is quite fixable.
I will explore this, and probably have a fix in the next week.
By the way, did you know that you can create a graphNEL directly from an adjacency matrix?
g = new ("graphAM", adjMat = matrix, edgemode="directed")
Thanks, Ignacio, for your most helpful report. I should have done these timing tests long ago!
Paul
It has been a while since I used Cytoscape so I am not exactly sure how to do it, but the manual states that you can use text files as input using the "Table Import" feature.
In igraph you can use the write.graph() function to export a graph in a bunch of ways. This way you can circumvent having to convert to a graphNEL object which might be enough to not run out of memory.

Resources