Strange dates in plotting graph in Julia - julia

I get some strange result when I try plotting a DataFrame.
When I plot the starting graph it works great.
using DataFrames, XLSX, StatsPlots, Indicators
df = DataFrame(XLSX.readtable("Demo-sv.xlsx", "Blad3")...)
df[!, :Closeprice] .= Float64.(df.Closeprice)
Example of the data
1×2 DataFrame
│ Row │ Date │ Closeprice │
│ │ Any │ Float64 │
├─────┼────────────┼────────────┤
│ 1 │ 2019-05-03 │ 169.96 │
Then I plot the data
#df df plot(df.Date, df.Closeprice)
But when I try to add a new plot (a simple moving average (sma())), I get some really strange result. The dates isn't correct anymore, and the graph looks weird. I don't know if my new plot is somehow overwriting the original even if it should add to the existing plot?
I have tried to use both functions down below to add the new plot, but both gives the same result.
plot!(sma(df.Closeprice, n=200))
plot!(sma(sort(df, :Date).Closeprice, n=200))
But I have gotten the same result, where the graph just looks strange. And I don't know what is causing this problem.

The reason is because you are plotting your new plot vs the date starting from Date(1) which is 0001-01-01.
You need to plot plot!(df.Date, sma(df.Closeprice, n=200)).
By the way, calling the entry Date is not best practice, as Date is already the name of the DataType.

Related

Why is my csv export from R to Excel showing all the conent from one row in a single cell?

I'm trying to transform my data frame from R to Excel so I used the most obvious way:
write.csv(dataframe, "path")
This is working so far but if I'm opening it on Excel it's just terrible. I'm having 600 rows with 18 columns but Excel shows all the content from one row (so all the columns with different numbers, but also longer string text) in the very first cell, so A1, A2, A3 etc. Why isn't it showing the content from row 1, column 1 in A1 and then row 1, column 2 in A2 and so on?
I'm almost breaking down over this problem since this is the very last step of a quite long work and I don't know what to do. If anyone can help I'd be so grateful!

Grouping Values of a column into different categories in R

Please I am new to R and here as well. I am unable to upload an image of my dataset at the moment.
Here is my problem:
I have a dataset containing two columns that are of particular interest to me. One of them Status contains identifiers (1 and 2) 1 represents the variable Y1 and 2 represents the variable Y2. I need to run two separate regressions using Y1 and Y2 as dependent variables.
The other column Y1andY2 contains the respective value of Y1 and Y2 all merged into a single column. So I need a way of separating or grouping those values into Y1 and Y2. This would allow me to run the two separate regressions.
Status Y1andY2
1 1.521174
2 1.873917
2 2.116277
1 1.803262
1 3.725778
2 2.285313
1 2.732088
1 2.799842
2 2.976210
1 1.337500
1 1.259238
Your help would be greatly appreciate.
Thanks
Cheers
Ludov
I think you want to convert your data to wide format so that instead of having a Status ('key') column and a Y1andY2 ('values') column you want one column with the Y1 values and one with the Y2 values. This will make your df have half as many rows.
library(dplyr)
df %>% pivot_wider(names_from = 'Status', values_from = 'Y1andY2')

R coding, I'm trying to correctly order the variables in my dataframe from 1 to 13 but it goes like 201501, 2015010, 011,012,013, 02...09

I have a large dataframe sorted by fiscal year and fiscal period. I am trying to create a time plot starting at fiscal period 1 of 2015, ending at fiscal period 13 of 2019. I have two columns, one for FY, one for FP. They look like this.
I merged the two columns together separated by a 0 in a new column (C) using the code:
MarkP$C = paste(MarkP$FY, MarkP$FP, sep="0")
This ensures that my new column is a numeric variable.
It looks like this (check column C)
Then since I want to plot a time plot of total sales per period, I aggregated all sales to the level of C, so all rows ending with the same C aggregate together. I used this code for the aggregation.
MarkP11 <- MarkP %>%
group_by(C) %>%
summarise(Sales=sum(Sales))
This is what MarkP11 looks like.
The problem i'm having is that the row's are out of order so when I plot them, it gives me an incorrect plot. It has period 10 coming after period 1.
I've done some research and discovered that the sprintf function may work but i'm not sure how I can incorporate that into the code for my data frame.
The code below is how my C column is created by merging two columns. I believe I need to edit this line with the 'sprintf' function but i'm not sure how to get that to work.
R programming
MarkP$C = paste(MarkP$FY, MarkP$FP, sep="0")
I expect the ordering of the MarkP dataframe to look something like this:
sprintf is indeed what you want:
sprintf("%0.0f%02.0f", 2019, c(1,10))
# [1] "201901" "201910"
This assumes that FP's range is 0-99. It would not be incorrect to use sprintf("%d%02d", 2019, c(1,10)) since you're intending to use integers, but sometimes I find that seemingly-integer values can trigger Error ... invalid format '%02d', so I just strong-arm it. You could also use as.integer on each set of values ... another workaround.
I was speaking with a colleague of mine and he helped me figure out the solution. Like r2evans commented, sprintf is the correct function. The syntax that worked for me was:
MarkP$C = paste(MarkP$FY, sprintf("%02d", MarkP$FP), sep-"")
What that did in my code was concatenate the two cells FY and FP together in a new cell titled "C".
-It first added my FY column to the new cell.
-Then, since sep="" there was no separator character so FY and FP were simply merged together.
-Since I added the sprintf function with
("%02d",
it padded the FP column with 0 zero prior to tacking on my FP column.

Create event frequency graph from a datetime column in google spreadsheet

I have a google spreadsheet containing a column of datetime values. There can be as many as 10 values that occur on a common date (at different times). Is it possible to create a graph that shows frequency of "events per day" such that the x axis is a date and the y axis is a numerical value from 0 to 10? There doesn't appear to be anything in the chart wizard that resembles this idea and my knowledge of spreadsheets is just about nil...
It is possible but I can't say I'm impressed by the results. If your column of datetime values starts in A2, please insert in B2 =int(A2) and in C2:
=if(B1=B2,1+C1,1)
and copy both down to suit. You might want to add an earlier date in B1.

Read .txt-file reversed in R

I have a problem where I need to import a two column (first column being the x-axis and the second being the y-axis) .txt-file into R and I need to do this so that R reads it from bottom to top.
Here is what I did so far:
data<-read.table("data.txt",skip=1910,nrow=132982)
plot(data,type="l")
After this, I have the desired plot, but I wish this to be reversed horizontally. What would be the most convenient way to do this?
I tried
datar<-rev(data)
after import but it reversed the columns by switching the x-values to y-axis and y-values to x-axis. I wish to reverse the columns so that the last value in both columns will be the first in their columns without the columns switching places with each other.
I think that most convenient would be to revert the file during import as the file has over 130 000 rows and it is very cumbersome to work with.
Thank you in advance!
I may be daft - but I don't see how order of reading x-y data would affect the position x and y coordinates.
maybe you want something like this:
data <- read.csv(text= "
x,y
1,1
2,2
3,4")
plot(data,xlim=c(3,1))
(x axis goes down from 3 left to 1 right)

Resources