Varargs is giving key error in Julia

Varargs is giving key error in Julia - julia

Consider the following table:
julia> using RDatasets, DataFrames
julia> anscombe = dataset("datasets","anscombe")
11x8 DataFrame
| Row | X1 | X2 | X3 | X4 | Y1 | Y2 | Y3 | Y4 |
|-----|----|----|----|----|-------|------|-------|------|
| 1 | 10 | 10 | 10 | 8 | 8.04 | 9.14 | 7.46 | 6.58 |
| 2 | 8 | 8 | 8 | 8 | 6.95 | 8.14 | 6.77 | 5.76 |
| 3 | 13 | 13 | 13 | 8 | 7.58 | 8.74 | 12.74 | 7.71 |
| 4 | 9 | 9 | 9 | 8 | 8.81 | 8.77 | 7.11 | 8.84 |
| 5 | 11 | 11 | 11 | 8 | 8.33 | 9.26 | 7.81 | 8.47 |
| 6 | 14 | 14 | 14 | 8 | 9.96 | 8.1 | 8.84 | 7.04 |
| 7 | 6 | 6 | 6 | 8 | 7.24 | 6.13 | 6.08 | 5.25 |
| 8 | 4 | 4 | 4 | 19 | 4.26 | 3.1 | 5.39 | 12.5 |
| 9 | 12 | 12 | 12 | 8 | 10.84 | 9.13 | 8.15 | 5.56 |
| 10 | 7 | 7 | 7 | 8 | 4.82 | 7.26 | 6.42 | 7.91 |
| 11 | 5 | 5 | 5 | 8 | 5.68 | 4.74 | 5.73 | 6.89 |
I have defined a function as follows:
julia> f1(df, matchval, matchfield, qfields...) = isempty(qfields)
WARNING: Method definition f1(Any, Any, Any, Any...) in module Main at REPL[314]:1 overwritten at REPL[317]:1.
f1 (generic function with 3 methods)
Now below is the problem
julia> f1(anscombe, 11, "X1")
ERROR: KeyError: key :field not found
in getindex at ./dict.jl:697 [inlined]
in getindex(::DataFrames.Index, ::Symbol) at /home/arghya/.julia/v0.5/DataFrames/src/other/index.jl:114
in getindex at /home/arghya/.julia/v0.5/DataFrames/src/dataframe/dataframe.jl:228 [inlined]
in f1(::DataFrames.DataFrame, ::Int64, ::String) at ./REPL[249]:2
Where am I doing wrong? FYI I'm using Julia Version 0.5.2. How to overcome this problem? Thanks in advance!

There is nothing wrong with your code - try running just what you've posted in a fresh session. Possibly you've defined another f1 method before. If you come from R, you may assume that this is overwritten by f1(df, matchval, matchfield, qfields...) = isempty(qfields), while in fact you're just defining a new method for the f1 function. The error is probably thrown by a 3-argument version you've defined earlier. Look at https://docs.julialang.org/en/stable/manual/methods/

Related

Calculating weighted average buy and hold return per ID in R

Thanks to #langtang, I was able to calculate Buy and Hold Return around the event date for each company (Calculating Buy and Hold return around event date per ID in R). But then now I am facing a new problem.
Below is the data I currently have.
+----+------------+-------+------------+------------+----------------------+
| ID | Date | Price | EventDate | Market Cap | BuyAndHoldIndividual |
+----+------------+-------+------------+------------+----------------------+
| 1 | 2011-03-06 | 10 | NA | 109 | NA |
| 1 | 2011-03-07 | 9 | NA | 107 | -0.10000 |
| 1 | 2011-03-08 | 12 | NA | 109 | 0.20000 |
| 1 | 2011-03-09 | 14 | NA | 107 | 0.40000 |
| 1 | 2011-03-10 | 15 | NA | 101 | 0.50000 |
| 1 | 2011-03-11 | 17 | NA | 101 | 0.70000 |
| 1 | 2011-03-12 | 12 | 2011-03-12 | 110 | 0.20000 |
| 1 | 2011-03-13 | 14 | NA | 110 | 0.40000 |
| 1 | 2011-03-14 | 17 | NA | 100 | 0.70000 |
| 1 | 2011-03-15 | 14 | NA | 101 | 0.40000 |
| 1 | 2011-03-16 | 17 | NA | 107 | 0.70000 |
| 1 | 2011-03-17 | 16 | NA | 104 | 0.60000 |
| 1 | 2011-03-18 | 15 | NA | 104 | NA |
| 1 | 2011-03-19 | 16 | NA | 102 | 0.06667 |
| 1 | 2011-03-20 | 17 | NA | 107 | 0.13333 |
| 1 | 2011-03-21 | 18 | NA | 104 | 0.20000 |
| 1 | 2011-03-22 | 11 | NA | 105 | -0.26667 |
| 1 | 2011-03-23 | 15 | NA | 100 | 0.00000 |
| 1 | 2011-03-24 | 12 | 2011-03-24 | 110 | -0.20000 |
| 1 | 2011-03-25 | 13 | NA | 110 | -0.13333 |
| 1 | 2011-03-26 | 15 | NA | 107 | 0.00000 |
| 2 | 2011-03-12 | 48 | NA | 300 | NA |
| 2 | 2011-03-13 | 49 | NA | 300 | NA |
| 2 | 2011-03-14 | 50 | NA | 290 | NA |
| 2 | 2011-03-15 | 57 | NA | 296 | 0.14000 |
| 2 | 2011-03-16 | 60 | NA | 297 | 0.20000 |
| 2 | 2011-03-17 | 49 | NA | 296 | -0.02000 |
| 2 | 2011-03-18 | 64 | NA | 299 | 0.28000 |
| 2 | 2011-03-19 | 63 | NA | 292 | 0.26000 |
| 2 | 2011-03-20 | 67 | 2011-03-20 | 290 | 0.34000 |
| 2 | 2011-03-21 | 70 | NA | 299 | 0.40000 |
| 2 | 2011-03-22 | 58 | NA | 295 | 0.16000 |
| 2 | 2011-03-23 | 65 | NA | 290 | 0.30000 |
| 2 | 2011-03-24 | 57 | NA | 296 | 0.14000 |
| 2 | 2011-03-25 | 55 | NA | 299 | 0.10000 |
| 2 | 2011-03-26 | 57 | NA | 299 | NA |
| 2 | 2011-03-27 | 60 | NA | 300 | NA |
| 3 | 2011-03-18 | 5 | NA | 54 | NA |
| 3 | 2011-03-19 | 10 | NA | 50 | NA |
| 3 | 2011-03-20 | 7 | NA | 53 | NA |
| 3 | 2011-03-21 | 8 | NA | 53 | NA |
| 3 | 2011-03-22 | 7 | NA | 50 | NA |
| 3 | 2011-03-23 | 8 | NA | 51 | 0.14286 |
| 3 | 2011-03-24 | 7 | NA | 52 | 0.00000 |
| 3 | 2011-03-25 | 6 | NA | 55 | -0.14286 |
| 3 | 2011-03-26 | 9 | NA | 54 | 0.28571 |
| 3 | 2011-03-27 | 9 | NA | 55 | 0.28571 |
| 3 | 2011-03-28 | 9 | 2011-03-28 | 50 | 0.28571 |
| 3 | 2011-03-29 | 6 | NA | 52 | -0.14286 |
| 3 | 2011-03-30 | 6 | NA | 53 | -0.14286 |
| 3 | 2011-03-31 | 4 | NA | 50 | -0.42857 |
| 3 | 2011-04-01 | 5 | NA | 50 | -0.28571 |
| 3 | 2011-04-02 | 8 | NA | 55 | 0.00000 |
| 3 | 2011-04-03 | 9 | NA | 55 | NA |
+----+------------+-------+------------+------------+----------------------+
This time, I would like to make a new column called BuyAndHoldWeightedMarket, where I calculate the weighted average (by Market cap) Buy and Hold return for each ID around -5 ~ +5 days of the event date. For example, for ID =1, starting from 2011-03-19, BuyAndHoldWeightedMarket is calculated as the sum product of (prices for each ID(t)/prices for each ID(eventdate-6)-1) and Market Caps for that day for each ID and then dividing that by the sum of the Market Caps for each ID on that day.
Please check the below picture for the details. The equations are listed for each case of colored blocks.
Please note that for the uppermost BuyAndHoldWeightedMarket, ID =2,3 is not involved because they begin later than 2011-03-06. For the third block (grey colored area), the calculation of weighted return only includes ID=1,2 because Id=3 begins later than 2011-03-14. Also, for the Last block (mixed color), the first four rows use all three IDs, Blue area uses only ID=2,3 because ID=1 ends 2011-03-26, and the yellow block uses only ID=3 because ID=1, 2 ends before 2011-03-28.
Eventually, I would like to get a nice data table that looks as below.
+----+------------+-------+------------+------------+----------------------+--------------------------+
| ID | Date | Price | EventDate | Market Cap | BuyAndHoldIndividual | BuyAndHoldWeightedMarket |
+----+------------+-------+------------+------------+----------------------+--------------------------+
| 1 | 2011-03-06 | 10 | NA | 109 | NA | NA |
| 1 | 2011-03-07 | 9 | NA | 107 | -0.10000 | -0.10000 |
| 1 | 2011-03-08 | 12 | NA | 109 | 0.20000 | 0.20000 |
| 1 | 2011-03-09 | 14 | NA | 107 | 0.40000 | 0.40000 |
| 1 | 2011-03-10 | 15 | NA | 101 | 0.50000 | 0.50000 |
| 1 | 2011-03-11 | 17 | NA | 101 | 0.70000 | 0.70000 |
| 1 | 2011-03-12 | 12 | 2011-03-12 | 110 | 0.20000 | 0.20000 |
| 1 | 2011-03-13 | 14 | NA | 110 | 0.40000 | 0.40000 |
| 1 | 2011-03-14 | 17 | NA | 100 | 0.70000 | 0.70000 |
| 1 | 2011-03-15 | 14 | NA | 101 | 0.40000 | 0.40000 |
| 1 | 2011-03-16 | 17 | NA | 107 | 0.70000 | 0.70000 |
| 1 | 2011-03-17 | 16 | NA | 104 | 0.60000 | 0.60000 |
| 1 | 2011-03-18 | 15 | NA | 104 | NA | NA |
| 1 | 2011-03-19 | 16 | NA | 102 | 0.06667 | 0.11765 |
| 1 | 2011-03-20 | 17 | NA | 107 | 0.13333 | 0.10902 |
| 1 | 2011-03-21 | 18 | NA | 104 | 0.20000 | 0.17682 |
| 1 | 2011-03-22 | 11 | NA | 105 | -0.26667 | -0.07924 |
| 1 | 2011-03-23 | 15 | NA | 100 | 0.00000 | 0.07966 |
| 1 | 2011-03-24 | 12 | 2011-03-24 | 110 | -0.20000 | -0.07331 |
| 1 | 2011-03-25 | 13 | NA | 110 | -0.13333 | -0.09852 |
| 1 | 2011-03-26 | 15 | NA | 107 | 0.00000 | 0.02282 |
| 2 | 2011-03-12 | 48 | NA | 300 | NA | NA |
| 2 | 2011-03-13 | 49 | NA | 300 | NA | NA |
| 2 | 2011-03-14 | 50 | NA | 290 | NA | NA |
| 2 | 2011-03-15 | 57 | NA | 296 | 0.14000 | 0.059487331 |
| 2 | 2011-03-16 | 60 | NA | 297 | 0.20000 | 0.147029703 |
| 2 | 2011-03-17 | 49 | NA | 296 | -0.02000 | -0.030094118 |
| 2 | 2011-03-18 | 64 | NA | 299 | 0.28000 | 0.177381404 |
| 2 | 2011-03-19 | 63 | NA | 292 | 0.26000 | 0.177461929 |
| 2 | 2011-03-20 | 67 | 2011-03-20 | 290 | 0.34000 | 0.24836272 |
| 2 | 2011-03-21 | 70 | NA | 299 | 0.40000 | 0.311954459 |
| 2 | 2011-03-22 | 58 | NA | 295 | 0.16000 | 0.025352941 |
| 2 | 2011-03-23 | 65 | NA | 290 | 0.30000 | 0.192911011 |
| 2 | 2011-03-24 | 57 | NA | 296 | 0.14000 | 0.022381918 |
| 2 | 2011-03-25 | 55 | NA | 299 | 0.10000 | 0.009823098 |
| 2 | 2011-03-26 | 57 | NA | 299 | NA | NA |
| 2 | 2011-03-27 | 60 | NA | 300 | NA | NA |
| 3 | 2011-03-18 | 5 | NA | 54 | NA | NA |
| 3 | 2011-03-19 | 10 | NA | 50 | NA | NA |
| 3 | 2011-03-20 | 7 | NA | 53 | NA | NA |
| 3 | 2011-03-21 | 8 | NA | 53 | NA | NA |
| 3 | 2011-03-22 | 7 | NA | 50 | NA | NA |
| 3 | 2011-03-23 | 8 | NA | 51 | 0.14286 | 0.178343199 |
| 3 | 2011-03-24 | 7 | NA | 52 | 0.00000 | 0.010691161 |
| 3 | 2011-03-25 | 6 | NA | 55 | -0.14286 | -0.007160905 |
| 3 | 2011-03-26 | 9 | NA | 54 | 0.28571 | 0.106918456 |
| 3 | 2011-03-27 | 9 | NA | 55 | 0.28571 | 0.073405953 |
| 3 | 2011-03-28 | 9 | 2011-03-28 | 50 | 0.28571 | 0.285714286 |
| 3 | 2011-03-29 | 6 | NA | 52 | -0.14286 | -0.142857143 |
| 3 | 2011-03-30 | 6 | NA | 53 | -0.14286 | -0.142857143 |
| 3 | 2011-03-31 | 4 | NA | 50 | -0.42857 | -0.428571429 |
| 3 | 2011-04-01 | 5 | NA | 50 | -0.28571 | -0.285714286 |
| 3 | 2011-04-02 | 8 | NA | 55 | 0.00000 | 0.142857143 |
| 3 | 2011-04-03 | 9 | NA | 55 | NA | NA |
+----+------------+-------+------------+------------+----------------------+--------------------------+
I tried so far by using the following code, with the help of the previous question, but I am having a hard time figure out how to calculate the weighted BUY AND HOLD return that begins around different event dates for each ID.
#choose rows with no NA in event date and only show ID and event date
events = unique(df[!is.na(EventDate),.(ID,EventDate)])
#helper column
#:= is defined for use in j only. It adds or updates or removes column(s) by reference.
#It makes no copies of any part of memory at all.
events[, eDate:=EventDate]
#makes new column(temporary) lower and upper boundary
df[, `:=`(s=Date-6, e=Date+6)]
#non-equi match
bhr = events[df, on=.(ID, EventDate>=s, EventDate<=e), nomatch=0]
#Generate the BuyHoldReturn column, by ID and EventDate
bhr2 = bhr[, .(Date, BuyHoldReturnM1=c(NA, (Price[-1]/Price[1] -1)*MarketCap[-1])), by = .(Date)]
#merge back to get the full data
bhr3 = bhr2[df,on=.(ID,Date),.(ID,Date,Price,EventDate=i.EventDate,BuyHoldReturn)]
I would be grateful if you could help.
Thank you very much in advance!

R mutate new column based on range of values in other column

I have r dataframe in following format
+--------+---------------+--------------------+--------+
| time | Stress_ratio | shear_displacement | CX |
+--------+---------------+--------------------+--------+
| <dbl> | <dbl> | <dbl> | <dbl> |
| 50.1 | -0.224 | 4.9 | 0 |
| 50.2 | -0.219 | 4.98 | 0.0100 |
| . | . | . | . |
| . | . | . | . |
| 249.3 | -0.217 | 4.97 | 0.0200 |
| 250.4 | -0.214 | 4.96 | 0.0300 |
| 251.1 | -0.222 | 4.91 | 0.06 |
| 252.1 | -0.222 | 4.91 | 0.06 |
| 253.3 | -0.222 | 4.91 | 0.06 |
| 254.5 | -0.222 | 4.91 | 0.06 |
| 256.8 | -0.222 | 4.91 | 0.06 |
| . | . | . | . |
| . | . | . | . |
| 500.1 | -0.22 | 4.91 | 0.6 |
| 501.4 | -0.22 | 4.91 | 0.6 |
| 503.1 | -0.22 | 4.91 | 0.6 |
+--------+---------------+--------------------+--------+
and I want a new column which has repetitive values based on the difference between a range of values in column time. The range should be 250 for the column time. For example in all the rows of new_column I should get number 1 when df$time[1] and df$time[1]*4.98 is 250. Similarly this number 1 should change to 2 when the next chunk starts of difference of 250. So the new dataframe should be like
+--------+---------------+--------------------+--------+------------+
| time | Stress_ratio | shear_displacement | CX | new_column |
+--------+---------------+--------------------+--------+------------+
| <dbl> | <dbl> | <dbl> | <dbl> | <dbl> |
| 50.1 | -0.224 | 4.9 | 0 | 1 |
| 50.2 | -0.219 | 4.98 | 0.0100 | 1 |
| . | . | . | . | 1 |
| . | . | . | . | 1 |
| 249.3 | -0.217 | 4.97 | 0.0200 | 1 |
| 250.4 | -0.214 | 4.96 | 0.0300 | 2 |
| 251.1 | -0.222 | 4.91 | 0.06 | 2 |
| 252.1 | -0.222 | 4.91 | 0.06 | 2 |
| 253.3 | -0.222 | 4.91 | 0.06 | 2 |
| 254.5 | -0.222 | 4.91 | 0.06 | 2 |
| 256.8 | -0.222 | 4.91 | 0.06 | 2 |
| . | . | . | . | . |
| . | . | . | . | . |
| 499.1 | -0.22 | 4.91 | 0.6 | 2 |
| 501.4 | -0.22 | 4.91 | 0.6 | 3 |
| 503.1 | -0.22 | 4.91 | 0.6 | 3 |
+--------+---------------+--------------------+--------+------------+

If I understand what you're trying to do, a base R solution could be:
df$new_column <- df$time %/% 250 + 1
The %/% operator is integer division (sort of the complement of the modulus operator) and tells you how many copies of 250 would fit into your number; we add 1 to get the value you want.
The tidyverse version:
df <- df %>%
mutate(new_column = time %/% 250 + 1)

library(data.table)
setDT(df)[, new_column := rleid(time %/% 250)][]

How to plot following exponential function properly

Data
| x | Y |
| --------| --------|
| 26.88 | 3.16 |
| 28.57 | 4.21 |
| 30.94 | 2.97 |
| 33.90 | 3.06 |
| 37.24 | 2.87 |
| 39.76 | 2.95 |
| 41.89 | 2.70 |
| 44.37 | 1.25 |
| 27.20 | 5.04 |
| 26.54 | 6.69 |
| 29.21 | 4.42 |
| 33.26 | 3.15 |
| 34.80 | 3.20 |
| 37.87 | 3.11 |
| 41.88 | 2.95 |
| 44.13 | 2.26 |
| 26.42 | 7.07 |
| 24.02 | 8.72 |
| 29.73 | 6.38 |
| 31.10 | 3.85 |
| 33.16 | 3.00 |
| 36.76 | 3.28 |
| 43.26 | 3.18 |
| 42.06 | 2.73 |
| 26.73 | 9.44 |
| 23.03 | 9.72 |
| 27.07 | 6.98 |
| 29.04 | 4.67 |
| 31.83 | 3.55 |
| 36.29 | 3.89 |
| 39.45 | 3.55 |
| 42.17 | 3.37 |
| 23.51 | 10.44 |
| 21.98 | 10.90 |
| 27.21 | 8.13 |
| 28.63 | 5.76 |
| 30.92 | 3.96 |
| 35.57 | 3.94 |
| 38.33 | 3.88 |
| 40.91 | 3.58 |
| 25.15 | 13.05 |
| 19.44 | 15.91 |
| 25.94 | 10.37 |
| 28.03 | 5.17 |
| 31.25 | 4.04 |
| 35.31 | 4.24 |
| 37.02 | 4.31 |
| 38.89 | 3.99 |
| 25.12 | 15.66 |
| 18.36 | 19.86 |
| 25.05 | 12.82 |
| 27.58 | 6.07 |
| 28.83 | 4.11 |
| 33.76 | 4.17 |
| 34.48 | 4.30 |
| 37.32 | 3.97 |
| 21.27 | 20.49 |
| 16.61 | 25.53 |
| 22.68 | 16.58 |
| 25.63 | 6.34 |
| 28.15 | 4.40 |
| 32.80 | 3.99 |
| 35.27 | 4.59 |
| 36.75 | 4.35 |
Code
library(data.table)
library(readxl)
library(dplyr)
library(ggplot2)
library(patchwork)
library(ggplot2)
library(ggpubr)
library(ggpmisc)
setwd("E:/")
Data_2 <- read_excel("Data_2.xlsx")
model.0 <- lm(log(Strength) ~ Theoritical, data= Data_2)
alpha.0 <- exp(coef(model.0)[1])
beta.0 <- coef(model.0)[2]
# Starting parameters
start <- list(alpha = alpha.0, beta = beta.0)
start
model <- nls(Strength ~ alpha * exp((1/beta) * Theoritical) , data = Data_2, start = start)
summary(model)
# Plot fitted curve
plot(Data_2$Theoritical, Data_2$Strength)
line(Data_2$Theoritical, predict(model, list(x = Data_2$Theoritical)), col = 'skyblue')
When draw my plot I got following image.
I need this kind of equation for my data
y=a*e^(-x/b)
I could not get the R^2 value as well as shown in this picture
Please correct my code as well. kindly help me to provide a good code for this best fit graph for that equation. I am new to R programming.

left_join with individual lag to new column

I need to merge two data frames probably with left_join, offset the joining observation by a specific amount and add it to a new column. The purpose is the preparation of a time-series analysis hence the different shifts in calendar weeks. I would like to stay in the tidyverse.
I read a few posts with a nested left-join() and lag() but that's beyond my current capability.
MWE
library(tidyverse)
set.seed(1234)
df1 <- data.frame(
Week1 = sample(paste("2015", 20:40, sep = "."),10, replace = FALSE),
Qty = as.numeric(sample(1:10)))
df2 <- data.frame(
Week2 = paste0("2015.", 1:52),
Value = as.numeric(sample(1:52)))
df1 %>%
left_join(df2, by = c("Week1" = "Week2")) %>%
rename(Lag_0 = Value)
Current output
+----+---------+-------+-------+
| | Week1 | Qty | Lag_0 |
+====+=========+=======+=======+
| 1 | 2015.35 | 6.00 | 50.00 |
+----+---------+-------+-------+
| 2 | 2015.24 | 10.00 | 26.00 |
+----+---------+-------+-------+
| 3 | 2015.31 | 7.00 | 43.00 |
+----+---------+-------+-------+
| 4 | 2015.34 | 9.00 | 42.00 |
+----+---------+-------+-------+
| 5 | 2015.28 | 4.00 | 10.00 |
+----+---------+-------+-------+
| 6 | 2015.39 | 8.00 | 24.00 |
+----+---------+-------+-------+
| 7 | 2015.25 | 5.00 | 33.00 |
+----+---------+-------+-------+
| 8 | 2015.23 | 1.00 | 39.00 |
+----+---------+-------+-------+
| 9 | 2015.21 | 2.00 | 17.00 |
+----+---------+-------+-------+
| 10 | 2015.26 | 3.00 | 27.00 |
+----+---------+-------+-------+
It might be worthwhile pointing out that the target data frame does not hold the same amount of week observations as the joining data frame.
Desired output
+----+---------+-------+-------+-------+-------+--------+
| | Week1 | Qty | Lag_0 | Lag_3 | Lag_6 | Lag_12 |
+====+=========+=======+=======+=======+=======+========+
| 1 | 2015.35 | 6.00 | 50.00 | 9.00 | | |
+----+---------+-------+-------+-------+-------+--------+
| 2 | 2015.24 | 10.00 | 26.00 | 17.00 | | |
+----+---------+-------+-------+-------+-------+--------+
| 3 | 2015.31 | 7.00 | 43.00 | 10.00 | | |
+----+---------+-------+-------+-------+-------+--------+
| 4 | 2015.34 | 9.00 | 42.00 | 43.00 | | |
+----+---------+-------+-------+-------+-------+--------+
| 5 | 2015.28 | 4.00 | 10.00 | 33.00 | | |
+----+---------+-------+-------+-------+-------+--------+
| 6 | 2015.39 | 8.00 | 24.00 | 13.00 | | |
+----+---------+-------+-------+-------+-------+--------+
| 7 | 2015.25 | 5.00 | 33.00 | 25.00 | | |
+----+---------+-------+-------+-------+-------+--------+
| 8 | 2015.23 | 1.00 | 39.00 | 38.00 | | |
+----+---------+-------+-------+-------+-------+--------+
| 9 | 2015.21 | 2.00 | 17.00 | 6.00 | | |
+----+---------+-------+-------+-------+-------+--------+
| 10 | 2015.26 | 3.00 | 27.00 | 39.00 | | |
+----+---------+-------+-------+-------+-------+--------+
Column Lag_3, which I added manually, contains the values from the matching df2 week value but offset by three rows. Lag_6 would be offset by six rows, etc.
I suppose the challenge is, that the lag() would have to happen in the joining table after the matching but before the returning of the value.
Hope this makes sense and thanks for the assistance.

We just need to create the lag before in the second data and then do the join
library(dplyr)
df2 %>%
mutate(Lag_3 = lag(Value, 3), Lag_6 = lag(Value, 6)) %>%
left_join(df1, ., by = c("Week1" = "Week2")) %>%
rename(Lag_0 = Value)
-output
# Week1 Qty Lag_0 Lag_3 Lag_6
#1 2015.35 6 50 9 46
#2 2015.24 10 26 17 6
#3 2015.31 7 43 10 33
#4 2015.34 9 42 43 10
#5 2015.28 4 10 33 25
#6 2015.39 8 24 13 16
#7 2015.25 5 33 25 49
#8 2015.23 1 39 38 15
#9 2015.21 2 17 6 32
#10 2015.26 3 27 39 38

renaming a column in a dataframe using a value from the global environment in R

I have a data frame (summary_transposed_no_time) and I want to rename one of the columns to a name I have stored as a value.
summary_transposed_no_time looks like this:
| A | B | C | D
------ | ------ | ------ | ------ | ------
area_1 | 0.870 | 0.435 | 0.968 | 0.679
area_2 | 0.456 | 0.259 | 0.906 | 0.467
area_3 | 0.298 | 0.256 | 0.457 | 0.768
area_4 | 0.994 | 0.987 | 0.365 | 0.765
My value is called test and it is set to "B" so I have tried using the following code with no luck:
summary_transposed_no_time <- names(summary_transposed_no_time)[c(test_col)]<-c("test")
Desire output
| A | test | C | D
------ | ------ | ------ | ------ | ------
area_1 | 0.870 | 0.435 | 0.968 | 0.679
area_2 | 0.456 | 0.259 | 0.906 | 0.467
area_3 | 0.298 | 0.256 | 0.457 | 0.768
area_4 | 0.994 | 0.987 | 0.365 | 0.765

I think you need (I have replaced summary_transposed_no_time with x)
names(x)[match(test_col, names(x))] <- "test"
x <- trees[1:5, ]
# Girth Height Volume
#1 8.3 70 10.3
#2 8.6 65 10.3
#3 8.8 63 10.2
#4 10.5 72 16.4
#5 10.7 81 18.8
names(x)[match("Girth", names(x))] <- "test"
# test Height Volume
#1 8.3 70 10.3
#2 8.6 65 10.3
#3 8.8 63 10.2
#4 10.5 72 16.4
#5 10.7 81 18.8

Develop Reference

r css asp.net wordpress firebase qt symfony nginx http apache-flex

Varargs is giving key error in Julia - julia

Related

Calculating weighted average buy and hold return per ID in R

R mutate new column based on range of values in other column

How to plot following exponential function properly

left_join with individual lag to new column

renaming a column in a dataframe using a value from the global environment in R

Categories

Resources