R tvm financial package - r

Im trying to estimate the present value of a stream of payments using the fvm in the financial package.
y <- tvm(pv=NA,i=2.5,n=1:10,pmt=-c(5,5,5,5,5,8,8,8,8,8))
The result that I obtain is:
y
Time Value of Money model
I% #N PV FV PMT Days #Adv P/YR C/YR
1 2.5 1 4.99 0 -5 30 0 12 12
2 2.5 2 9.97 0 -5 30 0 12 12
3 2.5 3 14.94 0 -5 30 0 12 12
4 2.5 4 19.90 0 -5 30 0 12 12
5 2.5 5 24.84 0 -5 30 0 12 12
6 2.5 6 47.65 0 -8 30 0 12 12
7 2.5 7 55.54 0 -8 30 0 12 12
8 2.5 8 63.40 0 -8 30 0 12 12
9 2.5 9 71.26 0 -8 30 0 12 12
10 2.5 10 79.09 0 -8 30 0 12 12
There is a jump in the PV from 5 to 6 (when the price changes to 8) that appears to be incorrect. This affects the result in y[10,3] which is the result that I'm interested in obtaining.
The NPV formula in Excel produces similar results when the payments are the same throughout the whole stream, however, when the vector of paymets is variable, the resuls with the tvm formula and the NPV differ. I need to obtain the same result that the NPV formula provides in Excel.
What should I do to make this work?

The cf formula helps but it is not always consistent with Excel.
I solved my problem using the following function:
npv<-function(a,b,c) sum(a/(1+b)^c)

Related

Choosing the correct fixed and random variables in a generalized linear mixed model (GLMM) in a longitudinal study (repeated measures)

I want to explore the relationship between the abundance of an organism and several possible explanatory factors. I have doubts regarding what variables should be called as fixed or random in the GLMM.
I have a dataset with the number of snails in different sites within a national park (all sites are under the same climatic conditions). But there are local parameters whose effects over the snail abundance haven't been studied yet.
This is a longitudinal study, with repeated measures over time (every month, for almost two years). The number of snails were counted in the field, always in the same 21 sites (each site has a 6x6 square meters plot, delimitated with wooden stakes).
In case it could influence the analysis, note that some parameters may vary over time, such as the vegetation cover in each plot, or the presence of the snail natural predator (measured with yes/no values). Others, however, are always the same, because they are specific to each site, such as the distant to the nearest riverbed or the type of soil.
Here is a subset of my data:
> snail.data
site time snails vegetation_cover predator type_soil distant_riverbed
1 1 1 9 NA n 1 13
2 1 2 7 0.8 n 1 13
3 1 3 13 1.4 n 1 13
4 1 4 14 0.6 n 1 13
5 1 5 12 1.6 n 1 13
10 2 1 0 NA n 1 136
11 2 2 0 0.0 n 1 136
12 2 3 0 0.0 n 1 136
13 2 4 0 0.0 n 1 136
14 2 5 0 0.0 n 1 136
19 3 1 1 NA n 2 201
20 3 2 0 0.0 n 2 201
21 3 3 0 0.0 y 2 201
22 3 4 3 0.0 n 2 201
23 3 5 2 0.0 n 2 201
28 4 1 0 NA n 2 104
29 4 2 0 0.0 n 2 104
30 4 3 0 0.0 y 2 104
31 4 4 0 0.0 n 2 104
32 4 5 0 0.0 n 2 104
37 5 1 1 NA n 3 65
38 5 2 0 2.4 n 3 65
39 5 3 3 2.2 n 3 65
40 5 4 2 2.2 n 3 65
41 5 5 4 2.0 y 3 65
46 6 1 1 NA n 3 78
47 6 2 2 3.0 n 3 78
48 6 3 7 2.8 n 3 78
49 6 4 3 1.8 n 3 78
50 6 5 6 1.2 y 3 78
55 7 1 14 NA n 3 91
56 7 2 21 2.8 n 3 91
57 7 3 16 2.6 n 3 91
58 7 4 15 1.6 n 3 91
59 7 5 8 2.0 n 3 91
So I'm interested in investigating if the number of snails is significantly different in each site and if those differences are related to some specific parameters.
So far the best statistic approach I have found is a generalized linear mixed model. But I'm struggling in choosing the correct fixed and random variables. My reasoning is, although I'm checking for the differences among sites (by comparing the number of snails) the focus of the study is the other parameters commented above, thus the site would be a random factor.
Then, my question is: should 'site' and 'time' be considered random factors and the local parameters should be the fixed variables? Should I include interactions between time and other factors?
I have set up my command as follows:
library(lme4)
mixed_model <- glmer(snails ~ vegetation_cover + predator + type_soil + distant_riverbed + (1|site) + (1|time), data = snails.data, family = poisson)
Would it be the correct syntax for what I have described?

proper date format for time series precipitation analysis

I have 100 yrs precipitation and temperature data that I would like to analyze using the 'seas' package. I have tried several formats for the date column and get the following error code every time:
Error in seas.df.check(x, orig, var) :
a ‘date’ column must exist in ‘srs_precip’
Example code below, where I have tried to coerce Gregorian dates into Date, also Date.POSIX but still get error message from 'seas' library.
Year yr.m.d Date date.J Inches mm_SRS Max_Temp_F Max_Temp_C Min_Temp_F Min_Temp_C Date.POSIX
1 1919 1919-01-01 1919-01-01 1 0 0 39 3.888889 26 -3.333333333 1919-01-01
2 1919 1919-01-02 1919-01-02 2 0 0 35 1.666667 19 -7.222222222 1919-01-02
3 1919 1919-01-03 1919-01-03 3 0 0 40 4.444444 14 -10 1919-01-03
4 1919 1919-01-04 1919-01-04 4 0 0 52 11.111111 20 -6.666666667 1919-01-04
5 1919 1919-01-05 1919-01-05 5 0 0 43 6.111111 20 -6.666666667 1919-01-05
6 1919 1919-01-06 1919-01-06 6 0 0 56 13.333333 31 -0.555555556 1919-01-06

X axis not ordering discrete value after melt of DF

I am fairly new to R and I have a problem with the usage of ggplot2 together with the melt function. In the specific case I am trying to create a multiline plot which represents certain time gaps and their evolution during a race.
Say the data frame is the following (DF_TimeGap)
Lap Ath1 Ath2 Ath3 Ath4 Ath5
1 1 0 0 0 -1 1
2 2 0 0 14 0 28
3 3 0 -1 3 0 18
4 4 0 0 1 0 3
5 5 0 -8 1 -9 3
6 6 0 -22 0 -23 1
7 7 0 0 1 -19 3
8 8 0 -1 13 -2 13
9 9 0 -1 1 0 -1
10 10 0 5 7 8 10
I then melt it with
library(reshape2)
DFMelt_TimeGap = melt(DF_TimeGap, id.var="Lap")
names(DFMelt_TimeGap)[2] = "Rider"
names(DFMelt_TimeGap)[3] = "Gap"
and it looks like (I'll just report the first two for space reasons)
Lap Rider Gap
1 1 Ath1 0
2 2 Ath1 0
3 3 Ath1 0
4 4 Ath1 0
5 5 Ath1 0
6 6 Ath1 0
7 7 Ath1 0
8 8 Ath1 0
9 9 Ath1 0
10 10 Ath1 0
11 1 Ath2 0
12 2 Ath2 0
13 3 Ath2 -1
14 4 Ath2 0
15 5 Ath2 -8
16 6 Ath2 -22
17 7 Ath2 0
18 8 Ath2 -1
19 9 Ath2 -1
20 10 Ath2 5
...
when I am trying to plot the multiline plot then
ggplot(DFMelt_TimeGap, aes(x = Lap, y = Gap, col= Rider, group = Rider)) +
geom_point()+
geom_line()+
xlab("Lap")+ ylab("Gap (s)")
what I obtain is the following graph(forget about colour labels, I am avoiding unnecessary code)
which would be fine if not for the fact that the ordering on the x axis is
1 10 2 3 4 5 6 7 8 9
Is anyone aware of how to fix this sort of issues?
Thanks to everyone who is so keen to contribute
In your melt process Lap gets somehow transformed to a character. My guess is that in your real data Lap contains already a character (or worse a factor). Then in your ggplot the x-axis is mapped to a character column, which uses alphabetical ordering by default.
You could verify that via str(DFMelt_TimeGap).
Best is to make sure that Lap is a numeric to start with so DF_TimeGap$Lap <- as.numeric(as.character(DF_TimeGap$Lap)) should fix it.
I used as.numeric(as.character(.)) in case your Lap was originally formatted as factor.
This will result in a numeric scale for your plot. You may like to add scale_x_continuous(breaks = 1:10) to have breaks at each Lap number.
If you want to stick to the factor/character variable. you have to manually adjust the ordering of the levels in DFMelt_TimeGap, which you could do via DFMelt_TimeGap$Lap <- factor(DFMelt_TimeGap$Lap, 1:10)

transform values in data frame, generate new values as 100 minus current value

I'm currently working on a script which will eventually plot the accumulation of losses from cell divisions. Firstly I generate a matrix of values and then I add the number of times 0 occurs in each column - a 0 represents a loss.
However, I am now thinking that a nice plot would be a degradation curve. So, given the following example;
>losses_plot_data <- melt(full_losses_data, id=c("Divisions", "Accuracy"), value.name = "Losses", variable.name = "Size")
> full_losses_data
Divisions Accuracy 20 15 10 5 2
1 0 0 0 0 3 25
2 0 0 0 1 10 39
3 0 0 1 3 17 48
4 0 0 1 5 23 55
5 0 1 3 8 29 60
6 0 1 4 11 34 64
7 0 2 5 13 38 67
8 0 3 7 16 42 70
9 0 4 9 19 45 72
10 0 5 11 22 48 74
Is there a way I can easily turn this table into being 100 minus the numbers shown in the table? If I can plot that data instead of my current data, I would have a lovely curve of degradation from 100% down to however many cells have been lost.
Assuming you do not want to do that for the first column:
fld <- full_losses_data
fld[, 2:ncol(fld)] <- 100 - fld[, -1]

Count of values in intervals of latitude and years

I have different dataframes with a column in which there are the latitudes (latitude) of some records and in another column of the same dataframe the date of the records (datecollected).
I would like to count and export in a new dataframe the number of the records in the same intervals of latitude (5 degrees) and year (two years).
(Hint: you'll make it easier for us to answer by providing some sample data.)
dataset <- data.frame(datecollected=
sample(as.Date("2000-01-01")+(0:3650),1000,replace=TRUE),
latitude=90*runif(1000))
We round the datecollected down to the next even year:
year.index <- (as.POSIXlt(dataset$datecollected)$year %/% 2)*2+1900
Similarly, we round the latitude down to the nearest multiple of 5 degrees:
latitude.index <- (floor(dataset$latitude) %/% 5)*5
Then we simply build a table on the rounded years and latitudes:
table(year.index,latitude.index)
latitude.index
year.index 0 5 10 15 20 25 30 35 40 45 50 55 60 65 70 75 80 85
2000 12 9 15 7 11 10 11 14 9 13 11 10 8 11 13 25 10 18
2002 11 9 11 16 11 15 12 5 12 13 7 15 8 7 11 7 10 13
2004 8 12 9 10 12 16 12 13 9 7 16 11 6 13 4 15 12 10
2006 14 8 13 10 12 9 12 9 6 11 11 9 13 9 10 5 5 12
2008 8 12 17 12 12 8 12 8 14 12 11 11 10 10 14 16 17 13
EDIT: after a bit of discussion in the comments, I'll post my current script. It seems like there may be an issue when you read the data into R. This is what I do and what I get:
rm(list=ls())
dataset <- read.csv("GADUS.csv",header=TRUE,sep=",")
year.index <- (as.POSIXlt(as.character(dataset$datecollected),format="%Y-%m-%d")$year
%/% 2)*2+1900
latitude.index <- (floor(dataset$latitude) %/% 5)*5
table(year.index,latitude.index)
latitude.index
year.index 0 5 20 35 40 45 50 55 60 65 70 75
1752 0 0 0 0 0 20 0 0 0 0 0 0
1754 0 0 0 0 0 27 0 3 0 0 0 0
1756 0 0 0 0 0 21 0 1 0 0 0 0
1758 0 0 0 0 0 46 0 2 0 0 0 0
...
Does this give the same result for you? If not, please edit your question and post the result of str(dataset[,c("datecollected","latitude")]).

Resources