Simple Percentage formula claculation - percentage

Morning all
I need calculate a percentage in excel 2010
I have cell R15 as 17%
I have cell F15 as 197
How do I calculate 17% of 197 with a formula?
Very simple I know but I am useless
Thanks
Danny

In Excel percentages are stored as decimals so 75% is actually stored as 0.75.
Given this you should be able to multiply the 2 together to get the result
197 x .17 = 33.49
or =(F15*R15)

Multiply both cells:
=F15*R15
If you wrote only 197 and 17 (without % symbol) do the following:
=F15*R15/100
Bye.

Related

Rolling subset of data frame within for loop in R

Big picture explanation is I am trying to do a sliding window analysis on environmental data in R. I have PAR (photosynthetically active radiation) data for a select number of sequential dates (pre-determined based off other biological factors) for two years (2014 and 2015) with one value of PAR per day. See below the few first lines of the data frame (data frame name is "rollingpar").
par14 par15
1356.3242 1306.7725
NaN 1232.5637
1349.3519 505.4832
NaN 1350.4282
1344.9306 1344.6508
NaN 1277.9051
989.5620 NaN
I would like to create a loop (or any other way possible) to subset the data frame (both columns!) into two week windows (14 rows) from start to finish sliding from one window to the next by a week (7 rows). So the first window would include rows 1 to 14 and the second window would include rows 8 to 21 and so forth. After subsetting, the data needs to be flipped in structure (currently using the melt function in the reshape2 package) so that the values of the PAR data are in one column and the variable of par14 or par15 is in the other column. Then I need to get rid of the NaN data and finally perform a wilcox rank sum test on each window comparing PAR by the variable year (par14 or par15). Below is the code I wrote to prove the concept of what I wanted and for the first subsetted window it gives me exactly what I want.
library(reshape2)
par.sub=rollingpar[1:14, ]
par.sub=melt(par.sub)
par.sub=na.omit(par.sub)
par.sub$variable=as.factor(par.sub$variable)
wilcox.test(value~variable, par.sub)
#when melt flips a data frame the columns become value and variable...
#for this case value holds the PAR data and variable holds the year
#information
When I tried to write a for loop to iterate the process through the whole data frame (total rows = 139) I got errors every which way I ran it. Additionally, this loop doesn't even take into account the sliding by one week aspect. I figured if I could just figure out how to get windows and run analysis via a loop first then I could try to parse through the sliding part. Basically I realize that what I explained I wanted and what I wrote this for loop to do are slightly different. The code below is sliding row by row or on a one day basis. I would greatly appreciate if the solution encompassed the sliding by a week aspect. I am fairly new to R and do not have extensive experience with for loops so I feel like there is probably an easy fix to make this work.
wilcoxvalues=data.frame(p.values=numeric(0))
Upar=rollingpar$par14
for (i in 1:length(Upar)){
par.sub=rollingpar[[i]:[i]+13, ]
par.sub=melt(par.sub)
par.sub=na.omit(par.sub)
par.sub$variable=as.factor(par.sub$variable)
save.sub=wilcox.test(value~variable, par.sub)
for (j in 1:length(save.sub)){
wilcoxvalues$p.value[j]=save.sub$p.value
}
}
If anyone has a much better way to do this through a different package or function that I am unaware of I would love to be enlightened. I did try roll apply but ran into problems with finding a way to apply it to an entire data frame and not just one column. I have searched for assistance from the many other questions regarding subsetting, for loops, and rolling analysis, but can't quite seem to find exactly what I need. Any help would be appreciated to a frustrated grad student :) and if I did not provide enough information please let me know.
Consider an lapply using a sequence of every 7 values through 365 days of year (last day not included to avoid single day in last grouping), all to return a dataframe list of Wilcox test p-values with Week indicator. Then later row bind each list item into final, single dataframe:
library(reshape2)
slidingWindow <- seq(1,364,by=7)
slidingWindow
# [1] 1 8 15 22 29 36 43 50 57 64 71 78 85 92 99 106 113 120 127
# [20] 134 141 148 155 162 169 176 183 190 197 204 211 218 225 232 239 246 253 260
# [39] 267 274 281 288 295 302 309 316 323 330 337 344 351 358
# LIST OF WILCOX P VALUES DFs FOR EACH SLIDING WINDOW (TWO-WEEK PERIODS)
wilcoxvalues <- lapply(slidingWindow, function(i) {
par.sub=rollingpar[i:(i+13), ]
par.sub=melt(par.sub)
par.sub=na.omit(par.sub)
par.sub$variable=as.factor(par.sub$variable)
data.frame(week=paste0("Week: ", i%/%7+1, "-", i%/%7+2),
p.values=wilcox.test(value~variable, par.sub)$p.value)
})
# SINGLE DF OF ALL P-VALUES
wilcoxdf <- do.call(rbind, wilcoxvalues)

Scilab Data stretching

I have a data file with 2 columns. First column runs from 0 to 1390 second column has different values. (1st column is X pixel coordinates 2nd is intensity values).
I would like to "stretch" the data so that the first column runs from 0 to 1516 and the second column gets linearly interpolated for these new datapoints.
Any simple way to do this in scilab?
Data looks like this:
0 300.333
1 289.667
2 273
...
1388 427
1389 393.667
1390 252
Interpolation
You can linearly interpolate using interpln. Following the demo implementation on the docs, this results in the below code.
Example code
x=[0 1 2 1388 1389 1390];
y=[300.333 289.667 273 427 393.667 252];
plot2d(x',y',[-3],"011"," ",[-10,0,1400, 500]);
yi=interpln([x;y],0:1390);
plot2d((0:1390)',yi',[3],"000");
Resulting plot
Extrapolation
I think you are thinking of extrapolation, since it is outside the known measurements and not in between.
You should determine if you would like to fit the data datafit. For a tutorial see here or here.
The question was how to "stretch" the y vector from 1391 values to 1517 values. It is possible to do that with interpln as suggested by #user1149326 but we need to stretch the x vector before the interpolation:
x=[0 1 2 1388 1389 1390];
y=[300.333 289.667 273 427 393.667 252];
d=1391/1517;
x2=0:d:1390;
yi=interpln([x;y],x2);
x3=0:1516;
plot2d(x3',yi',[3],"000");

Mismatching drawdown calculations

I would like to ask you to clarify the next question, which is of extreme importance to me, since a major part of my master's thesis relies on properly implementing the data calculated in the following example.
I hava a list of financial time series, which look like this (AUDUSD example):
Open High Low Last
1992-05-18 0.7571 0.7600 0.7565 0.7598
1992-05-19 0.7594 0.7595 0.7570 0.7573
1992-05-20 0.7569 0.7570 0.7548 0.7562
1992-05-21 0.7558 0.7590 0.7540 0.7570
1992-05-22 0.7574 0.7585 0.7555 0.7576
1992-05-25 0.7575 0.7598 0.7568 0.7582
From this data I calculate log returns for the column Last to obtain something like this
Last
1992-05-19 -0.0032957646
1992-05-20 -0.0014535847
1992-05-21 0.0010573620
1992-05-22 0.0007922884
Now I want to calculate the drawdowns in the above presented time series, which I achieve by using (from package PerformanceAnalytics)
ddStats <- drawdownsStats(timeSeries(AUDUSDLgRetLast[,1], rownames(AUDUSDLgRetLast)))
which results in the following output (here are just the first 5 lines, but it returns every single drawdown, including also one day long ones)
From Trough To Depth Length ToTrough Recovery
1 1996-12-03 2001-04-02 2007-07-13 -0.4298531511 2766 1127 1639
2 2008-07-16 2008-10-27 2011-04-08 -0.4003839141 713 74 639
3 2011-07-28 2014-01-24 2014-05-13 -0.2254426369 730 652 NA
4 1992-06-09 1993-10-04 1994-12-06 -0.1609854215 650 344 306
5 2007-07-26 2007-08-16 2007-09-28 -0.1037999707 47 16 31
Now, the problem is the following: The depth of the worst drawdown (according to the upper output) is -0.4298, whereas if I do the following calculations "by hand" I obtain
(AUDUSD[as.character(ddStats[1,1]),4]-AUDUSD[as.character(ddStats[1,2]),4])/(AUDUSD[as.character(ddStats[1,1]),4])
[1] 0.399373
To make things clearer, this are the two lines from the AUDUSD dataframe for from and through dates:
AUDUSD[as.character(ddStats[1,1]),]
Open High Low Last
1996-12-03 0.8161 0.8167 0.7845 0.7975
AUDUSD[as.character(ddStats[1,2]),]
Open High Low Last
2001-04-02 0.4858 0.4887 0.4773 0.479
Also, the other drawdown depts do not agree with the calculations "by hand". What I am missing? How come that this two numbers, which should be the same, differ for a substantial amount?
I have tried replicating the drawdown via:
cumsum(rets) -cummax(cumsum(rets))
where rets is the vector of your log returns.
For some reason when I calculate Drawdowns that are say less than 20% I get the same results as table.Drawdowns() & drawdownsStats() but when there is a large difference say drawdowns over 35%, then the Max Drawdown begin to diverge between calculations. More specifically the table.Drawdowns() & drawdownsStats() are overstated (at least what i noticed). I do not know why this is so, but perhaps what might help is if you use an confidence interval for large drawdowns (those over 35%) by using the Standard error of the drawdown. I would use: 0.4298531511/sqrt(1127) which is the max drawdown/sqrt(depth to trough). This would yield a +/- of 0.01280437 or a drawdown of 0.4169956 to 0.4426044 respectively, which the lower interval of 0.4169956 is much closer to you "by-hand" calculation of 0.399373. Hope it helps.

Looping within a loop in R

I'm trying to build quite a complex loop in R.
I have a set of data set as an object called p_int (p_int is peak intensity).
For this example the structure of p_int i.e. str(p_int) is:
num [1:1599]
The size of p_int can vary i.e. [1:688], [1:1200] etc.
What I'm trying to do with p_int is to construct a complex loop to extract the monoisotopic peaks, these are peaks with certain characteristics which will be extracted into a second object: mono_iso:
search for the first eight sets of data results in p_int. Of these eight, find the set of data with the greatest score (this score also needs to be above 50).
Once this result has been found, record it into mono_iso.
The loop will then fix on to this position of where this result is located within the large dataset. From this position it will then skip the next result along the dataset before doing the same for the next set of 8 results.
So something similar to this:
16 Results: 100 120 90 66 220 90 70 30 70 100 54 85 310 200 33 41
** So, to begin with, the loop would take the first 8 results:
100 120 90 66 220 90 70 30
**It would then decide which peak is the greatest:
220
**It would determine whether 220 was greater than 50
IF YES: It would record 220 into "mono_iso"
IF NO: It would move on to the next set of 8 results
**220 is greater than 50... so records into mono_iso
The loop would then place it's position at 220 it would then skip the "90" and begin the same thing again for the next set of 8 results beginning at the next data result in line: in this case at the 70:
70 30 70 100 54 85 310 200
It would then record the "310" value (highest value) and do the same thing again etc etc until the end of the set of data.
Hope this makes perfect sense. If anyone could possibly help me out into making such a loop work with R-script, I'd very much appreciate it.
Use this:
mono_iso <- aggregate(p_int, by=list(group=((seq_along(p_int)-1)%/%8)+1), function(x)ifelse(max(x)>50,max(x),NA))$x
This will put NA for groups such that max(...)<=50. If you want to filter those out, use this:
mono_iso <- mono_iso[!is.na(mono_iso)]

Data dictionary packing in R

I am thinking of writing a data dictionary function in R which, taking a data frame as an argument, will do the following:
1) Create a text file which:
a. Summarises the data frame by listing the number of variables by class, number of observations, number of complete observations … etc
b. For each variable, summarise the key facts about that variable: mean, min, max, mode, number of missing observations … etc
2) Creates a pdf containing a histogram for each numeric or integer variable and a bar chart for each attribute variable.
The basic idea is to create a data dictionary of a data frame with one function.
My question is: is there a package which already does this? And if not, do people think this would be a useful function?
Thanks
There are a variety of describe functions in various packages. The one I am most familiar with is Hmisc::describe. Here's its description from its help page:
" This function determines whether the variable is character, factor, category, binary, discrete numeric, and continuous numeric, and prints a concise statistical summary according to each. A numeric variable is deemed discrete if it has <= 10 unique values. In this case, quantiles are not printed. A frequency table is printed for any non-binary variable if it has no more than 20 unique values. For any variable with at least 20 unique values, the 5 lowest and highest values are printed."
And an example of the output:
Hmisc::describe(work2[, c("CHOLEST","HDL")])
work2[, c("CHOLEST", "HDL")]
2 Variables 5325006 Observations
----------------------------------------------------------------------------------
CHOLEST
n missing unique Mean .05 .10 .25 .50 .75 .90
4410307 914699 689 199.4 141 152 172 196 223 250
.95
268
lowest : 0 10 19 20 31, highest: 1102 1204 1213 1219 1234
----------------------------------------------------------------------------------
HDL
n missing unique Mean .05 .10 .25 .50 .75 .90
4410298 914708 258 54.2 32 36 43 52 63 75
.95
83
lowest : -11.0 0.0 0.2 1.0 2.0, highest: 241.0 243.0 248.0 272.0 275.0
----------------------------------------------------------------------------------
Furthermore, on your point about getting histograms, the Hmisc::latex method for a describe-object will produce histograms interleaved in the output illustrated above. (You do need to have a function LaTeX installation to take advantage of this.) I'm pretty sure you can find an illustration of the output in either Harrell's website or with the Amazon "Look Inside" presentation of his book "Regression Modeling Strategies". The book has a ton of useful material regarding data analysis.

Resources