Divide column values within a vector - r

I'm not sure if my title is properly expressing what I'm asking. Once I'm done writing, it'll make sense. Firstly, I just started learning R, so I am a newbie. I've been reading through tutorial series and PDF's I've found online.
I'm working on a data set and I created a data frame of just the year 2001 and the DAM value Bon. Here's a picture.
What I want to do now is create a matrix with 3 columns: Coho Adults, Coho Jacks and the third column the ratio of Coho Jacks to Adults. This is what I'm having trouble with. The ratio between Coho Jacks to Adults.
If I do a line of code like this I get a normal output.
(cohoPassage <- matrix(fishPassage1995BON[c(5,6, 7)], ncol = 3))
The values are 259756, 6780 114934.
I'm figuring in order to get the ratio, I should divide column 5 and column 6's values. So basically 259756/6780 = 38.31
I've tried many things like:
(cohoPassage <- matrix(fishPassage1995BON[c(5,6, 5/6)], ncol = 3))
This just outputs the value of the fifth column instead of dividing for some reason
I've tried this:
matrix(fishPassage1995BON[c(5,6)],fishPassage1995BON[,5]/fishPassage1995BON[,6], ncol = 3)
Which gives me an incorrect output
I decided to break down the problem and divide the fifth and sixth columns separately and it gave the correct ratio.
If I create a matrix like this
matrix(fishPassage1995BON[,5]/fishPassage1995BON[,6])
It outputs the correct ratio of 38.31209. But when I try to combine everything, I just keep getting errors.
What can I do? Any help would be appreciated. Thank you.

Related

Create a simple range/values in range table

I am working on a simple project to help me get to know R, coming from javascript.
I have imported a list of numbers, and all I simply want to do, is to export a table that looks like the following:
"range","number"
"0.000-0.510",863
"0.510-1.020",21
"1.020-1.530",2
"1.530-2.040",2
"2.040-2.550",0
"2.550-3.059",2
"3.059-3.569",0
"3.569-4.079",3
"4.079->4.589",0
"4.589->5.099",1
where the ranges are in 10 steps, from the smallest to the largest value, the "range" and "number" are the top rows, and the columns going down are the different ranges and number of occurrences in this range.
This is my attempt so far:
list <- read.csv(file = "results/solarSystem.data")
table(list)
range <- (max(list) - min(list)) / 10
a1<-as.data.frame(table(cut(list,breaks=c(min(list),min(list)+1*range,min(list)+2*range,min(list)+3*range,min(list)+4*range,min(list)+5*range,min(list)+6*range,min(list)+7*range,min(list)+8*range,min(list)+9*range,max(list)))))
colnames(a1)<-c("range","freq")
a1
However, I get an error that
'Error in cut.default(list, breaks = c(min(list), min(list) + 1 * range...
'x' must be numeric'
This is the file I am importing, what looks like just a simple list of numbers, so I don't understand how it cannot be numeric?
https://gyazo.com/8fd00ce45c1c033f9dc9bf6c829195eb
Any advice on this would be appreciated!
Peter

Getting a fixed number of zeroes randomly in a matrix in R

I have a matrix of size 2000x50. For the 50 places in each of the 2000 rows, I want exactly 6 of them to be 1 and the remaining 44 to be 0. This distribution needs to be random across each row. I have tried using the sample, rbinom functions but none of them seem to be helping. It is also possible that I might not be using them correctly. All thoughts and inputs regarding this will be appreciated.
Thank you.
Edit- Initially I wanted those 6 numbers to be one but now I want them to be sampled randomly from a gamma distribution with both shape and scale= 4. How do I make changes to the suggestions below to incorporate this? I am very new to R and these basic things seem to be troubling me. Thanks once again.
This will create the object you requested:
do.call("cbind", lapply(1:2000, function(x) sample(c(rep(1, 6), rep(0, 44)))))
Based on the approach by #Gki in the comments, you can generate a matrix via replicate + t, i.e.,
m <- t(replicate(2000,sample(rep(0:1, c(44,6)))))

How to manage factors with mixed data types

I'm afraid this question has two sub parts. My project is to determine which insurance carrier has the lowest cost based on CPT Codes. Since there are so many CPT Codes I wanted to group them using cut like this:
uCPTCode<- unique(data$CPTCode)
uCPTCode <- cut(uCPTCode,
breaks = c(-Inf, "01999", "69979", "79999", "89398", "99091", "99499", Inf),
labels = c("NA","Anesthesia", "Surgery", "Radiology", "Pathology&Laboratory", "Medicine","Evaluation&Management", "Temp"),
right = FALSE)
Not sure unique is required or wise, but seemed to make sense to me. The issue is that some codes have leading zeros and terminating letters like this
2608 Levels: 0014F 0159T 0164T 0191T 0195T 0232T 0319T 0326T 0513F 0517F 0518F
So question 1 is what is the process to convert these ranges into integers corresponding to the labels I have in the cut function so I can graph the grouped results the x axis?
Question 2 is that I expected the ranges to be continuous, but they are not. How to I manage what happens around code 99000 through 99216 where previous groups (Medicine, Anesthesiology and Evaluation and Management) get combined? Here is a link to the CPT grouper file https://www.dropbox.com/s/wm55n17pufoacww/CPTGrouper.xlsx?dl=0
Here is a smattering of results to see where I am going with it
https://www.dropbox.com/s/h6sdnvm9yew6jdg/SampleStudyResults.xlsx?dl=0
Thanks very much for your time and attention

Attributing row name of irregular number of rows (populations)

I've been given this to do by the GENELAND tutorial to give population names to a dataset of populations of 60 individuals :
pop.mbrship1<-rep(c(1,2,3), each=60)
Nevertheless, my dataset comprises 10 populations of irregular sizes to which i would give the names of 1,2,3,4,5,6,7,8,9,10 and the distribution of my individuals (represented by one row each) would be :
1:24,25:39,40:58,59:79,80:103,104:126,127:147,148:171,172:191,192:214
I'd be tempted to use each population number as number of repeats which would make it
pop.mbrship1<-rep[c(1,2,3,4,5,6,7,8,9,10), each=c(24,15,19,21,24,23,21,24,20,23)]
Or try their distribution...
pop.mbrship1<-rep[c(1,2,3,4,5,6,7,8,9,10),
c(1:24,25:39,40:58,59:79,80:103,104:126,127:147,148:171,172:191,192:214)]
In both case, R gives me Error: unexpected '>' in ">"
I'm sure i'm really close to having it work but i've spent a shameful amount of time on this and i'd defenetly need a hand. Thanks a lot!
I'm looking at the geneland tutorial and I see that they have > at the beginning of the lines that you're copying/editing.
You are copying everything including the console pointer > all you need to copy/paste is :
# replicates each element 60 times
pop.mbrship1 <- rep(c(1,2,3),each=60)
# replicates each element, respectively
pop.mbrship2 <- rep(c(1,2,3),times=c(60,40,30))
Your answer is what Henrik said above, without a preceding>.
pop.mbrship1 <- rep(c(1,2,3,4,5,6,7,8,9,10), c(24,15,19,21,24,23,21,24,20,23))
# same as
pop.mbrship1 <- rep(c(...),times=c(...))

R Accumulate equity data - add time and price

I have some data formatted as below. I have done some analysis on this and would like to be able to plot the price development in the same graph as the analyzed data.
This requires me to have the same x-axes for the data.
So I would like to aggregate the "shares" column in say 150 increments, and add the "finalprice" and "time" to this.
The aggregation should include the latest time and price, so if the aggregation needs to occur over two or more rows of data then the last row should provide the price and time data.
My question is how to create a new vector with 150 shares per row.
The length of the vector will equal sum(shares)/150.
Is there an easy way to do this? Thanks in advance.
Edit:
I thought about expanding the observations using rep(finalprice, shares) and then getting each 150th value of the expanded vector.
Data sample:
"date","ord","shares","finalprice","time","stock"
20120702,E,2000,99.35,540.84753333,500
20120702,E,28000,99.35,540.84753333,500
20120702,E,50,99.5,542.03073333,500
20120702,E,13874,99.5,542.29411667,500
20120702,E,292,99.5,542.30191667,500
20120702,E,784,99.5,542.30193333,500
20120702,E,13300,99.35,543.04805,500
20120702,E,16658,99.35,543.04805,500
20120702,E,42,99.5,543.04805,500
20120702,E,400,99.4,546.17173333,500
20120702,E,100,99.4,547.07,500
20120702,E,2219,99.3,549.47988333,500
20120702,E,781,99.3,549.5238,500
20120702,E,50,99.3,553.4052,500
20120702,E,1500,99.35,559.86275,500
20120702,E,103,99.5,567.56726667,500
20120702,E,1105,99.7,573.93326667,500
20120702,E,4100,99.5,582.2657,500
20120702,E,900,99.5,582.2657,500
20120702,E,1024,99.45,582.43891667,500
20120702,E,8214,99.45,582.43891667,500
20120702,E,10762,99.45,582.43895,500
20120702,E,1250,99.6,586.86446667,500
20120702,E,5000,99.45,594.39061667,500
20120702,E,20000,99.45,594.39061667,500
20120702,E,15000,99.45,594.39061667,500
20120702,E,4000,99.45,601.34491667,500
20120702,E,8700,99.45,603.53608333,500
20120702,E,3290,99.6,609.23213333,500
I think I got it solved.
expand <- rep(finalprice, shares)
Increment <- expand[seq(from = 1, to = length(expand), by = 150)]

Resources