How to make nested ifelse loop dynamic - r

i Have a Df:abc as below
Sr|VALUE
a |85
b |120
c |145
d |225
e |100
f |325
g |410
I am writing below code to create a count for each record such that its 0 for VALUE<100,1 for VALUE between[100,200),2 for VALUE>=200
Stepdif<-100
abc = within(abc, {
Count = ifelse(abc$VALUE>=Stepdif & abc$VALUE<2*Stepdif,1,ifelse(abc$VALUE>=2*Stepdif ,2,0))
})
to give result as
Sr|VALUE|Count
a |85 |0
b |120 |1
c |145 |1
d |225 |2
e |100 |1
f |325 |2
g |410 |2
Now i want a code using which i can define count for each duration of 100. I dont want to write code as such
Count = ifelse(abc$VALUE>=Stepdif & abc$VALUE<2*Stepdif,1,ifelse(abc$VALUE>=2*Stepdif & abc$VALUE<3*Stepdif,2,ifelse(abc$VALUE>=3*Stepdif & abc$VALUE<4**Stepdif,3,ifelse(abc$VALUE>=4*Stepdif ,4,0))))
Rather i want to make it dynamic so that if i change the no of iteration from 4 to 6 , i dont have to rewrite the code again.
expected result
Sr|VALUE|Count
a |85 |0
b |120 |1
c |145 |1
d |225 |2
e |100 |1
f |325 |3
g |410 |4

hope this will be of help:
funfun=function(x,n){n=1:n*100; findInterval(x,n)}
funfun(k$VALUE,2)
[1] 0 1 1 2 1 2 2
funfun(k$VALUE,4)
[1] 0 1 1 2 1 3 4

Related

Find substring that have the the correlation greater than k in R

I have a big problems in R :(. We have a dataframe named:"hcmut" show the answers of students in half term test like here:
hcmut
Code | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10
2011 | B | D | A | A | C | B | A | B | C | C
2012 | A | D | AC | B | D | B | A | B | C | C
2013 | A | D | A | A | C | D | C | B | D | D
2014 | A | B | A | C | BC | D | D | D | D | D
Question 1: find substring that have the corrrelation greater than k?
I think k is from range 0:1
Question 2: find substring that have the greatest correlation and show substring ( like "ABCD"...)
Could you help me with problem in R ?
:( :(

How to compare comma separated string in one column with the comma separated strings other dataframe

I have two df as below
df1:
M1 |
-------+
a,b,c |
a |
b,c |
c,b,a |
b,a,d |
d,a,b,c|
a,d,c |
b |
c,d |
d,a |
df2:
X1 |X2
--------+---
a |1
b |2
c |3
d |4
a,b |5
a,c |6
a,d |7
b,c |8
b,d |9
c,d |10
a,b,c |11
a,c,d |12
a,b,d |13
b,c,d |14
a,b,c,d |15
can someone help me to match values in df1$M1 and df2$X1. and put the corresponding X2 value in column M2 as below
df1:
M1 |M2
--------+---
a,b,c |11
a |1
b,c |8
c,b,a |11
b,a,d |13
d,a,b,c |15
a,d,c |12
b |2
c,d |10
d,a |7
Can someone help me
X1 and M1 have to be stored as Characters. You can check with str(df1), and re-assign if necessary df1 <- as.character(df1$X1), and the same for df2
Then, create new columns with the values in alphabetical order:
df1$Ordered <- sapply(lapply(strsplit(df1$X1, ","), sort),paste,collapse=",")
df2$Ordered <- sapply(lapply(strsplit(df2$M1, ","), sort),paste,collapse=",")
Then perform a join like so:
merge(df1, df2, by="Ordered")
If you want to include all the values in df1 regardless of whether they have a matching value in df2, add the all.x = TRUE argument. Same logic applies adding all = TRUE (include everything from both data frames), or all.y = TRUE for df2.

R: How to count rows with same factor levels and a numeric in a range

If got data looking like this:
A | B | C
--------------
f | 1 | 1420h
f | 1 | 1540h
f | 3 | 600h
g | 2 | 900h
g | 2 | 930h
h | 1 | 700h
h | 3 | 400h
Now I want to create a new column which counts other rows in the data frame that meet certain conditions.
In this case I would like to know in each row how often the same combination of A and B occured in a range of 100 around C.
So the result with this data would be:
A | B | C | D
------------------
f | 1 | 1420 | 0
f | 1 | 1540 | 0
f | 3 | 1321 | 0
g | 2 | 900 | 1
g | 2 | 930 | 1
h | 1 | 700 | 0
h | 3 | 400 | 0
I actually came to a solution using for(for()). But the time R needs to compute the resuts is tooooo long.
for(i in 1:nrow(df)) {
df[i,D] <- sum( for(p in 1:nrow(df)) {
df[p,A] == df[i,A] &
df[p,B] == df[i,B] &
df[i,C] +100 > df[p,C] &
df[p,C] > df[i,C]-100 } ) }
Is there a better way?
Thanks a lot!

R data frame - Include NAs in aggregation [duplicate]

This question already has an answer here:
Aggregate with na.action=na.pass gives unexpected answer
(1 answer)
Closed 6 years ago.
With a data frame df1 like below
+-----------------------------------------+
|reg |make |model |year|abs |gears|fm|
+-----------------------------------------+
|ax1234|Toyota|Corolla|1999|true |6 |0 |
|ax1235|Toyota|Corolla|1999|false|5 |0 |
|ax1236|Toyota|Corolla|1992|false|4 |NA|
|ax1237|Toyota|Camry |2001|true |7 |1 |
|ax1238|Honda |Civic |1994|true |5 |NA|
|ax1239|Honda |Civic |2000|false|6 |0 |
|ax1240|Honda |Accord |1992|false|4 |NA|
|ax1241|Nissan|Sunny |2001|true |6 |0 |
|ax1242| | |1998|false|6 |0 |
|ax1243|NA |NA |1992|false|4 |NA|
+-----------------------------------------+
On aggregation like below, I want to preserve makes with NA - how to achieve this ? It is fine to have the make and NA are combined together.
> aggregate(reg ~ make, df1, length)
make reg
1 1
2 Honda 3
3 Nissan 1
4 Toyota 4
We can use dplyr and it gives the NA count as well
library(dplyr)
df1 %>%
group_by(make) %>%
summarise(reg = n())

R: two data frame merge with 2 variables and several other conditions

I am a beginner in R. Here is an example of a datatable (C) that I created using jmp. I have joined Table A and B using A1 and B;C columns to create C . In the datatable B, the cloumn OP that contains CLO is dropped during the join while the column J from datatable A is updated during the join.
I am trying to create the dataframe C using the merge command in R. I used the following expression:
C <- merge(B,A, BY=c("A1","B;C"),all.x = TRUE) but I don't seem to get the Data frame C. I would appreciate any help from the community to solve this.
Data Frame A
A1 | B;C | D |E |F |G | H | I |J |K |L | M |
------|------|---|--|---|---|---|------------|---|----|----|---|
ABCD |SD;TH |HO |2 |FA | |ENG| 201808:SPR |54 |PRO |VAC |MAA|
JCBW |RF;TH |HO |2 |FU |VIN|FUT| 504278:SPR |4 |PRO |VAC |MAA|
TVGH |ED;UJ |HO |2 |FU |VIN|FUT| 504276:SPR |4 |PRO |VAC |MAA|
IGHE |WR;RE |HO |3 |IN | |SPE| 504278:SPR |73 |PRO |VAC |MAA|
UUUU |DF;TH |HO |3 |FU | |FUT| 357193:IT |13 |INT |VAC |MAA|
JFLD |YO;TH |HO |3 |CH |BRI|CHE| 476306:SPR |6 |PRO |VAC |MAA|
|
Data frame B
OWN|COM|OP |GR |J | A1 | B;C | D|E |F |G |H | I |K |L |M
---|---|---|---|--|-----|-----|--|--|--|---|---|-----------|---|---|----
SUP|X |CLO|ARE|16|59HUW|BB;TH|HO|8 |FA|MIC|SPE|90278:SPR |INT|VAC|MAA
SUP|X |OPE|ARE|75|ABCD |SD;TH|HO|8 |FU|MIC|ENG|201808:SPR |INT|VAC|MAA
SUP|X |CLO|ARE|4 |59HVG|BB;RE|HO|8 |FA|MIC|SPE|6074278:SPR|INT|VAC|MAA
PAD|X |CLO|PEN|30|9RHSG|BV;TH|HO|2 |FA| |SPE|201808:SPR |PRO|VAC|MAA
PAD|X |OPE|PEN|99|UUUU |DF;TH|HO|8 |FU|MIC|FUT|357193:IT |PRO|VAC|MAA
PAD|X |OPE|PEN|65|IGHE |WR;RE|HO|8 |IN| |SPE|504278:SPR |PRO|VAC|MAA
PAD|X |CLO|PEN|13|S9K7E|FN;TH|HO|8 |FA|MIC|FUT|394290:SPR |PRO|VAC|MAA
Data frame C
OWN|COM|OP |GR |J |A1 | B;C |D |E |F | G |H | I | K |L |M
---|---|---|---|---|----|-----|--|--|--|---|---|----------|---|---|----
SUP|x |OPE|ARE|99 |ABCD|SD;TH|HO|8 |FU|MIC|ENG|201808:SPR|INT|VAC|MAA
PAD|x |OPE|PEN|120|UUUU|DF;TH|HO|8 |FU|MIC|FUT|357193:IT |PRO|VAC|MAA
PAD|x |OPE|PEN|73 |IGHE|WR;RE|HO|8 |IN| |SPE|504278:SPR|PRO|VAC|MAA
| | | |4 |JCBW|RF;TH|HO|2 |FU|VIN|FUT|504278:SPR|PRO|VAC|MAA
| | | |25 |TVGH|ED;UJ|HO|2 |FU|VIN|FUT|504276:SPR|PRO|VAC|MAA
| | | |15 |JFLD|YO;TH|HO|3 |CH|BRI|CHE|476306:SPR|PRO|VAC|MAA

Resources