else {} statement ignored in a for loop [closed] - r

Closed. This question is not reproducible or was caused by typos. It is not currently accepting answers.
This question was caused by a typo or a problem that can no longer be reproduced. While similar questions may be on-topic here, this one was resolved in a way less likely to help future readers.
Closed 4 years ago.
Improve this question
I have a tibble, Agencies, with two columns as follows:
> head(Agencies, 10)
# A tibble: 10 x 2
AgencyNumber State
<int> <chr>
1 1 AR
2 2 Arkansas
3 3 Texas
4 4 Texas
5 5 TX
6 6 IL
7 7 Illinois
8 8 Illinois
9 9 IL
10 10 IL
I'm trying to add a column (Agencies$STATE) with the full state name. If Agencies$State is an abbreviation, it should use the abbr2state function to save the full name to the new column. If Agencies$State already has the full name, it should store the value of Agencies$State to the new column.
I'm using the following code:
Agencies$STATE <- "NA"
for(i in 1:nrow(Agencies)) {
if(nchar(Agencies$State[i] == 2)) {
Agencies$STATE[i] <- abbr2state(Agencies$State[i])
}
else {
Agencies$STATE[i] <- Agencies$State[i]
}
}
The output is unexpected. It appears to evaluate the first if statement as expected, but ignores the else statement.
> head(Agencies, 10)
# A tibble: 10 x 3
AgencyNumber State STATE
<int> <chr> <chr>
1 1 AR Arkansas
2 2 Arkansas <NA>
3 3 Texas <NA>
4 4 Texas <NA>
5 5 TX Texas
6 6 IL Illinois
7 7 Illinois <NA>
8 8 Illinois <NA>
9 9 IL Illinois
10 10 IL Illinois
I'm a bit new to R so this may be an obvious error, but I'm missing it.
Any suggestions on why this isn't doing what I expect?
Thanks,
Jeff

Your statement nchar(Agencies$State[i] == 2)
should be (nchar(Agencies$State[i]) == 2)
You misplace the parenthesis
You can also use dplyr to avoid the loops
library(dplyr)
Agencies %>%
mutate(state = ifelse( stringi::stri_length(State) == 2,abbr2state(State),State))

Related

Calculating mean with NA value present in a data.frame using R [closed]

Closed. This question is not reproducible or was caused by typos. It is not currently accepting answers.
This question was caused by a typo or a problem that can no longer be reproduced. While similar questions may be on-topic here, this one was resolved in a way less likely to help future readers.
Closed 1 year ago.
Improve this question
I have a data.frame and would like to average a column where there is an NA present.
when performing the calculation I noticed that R cannot calculate the average, returning NA as a result.
OBS: I cannot remove the line with NA as it would remove other columns with values that interest me.
df1<-read.table(text="st date ph
1 01/02/2004 5
16 01/02/2004 6
2 01/02/2004 8
2 01/02/2004 8
2 01/02/2004 8
16 01/02/2004 6
1 01/02/2004 NA
1 01/02/2004 5
16 01/02/2004 NA
", sep="", header=TRUE)
df2<-df1%>%
group_by(st, date)%>%
summarise(ph=mean(ph))
View(df2)
out
my expectation was this result:
You need to use na.rm = TRUE:
df2<-df1%>%
group_by(st, date)%>%
summarise(ph=mean(ph, na.rm = TRUE))
df2
# A tibble: 3 x 3
# Groups: st [3]
st date ph
<int> <chr> <dbl>
1 1 01/02/2004 5
2 2 01/02/2004 8
3 16 01/02/2004 6

Trying to Extract States and Counties using map_data Function [closed]

Closed. This question is not reproducible or was caused by typos. It is not currently accepting answers.
This question was caused by a typo or a problem that can no longer be reproduced. While similar questions may be on-topic here, this one was resolved in a way less likely to help future readers.
Closed 4 years ago.
Improve this question
I am trying to run the following:
library(ggplot2)
library(RColorBrewer)
state_df <- map_data('state')
county_df <- map_data('county')
transform_mapdata <- function(x){
names(x)[5:6] <- c('state','county')
for(u in c('state','county'){
x[,u] <- sapply(x[,u],simpleCap)
}
return(x)
}
state_df <- transform_mapdata(state_df)
county_df <- transform_mapdata(county_df)
I keep getting this message:
Error in x[, u] : incorrect number of dimensions
> }
Error: unexpected '}' in " }"
The data seems ok, so. I guess the problem has something to do with the transformation.
> head(state_df)
long lat group order region subregion
1 -87.46201 30.38968 1 1 alabama <NA>
2 -87.48493 30.37249 1 2 alabama <NA>
3 -87.52503 30.37249 1 3 alabama <NA>
4 -87.53076 30.33239 1 4 alabama <NA>
5 -87.57087 30.32665 1 5 alabama <NA>
6 -87.58806 30.32665 1 6 alabama <NA>
> head(county_df)
long lat group order region subregion
1 -86.50517 32.34920 1 1 alabama autauga
2 -86.53382 32.35493 1 2 alabama autauga
3 -86.54527 32.36639 1 3 alabama autauga
4 -86.55673 32.37785 1 4 alabama autauga
5 -86.57966 32.38357 1 5 alabama autauga
6 -86.59111 32.37785 1 6 alabama autauga
First issue seems to be a missing parenthesis in:
"for(u in c('state','county'){"
Should be:
for(u in c('state','county')){
Although when that is fixed this error comes up:
Error in sapply(x[, u], simpleCap) : object 'simpleCap' not found

Should I use for loop? OR apply? [duplicate]

This question already has answers here:
Split dataframe by levels of a factor and name dataframes by those levels
(3 answers)
Closed 5 years ago.
this is my first post.
I have this dataframe of the Nhl draft.
What I would like to do is to use some sort of recursive function to create 10 objects.
So, I want to create these 10 objects by subsetting the Nhl dataframe by Year.
Here are the first 6 rows of the data set (nhl_draft)
Year Overall Team
1 2000 1 New York Islanders
2 2000 2 Atlanta Thrashers
3 2000 3 Minnesota Wild
4 2000 4 Columbus Blue Jackets
5 2000 5 New York Islanders
6 2000 6 Nashville Predators
Player PS
1 Rick DiPietro 49.3
2 Dany Heatley 95.2
3 Marian Gaborik 103.6
4 Rostislav Klesla 34.5
5 Raffi Torres 28.4
6 Scott Hartnell 74.5
I want to create 10 objects by subsetting out the Years, 2000 ~ 2009.
I tried,
for (i in 2000:2009) {
nhl_draft.i <- subset(nhl_draft, Year == "i")
}
BUT this doesn't do anything. What's the problem with this for-loop? Can you suggest any other ways?
Please tell me if this is confusing after all, this is my first post......
The following code may fix your error.
# Create an empty list
nhl_list <- list()
for (i in 2000:2009) {
# Subset the data frame based on Year
nhl_draft_temp <- subset(nhl_draft, Year == i)
# Assign the subset to the list
nhl_list[[as.character(i)]] <- nhl_draft_temp
}
But you can consider split, which is more concise.
nhl_list <- split(nhl_draft, f = nhl_draft$Year)

Multiple conditions in R using a specific variable [closed]

Closed. This question needs debugging details. It is not currently accepting answers.
Edit the question to include desired behavior, a specific problem or error, and the shortest code necessary to reproduce the problem. This will help others answer the question.
Closed 6 years ago.
Improve this question
i have a simple question. I have a big df like :
Name AGE Order
Anna 25 1
Anna 28 2
Peter 10 1
Paul 15 1
Mary 14 1
John 8 1
Charlie 24 2
Robert 20 2
For just Order= 1 , I need filter AGE>=10 & AGE<=15. So output file must be:
Name AGE Order
Anna 28 2
Peter 10 1
Paul 15 1
Mary 14 1
Charlie 24 2
Robert 20 2
Could you help me, please?
We can use vectorized ifelse
For Order = 1 check if AGE lies in the range of 10-15, select rest rows as it is.
df[ifelse(df$Order==1, df$AGE >= 10 & df$AGE <= 15, TRUE), ]
# Name AGE Order
#2 Anna 28 2
#3 Peter 10 1
#4 Paul 15 1
#5 Mary 14 1
#7 Charlie 24 2
#8 Robert 20 2
We can also consolidate to:
subset(df, AGE >= 10 & AGE <= 15 | Order != 1)

dplyr object not found error [closed]

Closed. This question is not reproducible or was caused by typos. It is not currently accepting answers.
This question was caused by a typo or a problem that can no longer be reproduced. While similar questions may be on-topic here, this one was resolved in a way less likely to help future readers.
Closed 6 years ago.
Improve this question
I am not quite sure why this piece of code isn't working.
Here's how my data looks like:
head(test)
Fiscal.Year Fiscal.Quarter Seller Product.Revenue Product.Quantity Product.Family Sales.Level.1 Group Fiscal.Week
1 2015 2015Q3 ABCD1234 4000 4 Paper cup Americas Paper Division 32
2 2014 2014Q1 DDH1234 300 5 Paper tissue Asia Pacific Paper Division 33
3 2015 2015Q1 PNS1234 298 6 Spoons EMEA Cutlery 34
4 2016 2016Q4 CCC1234 289 7 Knives Africa Cutlery 33
Now, my objective is to summarize revenue by year.
Here's the dplyr code I wrote:
test %>%
group_by(Fiscal.Year) %>%
select(Seller,Product.Family,Fiscal.Year) %>%
summarise(Rev1 = sum(Product.Revenue)) %>%
arrange(Fiscal.Year)
This doesnt work. I get the error:
Error: object 'Product.Revenue' not found
However, when I get rid of select statement, it works but then I don't get to see the output with Sellers, and Product family.
test %>%
group_by(Fiscal.Year) %>%
# select(Seller,Product.Family,Fiscal.Year) %>%
summarise(Rev1 = sum(Product.Revenue)) %>%
arrange(Fiscal.Year)
The output is :
# A tibble: 3 x 2
Fiscal.Year Rev1
<dbl> <dbl>
1 2014 300
2 2015 4298
3 2016 289
This works well.
Any idea what's going on? It's been about 3 weeks since I started programming in R. So, I'd appreciate your thoughts. I am following this guide: https://cran.rstudio.com/web/packages/dplyr/vignettes/introduction.html
Also, I looked at similar threads on SO, but I believe they were relating to issues because of "+" sign:Error in dplyr group_by function, object not found
I am looking for the following output:
Fiscal.Year Rev1 Product Family Seller
<dbl> <dbl> ... ...
1 2014 ...
2 2015 ...
3 2016 ...
Many thanks
Ok. This did the trick:
test %>%
group_by(Fiscal.Year, Seller,Product.Family) %>%
summarise(Rev1 = sum(Product.Revenue)) %>%
arrange(Fiscal.Year)
Output:
Source: local data frame [4 x 4]
Groups: Fiscal.Year, Seller [4]
Fiscal.Year Seller Product.Family Rev1
<dbl> <chr> <chr> <dbl>
1 2014 DDH1234 Paper tissue 300
2 2015 ABCD1234 Paper cup 4000
3 2015 PNS1234 Spoons 298
4 2016 CCC1234 Knives 289

Resources