Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question does not appear to be about programming within the scope defined in the help center.
Closed 7 years ago.
Improve this question
I have this spreadsheet with all the movies I've seen during 2015.
I want to find an easy way to count the actors names (in column G) and print the most common names, with both the name and number of times it occurs.
Can I do that?
Here is the spreadsheet:
https://docs.google.com/spreadsheets/d/1jO6OTLRYW3-id6SynkNBeFSpYshZ_lTUWam3mS8CaKw/edit?usp=sharing
Select your column data ► Data ► Pivot Table...
Then in the "Rows - Add Field" add your column name.
Finally, in the "Values - Add Field" add the same column and choose COUNTA.
This might be your desired set-up:
Related
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 3 years ago.
Improve this question
I'm working with a very large dataset with lots of columns (400+) and every time I create a new variable or add a new one I have to reorder it. I want it ordered so that all the related variables remain together so I've been using dplyr::select() to reorder things. Yet there are times when I have to go back into my script very early on and add a new variable. When I run the whole code after that, there tends to be one or two variables I forgot to put into preceding select() functions so it goes missing.
I use select() because selecting all the columns between two variables and referencing them by name is super easy (eg, Vfour:Vthreefifty). Do you have any tips for reordering datasets with lots of columns?
Given no reproducible example but using your 2 column names:
df %>%
select(., starts_with('V'))
You can then chain starts_with as needed.
Other options include:
ends_with, contains, matches
Closed. This question needs debugging details. It is not currently accepting answers.
Edit the question to include desired behavior, a specific problem or error, and the shortest code necessary to reproduce the problem. This will help others answer the question.
Closed 4 years ago.
Improve this question
I have a dataset named SPT which has one column which consist 4 unique values-A,B,C,D, Now I have to create one more column which consist of 1 if first column had A, B otherwise 0.How to do it using ifelse command
Using ifelse:
SPT$new <- ifelse(SPT$col %in% c('A','B'),1,0)
Where new must be changed by the name of your new variable, and col by the name of the column that is storing the 4 letters
Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 7 years ago.
Improve this question
I'm working on a project in R, regarding baseball. I have two CSV's that I'm working with. One file, CSV2: "PitchingPost.csv" is all postseason pitching stats, and the column I'm looking at there is the "teamID". I'm trying to evaluate regular season pitching stats in another file, CSV1: "pitching.csv" but only for teams that made the postseason. So I'm trying to remove all of the items in the "teamID" of CSV1 EXCEPT for those occur in CSV2 "teamID".
Help?
To keep only the rows from your first file that share an ID with rows in your second file, you could try something like that:
pitch <- read.csv("pitching.csv")
pitch_post <- read.csv("PitchingPost.csv")
pitch <- pitch[pitch$teamID %in% unique(pitch_post$teamID),]
Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 7 years ago.
Improve this question
Without using a special function, can you do the following to change the values in an R dataframe from a column that have length greater than 4:
df[length(df$Column1)>4,"Column1"] = "replacement value"
This does not seem to work, is there an alternative index style I can use, or do I need to use a function?
Thanks
The function to determine the length of an entry, like a word in a dataframe, is nchar(), and not length(). The latter is typically used to determine the number of entries in a vector.
You could therefore try using:
df[nchar(df$Column1) > 4, "Column1"] <- "replacement value"
Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question appears to be off-topic because it lacks sufficient information to diagnose the problem. Describe your problem in more detail or include a minimal example in the question itself.
Closed 8 years ago.
Improve this question
I have a matrix
B = matrix(c(1,2,4,1,4,2,1,3.5,4.2,2,2.2,6.5,3,1.2,7.7,1,2.1,1.6,3,5.2,8.2),
nrow = 7)
with 3 columns: the color column (colors from 1-5, representing blue,red,green,yellow,black), the x column and the y column.
I want to plot every points (x,y) with the color from the color column.
First question: what software is best to do so?
Second question: how can I do it with R?
Simple question finds simple answer:
plot(B[,2], B[,3], col=c("blue","red","green","yellow","black")[B[,1]])