Naming the number of the row in a data frame that contains a certain value - r

I've done some thorough research and I am struggling with an attempt to find a function that will name the number of the row (in my data frame the rows don't contain numbers) that contains a certain value. In this case a number.
e.g. Call the data frame = df
I don't know how to show a little image of the data frame but say that in row 5, column 4 the value was '162', is there a function I could use that will end with the return being '5' or 'row 5'?
I have used rowsums(df=="162")
which gives a long line of the rows, if they contain the values there is a '1' under them, if not a '0' but I need a function that simply states the row.
I couldn't figure out how to correctly use the 'which' function either.

which(df$col4=='162')
I am assuming that col4 is the name of the column number 4

Related

How to split rows within a dataframe for a target column with multiple/nested values

With a dataframe that has, for example, one column x that has nested or multiple values for some rows, how would i, for those rows that have multiple values for x, append duplicate rows to the dataframe, save that that they correspond to one value within x.
To try to explain better, see "mock dataframe pre-transform", below. Row 1 has values "webui, cli, mobile" for column "module", and what i want is to append three near copies of row 1 to the dataframe, one with module value "webui", one with module value "cli" and one with module value "mobile". I also then want to remove the the original row 1. A similar operation would occur for row 4, such that the final dataframe would have 7 rows (see "mock dataframe post-transform, below).
mock dataframe pre-transform
mock dataframe post-transform

How do I select specific vectors of a matrix to be plotted against each other (such as when using hexplom)?

Is there a quick way to code for those specific vectors? Like I only want to use every 4th column in my matrix then plot the selected columns. I'm very new to R and have absolutely no idea what I'm doing. I know how to select a single vector and how to select a certain number in a row but that doesn't really help.
If you're looking to extract every 4th column from a matrix you can use seq().
Here's an example. I made a dummy dataset: foo<-matrix(c(rep(c(4,3,2,7),100)),nrow=10,ncol=10)
Then you can store the column indexes that you want from your matrix like so:
colsyouwant<-seq(from = 4, to = ncol(foo), by = 4)
from = whatever column you'd like to start from, in your case the 4th. Then you specify where you'd like to stop, so I used the ncol function to count how many columns are in the matrix. In this case my matrix isn't a multiple of 4 but it doesn't matter because seq stops before then. Then by=4 because you want to select every fourth column.
The colsyouwant now equals to 4 8. Simply use brackets and the name of your variable to get the columns you want out. foo[,colsyouwant]. Here the brackets just specify what part of the matrix I want as an output, it goes [rows,columns]. Since I want all the rows I leave that spot blank and then specify the rows using the colsyouwant variable, or in other words 4 8.

Missing values when excluding rows

I have a data frame of about 10,000,000 entries. There's only two columns: 'value' and 'deleted'. The values usually range from 1:1800, but also there's some odd strings. Deleted is a boolean indicating whether the value was deleted. If I copy this data frame with the condition
deletedFrame <- df[df$deleted!=0, ]
the resulting data frame reduces to 283 entries. However, it doesn't copy over any of the corresponding values. That column is there but is left blank. Any ideas on what I'm doing wrong?
It could be a case where we have NA along with the boolean, one way would be to use
df[df$deleted!=0 & !is.na(df$deleted), ]

Filling in values in a blank data frame

I have a data frame with a number of columns I read in, and now I want to add certain pieces only to certain columns.
For example, the variable periodicnumber exists in the dataframe called df and I want to give the first six rows the values 1 through 6. I thought code below would work but I get the error:
periodicnumber=seq(1,6)
df$periodicnumber=periodicnumber
Error in `$<-.data.frame`(`*tmp*`, "periodicnumber", value = 1:6) :
replacement has 6 rows, data has 0
As in, were this in Excel, I would write the numbers 1 through 6 only on the periodicnumber column.
If you only want to change the first six rows of df, you need to specify that in the assignment:
periodicnumber=seq(1,6)
df$periodicnumber[1:6]<-periodicnumber
More generally:
df$column[1:len(x)]<-x

extract columns that don't have a header or name in R

I need to extract the columns from a dataset without header names.
I have a ~10000 x 3 data set and I need to plot the first column against the second two.
I know how to do it when the columns have names ~ plot(data$V1, data$V2) but in this case they do not. How do I access each column individually when they do not have names?
Thanks
Why not give them sensible names?
names(data)=c("This","That","Other")
plot(data$This,data$That)
That's a better solution than using the column number, since names are meaningful and if your data changes to have a different number of columns your code may break in several places. Give your data the correct names and as long as you always refer to data$This then your code will work.
I usually select columns by their position in the matrix/data frame.
e.g.
dataset[,4] to select the 4th column.
The 1st number in brackets refers to rows, the second to columns. Here, I didn't use a "1st number" so all rows of column 4 are selected, i.e., the whole column.
This is easy to remember since it stems from matrix calculations. E.g., a 4x3 dimensional matrix has 4 rows and 3 columns. Thus when I want to select the 1st row of the third column, I could do something like matrix[1,3]

Resources