How to combine values of different variables into new variable - r

I have several numerical variables, the value of which I want to add (+) into a new variable. The different things I tried always gave me the error message „non numeric argument to binary operator“. How do I add the values?
I have added them also by no_pp <- c(„Suppliers“ + „Producers“ + „Buyers“) but it doesn’t show up as a new variable

Related

Does `R` language Override variable values?

I'm creating a variable dataset and assigning a value to it like this :
dataset = iris
Now i assign a different value to the same variable like this :
dataset = read.csv(filename, header = FALSE)
Does R Override the previous value of dataset? Can anyone explain me how this works and can we assign more than one value to the same variable?
Yes, r will override the value of the previously assigned variable (Some examples to get started can be found here)
On a sidenote: In contrast to other languages, r uses <- as an assignment operator, so to make the code more readable to other users, you should consider using that instead of =.

Setting values to NA based on other variable

I'm doing analyses and am getting some doubtful results. However, one of my variables is not normally distributed. To check if this is the problem, I've tried to create a new, normally distributed variable that has the same mean and standard deviation as the original.
However, the original also has a lot of NAs, which I also want to mirror in my new variable (so that I can be certain that any differences between the original and the new variable can be attributed to the normal nature of the variable).
I've tried several ways to do it, but keep getting the warning that "the condition has length > 1 and only the first element will be used".
Can anyone help me? Below is the code I used to create a new variable!
data$var_normal <- rnorm(data$var_original, mean = 0.6200154, sd = 0.3555574)
And basically what I want to do:
if(data$var_original==NA) data$var_normal <- NA

What's the easiest way to ignore one row of data when creating a histogram in R?

I have this csv with 4000+ entries and I am trying to create a histogram of one of the variables. Because of the way the data was collected, there was a possibility that if data was uncollectable for that entry, it was coded as a period (.). I still want to create a histogram and just ignore that specific entry.
What would be the best or easiest way to go about this?
I tried making it so that the histogram would only use the data for every entry except the one with the period by doing
newlist <- data1$var[1:3722]+data1$var[3724:4282]
where 3723 is the entry with the period, but R said that + is not meaningful for factors. I'm not sure if I went about this the right way, my intention was to create a vector or list or table conjoining those two subsets above into one bigger list called newlist.
Your problem is deeper that you realize. When R read in the data and saw the lone . it interpreted that column as a factor (categorical variable).
You need to either convert the factor back to a numeric variable (this is FAQ 7.10) or reread the data forcing it to read that column as numeric, if you are using read.table or one of the functions that calls read.table then you can set the colClasses argument to specify a numeric column.
Once the column of data is a numeric variable then a negative subscript or !is.na will work (or some functions will automatically ignore the missing value).

My data is stored as a matrix and as a list at the same time?

I am using the tabular() function to produce tables in r (tables library).
I want to compute CI's from the data in the output (let mytable be the output from tabular()). Simple enough I thought, except when I go to call a value from the matrix, I get the error Error in mytable[1, i] - 1 : non-numeric argument to binary operator. I thought this was odd, as when I call up a particular cell of the matrix (where as.matrix returned true for mytable), for example mytable[1, i] for some i, I get an interger. I then do the as.list for mytable and get true also, so I am not sure what this means. I guess the tabular() function stores the results as a special kind of matrix.
I am only trying to pull out the mean,sdev, and n, which I am able to just by typing the cell location, for example mytable[1, i] would return an 86. However, when I try to call up the value in qt(.975,df=(mytable[1,i]-1)) for example, I get the error above. Not sure really how to approach this except to manually enter the values into another matrix (which I would like to avoid). Or, if I can compute CI's directly in the tabular() function that would work also. Cheers.
I shall quote for you the Value section of the documentation on the function ?tabular:
An object of S3 class "tabular". This is a matrix of mode list, whose
entries are computed summary values, with the following attributes:
rowLabels - A matrix of labels for the rows. This will have the same
number of rows as the main matrix, but may have multiple columns for
different nested levels of labels. If a label covers multiple rows, it
is entered in the first row, and NA is used to fill following rows.
colLabels - Like rowLabels, but labelling the columns.
table - The original table expression being displayed. A list of the
original format specifications are attached as a "fmtlist" attribute.
formats - A matrix of the same shape as the main result, containing NA
for default formatting, or an index into the format list.
As the documentation says, each element of the matrix is a list. If your tabular object is called tab type tab[1,1] and you should see a list containing one of your table values. If I wanted to modify that value, I would probably do something like:
tab[1,1]$term <- value
just like you would modify values in any other list.
Type attributes(tab) and you'll see the items listed above, containing a lot of the formatting information and row/col headers.

R: partimat function doesn't recognize my classes

I am a relatively novice r user and am attempting to use the partimat() function within the klaR package to plot decision boundaries for a linear discriminant analysis but I keep encountering the same error. I have tried inputing the arguments multiple different ways according to the manual, but keep getting the following error:
Error in partimat.default(x, grouping, ...) :
at least two classes required
Here is an example of the input I've given:
partimat(sources1[,c(3:19)],grouping=sources1[,2],method="lda",prec=100)
where my data table is loaded in under the name "sources1" with columns 3 through 19 containing the explanatory variables and column 2 containing the classes. I have also tried doing it by entering the formula like so:
partimat(sources1$group~sources1$tio2+sources1$v+sources1$cr+sources1$co+sources1$ni+sources1$rb+sources1$sr+sources1$y+sources1$zr+sources1$nb+sources1$la+sources1$gd+sources1$yb+sources1$hf+sources1$ta+sources1$th+sources1$u,data=sources1)
with these being the column heading.
I have successfully run an LDA on this same data set without issue so I'm not quite sure what is wrong.
From the source code of the partimat.default function getAnywhere(partimat.default) it states
if (nlevels(grouping) < 2)
stop("at least two classes required")
Therefore maybe you haven't defined your grouping column as a factor variable. If you try summary(sources1[,2]) what do you get? If it's not a factor, try
sources1[,2] <- as.factor(sources1[,2])
Or in method 2 try removing the "sources1$"on each of your variable names in the formula as you specify the data frame in which to look for these variable names in the data argument. I think you are effectively specifying the dataframe twice and it might be looking, for instance, for
"sources1$sources1$groups"
Rather than
"sources1$groups"
Without further error messages or a reproducible example (i.e. include some data in your post) it's hard to say really.
HTH

Resources