I have 7909 observations in a Dataframe. I want to record Engineer ID against a logical value (y/n) to indicate whether he arrived on site early/late:
> head(table(f_SOMs$ENG.ID, f_SOMs$EARLY))
FALSE TRUE
4PXO18 0 0
5BZS21 0 0
5DAD37 0 0
6YJM50 6YGP51 0 0
7JVE00 0 0
7KAM01 0 0
The problem I have is it doesn't print the whole thing and gives this at the end of printing:
[ reached getOption("max.print") -- omitted 5061 rows ]
How to get all the values? Is there an easier way please?
Related
A sample of my data can be seen below. The data contains information about ties between organizations (over 2000 organizations, the csv file has 0s and 1s, and empty cells)
A2654 B0004 B0188 B1278 B1372 B1722 B2503
A2654 0 1 0 0 0 1 0
B0004 1 0 0 0 0 1 0
B0188 0 0 0 0 0 0 0
B1278 0 0 0 0 0 0 0
B1372 0 0 0 0 0 0 0
B1722 1 1 0 0 0 0 0
(1) The first problem is that I can't import this data (.csv) into R
I runt the following code dt <- read_csv2("Org_ties.csv") The problem here is that while in the csv file the first column is left empty (it should be) -- when reading it into R, read_csv() generates a label for this column "X1". I do this in order to run the next code: g=graph_from_adjacency_matrix(dtmtrx, mode="directed", weighted = T) to produce a graph. However, I get the error message below. I think it has to do with the fact that I can't read it properly.
graph.adjacency.dense(adjmatrix, mode = mode, weighted = weighted, :
not a square matrix
In addition: Warning message:
In mde(x) : NAs introduced by coercion
(2) Another puzzling thing is that I cannot seem to transform the current data structure into an edge list. How can I do that? The edge list looks something like this
V1 V2 weight
A2654 B0004 1
A2654 B0188 0
A2654 B1278 0
A2654 B1372 0
A2654 B1722 1
I am currently working with about 301 rows of data and want to determine the earliest point at which only a few particular columns are nonzero. However, I also want to ensure that this does not change. For example, the two columns are nonzero, while all other columns are zero, then later in the dataframe other columns are nonzero as well, this would mean that I would have to determine a later point which is "correct".
I have the data:
1 x y z xx xy xz
292 0 -8.965140 9.596890 0 0 0 -0.03147483
293 0 -9.079889 9.645991 0 0 0 -0.02722520
294 0 -8.967767 9.597826 0 0 0 0
295 0 -9.090561 9.650230 0 0 0 -0.02685287
296 0 -9.081568 9.646105 0 0 0 -0.02716237
297 0 0.000000 0.000000 0 0 0 0.00000000
298 0 0.000000 0.000000 0 0 0 0.00000000
299 0 -9.098568 9.628576 0 0 0 -0.02654466
300 0 -9.089815 9.646099 0 0 0 -0.02681748
301 0 -8.998078 9.605140 0 0 0 0
As you can see, only the variables x and y are selected for row 294, however, the xz variable contains values after that until the 301 row. Is it possible to develop a function which tells me at which point is the minimum row where I see only x and y as nonzero and it remains that way until the final row of the dataframe?
I'm sorry if it's difficult to understand the question, I found it difficult asking how exactly to accomplish this issue.
EDIT: I presume I could use something like
which((df$x != 0 & df$y != 0 &
(df[, 1] | df[, 4] == 0))
but then I need to somehow expand the second or statement to all columns of df.
Thanks in advance.
I have generate a heatmap with pheatmap and for some reasons, I want that the rows appear in a predefined order.
I see in previous posts that the solution is to set the paramater cluster_row to FALSE, and to order the matrix in the order we want, like this in my case:
Otu0085 Otu0086 Otu0087 Otu0088 Otu0091
AB200 0 0 0 0 0
2 91 0 2 1 0
20CF360 0 1 0 1 0
19CF359 0 0 0 2 0
11VP12 0 0 0 0 155
11VP04 4 1 0 0 345
However, when I do:
pheatmap(shared,cluster_rows = F)
My rows are sorted alphabetically, like this:
10CF278a
11
11AA07
11CF278b
11VP03
11VP04
11VP05
11VP06
11VP08
11VP09
ANy suggestions would be welcome
Thank's by advance
I have multiple response questions which have 5 categories (values). I want to get respondents who answered only one category.
For example,
Respondents who answered category not 2,3,4,5.
I want only A mentions like, who are all checked A category alone. I need count of this.
Help, Please.
The following solution is assuming the data has 5 dichotomous variables - one for each of the multiple response categories.
* creating some sample data to demonstrate on.
data list list/cat1 to cat5.
begin data
1 0 0 0 1
0 1 1 0 0
1 0 0 0 0
0 1 0 0 0
0 0 1 0 0
0 0 0 0 1
1 0 0 0 0
1 1 1 0 0
end data.
* now checking in which cases only category 1 was chosen.
compute NumCats=sum(cat1 to cat5).
if cat1=1 and NumCats=1 onlyCat1=1.
execute.
* if instead you wish to do the same check for each of the 5 categories,
use `do repeat` this way.
do repeat cat=cat1 to cat5/only=only1 to only5.
compute only=(cat=1 and NumCats=1).
end repeat.
execute.
But ditch the EXECUTE commands. They just cause a useless data pass in this case except for immediately updating the Data Editor (instead of updating on the next data pass).
What's the 'nature' of a table.ff object in r? dim of table.ff is N
ULL, and typically it is used for frequency measures. I could not find any funtion to add all columns together in order to do some statistics on resultant ' numeric vector'. str of my example of table.ff is num [1:215558488] 0 0 0 0 0 0 0 0 0 0 ...
Thanks ahead for any thoughts!