scilab - program to find position of zero - scilab

255 255 255 255 255 255 0 0 0 0 255 255 255
255 255 255 255 255 0 255 255 0 0 0 255 255
255 255 255 255 0 255 255 255 0 0 0 0 255
255 255 255 0 255 255 255 255 0 0 0 0 0
255 255 255 0 0 0 0 0 0 0 0 0 0
255 255 255 0 0 0 0 0 0 255 255 255 0
255 255 255 255 0 0 0 0 0 255 255 0 255
255 255 255 255 0 0 0 0 0 255 0 0 255
255 255 255 255 255 255 0 0 0 0 255 255255
How can I find by using scilab functions, the zero at row 5 and column 9 (5*9)

Saying that your matrix is saved in "data.dat" you can first read the matrix from the file and store it in a variable (here "M") using:
M = fscanfMat('/<add filepath here>/data.dat');
The next step is to find the matrix elements which equal 0. This can be done using:
[row, column] = find(M == 0);
Where "row" is a (1-by-n) vector containing the row indices of the elements equal to zero and "column" contains the column indices of found elements, respectively.
If you are interested in how many zeros there are found you can use
n = size(row, 'c')
which tells you that there are 53 zeros found.

Related

How can I extract distinct element from matrix's rows in R?

I have a matrix as shown and I want to extract from it an other matrix where without any duplicated element in each row.
This is the input matrix
head(Data_Achat2)
ID_Achat 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36
1 1349 433 405 451 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
2 4890 405 405 416 416 388 464 416 388 392 405 393 405 433 453 392 416 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
3 7881 405 384 390 395 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
4 8081 442 405 405 475 464 405 442 405 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
5 9465 457 417 416 391 441 441 392 441 401 441 432 388 395 466 464 399 475 466 464 481 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
6 10626 432 390 433 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
In other word I want to get for example for the second row like this:
2 4890 405 416 388 464 388 392 393 433 453
Then, each row of the new matrix has only distincts element of the input one and all of results is in matrix (which include also 0 values for missing values).
I would row-wise apply a function that only retains the m unique values and then "pad" that vector to a length N with zeros, by adding N - m zeros to the unique values:
N <- ncol(Data_Achat2)
t(apply(Data_Achat2, 1, function(x){
uniques <- unique(x)
return(c(uniques, rep(0, N-length(uniques))))
}))
Which results in:
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11] [,12] [,13] [,14] [,15] [,16] [,17] --- [,36] [,37]
1 1349 433 405 451 0 0 0 0 0 0 0 0 0 0 0 0 0 --- 0 0
2 4890 405 416 388 464 392 393 433 453 0 0 0 0 0 0 0 0 --- 0 0
3 7881 405 384 390 395 0 0 0 0 0 0 0 0 0 0 0 0 --- 0 0
4 8081 442 405 475 464 0 0 0 0 0 0 0 0 0 0 0 0 --- 0 0
5 9465 457 417 416 391 441 392 401 432 388 395 466 464 399 475 481 0 --- 0 0
6 10626 432 390 433 0 0 0 0 0 0 0 0 0 0 0 0 0 --- 0 0

Generate an image with specific dimensions from a data frame in R

I have a data frame in R with the following dimensions [15750,93]. I want to construct an image using this data such that there are 3 row coordinates and 31 column coordinates in the image. Each column in the data frame corresponds to data from one coordinate position in the image. The columns in the data frame have been arranged based on their respective coordinates in the following manner [1,1], [2,1], [3,1], [1,2], [2,2], [3,2] ......... [1,31],[2,31],[3,31]
To generate the image, for each column I would like to have an average of all values, a sum of all values and the highest value in each column. This way there will be exactly one value corresponding to a coordinate. And, with the 3 variations, I should get three types of images - average, sum and highest value.
Can someone help me in generating an overall image using this data or can guide me using data with smaller dimensions?
Some demo data below:
Dimensions of the data frame are [11, 15]
0 0 0 0 0 46 0 0 0 0 0 0 0 78 0
0 734 0 0 0 0 932 0 0 56 0 0 0 0 0
0 0 0 115 0 0 0 0 0 0 64 0 0 0 0
0 67 0 0 0 45 0 0 0 0 0 546 0 12 0
0 0 0 0 65 5 56 0 54 0 0 0 0 0 0
667 0 430 0 0 0 0 456 0 0 787 0 0 467 0
0 0 0 0 54 0 0 0 0 0 0 456 90 0 0
778 45 0 0 0 0 24 913 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 26 0 0 0
234 0 0 620 0 0 0 0 0 106 0 0 901 0 0
0 0 0 0 0 0 45 0 34 0 0 0 0 0 0
I would like to have an image of with the dimensions [3,5] and the columns in the above data frame have been arranged based on their respective coordinates in the following manner [1,1], [2,1], [3,1], [1,2], [2,2], [3,2]..... and so on
The image coordinate arrangement
[1,1] [1,2] [1,3] [1,4] [1,5]
[2,1] [2,2] [2,3] [2,4] [2,5]
[3,1] [3,2] [3,3] [3,4] [3,5]
This function reads in your dataset and finds the mean (or max or sum) of each column (yielding a series of numbers, one per column). It then reshapes that series into your desired output dimensions and displays as an image.
df <- read.table(header=FALSE,text="
0 0 0 0 0 46 0 0 0 0 0 0 0 78 0
0 734 0 0 0 0 932 0 0 56 0 0 0 0 0
0 0 0 115 0 0 0 0 0 0 64 0 0 0 0
0 67 0 0 0 45 0 0 0 0 0 546 0 12 0
0 0 0 0 65 5 56 0 54 0 0 0 0 0 0
667 0 430 0 0 0 0 456 0 0 787 0 0 467 0
0 0 0 0 54 0 0 0 0 0 0 456 90 0 0
778 45 0 0 0 0 24 913 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 26 0 0 0
234 0 0 620 0 0 0 0 0 106 0 0 901 0 0
0 0 0 0 0 0 45 0 34 0 0 0 0 0 0
")
img <- function(data, op, tall, wide) image(t(matrix(sapply(data, op), nrow = wide, ncol = tall)),
col = gray((0:32) / 32))
img(df, mean, 3, 5)
img(df, max, 3, 5)
img(df, sum, 3, 5)

Remove duplicate header lines in dataframe

My raw data contains numeric values with a recall of the headers every 20 lines.
I wish to remove the repeated header lines with R. I know it's quite easy with sed command but I wish the R script to handle all steps of tidying data.
> raw <- read.delim("./vmstat_archiveadm_s.txt")
> head(raw)
kthr memory page disk faults cpu
r b w swap free re mf pi po fr de sr s2 s3 vc -- in sy cs us sy id
0 0 0 100097600 97779056 285 426 53 0 0 0 367 86 6 0 0 1206 7711 2630 1 0 99
0 0 0 96908192 94414488 7 31 0 0 0 0 0 120 0 0 0 2782 5775 5042 2 0 97
0 0 0 96889840 94397152 0 1 0 0 0 0 0 122 0 0 0 2737 5591 4958 2 0 97
kthr memory page disk faults cpu
r b w swap free re mf pi po fr de sr s2 s3 vc -- in sy cs us sy id
0 0 0 100065744 97745448 282 422 52 0 0 0 363 89 6 0 0 1233 7690 2665 1 0 99
0 0 0 96725312 94222040 7 31 0 0 0 0 0 604 69 0 0 5269 5703 7910 2 1 97
0 0 0 96668624 94170784 0 0 0 0 0 0 0 155 53 0 0 3047 5505 5317 2 0 97
0 0 0 96595104 94086816 0 0 0 0 0 0 0 174 0 0 0 2879 5567 5068 2 0 97
1 0 0 96521376 94025504 0 0 0 0 0 0 0 121 0 0 0 2812 5471 5105 2 0 97
0 0 0 96503256 93994896 0 0 0 0 0 0 0 121 0 0 0 2731 5621 4981 2 0 97
(...)
Try this :
where df is the dataframe
x = seq(6,100,21)
df = df[-x,]
Sequence will generate a string of numbers from 6 till 100 at an interval of 21.
Therefore, in this case :
6 27 48 69 90
Remove them from the dataframe by
df[-x,]
EDIT:
To do this for the entire dataframe, replace 100 with number of rows. i.e
seq(6,nrow(df),21)
Instead of processing the output in R I will clean it at the generation level:
$ vmstat 1 | egrep -v '^ kthr|^ r'
0 0 0 154831904 153906536 215 471 0 0 0 0 526 33 32 0 0 1834 14171 5253 0 0 99
1 0 0 154805632 153354296 9 32 0 0 0 0 0 0 0 0 0 1463 610 739 0 0 100
1 0 0 154805632 153354696 0 4 0 0 0 0 0 0 0 0 0 1408 425 634 0 0 100
0 0 0 154805632 153354696 0 0 0 0 0 0 0 0 0 0 0 1341 381 658 0 0 100
0 0 0 154805632 153354696 0 0 0 0 0 0 0 0 0 0 0 1299 353 610 0 0 100
1 0 0 154805632 153354696 0 0 0 0 0 0 0 0 0 0 0 1319 375 638 0 0 100
0 0 0 154805632 153354640 0 0 0 0 0 0 0 0 0 0 0 1308 367 614 0 0 100
0 0 0 154805632 153354640 0 0 0 0 0 0 0 0 0 0 0 1336 395 650 0 0 100
1 0 0 154805632 153354640 0 0 0 0 0 0 0 44 44 0 0 1594 378 878 0 0 100
0 0 0 154805632 153354640 0 0 0 0 0 0 0 66 65 0 0 1763 382 1015 0 0 100
0 0 0 154805632 153354640 0 0 0 0 0 0 0 0 0 0 0 1312 411 645 0 0 100
0 0 0 154805632 153354640 0 0 0 0 0 0 0 0 0 0 0 1342 390 647 0 0 100

R - replace values in data frame using lookup table

I was having some trouble lately trying to replace specific values in a data frame or matrix by using a lookup-table.
So this represents the original.data to be modified ...
V1 V2 V3 V4 V5 V6 V7 V8 V9 V10 V11 V12 V13 V14
1 255 255 255 255 255 255 255 255 255 255 255 255 255 255
2 255 255 255 255 255 255 255 255 3 3 255 255 255 255
3 255 255 255 255 255 1 3 3 3 3 3 255 255 255
4 255 255 5 5 5 1 3 3 4 4 3 255 255 255
5 255 5 5 5 5 1 3 4 4 4 4 255 255 255
6 255 5 5 5 1 3 3 3 4 4 3 3 255 255
7 255 255 5 1 3 3 3 3 6 6 6 3 255 255
8 255 255 1 1 1 1 2 2 3 3 6 3 255 255
9 255 255 1 1 1 2 2 2 2 2 3 3 3 255
10 255 255 255 1 2 2 2 2 2 2 2 3 3 255
11 255 255 255 2 2 2 2 2 7 7 7 2 255 255
12 255 255 255 2 2 8 8 8 7 255 255 255 255 255
13 255 255 255 255 8 8 255 255 255 255 255 255 255 255
14 255 255 255 255 255 255 255 255 255 255 255 255 255 255
... and following may be the lookup.table (rows=1:9, column1="Sub", column2="Main"):
Sub Main
1 1 1
2 2 2
3 3 3
4 4 4
5 5 5
6 255 255
7 6 3
8 7 2
9 8 2
The aim is to compare e.g.
original.data[11,11] [7] with lookup.tabel[8,"Sub"] [7]
... and write a new matrix
modified.data[11,11] with lookup.table[8,"Main"] [2].
Until now all I came up with is using for-loops and an if-statement,
for (i in 1:ncol(original.data)){
for (j in 1:nrow(lookup.table)){
if (original.data[i,i]==lookup.table[j,1]){
origingal.data[j,i]<-lookup.table[j,2]
}
}
}
which leads to
Error in origingal.data[j, i] <- lookup.table[j, 2] :
object 'origingal.data' not found
but i cannot figure out my errors in reasoning.
I'd love to get some hints.
Thanks
\\\\\PROBLEM SOLVED
for (i in 1:ncol(original.data)){
for (j in 1:nrow(original.data)){
for (x in 1:nrow(lookup.table)){
if (original.data[j,i]==lookup.table[x,1]){
original.data[j,i]<-lookup.table[x,2]
}
}
}
}
... works, but this is a much faster method:
for(i in 1:nrow(lookup.table)){
c<-lookup.table[b,2]
d<-lookup.table[b,3]
original.data_modified[original.data == c] <- d
}
you can try :
# x the original.data (a matrix)
# y the lookup.table
x2 <- y[match(x, y[,1]),2]
dim(x2) <- dim(x)
table(x, x2)
x2
x 1 2 3 4 5 255
1 13 0 0 0 0 0
2 0 22 0 0 0 0
3 0 0 29 0 0 0
4 0 0 0 8 0 0
5 0 0 0 0 11 0
6 0 0 4 0 0 0
7 0 4 0 0 0 0
8 0 5 0 0 0 0
255 0 0 0 0 0 100

How do you silently save an inspect object in R's tm package?

When I save the inspect() object in R's tm package it prints to screen. It does save the data that I want in the data.frame, but I have thousands of documents to analyze and the printing to screen is eating up my memory.
library(tm)
data("crude")
matrix <- TermDocumentMatrix(corpus,control=list(removePunctuation = TRUE,
stopwords=TRUE))
out= data.frame(inspect(matrix))
I have tried every trick that I can think of. capture.output() changes the object (not the desired effect), as does sink(). dev.off() does not work. invisible() does nothing. suppressWarnings(), suppressMessages(), and try() unsurprisingly do nothing. There are no silent or quiet options in the inspect command.
The closest that I can get is
out= capture.output(inspect(matrix))
out= data.frame(out)
which notably does not give the same data.frame, but pretty easily could be if I need to go down this route. Any other (less hacky) suggestions would be helpful. Thanks.
Windows 7
64- bit R-3.0.1
tm package is the most recent version (0.5-9.1).
Assign inside the capture then:
capture.output(out <- data.frame(inspect(matrix))) -> .null # discarding this
But really, inspect is for visual inspection, so maybe try
as.data.frame(as.matrix(matrix))
instead (btw matrix is a very unfortunate name for a variable, as that's a base function).
Using this input (varible name changed from you question as using a variable named "matrix" can be confusing:
library(tm)
data("crude")
tdm <- TermDocumentMatrix(crude,control=list(removePunctuation = TRUE,
stopwords=TRUE))
Then this will avoid printing to screen
m <- as.matrix(tdm)
and then I would personally do something like
require(data.table)
data.table(m, keep.rownames=TRUE)
# rn 127 144 191 194 211 236 237 242 246 248 273 349 352 353 368 489 502 543 704 708
# 1: 100000 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0
# 2: 108 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1
# 3: 111 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1
# 4: 115 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1
# 5: 12217 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0
# ---
# 996: yesterday 0 0 0 0 0 0 0 3 0 0 1 0 0 0 0 0 0 0 0 0
# 997: yesterdays 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0
# 998: york 0 1 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 1 0
# 999: zero 0 0 0 0 0 0 1 0 0 0 1 0 0 0 0 0 0 0 0 0
# 1000: zone 0 0 0 0 0 0 0 0 0 0 2 0 0 0 0 0 0 0 0 0

Resources