Unexpectedly transposed flipped output from R "image" function - r

Say I have a matrix:
m<-matrix(1:5,4,5)
m
[,1] [,2] [,3] [,4] [,5]
[1,] 1 5 4 3 2
[2,] 2 1 5 4 3
[3,] 3 2 1 5 4
[4,] 4 3 2 1 5
Now, when I do
image(m)
I get unexpected output. And so I need to do:
image(t(m)[,4:1])
to get it the "right" way. What is the point?

Others have pointed out that what you are seeing is consistent with the documentation, here are a couple of thoughts of the why it does it that way.
The image function was not originally designed to plot images/graphics, but to represent tabular information graphically, therefore the ordering of things was intended to be consistent with other graphing ideals rather than making sure that clipart looked correct. This means that rotations and mirroring of the image does not make it "wrong", it is just a different view, and the view that followed the plotting rules was chosen.
It also tried to be consistent with other graphing functions and the philosophy that they were based on. For a scatter plot we use plot(x,y) with x being the horizontal axis, but when we do table(x,y) the x variable forms the rows of the resulting table. Both of these facts are consistent with common practice (the explanatory variable is generally the row variable in a table since numbers are easier to compare vertically). So the image function uses the rows of the matrix (the x variable if it came from the table function) as the predictor/explanatory variable on the horizontal axis. It is also customary for values in plots to increase going left to right and bottom to top (but in tables it is more common to increase going top to bottom).

From the help file:
Notice that image interprets the z matrix as a table of f(x[i], y[j])
values, so that the x axis corresponds to row number and the y axis to
column number, with column 1 at the bottom, i.e. a 90 degree
counter-clockwise rotation of the conventional printed layout of a
matrix.

Related

How to use text function in R?

I am learning graphical analysis using R. Here is the code, which I can not understand.
barplotVS <- barplot(table(mtcarsData$vs), xlab="Type of engine")
text(barplotVS,table(mtcarsData$vs)/2,table(mtcarsData$vs),cex=1.25)
The output is like below. I can not understand the function of text(), I googled the text() function, which shows that the parameter of text(x,y) is numeric vectors of coordinates where the text labels should be written. Can anyone tell me what is barplotVS,table(mtcarsData$vs)/2,table(mtcarsData$vs),cex=1.25 in my code.
barplotVS <- barplot(table(mtcarsData$vs), xlab="Type of engine")
print(barplotVS)
outputs:
[,1]
[1,] 0.7
[2,] 1.9
These are the positions where the center of the bars in the barplot are on the x axis.
print(table(mtcarsData$vs))
outputs:
0 1
18 14
the numbers below are the occurrences of each value that is present in mtcarsData$vs and the numbers above are the actual value that is counted.
When you run the function:
text(barplotVS,table(mtcarsData$vs)/2,table(mtcarsData$vs),cex=1.25)
the first value will be the x positions where to put the labels (i.e. 0.7 and 1.9), the second parameter will be the y positions set in this case to total counts divided by two (i.e. 9 and 7) meaning to put the labels halfway in the bars, the third will be the labels (i.e. 18 and 14) and finally cex is a value that allows to change the size of the font.
Anyway R has in general a good documentation that you can call by using the ? operator (as suggested in the comments). In order to understand try to run the code and check what each variable contains with print or str functions. If you use a IDE (e.g. RStudio) have the content of the variables in a graphical panel so you don't event need to print.

R circular "linked" list: adding +1 to last index brings you to first index

I'm trying to implement movement through four points, while recording which points I visit. Think of it as a square. I can move from corner to corner or diagonally.
If you 'unwrap' the square you get a straight line with four points, which can be thought of as 1-2-3-4- where after 4 it goes back to 1. So if I'm at point 2 I can move to 1 and 3 directly or 4 diagonally. I'd implement that as 2-1 / 2+1 for corner-to-corner or 2+/-2 for diagonally. The problem occurs when I'm at 2 and will try to subtract 2 where I'll end up outside of the list.
The thought I've had is that if I could somehow translate my "out of bounds" numbers to in bounds this would be solved. One solution is hard coding that:
0=4
-1=3
5=1
6=2
but I'm pretty sure there is a better way to do this, however I can't seem to find it.
It seems to me all you want is modular arithmetic (bless the lord for math)
magicFun <- function (x) x %% 4
Here is a simple test run
> magicFun(0:6)
[1] 0 1 2 3 0 1 2
Addendum
It's more about math but the reason it works for negatives is that in Z/nZ ("the world where n is equal to 0") n is "identified" to 0.
This means you can add n as many times as you wish to a given number without changing it's "value".
Also, by convention the numbers in Z/nZ are listed as {0, 1, ..., n-1}.
So suppose n = 4 and x = -6, by the above x = x + 2*4 = 2.

Split matrix into 4 sub-matrices with lowest difference between their sum

I have to find the difference between the sum of 4 sub-matrices, which I get after splitting the matrix A in any way, in order to get the lowest difference between the sum of sub-matrix.
For example, for a matrix A,
3 0 2 -8 -8
5 3 2 2 3
2 5 2 1 4
3 4 -1 4 2
-3 6 2 4 3
I could split it like this:
3 | 0 2 -8 -8
5 | 3 2 2 3
2 | 5 2 1 4
-------------------
3 4 -1 | 4 2
-3 6 2 | 4 3
The sum of all elements within each sub-matrix gives the following result:
10 | 8
-------
11 | 13
Afterwards, I compute all the possible absolute differences between the sums, i.e.
abs(10 - 8) = 2
abs(10 - 11) = 1
abs(10 - 13) = 3
abs(8 - 11) = 3
abs(8 - 13) = 5
abs(11 -13) = 2
Finally, I chose the maximum distance, which is 5.
However, if I split the matrix A in any other way, it will give a different maximum distance, which I don't want. I have to find just 5, but if I'll do this brute force, I just spend too much time on finding all possibilities. Does this problem has a name, or may be you can give me a hint?
ADDED
The allowable splits are a horizontal split followed by a vertical split above and a possibly different vertical split below the horizontal split. In the example, there are 4 x 4 x 4 = 64 allowable partitions of the matrix.
The max difference between the submatrices of a particular partition is formed by considering all pairs of the 4 submatrices of that partition (there will be 6 such pairs) and taking the largest difference between the sums of the elements of one of the submatrices of the pair and the sum of the elements of the other submatrix of the pair. We wish to find the minimum over all max differences.
The actual matrix may be up to 4000 x 4000.
There are some speed-ups over brute force. First of all, by accumulating sums along rows and then down columns you can build a table giving, for each point, the total sum of all points, including that one, no further up than it and no further right than it. Then you can compute the sum in any rectangle by subtracting at most four of these sub-totals: roughly speaking the sum from the top right corner plus the sum from the bottom left corner minus the sums from the other two corners.
For the split pattern the OP has diagrammed, with a horizontal line splitting the entire matrix followed by different vertical lines splitting in each half, the vertical splits must be the most even vertical split of their half. If the most extreme difference between sums is within a vertical split, evening the vertical splits can only improve it. If the most extreme difference between sums is between (for example) a high sum from the top left and a low sum on the bottom right, then evening out either vertical split will either bring the high sum down or the low sum up, evening out the most extreme difference. This means you need only consider the best split in the top half and the best split in the bottom half - you don't need to consider all pairs of splits.
For the case where you have two vertical splits on the same side of a horizontal split, you do not have to try all pairs of positions for the vertical splits: you can start with the leftmost split at the far left, and adjust the rightmost split to cut the remainder as evenly as possible in two. Then move the leftmost split slowly to the right and, as you do so, the rightmost split can be repeatedly adjust to move to the right so as to keep splitting the remainder as evenly as possible.
Using these ideas, it seems to me that, for each possible split pattern, you can find the minimum cost split of that pattern in time, given the position of the longest line in that pattern, which is O(N) for a square matrix of side N, so with N positions for a longest line that is O(N^2), which is about the same time as it takes to build up a table of sums of points below and to the left of each point, which takes time linear in the total number of cells in the matrix, or O(N^2) for a square matrix of side N. - but it is annoying that there seem to six different split patterns.

R: Outer, Matrices and vectorizing

I'd like to understand better how outer works and how to vectorize functions. Below is a minimal example for what I am trying to do:I have a set of numbers 2,3,4. for each combination (a,b) of create a diagonal matrix with a a b b b on the diagonal, and then do something with it, e.g. calculating its determinant (this is just for demonstration purposes). The results of the calculation should be written in a 3 by 3 matrix, one field for each combination.
The code below isn't working - apparently, outer (or my.func) doesn't understand that I don't want the whole lambdas vector to be applied - you can see that this is the case when you uncomment the print command included.
lambdas <- c(1:2)
my.func <- function(lambda1,lambda2){
# print(diag(c(rep(lambda1,2),rep(lambda2,2))))
det(diag(c(rep(lambda1,2),rep(lambda2,2))))
}
det.summary <- outer(lambdas,lambdas, FUN="my.func")
How do I need to modify my function or the call of outer so things behave like I'd like to?
I guess I need to vectorize my function somehow, but I don't know how, and in which way the outer call would be processed differently.
Edit:
I've changed size of the matrices to make it a bit less messy. I'd like to generate 4 diagonal 4 by 4 matrices, with the following diagonals; in are brackets the corresponding parameters lambda1, lambda2:
1 1 1 1 (1,1), 1 1 2 2 (1,2), 2 2 1 1 (2,1), 2 2 2 2 (2,2).
Then, I want to calculate their determinants (which is an arbitrary choice here) and put the results into a matrix, whose first column corresponds to lambda1=1, the second to lambda1=2, and the rows correspond to the choice of lambda2. det.summary should be a 2 by to matrix with the following values:
1 4
4 16
as these are the determinants of the diagonal matrices listed above.
What do you know, there is a Vectorize function (capital "V")!
outer(lambdas,lambdas, Vectorize(my.func))
# [,1] [,2]
# [1,] 1 4
# [2,] 4 16
As you figured out (and as it took me a while to figure out) outer requires the function to be vectorized. In some ways, it is the opposite of the *pply functions which effectively vectorize an operation by feeding the operator/function each value in turn. But this is easily dealt with, as shown above.

R spline function given a fixed space

So, I need to generate a spline function to feed it into another program which only accepts a fixed space between consecutive points. So, I used spline function in R with a given number of points to genrate spline, however, the floating-point cutoff makes the space among the points variable, for example:
spline(d$V1, d$V2, n=(max(d$V1)-min(d$V1))/0.0200)
> head(t.spl, 7)
x y
1 2.3000 -3.0204
2 2.3202 -3.0204
3 2.3404 -3.0204
4 2.3606 -3.0204
5 2.3807 -3.0204
6 2.4009 -3.0204
7 2.4211 -3.0204
so, the space between 1st 1nd 2nd row is 0.0202, while between 4th and 5th is 0.0201. So because of this problem, the other program that I am feeding this spline into, doesn't accept this. So, is there any way to make this work?
As an aside: please provide a reproducible example next time (I can't copy/paste your code in because I don't have d or t.spl)
I think you'll find that the different intervals (0.0202 vs 0.0201) is an artifact of the number of characters you are printing on the screen, not of the spline function.
It seems R is printing 4 digits after the decimal point for you for neatness, so it's doing the rounding only for the purposes of displaying the results to you.
You can see how many digits are displayed with options('digits')$digits, and adjust it with options(digits=new_number_of_digits) (see ?options for details).
For example:
options(digits=4)
pi
# 3.142
options(digits=10)
pi
# 3.141592654
In summary, when you feed the values in to your other program, make sure you print the values with enough decimal points that the other program accepts the intervals as being "equal".
If you are writing to a file, for example, just make sure you write enough digits out. If you are copy-pasting from the R console, make sure you adjust R to print out enough digits.
MathematicalCoffee is probably right. I'm just adding an alternative for the sake of wordiness.
myspline <- splinefun(dV$1,dV$2)
mydata.y <- myspline(desired_x_values,deriv=0)
Will guarantee the uniform x-spacings you desire.

Resources