Creating matrix with values 1-4 in a random placement with limitations - r

I have an experiment where we are going to plant 4 different bushes in two different sized square plantings (8x8 bushes and 12x12 bushes) and we want their placement in the plantings to be randomized. I thought of trying to create a matrix in R for this purpose, since it's the software I'm used to. The issue is that the randomization has to follow a few limitations, and I’m not that experienced in creating matrices. The limitations are:
There has to be one of each species in each corner of the planting.
The two outer rows in the square has to contain an equal amount of each species.
The inner 4x4 quadrant in the square has to contain an equal amount of each species.
Plants of the same species can’t be planted next each other, but if they are placed on the diagonal of each other it’s fine.
I hope the issue is clear, otherwise ask!

Related

How to analyse spatial data using grid codes from a map

I would like to analyse movement data from a semi-captive animal population. We record their location every 5 mins using a location code which corresponds to a map of the reserve we have made ourselves. Each grid square represents 100 square meters, and has a letter and number to correspond with each grid square e.g. H5 or L6 (letters correlate with columns, whereas numbers correlate with rows.I would like to analyse differences in space use between three different periods of time, to answer questions such as do the animals move around more in certain periods, or are more restricted in their space use in other periods. Please can someone give me any indication of how to go about this? I have looked into spatial analysis in rstudio but haven't come across anything that doesn't use official maps or location co-ordinates. I've not done this type of analysis before so any help would be greatly appreciated! Thanks so much.

Moving points to a regular grid

I need to evenly distribute clumped 3D data. 2D solutions would be terrific. Up to many millions of data points.
I am looking for the best method to evenly distribute [ie fully populate a correctly sized grid] clumped 3D or 2D data.
Sorting in numerous directions numerous times, with a shake to separate clumps a little now & again, is the method currently used. It is known that it is far from optimum. In general sorting is no good because it spreads/flattens clumps of points across a single surface.
Triangulation would seemingly be best [de-warp back to a regular grid] however I could never get the proper hull and had other problems.
Pressure equalization type methods seem over the top.
Can anybody point me in the direction of information on this?
Thanks for your time.
Currently used [inadequate] code
1 - allocates indexes for sorting in various directions [side to side, then on diagonals],
2 - performs the sorts independently;
3 - allocates 2D locations from the sorts;
4 - averages the locations obtained from the different sorts;
5 - shakes [attempted side to side & up/down movement of whole dataset leaving duplicates static] to declump;
6 - repeat as required up to 11 times.
I presume the "best" result would be the minimum total movement from original locations to final grided locations.

Merge neighbouring areas together, if tm_fill for one area is too small

I have made a map with zip codes of a town. The fill is the ratio of being a case or not case. But some zip codes have very few numbers in total so outliers distort the map.
Is there a way to merge the polygons and data of two neighboring areas based on their n automatically?
And if that is not possible, how can I merge rows of my sf/df without losing ID?
And I guess simplest would be just to set the zip codes to NA.
Depends on what you mean by "automatically". Here's a simple algorithm.
repeat:
Find the region with the smallest population.
If that's more than your threshold, stop
Find that region's neighbours, pick one (at random, or smallest population).
merge that neighbour with that region
Finding neighbours and merging can all be done with either the sf package or the sp package and friends (like spdep and rgeos).
Equally, this can be considered a clustering algorithm using a distance metric based on adjacency. You could do a full hierarchical clustering and then cut the tree at a point such that all clusters had N>threshold.
Now as to whether this is a good idea statistically is another question, and depends on what your goal here is. If you are worried about whether an underlying risk is, say > 0.5, and you are getting positives because you have a population of 3 and 2 positives "by chance" from a small sample (of 3), then you need to model your data and work out the probability of exceeding 0.5 given the data. Then map that, which will take into account the small sample size.

Finding one coin of N in q steps

Subject: Finding One Coin of 13 in 3 Steps
There is a pile of thirteen coins, all of equal size. Twelve are of
equal weight. One is of a different weight. In three weighings(using scales) find
the unequal coin and determine if it is heavier or lighter.
I scratched my head on this one. I have found an answer but about 12.
Is it possible to do for 13 ?
So if it is possible can we end up with a method that can calculate the number of steps that are needed to find the unequal coin in pile of N. Pseudocode is just fair enough.
NOTE: Do not forget we do not know if the coin is lighter or heavier.
PS: Solution for 12 and some interesting thoughts here.
No, we cannot find a method that is guaranteed to determine which coin is not equal to the others and if it is heavier or lighter then the others, not with the restrictions you lay out.
One weighing of coins has three possible results: left pan down and right pan up (so the total of weights on the left is greater than the total of weights on the right), left pan up and right pan down (so the total of weights on the left is less than the total of weights on the right), or the pans balance (so the total of weights on the left is equal to the total of weights on the right). If we want to distinguish between four or more possibilities with just one weighing, we may fail since we can guarantee only three. Similarly, two weighings can distinguish between at most nine possibilities, and three weighings can handle at most 27 possibilities. The problem has 13 coins, each of which may be light or heavy, so there are 26 possibilities to begin. It looks like we may be able to handle them.
However, the problem comes at the first weighing. What happens if we place four or fewer weights on each pan? If one side goes up, all we know is that the special coin is among the five or more coins we did not use. However, that is 10 possibilities: light or heavy, for five coins. Therefore two more weighings is not guaranteed to distinguish between them.
Now, what happens if we place five or more weights on each pan for the first weighing? If the left pan rises, either one of the five or more weights on the left is light or one of the five or more weights on the right is heavy. That is at least 10 possibilities, so two more weighings is not guaranteed to distinguish between them.
Either way we may end up with 10 or more possibilities to solve in two weighings, which spoils any solution. Any method that has only three possible results at each step will need to be more sophisticated than the weighing pans.

Problem with Principal Component Analysis

I'm not sure this is the right place but here I go:
I have a database of 300 picture in high-resolution. I want to compute the PCA on this database and so far here is what I do: - reshape every image as a single column vector - create a matrix of all my data (500x300) - compute the average column and substract it to my matrix, this gives me X - compute the correlation C = X'X (300x300) - find the eigenvectors V and Eigen Values D of C. - the PCA matrix is given by XV*D^-1/2, where each column is a Principal Component
This is great and gives me correct component.
Now what I'm doing is doing the same PCA on the same database, except that the images have a lower resolution.
Here are my results, low-res on the left and high-res on the right. Has you can see most of them are similar but SOME images are not the same (the ones I circled)
Is there any way to explain this? I need for my algorithm to have the same images, but one set in high-res and the other one in low-res, how can I make this happen?
thanks
It is very possible that the filter you used could have done a thing or two to some of the components. After all, lower resolution images don't contain higher frequencies that, too, contribute to which components you're going to get. If component weights (lambdas) at those images are small, there's also a good possibility of errors.
I'm guessing your component images are sorted by weight. If they are, I would try to use a different pre-downsampling filter and see if it gives different results (essentially obtain lower resolution images by different means). It is possible that the components that come out differently have lots of frequency content in the transition band of that filter. It looks like images circled with red are nearly perfect inversions of each other. Filters can cause such things.
If your images are not sorted by weight, I wouldn't be surprised if the ones you circled have very little weight and that could simply be a computational precision error or something of that sort. In any case, we would probably need a little more information about how you downsample, how you sort the images before displaying them. Also, I wouldn't expect all images to be extremely similar because you're essentially getting rid of quite a few frequency components. I'm pretty sure it wouldn't have anything to do with the fact that you're stretching out images into vectors to compute PCA, but try to stretch them out in a different direction (take columns instead of rows or vice versa) and try that. If it changes the result, then perhaps you might want to try to perform PCA somewhat differently, not sure how.

Resources