Is it Possible to Solve this
Suppose I only have 5 numbers (for example): 8, 12, 37, 202, and 7
and the only things I know are that:
The result is 266.
I have used 5 digits.
Is it possible to figure out all those 5 numbers, by using a mathematical procedure?
This is an interesting question, and there are at least a few ways to look at it. The short answer is no, you can't figure out the five numbers knowing only their sum. But there is more to say about it. Take a look at "underdetermined system of equations" and "partitions of integers" in your favorite search engine.
If you only know that there are 5 numbers that add up to 266, you will not be able to calculate/figure out what those 5 individual numbers are. If someone is inputting those numbers somewhere, you can programmatically store those numbers, but the basic answer to your question is no.
Related
I need some help for this problem ive been facing
suppose I have an array=[3,4,1,5,6,1,3]
now I need the permutation that the duplicate element 3 should not sit beside other 3 and same for 1.
how am I suppose to solve this ive watched a ton of YouTube and googled it but no luck
for the help thanks in advance.,,,
Are you looking for a general case solution or just for that particular array? If you are looking for more general case, I think you should specify the restrictions or the problem becomes too complex. Same applies for if you want to write a code. Some languages (like Python) have libraries that makes these works relatively simple, but the time complexity can get ugly.
Here's mathmatical approached to the problem:
Step 1: Suppose all the elements are different a = [3,4,5,6,1]
In this case we will have 5! different options (You have 5 options to choose the first element and 4 options to choose the second and so on)
Step 2: Suppose you have one repeated element a = [1,3,4,5,6,1]
In this case we have 6!/2! different options (6! comes from Step 1 and we divide it by 2! because if you switch the position of repeated element to itself the array does not chance).
Now you want to exclude options where repeated elements appear next to each other. The trick is to treat them as one element. So now we have a = [(1,1), 3, 4, 5, 6]. There 5! different options. We subtract this from total, that is 6!/2! - 5! will give you the answer.
Step 3: (your case) Two repeated elements a = [3,4,1,5,6,1,3]
We continue with the same logic. In total we have 7!/(2!x2!) options. From Step 2, if we want to exclude cases where 1 appear next to 1 then we will have to substract 6! from total. Also we have 3 that appears twice too. So, we will subtract another 6! from total. Unfortunately, we have subtracted some cases twice (can you guess which). If we find which cases we subtracted twice and add them we will get the answer.
The cases that we subtracted twice are when 1 comes after 1 at the same time 3 comes after 3, that is a = [(1,1),4,5,6,(3,3)]. We have subtracted those options for both one and three. There are 5! cases like that (can you guess why?).
To some it up get 7!/(2!x2!) - 2x6! + 5!.
If you are not looking for general solution those numbers are not big so you can write a bruteforce code (To save some time/space convert array to string).
I might have missed something in calculations but if you follow the logic you will get the answer. Also, if you want to understand why those things work try it with small data to get the intuition. If you need code, let me know. I will update the solution.
Not sure if someone could help me with this problem.
I have 5 lists of values of different lengths.
Note: Same value can be presence in different lists.
Does anyone know how to get the combination of 3 lists that will provide more total unique values?
Thanks in advance,
Miguel
I do not really have an answer to your question, which seems to be more of a combinatorics question than programming. My sense is that if you want an exact solution you will have to try all the possible combinations of subsets of 3 lists out of 5 (there are 10 of them). One thing to remember if you go that way is that if you want the number of unique elements of the concatenation of 3 lists you do not have necessarily to do length(unique(c(l1,l2,l3)) which I imagine could be inefficient if you have very long lists. You can use the formula for the size of the intersection of 3 sets, which you can find for example at https://math.stackexchange.com/questions/669249/probability-of-the-union-of-3-events .
This will require you only to compute the length of all the possible intersections of the lists. it could be a completely academic exercise: as I said, I am not offering an answer but if you are not familiar with that formula it is worth reading it, since it is relevant to the problem of finding the size of a set.
I have a basic question in regards to the R programming language.
I'm at a beginners level and I wish to understand the meaning behind two lines of code I found online in order to gain a better understanding. Here is the code:
as.data.frame(y[1:(n-k)])
as.data.frame(y[(k+1):n])
... where y and n are given. I do understand that the results are transformed into a data frame by the function as.data.frame() but what about the rest? I'm still at a beginners level so pardon me if this question is off-topic or irrelevant in this forum. Thank you in advance, I appreciate every answer :)
Looks like you understand the as.data.frame() function so let's look at what is happening inside of it. We're looking at y[1:(n-k)]. Here, y is a vector which is a collection of data points of the same type. For example:
> y <- c(1,2,3,4,5,6)
Try running that and then calling back y. What you get are those numbers listed out. Now, consider the case you want to just call out the number 1 in that vector. How would you do that? Well, this is where the brackets come into play. If you wanted to just call the number 1 in y:
> y[1]
[1] 1
Therefore, the brackets are a way of calling out or indexing specific items in the vector. Note that the indexing starts at the value 1 and goes up to the number of items in the vector, or length. One last thing before we go back to the example you gave. What if we want to index the numbers 1, 2, and 3 from the vector but not the rest?
> y[1:3]
[1] 1 2 3
This is where the colon comes into play. It allows us to reference a subset of the numbers. However, it will reference all the numbers between the index left of the colon and right of it. Try this out for yourself in R! Play around and see what happens.
Finally going back to your example:
y[1:(n-k)]
How would this work based on what we discussed? Well, the colon means that we are indexing all values in the vector y from two index values. What are those values? Well, they are the numbers to the left and right of the colon. Therefore, we are asking R to give us the values from the first position (index of 1) to the (n-k) position. Therefore, it's important to know what n and k are. If n is 4 and k is 1 then the command becomes:
y[1:3]
The same logic can apply to the second as.data.frame() command in your question. Essentially, R is picking out different numbers from a vector y and multiplying them together.
Hope this helps. The best way to learn R is to play around with a command, throw different numbers at it, guess what will happen, and then see what happens!
I and my coworkers enter data in turns. One day I do, the next week someone else does and we always enter 50 observations at a time (into an Excel sheet). So I can be pretty sure that I entered the cases from 101 to 150, and 301 to 350. We then read the data into R to work with it. How can I select only the cases I entered?
Now I know that I can do that by copying from the excel sheet, however, I wonder if it is doable in R?
I checked several documents about subsetting data with R, also tried things like
data<-data[101:150 & 301:350,]
but didn't work. I appreciate if someone would guide me to a more comprehensive guide answering this question.
The answer to the specific example you gave is
data[c(100:150,300:350),]
Can you be more specific about which cases you want? Is it the first 50 of each 100, or the first 50 of each 300, or ... ? To get the indices for the first n of each m cases you could use something like
c(outer(0:4,seq(1,100,by=10),"+"))
(here n=5, m=10); outer is a generalized outer product. An alternate (and possibly more intuitive) solution would use rep, e.g.
rep(0:4,10) + rep(seq(1,100,by=10),each=5)
Because R automatically recycles vectors where necessary you could actually shorten this to:
0:4 + rep(seq(1,100,by=10),each=5)
but I would recommend the slightly longer formulation as more understandable.
I have about 42,000 lists of 24 random numbers, all in the range [0, 255]. For example, the first list might be [32, 15, 26, 27, ... 11]. The second list might be [44, 44, 18, 19, .. 113]. How can I choose a number from each of the lists so that (so I will end up with a new list of about 42,000 numbers) such that this new list is most compressible using ZIP?
-- this question has to do with math, data compression
The ZIP file format uses DEFLATE for its compression algorithm. So you need to consider how that algorithm works and pick data such that the algorithm finds it easy to compress. According to the wikipedia article, there are two stages of compression. The first uses LZ77 to find repeated sections of data and replace them with short references. The second uses Huffman coding to take the remaining data and strip out redundancy across the whole block. This is called entropic coding - if the information isn't very random (has low entropy) the code replaces common things with short symbols, increasing the entropy.
In general, then, lists with lots of repeated runs (i.e., [111,2,44,93,111,2,44,93...]) will compress well in the first pass. Lists with lots of repeated numbers within other random stuff (i.e., [111,34,43,50,111,34,111,111,2,34,22,60,111,98,2], where 34 and 111 show up often) will compress well in the second pass.
To find suitable numbers, I think the easiest thing to do is just sort each list, then merge them, keeping the merge sorted, until you get to 42000 output numbers. You'll get runs as they happen. This won't be optimal, you might have the number 255 in each input list and you'd miss them using this technique, but it would be easy.
Another approach would be to histogram the numbers into 256 bins. Any bins that stand out indicate numbers that should be grouped. After that, I guess you have to search for sequences. Again, sorting the inputs will probably make this easier.
I just noticed you had the constraint that you have to pick one number from each list. So in both cases you could sort each list then remove duplicates.
Additionally, Huffman codes can be generated using a tree, so I wonder if there's some magic tree structure you could put the numbers into that would automatically give the right answer.
This smells NP-complete to me, but I am nowhere near able to prove it. On the outside, there are approximately 7.45e+57968 (!) possible configurations to test. It doesn't seem that you can opt out of a particular configuration early, as an incompressible initial section could be greatly compressible later on.
My best guess for "good" compression would be to count the number of occurrences of each number across the entire million-element set and select from each list the numbers with the most occurrences. For example, if every list has 42 present in it, selecting that only would give you a very-compressible array of 42,000 instances of the same value.