I have tried this exercise for a few sets e.g. {2, 3, 5}, {5, 11} where xor of elements is not 0. My intuition suggests that it will always be non-zero but I am unable to prove it. I searched on the net but didn't find anything. Any help will be appreciated.
Condensed from comments to your question:
For two distinct primes, bit-wise XOR will not be zero. In general, XOR of two numbers is zero if and only if the two numbers are equal.
For three distinct primes, from considering the least significant bit only, we see that not all three primes can be odd numbers. But there exists only one even prime, namely 2. Then the remaining primes, both odd, must be identical on every bit except the second-least significant bit where they must be different. They are therefore twin primes of the form 4k+1 and 4k+3 (note that twin primes of the form 4k-1 and 4k+1 do not satisfy the requirement). So the solutions for three primes are { 2, 5, 7 }; { 2, 17, 19 }; { 2, 29, 31 }; { 2, 41, 43 }; ....
For four primes, there are simply so many ways it can happen. From a simple program listing all occurrences with all primes under 50, I get: { 3, 5, 11, 13 }; { 5, 7, 17, 19 }; { 3, 5, 17, 23 }; { 11, 13, 17, 23 }; { 3, 7, 19, 23 }; { 7, 11, 17, 29 }; { 5, 11, 19, 29 }; { 3, 13, 19, 29 }; { 7, 13, 23, 29 }; { 5, 11, 17, 31 }; { 3, 13, 17, 31 }; { 7, 11, 19, 31 }; { 3, 11, 23, 31 }; { 5, 13, 23, 31 }; { 5, 7, 29, 31 }; { 17, 19, 29, 31 }; { 7, 11, 37, 41 }; { 17, 29, 37, 41 }; { 19, 31, 37, 41 }; { 5, 11, 37, 43 }; { 3, 13, 37, 43 }; { 19, 29, 37, 43 }; { 17, 31, 37, 43 }; { 5, 7, 41, 43 }; { 17, 19, 41, 43 }; { 29, 31, 41, 43 }; { 7, 13, 37, 47 }; { 23, 29, 37, 47 }; { 3, 5, 41, 47 }; { 11, 13, 41, 47 }; { 17, 23, 41, 47 }; { 3, 7, 43, 47 }; { 19, 23, 43, 47 }; ....
For five primes, again not all can be odd, so one has to be 2. But there are still a lot of ways it can happen (based on brute force search).
Not sure if this helps your intuition.
Related
I created a graph G and I have a node view as following < 0, 1,2,... 100>
I randomly removed 20 nodes and the node view of this new graph misses the nodes I removed randomly. to be precise for example , in the new graph there are some nodes missing(since they are removed
node view <0,1,3,5,6,7,9 ...100>
however, I want this graph to be a new graph having node view such as the following:
<0,1,2....80>
is there any solution? I tried relabeling, coping the same graph, they didn't work
PS. my nodes have attribute label equal to either 0,1
and i want to preserve them
Here is one approach you can take. After removing your nodes from the graph you can relabel the remaining nodes using nx.relabel_nodes to get the node view you want. See example below:
import networkx as nx
import numpy as np
#Creating random graph
N_nodes=50
G=nx.erdos_renyi_graph(N_nodes,p=0.25)
#Removing random nodes
N_del_nodes=10
del_node_list=np.random.choice(N_nodes,size=N_del_nodes,replace=False)
G.remove_nodes_from(del_node_list)
print('Node view without relabelling:' +str(G.nodes))
#Relabelling graph
label_mapping={list(G.nodes)[j]:j for j in range(N_nodes-N_del_nodes)}
G_rel=nx.relabel_nodes(G, label_mapping)
print('Node view with relabelling:' +str(G_rel.nodes))
And the output gives:
Node view without relabelling:[0, 1, 2, 5, 6, 8, 9, 10, 11, 12, 13, 14, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 27, 28, 30, 31, 32, 33, 34, 36, 37, 38, 40, 41, 44, 45, 46, 47, 48, 49]
Node view with relabelling:[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39]
I have the following dataframe:
> dput(master_credit)
structure(c(10, 10, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 12,
11, 11, 11, 11, 11, 12, 11, 11, 12, 11, 11, 11, 11, 11, 12, 12,
12, 11, 12, 12, 12, 11, 11, 11, 12, 11, 12, 12, 12, 12, 13, 12,
12, 12, 12, 12, 12, 11, 12, 12, 11, 12, 12, 14, 13, 12, 13, 13,
14, 13, 13, 12, 25, 26, 3, 21, 5, 9, 43, 15, 2, 6, 4, 27, 44,
1, 10, 31, 16, 12, 37, 7, 23, 54, 18, 19, 20, 14, 35, 52, 36,
32, 29, 50, 51, 30, 42, 24, 17, 63, 8, 62, 38, 34, 33, 49, 59,
58, 57, 60, 28, 61, 40, 41, 22, 11, 47, 13, 48, 45, 46, 65, 64,
53, 39, 56, 55), .Dim = c(65L, 2L), .Dimnames = list(NULL, c("master",
"credit")))
on which I am running the code:
library(InformationValue)
> somersD(master_credit[,"master"], master_credit[,"credit"])
[1] NaN
why does it return NaN?
The documentation of somersD says somersD(actuals, predictedScores), where actuals are binary flags which are either 1 or 0.
actuals: The actual binary flags for the response variable. It can take a numeric vector containing values of either 1 or 0, where 1 represents the 'Good' or 'Events' while 0 represents 'Bad' or 'Non-Events'.
Internally while calculating Somers D statistics, somersD function tries to find the number of rows containing, 1 and 0 in actuals column. This count is used in division. And since in your dataframe there is no such row, so you are basically dividing by zero, hence it returns NaN.
So I have read in a network data in iGraph(R) and would like to store the nodes into a list. Here's what I have done:
G = read_graph("somegraph.graphml",format="graphml")
x = list(V(G))
> x
+ 15/15 vertices, from ecb3920:
[1] 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
My question is, how do I get the true value, i.e. the actually node id in my data, from V(G). Thanks.
> dput(G)
structure(list(15, FALSE, c(13, 7, 9, 14, 10, 5, 4, 11, 6, 7,
14, 4, 13, 9, 10, 5, 5, 13, 9, 6, 7, 14, 12, 10, 14, 10, 11,
13, 9, 10, 12, 14, 8, 7, 11, 12, 8, 13, 14, 9, 11, 13, 13, 12,
14, 10, 13, 12, 14, 12, 13, 13, 14, 14), c(0, 0, 2, 2, 2, 2,
2, 3, 3, 3, 3, 3, 3, 3, 3, 3, 4, 4, 4, 4, 4, 4, 4, 4, 5, 5, 6,
6, 6, 6, 6, 6, 6, 6, 7, 7, 7, 7, 7, 7, 8, 8, 9, 9, 9, 9, 10,
10, 10, 11, 11, 12, 12, 13), c(6, 11, 5, 15, 16, 8, 19, 1, 9,
20, 33, 32, 36, 2, 13, 18, 28, 39, 4, 14, 23, 25, 29, 45, 7,
26, 34, 40, 22, 30, 35, 43, 47, 49, 0, 12, 17, 27, 37, 41, 42,
46, 50, 51, 3, 10, 21, 24, 31, 38, 44, 48, 52, 53), c(1, 0, 6,
5, 2, 4, 3, 11, 15, 8, 9, 13, 14, 7, 12, 10, 16, 19, 20, 18,
23, 22, 17, 21, 25, 24, 33, 32, 28, 29, 26, 30, 27, 31, 36, 39,
34, 35, 37, 38, 40, 41, 45, 43, 42, 44, 47, 46, 48, 49, 50, 51,
52, 53), c(0, 0, 0, 0, 0, 2, 5, 7, 11, 13, 18, 24, 28, 34, 44,
54), c(0, 2, 2, 7, 16, 24, 26, 34, 40, 42, 46, 49, 51, 53, 54,
54), list(c(1, 0, 1), structure(list(), .Names = character(0)),
structure(list(id = c("1351920706", "500102244", "1454425532",
"1625050630", "510838353", "1262640078", "681721364", "1351920717",
"1260750116", "1524975171", "1070293410", "727198538", "715215233",
"1351920666", "500920034")), .Names = "id"), list()), <environment>), class = "igraph")
Just for closure (and to summarise from our chat): Based on the sample data you give, you can extract additional data for every vertex by indexing the corresponding element.
So
V(g)$id
returns
#[1] "1351920706" "500102244" "1454425532" "1625050630" "510838353"
#[6] "1262640078" "681721364" "1351920717" "1260750116" "1524975171"
#[11] "1070293410" "727198538" "715215233" "1351920666" "500920034"
I have a dataset with 50 thousand rows that I want to sort according the the values in one of the columns. The numbers in the column go from 1-30, and when I do the following
data=data[order(data$columnname),]
it gets sorted so that the order of the columns is like this
1, 10, 11 12, 13, 14, 15, 16, 17, 18, 19, 2, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 3, 30, 4, 5, 6, 7, 8, 9
how could I sort it so that it is like this
1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30
For me it seems, that your format is not numeric. Try this:
data$columnname<-as.numeric(data$columnname)
data=data[order(data$columnname),]
Suppose I want to make a plot with the following data:
pairs <- c(1, 2, 2, 3, 2, 4, 2, 5, 2, 6, 2, 7, 2, 8, 2, 9, 2, 10, 2, 11, 4,
14, 4, 15, 6, 13, 6, 19, 6, 28, 6, 36, 7, 16, 7, 23, 7, 26, 7, 33,
7, 39, 7, 43, 8, 35, 8, 40, 9, 21, 9, 22, 9, 25, 9, 27, 9, 33, 9,
38, 10, 12, 10, 18, 10, 20, 10, 32, 10, 34, 10, 37, 10, 44, 10, 45,
10, 46, 11, 17, 11, 24, 11, 29, 11, 30, 11, 31, 11, 33, 11, 41, 11,
42, 11, 47, 14, 50, 14, 52, 14, 54, 14, 55, 14, 56, 14, 57, 14, 58,
14, 59, 14, 60, 14, 61, 15, 48, 15, 49, 15, 51, 15, 53, 15, 62, 15,
63)
g <- graph( pairs )
plot( g,layout = layout.reingold.tilford )
I get a plot like the one below:
As you can see the spaces between some of the vertices are so small that these vertices overlap.
1. I wonder if there is a way to change the spacing between vertices.
2. In addition, is the spacing between vertices arbitrary? For example, Vertices 3, 4, and 5 are very close to each other, but 5 and 6 are far apart.
EDIT:
For my 2nd question, I guess the spacing is dependent on the number of nodes below. E.g., 10 and 11 are farther from each other than 8 and 9 are because there are more children below 10 and 11 than there are below 8 and 9.
I bet there is a better solution but I cannot find it. Here my approach. Since seems that a general parameter for width is missing you have to adjust manually parameters in order to obtain the desired output.
My approach is primarily to resize some elements of the plot in order to make them of the right size, adjust margins in order to optimize the space as much as possible. The most important parameter here is the asp parameter that controls the aspect ratio of the plot (since in this case the plot I guess is better long than tall an aspect ratio of even less than 0.5 is right). Other tricks are to diminish the size of vertex and fonts. Here is the code:
plot( g, layout = layout.reingold.tilford,
edge.width = 1,
edge.arrow.width = 0.3,
vertex.size = 5,
edge.arrow.size = 0.5,
vertex.size2 = 3,
vertex.label.cex = 1,
asp = 0.35,
margin = -0.1)
That produces this plot:
another approach would be to set graphical devices to PDF (or JPEG etc.) and then set the rescale to FALSE. With Rstudio viewer this cut off a huge piece of the data but with other graphic devices it might (not guarantee) work well.
Anyway for every doubt about how to use these parameters (that are very tricky sometimes) type help(igraph.plotting)
For the second part of the question I am not sure but looking inside the function I cannot figure out a precise answer but I guess that the space between elements on the same level is calculated on the child elements they have, say 3,4,5 have to be closer because they have child and sub-child and then they require more space.