CSV File Histogram Generator : 'x' must be numeric - r

Basically, I'm trying to get a visualization of a histogram using R. My dataset has a column of data with the first row value "data" the remaining data are all numeric values the problem is when use the hist() function I am unable to visualize the data.
I've already looked over the solutions:
Solution 1
Solution 2
Solution 3
Solution 4
My Data Set:
V1
1 \357\273\277data
2 256
3 256
4 256
5 256
6 64
7 64
8 128
9 128
10 128
11 128
12 128
13 128
14 1024
15 1024
16 1024
17 1024
18 1024
19 1024
20 1024
21 1024
22 1024
23 1024
24 1024
25 1024
26 32
27 32
28 32
29 32
30 32
31 32
32 32
33 32
34 32
35 32
36 32
37 32
38 32
39 32
40 32
41 32
42 32
43 32
44 32
45 32
46 32
47 32
48 32
49 32
50 512
51 512
52 512
53 512
54 512
55 512
56 512
57 512
58 512
59 512
60 512
61 512
62 512
63 512
64 512
65 512
66 512
67 512
68 512
69 512
70 2
71 2
72 2
73 2
74 2
75 2
76 2
77 2
78 2
79 2
Code :
TD2 = read.csv("/Users/somename/Desktop/TD.csv",head=TRUE)
TD2 -- Result above
Also Tried :
data <- read.table("/Users/somename/Desktop/TD.csv", sep="\t")
TDR = read.csv("/Users/somename/Desktop/TD.csv",header = FALSE,sep = ",")
Result :
hist(TD2)
Error in hist.default(TD2) : 'x' must be numeric
hist(data)
Error in hist.default(data) : 'x' must be numeric
hist(TDR)
Error in hist.default(TDR) : 'x' must be numeric

You need to read the data with stringsAsFactors set to FALSE. Then the plot can be obtained as follows:
hist(as.numeric(df[-1,]))

Related

Problems with column labels after importing a csv file

I'm trying to import an anova data set csv file into R using the read.csv function. When I import it the columns are labelled X........ Even though the csv file the column labels are clearly person, gender etc
I don't know why this is. I've copied the code below. Any help would be appreciated
read.csv("/Users/Desktop/R /anova data set.csv")
X.......
1 ;Person;gender;Age;Height;pre.weight;Diet;weight6weeks
2 ;25; ;41;171;60;2;60
3 ;26; ;32;174;103;2;103
4 ;1;0;22;159;58;1;54.2
5 ;2;0;46;192;60;1;54
6 ;3;0;55;170;64;1;63.3
7 ;4;0;33;171;64;1;61.1
8 ;5;0;50;170;65;1;62.2
9 ;6;0;50;201;66;1;64
10 ;7;0;37;174;67;1;65
11 ;8;0;28;176;69;1;60.5
12 ;9;0;28;165;70;1;68.1
13 ;10;0;45;165;70;1;66.9
14 ;11;0;60;173;72;1;70.5
15 ;12;0;48;156;72;1;69
16 ;13;0;41;163;72;1;68.4
17 ;14;0;37;167;82;1;81.1
18 ;27;0;44;174;58;2;60.1
19 ;28;0;37;172;58;2;56
20 ;29;0;41;165;59;2;57.3
21 ;30;0;43;171;61;2;56.7
22 ;31;0;20;169;62;2;55
23 ;32;0;51;174;63;2;62.4
24 ;33;0;31;163;63;2;60.3
25 ;34;0;54;173;63;2;59.4
26 ;35;0;50;166;65;2;62
27 ;36;0;48;163;66;2;64
28 ;37;0;16;165;68;2;63.8
29 ;38;0;37;167;68;2;63.3
30 ;39;0;30;161;76;2;72.7
31 ;40;0;29;169;77;2;77.5
32 ;52;0;51;165;60;3;53
33 ;53;0;35;169;62;3;56.4
34 ;54;0;21;159;64;3;60.6
35 ;55;0;22;169;65;3;58.2
36 ;56;0;36;160;66;3;58.2
37 ;57;0;20;169;67;3;61.6
38 ;58;0;35;163;67;3;60.2
39 ;59;0;45;155;69;3;61.8
40 ;60;0;58;141;70;3;63
41 ;61;0;37;170;70;3;62.7
42 ;62;0;31;170;72;3;71.1
43 ;63;0;35;171;72;3;64.4
44 ;64;0;56;171;73;3;68.9
45 ;65;0;48;153;75;3;68.7
46 ;66;0;41;157;76;3;71
47 ;15;1;39;168;71;1;71.6
48 ;16;1;31;158;72;1;70.9
49 ;17;1;40;173;74;1;69.5
50 ;18;1;50;160;78;1;73.9
51 ;19;1;43;162;80;1;71
52 ;20;1;25;165;80;1;77.6
53 ;21;1;52;177;83;1;79.1
54 ;22;1;42;166;85;1;81.5
55 ;23;1;39;166;87;1;81.9
56 ;24;1;40;190;88;1;84.5
57 ;41;1;51;191;71;2;66.8
58 ;42;1;38;199;75;2;72.6
59 ;43;1;54;196;75;2;69.2
60 ;44;1;33;190;76;2;72.5
61 ;45;1;45;160;78;2;72.7
62 ;46;1;37;194;78;2;76.3
63 ;47;1;44;163;79;2;73.6
64 ;48;1;40;171;79;2;72.9
65 ;49;1;37;198;79;2;71.1
66 ;50;1;39;180;80;2;81.4
67 ;51;1;31;182;80;2;75.7
68 ;67;1;36;155;71;3;68.5
69 ;68;1;47;179;73;3;72.1
70 ;69;1;29;166;76;3;72.5
71 ;70;1;37;173;78;3;77.5
72 ;71;1;31;177;78;3;75.2
73 ;72;1;26;179;78;3;69.4
74 ;73;1;40;179;79;3;74.5
75 ;74;1;35;183;83;3;80.2
76 ;75;1;49;177;84;3;79.9
77 ;76;1;28;164;85;3;79.7
78 ;77;1;40;167;87;3;77.8
79 ;78;1;51;175;88;3;81.9
colnames(aov)
[1] "X......."

Translate this R geometric problem using numpy random geometric

How can I translate this geometric law problem to numpy ?
Products produced by a machine has a 3% defective rate.
What is the probability that the first defective oc-curs in the fifth item inspected?
P(X= 5) =P(1st 4 non-defective )P( 5th defective)=(0.974)(0.03)
In R > dgeom (x= 4, prob = .03)[1] 0.02655878T
The convention in R is to record X as the number of failures that occur
before the first success.
Is this my numpy code ok ? :
result = np.random.geometric(p=0.03, size=1000)
print(result);
result = (result == 5).sum() / 1000.
print(result * 1000,"%");
I get 17 % as a result with numpy , is it ok ? Seem wrong because there is only 3% defect rate.
This is the numpy result Array :
""" [ 31 20 37 9 47 31 22 7 44 15 52 15 4 14 36 45 26 27
9 48 30 5 7 17 7 24 121 22 23 49 2 26 25 8 4 5
3 27 70 71 3 1 19 22 103 18 14 20 34 45 8 169 11 63
29 71 30 79 75 19 56 9 5 8 15 44 8 12 40 29 46 2
144 69 65 1 4 90 20 187 100 52 46 76 3 105 12 110 31 3
113 18 6 15 127 22 6 7 3 18 123 41 69 104 13 18 2 8
52 35 54 27 74 22 31 27 3 15 21 26 13 3 32 10 131 20
I guess that 31 is the number of integrity checks before a failure .... 20 , 37 etc ...
This is what I would do:
np.random.seed(1)
tests = np.random.choice([0,1], size=(1000,5), p=[0.7,0.3])
((np.argmax(tests, axis=1) == 4) & tests[:,4]==1).mean()
# 0.073

igraph reorder vertices to sorted order

currently, I read in a graph from an edgelist as follows:
>> require(igraph) # i have igraph 1.1.0
>> g1 <- read_graph(graphname, format='ncol')
>> V(g1)
[1] 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 38 40 42 44 46 47 48 49 50 52 56 57 58
[50] 59 60 61 62 63 64 65 67 68 41 69 43 53 37 39 45 51 54 55 66 70
As you can see, the vertex ordering is completely wrong, despite the fact that the vertices have incredibly, incredibly basic naming convention (they are all just integers). This is incredibly problematic, because the ordering of the get.adjacency function in igraph (returning me a 70x70 matrix) depends on the ordering of the vertices in V(g1), so when I try to compare to some g2 with the same set of vertices, they are similarly in a ridiculously nonsensical ordering (yet distinct from the one here) leading to inconsistent graph vertices in the sample of graphs I have despite them all having the same vertex labels. Is there a way to correct this issue, such that I can easily reorder the vertices in my graph so that the resulting adjacency matrices have sensible orderings?
EDIT: note I have already tried permuting the vertices with the permute.vertices function:
>> gtest <- permute.vertices(g1, as.numeric(V(g1))) # permute vertex ids by the ordering returned by V()
>> V(gtest) # too bad it doesn't work...
[1] 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 38 40 42 44 46 47 48 49 50 52 56 57 58
[50] 59 60 61 62 63 64 65 67 68 41 69 43 53 37 39 45 51 54 55 66 70
I managed to get it working when I instead read my graph in as:
>> g1 <- read_graph(graphname, format='ncol', predef=1:70)
>> V(g1)
[1] 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49
[50] 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70
But this seems a bit ludicrous if this really is the only way to do it. Does anybody have any other suggestions?
Thanks!

J Get first N columns of a Matrix

I know that given a Matrix M of size NxN I can get the first m rows using (i.m){M I would like to know how to get the first n columns from M.
I assume that having something like
rows =: (i.m){M
giving a matrix of size mxN the same approach would be taken to get the first n columns of this new matrix.
edit:
I am trying the use code like this:
(i.n)"1{(i.m){M
However it is not working as it only returns the first element of the n columns in the first row of M, I need the get n columns.
You already have several answers from Dan. This one is just to explain why you might prefer using take instead of from. If you run into a case where your n is greater than the number of columns in your M, take will give you fill where from will produce an error.
$M
10 10
(i. 3){"1 M
0 1 2
10 11 12
20 21 22
30 31 32
40 41 42
50 51 52
60 61 62
70 71 72
80 81 82
90 91 92
3{."1 M
0 1 2
10 11 12
20 21 22
30 31 32
40 41 42
50 51 52
60 61 62
70 71 72
80 81 82
90 91 92
(i. 12){"1 M
|index error
| (i.12) {"1 M
12{."1 M
0 1 2 3 4 5 6 7 8 9 0 0
10 11 12 13 14 15 16 17 18 19 0 0
20 21 22 23 24 25 26 27 28 29 0 0
30 31 32 33 34 35 36 37 38 39 0 0
40 41 42 43 44 45 46 47 48 49 0 0
50 51 52 53 54 55 56 57 58 59 0 0
60 61 62 63 64 65 66 67 68 69 0 0
70 71 72 73 74 75 76 77 78 79 0 0
80 81 82 83 84 85 86 87 88 89 0 0
90 91 92 93 94 95 96 97 98 99 0 0

SHA1 encoding to hex has 40 characters and 160 bits

An SHA1 digest should be 160 bits long. Still it is normally represented as a string with 40 characters. Considering 8-bits-bytes and that 1 char corresponds to 1 byte, it seems to me the SHA1 digest should have 20 bytes and it's hex representation 40 bytes.
For example, using OpenSSL I could get the following results (after manually removing extra information added):
PLAIN MESSAGE: The only possible revolution is inside us
openssl dgst -sha1 -hex dgsttxt &> sha1_hex
32 64 66 61 33 35 66 62 35 37 34 65 36 62 65 36 32 33 62 37 63 36 31 61 63 61 32 63 61 31 65 66 39 30 36 62 39 63 38 34
openssl dgst -sha1 -binary dgsttxt &> sha1_binary
2D FA 35 FB 57 4E 6B E6 23 B7 C6 1A CA 2C A1 EF 90 6B 9C 84
Applying a wc in each file I get
wc sha1_binary sha1_hex
0 1 20 sha1_binary
0 1 40 sha1_hex
0 2 60 total
So I have two questions:
Why are there 20 more characters in the hex dump?
How are those extra bits inserted? I could note each byte in the hex dump starts with either 3 or 6. Is there a particular reason for that?
I have already seen a similar question here but I am not sure if I am too stupid to understand the answers or if they are really poor. Any help is appreciated.
160 bits / 8 = 20 bytes; a byte in hex is 2 characters (00 to FF) and 2 * 20 = 40 hex characters.
The longer output is the hexadecimally encoded version of the hexadecimal encoded hash.
Quite what the point of that is, who knows.
var s = "32 64 66 61 33 35 66 62 35 37 34 65 36 62 65 36 32 33 62 37 63 36 31 61 63 61 32 63 61 31 65 66 39 30 36 62 39 63 38 34".split(" ");
for (var i = 0; i < s.length; i++)
{
document.write( String.fromCharCode(parseInt(s[i], 16)) );
}

Resources