Rule Learning using SBRL in R - r

I'm trying to use the Scalable Bayesian Rule Lists Model for creating some rule lists in R.
Link to package: SBRL Package R
I read data into a list, split into train and test and plug into the function
sbrl_model <- sbrl(data_train,iters=20000, pos_sign="1", neg_sign="0",)
which gives me the following error:
Error in asMethod(object) :
column(s) 1, 2, 4, 6 not logical or a factor. Discretize the columns first.
When I convert the data_train into a factor and try using:
data_train <- sapply(data_train, as.factor)
sbrl_model <- sbrl::sbrl(data_train, iters=20000, pos_sign="1", neg_sign="0",)
I get the following error:
Error in data_train$label : $ operator is invalid for atomic vectors
My data has the following columns:
state, amounts, timestamp, code, risk, vendor, label
The label is 0 or 1. I need to create rules for detecting what data leads to a 1.
I'm new to R so this seems confusing. If I don't convert to factors, it complains, if I do it can't use the "$" operator. Any ideas what I'm doing wrong? Thank you
> dput(data_train)
structure(c("PR", "PR", "PR", "PR", "MA", "MA", "NH", "NH", "ME",
"ME", "ME", "VT", "VT", "CT", "CT", "NJ", "NJ", "NY", "NY", "NY",
"NY", "NY", "NY", "NY", "PA", "PA", "PA", "PA", "PA", "PA", "PA",
"PA", "PA", "DE", "VA", "VA", "VA", "WV", "WV", "WV", "WV", "WV",
"WV", "WV", "WV", "WV", "WV", "WV", "WV", "WV", "WV", "WV", "WV",
"WV", "WV", "WV", "GA", "GA", "FL", "FL", "FL", "FL", "FL", "FL",
"AL", "AL", "AL", "TN", "TN", "TN", "MS", "MS", "MS", "KY", "KY",
"KY", "KY", "KY", "KY", "KY", "KY", "KY", "OH", "OH", "OH", "OH",
"OH", "OH", "OH", "OH", "OH", "OH", "OH", "OH", "OH", "OH", "IN",
"IA", "IA", "IA", "IA", "WI", "MN", "MN", "MN", "MN", "MN", "SD",
"SD", "ND", "ND", "ND", "ND", "ND", "MO", "MO", "MO", "MO", "MO",
"MO", "MO", "MO", "MO", "MO", "MO", "MO", "KS", "KS", "KS", "KS",
"KS", "KS", "KS", "16441", "92946", "8970", "19937", "94589",
"50615", "75915", "50005", "23037", "14835", "83678", "66263",
"60818", "82760", "42137", "32888", "35385", "20242", "98269",
"16216", "76562", "49327", "30699", "1866", "91301", "75125",
"34016", "88673", "78612", "85008", "91030", "57276", "96772",
"79568", "59489", "14154", "71655", "78163", "41673", "19942",
"19364", "34004", "79349", "1611", "8875", "19673", "5422", "42395",
"11899", "26967", "73499", "79916", "71015", "73640", "39759",
"7735", "84853", "31662", "43183", "44787", "79001", "82999",
"17031", "88109", "62215", "56040", "66592", "59148", "20786",
"30106", "46561", "9125", "83512", "60031", "65233", "49512",
"8893", "46275", "11362", "29867", "61573", "46363", "91510",
"19267", "45554", "41193", "54267", "8045", "28089", "62450",
"69082", "66685", "80769", "15446", "62589", "42875", "74723",
"2934", "18540", "96540", "60812", "50636", "90924", "60556",
"90009", "15287", "35529", "28702", "82102", "96967", "5296",
"64804", "48743", "10867", "60914", "83678", "77883", "97631",
"97175", "48103", "63128", "46774", "18285", "74512", "69313",
"80414", "32394", "51103", "51155", "28672", "38460", "89024",
"49443", "2016-01-23 12:14:07", "2016-01-17 19:22:37", "2016-01-23 22:41:32",
"2016-01-27 09:58:34", "2016-01-30 08:40:06", "2016-01-28 01:41:40",
"2016-01-27 08:22:27", "2016-01-28 00:13:48", "2016-01-20 12:31:12",
"2016-01-17 08:25:30", "2016-01-28 13:01:36", "2016-01-20 12:10:46",
"2016-01-25 07:32:01", "2016-01-23 02:13:11", "2016-01-24 11:14:46",
"2016-01-16 20:59:35", "2016-01-19 20:12:58", "2016-01-19 06:38:06",
"2016-01-27 10:15:48", "2016-01-26 14:00:30", "2016-01-28 01:54:45",
"2016-01-27 05:43:58", "2016-01-25 22:07:06", "2016-01-18 09:58:05",
"2016-01-20 05:56:54", "2016-01-26 08:05:32", "2016-01-28 14:18:45",
"2016-01-22 06:25:48", "2016-01-27 18:05:50", "2016-01-16 11:33:47",
"2016-01-22 03:31:52", "2016-01-23 05:41:37", "2016-01-27 00:55:22",
"2016-01-16 17:19:51", "2016-01-18 10:05:42", "2016-01-22 10:20:16",
"2016-01-26 21:07:20", "2016-01-17 19:12:00", "2016-01-19 17:59:45",
"2016-01-28 08:50:18", "2016-01-16 09:31:52", "2016-01-24 14:50:13",
"2016-01-17 14:02:36", "2016-01-20 17:08:29", "2016-01-25 16:42:03",
"2016-01-19 04:18:27", "2016-01-20 03:05:13", "2016-01-26 23:34:33",
"2016-01-26 13:44:56", "2016-01-16 07:09:41", "2016-01-26 06:43:12",
"2016-01-26 20:22:25", "2016-01-23 05:58:38", "2016-01-19 23:21:00",
"2016-01-16 08:36:10", "2016-01-30 01:21:00", "2016-01-23 11:10:06",
"2016-01-27 15:29:30", "2016-01-30 15:50:38", "2016-01-19 08:32:33",
"2016-01-19 18:18:02", "2016-01-21 14:20:47", "2016-01-17 13:19:59",
"2016-01-20 05:49:06", "2016-01-16 15:54:17", "2016-01-21 09:15:42",
"2016-01-16 07:32:39", "2016-01-28 03:49:00", "2016-01-26 00:19:56",
"2016-01-25 10:29:44", "2016-01-23 06:26:45", "2016-01-29 08:03:34",
"2016-01-22 14:24:34", "2016-01-16 18:44:43", "2016-01-26 00:00:51",
"2016-01-20 17:38:03", "2016-01-17 22:38:47", "2016-01-30 10:12:01",
"2016-01-21 17:00:43", "2016-01-22 08:43:30", "2016-01-27 12:04:58",
"2016-01-25 21:09:40", "2016-01-27 16:35:42", "2016-01-27 20:09:03",
"2016-01-27 09:52:40", "2016-01-26 16:12:37", "2016-01-28 16:57:29",
"2016-01-30 13:48:47", "2016-01-30 19:15:03", "2016-01-24 19:33:56",
"2016-01-28 06:57:55", "2016-01-22 18:21:40", "2016-01-16 02:54:57",
"2016-01-23 08:18:44", "2016-01-20 13:47:54", "2016-01-24 16:23:39",
"2016-01-24 19:15:09", "2016-01-22 14:59:14", "2016-01-30 10:21:43",
"2016-01-27 11:54:39", "2016-01-30 15:19:59", "2016-01-24 19:21:48",
"2016-01-27 07:20:14", "2016-01-25 07:11:55", "2016-01-24 22:33:42",
"2016-01-26 14:30:57", "2016-01-16 13:12:46", "2016-01-28 11:25:45",
"2016-01-28 14:44:25", "2016-01-23 03:25:10", "2016-01-26 13:45:49",
"2016-01-19 06:14:21", "2016-01-25 22:12:29", "2016-01-25 12:13:07",
"2016-01-22 23:56:39", "2016-01-24 07:51:51", "2016-01-24 10:50:30",
"2016-01-21 07:02:41", "2016-01-21 09:52:54", "2016-01-26 22:35:52",
"2016-01-19 06:48:13", "2016-01-19 15:18:21", "2016-01-20 12:20:37",
"2016-01-16 07:04:34", "2016-01-24 10:20:05", "2016-01-25 09:01:09",
"2016-01-21 17:02:29", "2016-01-21 11:52:00", "2016-01-27 19:39:16",
"2016-01-19 18:33:35", "2016-01-18 06:00:23", "2016-01-17 01:27:11",
"2016-01-18 10:27:57", "3355", "4935", "5454", "9555", "5938",
"5855", "4888", "3885", "8533", "4359", "5339", "5554", "5894",
"8598", "5448", "9535", "3495", "3358", "3485", "3344", "8489",
"8553", "3354", "5889", "5948", "8455", "5988", "5595", "9354",
"8485", "4559", "4838", "5585", "5585", "8554", "8598", "5535",
"5355", "5844", "3485", "5885", "8833", "8558", "9889", "9885",
"8555", "3938", "8343", "8558", "5484", "3558", "3545", "8394",
"9933", "3853", "4598", "3855", "5845", "5588", "5495", "8585",
"9584", "3385", "8858", "9445", "8488", "8558", "5838", "5848",
"8845", "8848", "8945", "4599", "8585", "8858", "4598", "5358",
"5395", "9485", "4893", "4455", "8493", "9358", "5395", "8958",
"5888", "8888", "8555", "4885", "3538", "8998", "4445", "4838",
"9885", "3559", "5584", "9594", "8558", "3844", "5434", "8558",
"9898", "4395", "9585", "3858", "4858", "5895", "9383", "9858",
"8385", "5585", "4884", "8359", "8893", "3484", "8383", "5338",
"3544", "9859", "9454", "3539", "3583", "8455", "5983", "4345",
"4943", "5548", "8353", "8993", "8594", "8994", "3958", "3989",
"W sWn ae", "o gogynh ", " ntsnagWe", "aiatteaav", "shiytWngg",
"vvmthethW", "Wynhvrrht", "tttnheviv", "itg oiWhe", "a enotisn",
"ehaothe h", "stmeathng", "i emranth", "tersggtnh", "oeiehvhh ",
"sngeeetvg", "gyyhWatge", "ritnhengs", "etihi s e", "aoeertyWn",
"eeytitys ", "nmnmegome", "n vitsnot", " h i eoht", "ahghtangh",
"ehgn hynh", "ener aeig", "t niaat g", "agtWh eah", "vehi amae",
"enhnnn hg", "ennWhgnea", "tay hnaah", "igntyvrtv", "niesehahn",
" eoavongr", "hi ehhimm", "yovgianWi", "e tnehngg", "eyehtte n",
"at nimnrg", "enesgennW", "mhahnhyet", "tt amtgna", "hehtsoish",
"hyvtanggv", "et v nssn", "inhnahe h", "onahhraWn", "mn iiahsy",
" mymisnsg", "magWoshgr", "i t eneve", "nghy naen", "eyhsyehea",
"i ihntvea", "ththnWyri", "vntv yran", "ynaieere ", "yenre htW",
"ehyWga g ", "ngeagmenh", " nW ytito", "ermhaagvr", "eeWvtr eg",
"etreaehon", "thtWyerme", "hnveWnrta", "htmr ohee", "stitnthsi",
"snthhWh a", "ehhth iny", "shgoovema", " mseynWee", "netmiitnt",
"nvi eao", "t seWWay", "yngnerarm", "ggenitaeh", "n eaogiag",
"mitnetmnh", "not sine ", "ghmhnyhne", "eattnatgh", "vhatngtts",
"tntmegten", "hreyatert", "ggmneheri", "g y en he", "igrt ggrh",
"mehnssith", "gigstgnym", "iathWh ii", "h atynin ", "eiieWmetg",
"noyggtive", " iotneng ", "oveieteen", "shnagrhti", "itooo aWv",
"toreytnny", " henaaWvn", "shehnrh W", "ttrntehgi", "oWait tn ",
"hhshhnthh", "nogeamnme", "iraah thh", "eto ngvgr", "Wno tseie",
"ehnato eW", "anservnhn", "htsyyoarv", "n aththe", "vaneav h",
"tmttvniri", "gtmhgrtgv", "h tmtnvgt", " nnaiygnr", "httot ami",
"hehnheeis", "ihtaneito", "eogh h yg", "eWgeiimv ", "sgnyisihh",
"r ngangW", "teihyaeee", "hrytWnhgi", "nniaeavmh", "iotrWehn ",
" gnvgorht", "vyinaaen ", "tgniiseae", "14", "86", "51", "54",
"90", "15", "23", "49", "6", "45", "65", "55", "53", "52", "55",
"84", "74", "74", "45", "88", "4", "76", "65", "41", "77", "40",
"66", "39", "80", "6", "35", "56", "40", "57", "90", "66", "59",
"30", "98", "31", "55", "12", "29", "67", "85", "16", "94", "87",
"61", "55", "94", "95", "68", "10", "45", "41", "93", "55", "13",
"12", "80", "45", "59", "23", "45", "1", "68", "89", "86", "68",
"46", "50", "57", "78", "85", "40", "53", "26", "67", "75", "29",
"78", "91", "35", "37", "10", "90", "36", "9", "14", "36", "31",
"5", "57", "90", "65", "48", "80", "20", "13", "92", "62", "72",
"71", "52", "50", "16", "92", "79", "9", "97", "78", "69", "50",
"84", "96", "82", "95", "44", "2", "76", "13", "1", "16", "65",
"75", "91", "30", "60", "62", "97", "86", "82", "0", "0", "0",
"0", "0", "0", "0", "0", "0", "0", "0", "0", "0", "1", "0", "0",
"0", "0", "0", "0", "0", "0", "0", "0", "1", "0", "0", "0", "0",
"0", "0", "0", "0", "1", "0", "0", "0", "0", "0", "0", "0", "0",
"0", "0", "0", "0", "0", "0", "0", "0", "0", "0", "0", "0", "0",
"0", "0", "0", "0", "0", "0", "0", "0", "0", "0", "0", "0", "0",
"0", "0", "0", "0", "0", "0", "0", "0", "0", "0", "0", "0", "0",
"0", "0", "0", "0", "0", "0", "0", "1", "0", "0", "0", "0", "0",
"0", "0", "0", "0", "0", "1", "0", "0", "0", "0", "0", "0", "0",
"0", "1", "0", "0", "0", "1", "0", "0", "0", "0", "0", "0", "0",
"0", "0", "0", "0", "0", "0", "0", "0", "0", "0", "0", "0", "1"
), .Dim = c(133L, 7L), .Dimnames = list(NULL, c("state", "amounts",
"timestamp", "code", "vendor", "risk", "label")))

The problem is that you tried to turn the entire data.frame into a factor, not just 1 column. That resulted in an atomic vector full of junk, hence the error message you received.
This works:
data_train <- as.data.frame(data_train)
data_train$state <- as.factor(data_train$state)
data_train$amounts <- as.factor(as.character(data_train$amounts))
data_train$timestamp <- as.factor(data_train$timestamp)
data_train$code <- as.factor(data_train$code)
data_train$vender <- as.factor(data_train$vender)
data_train$label <- as.factor(data_train$label)
sbrl_model <- sbrl(data_train, iters=20000, pos_sign="1", neg_sign="0",)
create itemset ...
set transactions ...[48 item(s), 8 transaction(s)] done [0.00s].
sorting and recoding items ... [48 item(s)] done [0.00s].
creating sparse bit matrix ... [48 row(s), 8 column(s)] done [0.00s].
writing ... [48 set(s)] done [0.00s].
Creating S4 object ... done [0.00s].
Eclat
parameter specification:
tidLists support minlen maxlen target ext
FALSE 0.1 1 1 frequent itemsets FALSE
algorithmic control:
sparse sort verbose
7 -2 TRUE
Absolute minimum support count: 12
create itemset ...
set transactions ...[469 item(s), 125 transaction(s)] done [0.00s].
sorting and recoding items ... [4 item(s)] done [0.00s].
creating sparse bit matrix ... [4 row(s), 125 column(s)] done [0.00s].
writing ... [4 set(s)] done [0.00s].
Creating S4 object ... done [0.00s].

Related

R looping matrix elements

I'm trying to loop over matrix columns.
date <- rbind("2000-01-01", "2000-01-02", "2000-01-03", "2000-01-04", "2000-01-05", "2000-01-06", "2000-01-07", "2000-01-08", "2000-01-09", "2000-01-10", "2000-01-11", "2000-01-12")
a1 <- rbind("0", "0", "0", "0", "6421", "41", "5667", "44", "1178", "0", "1070", "1")
b1 <- rbind("1", "1", "1", "1", "6421", "41", "5667", "44", "1178", "0", "1070", "1")
hb1 <- rbind("2", "2", "2", "2", "6421", "41", "5667", "44", "1178", "0", "1070", "1")
a2 <- rbind("0", "0", "0", "0", "6421", "41", "5667", "44", "1178", "0", "1070", "1")
b2 <- rbind("1", "1", "1", "1", "6421", "41", "5667", "44", "1178", "0", "1070", "1")
hb2 <- rbind("2", "2", "2", "2", "6421", "41", "5667", "44", "1178", "0", "1070", "1")
a3 <- rbind("0", "0", "0", "0", "6421", "41", "5667", "44", "1178", "0", "1070", "1")
b3 <- rbind("1", "1", "1", "1", "6421", "41", "5667", "44", "1178", "0", "1070", "1")
hb3 <- rbind("2", "2", "2", "2", "6421", "41", "5667", "44", "1178", "0", "1070", "1")
a4 <- rbind("0", "0", "0", "0", "6421", "41", "5667", "44", "1178", "0", "1070", "1")
b4 <- rbind("1", "1", "1", "1", "6421", "41", "5667", "44", "1178", "0", "1070", "1")
hb4 <- rbind("2", "2", "2", "2", "6421", "41", "5667", "44", "1178", "0", "1070", "1")
info_mat <- cbind(date, a1, b1, hb1, a2, b2, hb2, a3, b3, hb3, a4, b4, hb4)
print(info_mat)
I want to compute an evolution rate (V+1 - V)/V between the months for each variable
(evolution from January to Feb, Feb to March, ..., for a1, ..., hb4)
and get the result in a matrix that I will name "evolution_matrix"
I tried the following but for some reason it won't work.
Note that i represents here the fact that I want to perform the evolution for every variable. I think of i as being:
Evolution(January to February for variable a1) =
(value of a1 in February - value of a1 in January)/(value of a1 in January).
I don't know how to model it therefore I put i, but it doesn't refer to anything in the matrix.
for(row in 1:nrow(info_mat)) {
for(col in 1:ncol(info_mat)) {
evolution[[i]] = (info_mat[i+1] - info_mat[i] )/info_mat[i]
print(evolution[[i]])
}
}
Help please!
Why do you use matrix? You have only character (string) variables in matrix, but you want to use them as numbers. I think data.frame is good idea.
R package dplyr has function lapply which can apply your function to each column and simplify the result by list. But we don't want to apply 'evolution' function for column date.
evolution <- as.data.frame(info_mat)[, -1] %>%
lapply(function(x) {x = as.numeric(x); (x - lag(x)) / lag(x)}) %>%
as.data.frame()
In the last line I convert list to data.frame (for beautiful printing).
But we forgot about 'date' column. Let's add it into our data.frame.
evolution <- bind_cols(data.frame(date = date), evolution)
That is all. But if you want to do it by loop you can use this code:
evolution <- matrix(NA, nrow(info_mat), ncol(info_mat))
evolution[, 1] <- date
for(row in 2:nrow(info_mat)) {
for(col in 2:ncol(info_mat)) {
evolution[row, col] = as.numeric(info_mat[row, col])/as.numeric(info_mat[row - 1, col]) - 1
}
}
Comments about your example of code:
you have no variable i and don't use variables row and col.
what is the type of evolution variable?
info_mat[i+1] is not numeric. You cannot divide it on info_mat[i].
What does info_mat[i] means? Yes, info_mat[row, col] is equal to info_mat[(col - 1)* 12 (number of rows) + row] but info_mat[i] and info_mat[i + 1] can be in different columns.
And if you want to create data.frame with you data use this code:
df = data.frame(
data = c("2000-01-01", "2000-01-02", "2000-01-03", "2000-01-04", "2000-01-05", "2000-01-06", "2000-01-07", "2000-01-08", "2000-01-09", "2000-01-10", "2000-01-11", "2000-01-12"),
a1 = c(0, 0, 0, 0, 6421, 41, 5667, 44, 1178, 0, 1070, 1),
b1 = c(1, 1, 1, 1, 6421, 41, 5667, 44, 1178, 0, 1070, 1)
)

scale_x_continuous : Discrete value supplied to continuous scale

I'm a beginner in R. I get this error
Error: Discrete value supplied to continuous scale
when I try to use
scale_x_continuous(breaks=1:10)
The plot I get is the following one. As you can see, the axis needs to be reduced...
I get my data from a csv file
dput(head(Data, 20))
structure(list(Pmanche = structure(1:20, .Label = c("0", "0,1",
"0,2", "0,3", "0,4", "0,5", "0,6", "0,7", "0,8", "0,9", "1",
"1,1", "1,2", "1,3", "1,4", "1,5", "1,6", "1,7", "1,8", "1,9",
"10", "10,1", "10,2", "10,3", "10,4", "10,5", "10,6", "10,7",
"10,8", "10,9", "100", "11", "11,1", "11,2", "11,3", "11,4",
"11,5", "11,6", "11,7", "11,8", "11,9", "12", "12,1", "12,2",
"12,3", "12,4", "12,5", "12,6", "12,7", "12,8", "12,9", "13",
"13,1", "13,2", "13,3", "13,4", "13,5", "13,6", "13,7", "13,8",
"13,9", "14", "14,1", "14,2", "14,3", "14,4", "14,5", "14,6",
"14,7", "14,8", "14,9", "15", "15,1", "15,2", "15,3", "15,4",
"15,5", "15,6", "15,7", "15,8", "15,9", "16", "16,1", "16,2",
"16,3", "16,4", "16,5", "16,6", "16,7", "16,8", "16,9", "17",
"17,1", "17,2", "17,3", "17,4", "17,5", "17,6", "17,7", "17,8",
"17,9", "18", "18,1", "18,2", "18,3", "18,4", "18,5", "18,6",
"18,7", "18,8", "18,9", "19", "19,1", "19,2", "19,3", "19,4",
"19,5", "19,6", "19,7", "19,8", "19,9", "2", "2,1", "2,2", "2,3",
"2,4", "2,5", "2,6", "2,7", "2,8", "2,9", "20", "20,1", "20,2",
"20,3", "20,4", "20,5", "20,6", "20,7", "20,8", "20,9", "21",
"21,1", "21,2", "21,3", "21,4", "21,5", "21,6", "21,7", "21,8",
"21,9", "22", "22,1", "22,2", "22,3", "22,4", "22,5", "22,6",
"22,7", "22,8", "22,9", "23", "23,1", "23,2", "23,3", "23,4",
"23,5", "23,6", "23,7", "23,8", "23,9", "24", "24,1", "24,2",
"24,3", "24,4", "24,5", "24,6", "24,7", "24,8", "24,9", "25",
"25,1", "25,2", "25,3", "25,4", "25,5", "25,6", "25,7", "25,8",
"25,9", "26", "26,1", "26,2", "26,3", "26,4", "26,5", "26,6",
"26,7", "26,8", "26,9", "27", "27,1", "27,2", "27,3", "27,4",
"27,5", "27,6", "27,7", "27,8", "27,9", "28", "28,1", "28,2",
"28,3", "28,4", "28,5", "28,6", "28,7", "28,8", "28,9", "29",
"29,1", "29,2", "29,3", "29,4", "29,5", "29,6", "29,7", "29,8",
"29,9", "3", "3,1", "3,2", "3,3", "3,4", "3,5", "3,6", "3,7",
"3,8", "3,9", "30", "30,1", "30,2", "30,3", "30,4", "30,5", "30,6",
"30,7", "30,8", "30,9", "31", "31,1", "31,2", "31,3", "31,4",
"31,5", "31,6", "31,7", "31,8", "31,9", "32", "32,1", "32,2",
"32,3", "32,4", "32,5", "32,6", "32,7", "32,8", "32,9", "33",
"33,1", "33,2", "33,3", "33,4", "33,5", "33,6", "33,7", "33,8",
"33,9", "34", "34,1", "34,2", "34,3", "34,4", "34,5", "34,6",
"34,7", "34,8", "34,9", "35", "35,1", "35,2", "35,3", "35,4",
"35,5", "35,6", "35,7", "35,8", "35,9", "36", "36,1", "36,2",
"36,3", "36,4", "36,5", "36,6", "36,7", "36,8", "36,9", "37",
"37,1", "37,2", "37,3", "37,4", "37,5", "37,6", "37,7", "37,8",
"37,9", "38", "38,1", "38,2", "38,3", "38,4", "38,5", "38,6",
"38,7", "38,8", "38,9", "39", "39,1", "39,2", "39,3", "39,4",
"39,5", "39,6", "39,7", "39,8", "39,9", "4", "4,1", "4,2", "4,3",
"4,4", "4,5", "4,6", "4,7", "4,8", "4,9", "40", "40,1", "40,2",
"40,3", "40,4", "40,5", "40,6", "40,7", "40,8", "40,9", "41",
"41,1", "41,2", "41,3", "41,4", "41,5", "41,6", "41,7", "41,8",
"41,9", "42", "42,1", "42,2", "42,3", "42,4", "42,5", "42,6",
"42,7", "42,8", "42,9", "43", "43,1", "43,2", "43,3", "43,4",
"43,5", "43,6", "43,7", "43,8", "43,9", "44", "44,1", "44,2",
"44,3", "44,4", "44,5", "44,6", "44,7", "44,8", "44,9", "45",
"45,1", "45,2", "45,3", "45,4", "45,5", "45,6", "45,7", "45,8",
"45,9", "46", "46,1", "46,2", "46,3", "46,4", "46,5", "46,6",
"46,7", "46,8", "46,9", "47", "47,1", "47,2", "47,3", "47,4",
"47,5", "47,6", "47,7", "47,8", "47,9", "48", "48,1", "48,2",
"48,3", "48,4", "48,5", "48,6", "48,7", "48,8", "48,9", "49",
"49,1", "49,2", "49,3", "49,4", "49,5", "49,6", "49,7", "49,8",
"49,9", "5", "5,1", "5,2", "5,3", "5,4", "5,5", "5,6", "5,7",
"5,8", "5,9", "50", "50,1", "50,2", "50,3", "50,4", "50,5", "50,6",
"50,7", "50,8", "50,9", "51", "51,1", "51,2", "51,3", "51,4",
"51,5", "51,6", "51,7", "51,8", "51,9", "52", "52,1", "52,2",
"52,3", "52,4", "52,5", "52,6", "52,7", "52,8", "52,9", "53",
"53,1", "53,2", "53,3", "53,4", "53,5", "53,6", "53,7", "53,8",
"53,9", "54", "54,1", "54,2", "54,3", "54,4", "54,5", "54,6",
"54,7", "54,8", "54,9", "55", "55,1", "55,2", "55,3", "55,4",
"55,5", "55,6", "55,7", "55,8", "55,9", "56", "56,1", "56,2",
"56,3", "56,4", "56,5", "56,6", "56,7", "56,8", "56,9", "57",
"57,1", "57,2", "57,3", "57,4", "57,5", "57,6", "57,7", "57,8",
"57,9", "58", "58,1", "58,2", "58,3", "58,4", "58,5", "58,6",
"58,7", "58,8", "58,9", "59", "59,1", "59,2", "59,3", "59,4",
"59,5", "59,6", "59,7", "59,8", "59,9", "6", "6,1", "6,2", "6,3",
"6,4", "6,5", "6,6", "6,7", "6,8", "6,9", "60", "60,1", "60,2",
"60,3", "60,4", "60,5", "60,6", "60,7", "60,8", "60,9", "61",
"61,1", "61,2", "61,3", "61,4", "61,5", "61,6", "61,7", "61,8",
"61,9", "62", "62,1", "62,2", "62,3", "62,4", "62,5", "62,6",
"62,7", "62,8", "62,9", "63", "63,1", "63,2", "63,3", "63,4",
"63,5", "63,6", "63,7", "63,8", "63,9", "64", "64,1", "64,2",
"64,3", "64,4", "64,5", "64,6", "64,7", "64,8", "64,9", "65",
"65,1", "65,2", "65,3", "65,4", "65,5", "65,6", "65,7", "65,8",
"65,9", "66", "66,1", "66,2", "66,3", "66,4", "66,5", "66,6",
"66,7", "66,8", "66,9", "67", "67,1", "67,2", "67,3", "67,4",
"67,5", "67,6", "67,7", "67,8", "67,9", "68", "68,1", "68,2",
"68,3", "68,4", "68,5", "68,6", "68,7", "68,8", "68,9", "69",
"69,1", "69,2", "69,3", "69,4", "69,5", "69,6", "69,7", "69,8",
"69,9", "7", "7,1", "7,2", "7,3", "7,4", "7,5", "7,6", "7,7",
"7,8", "7,9", "70", "70,1", "70,2", "70,3", "70,4", "70,5", "70,6",
"70,7", "70,8", "70,9", "71", "71,1", "71,2", "71,3", "71,4",
"71,5", "71,6", "71,7", "71,8", "71,9", "72", "72,1", "72,2",
"72,3", "72,4", "72,5", "72,6", "72,7", "72,8", "72,9", "73",
"73,1", "73,2", "73,3", "73,4", "73,5", "73,6", "73,7", "73,8",
"73,9", "74", "74,1", "74,2", "74,3", "74,4", "74,5", "74,6",
"74,7", "74,8", "74,9", "75", "75,1", "75,2", "75,3", "75,4",
"75,5", "75,6", "75,7", "75,8", "75,9", "76", "76,1", "76,2",
"76,3", "76,4", "76,5", "76,6", "76,7", "76,8", "76,9", "77",
"77,1", "77,2", "77,3", "77,4", "77,5", "77,6", "77,7", "77,8",
"77,9", "78", "78,1", "78,2", "78,3", "78,4", "78,5", "78,6",
"78,7", "78,8", "78,9", "79", "79,1", "79,2", "79,3", "79,4",
"79,5", "79,6", "79,7", "79,8", "79,9", "8", "8,1", "8,2", "8,3",
"8,4", "8,5", "8,6", "8,7", "8,8", "8,9", "80", "80,1", "80,2",
"80,3", "80,4", "80,5", "80,6", "80,7", "80,8", "80,9", "81",
"81,1", "81,2", "81,3", "81,4", "81,5", "81,6", "81,7", "81,8",
"81,9", "82", "82,1", "82,2", "82,3", "82,4", "82,5", "82,6",
"82,7", "82,8", "82,9", "83", "83,1", "83,2", "83,3", "83,4",
"83,5", "83,6", "83,7", "83,8", "83,9", "84", "84,1", "84,2",
"84,3", "84,4", "84,5", "84,6", "84,7", "84,8", "84,9", "85",
"85,1", "85,2", "85,3", "85,4", "85,5", "85,6", "85,7", "85,8",
"85,9", "86", "86,1", "86,2", "86,3", "86,4", "86,5", "86,6",
"86,7", "86,8", "86,9", "87", "87,1", "87,2", "87,3", "87,4",
"87,5", "87,6", "87,7", "87,8", "87,9", "88", "88,1", "88,2",
"88,3", "88,4", "88,5", "88,6", "88,7", "88,8", "88,9", "89",
"89,1", "89,2", "89,3", "89,4", "89,5", "89,6", "89,7", "89,8",
"89,9", "9", "9,1", "9,2", "9,3", "9,4", "9,5", "9,6", "9,7",
"9,8", "9,9", "90", "90,1", "90,2", "90,3", "90,4", "90,5", "90,6",
"90,7", "90,8", "90,9", "91", "91,1", "91,2", "91,3", "91,4",
"91,5", "91,6", "91,7", "91,8", "91,9", "92", "92,1", "92,2",
"92,3", "92,4", "92,5", "92,6", "92,7", "92,8", "92,9", "93",
"93,1", "93,2", "93,3", "93,4", "93,5", "93,6", "93,7", "93,8",
"93,9", "94", "94,1", "94,2", "94,3", "94,4", "94,5", "94,6",
"94,7", "94,8", "94,9", "95", "95,1", "95,2", "95,3", "95,4",
"95,5", "95,6", "95,7", "95,8", "95,9", "96", "96,1", "96,2",
"96,3", "96,4", "96,5", "96,6", "96,7", "96,8", "96,9", "97",
"97,1", "97,2", "97,3", "97,4", "97,5", "97,6", "97,7", "97,8",
"97,9", "98", "98,1", "98,2", "98,3", "98,4", "98,5", "98,6",
"98,7", "98,8", "98,9", "99", "99,1", "99,2", "99,3", "99,4",
"99,5", "99,6", "99,7", "99,8", "99,9"), class = "factor"), Pcsge = structure(1:20, .Label = c("0",
"0,1", "0,2", "0,3", "0,4", "0,5", "0,6", "0,7", "0,8", "0,9",
"1", "1,1", "1,2", "1,3", "1,4", "1,5", "1,6", "1,7", "1,8",
"1,9", "10", "10,1", "10,2", "10,3", "10,4", "10,5", "10,6",
"10,7", "10,8", "10,9", "100", "11", "11,1", "11,2", "11,3",
"11,4", "11,5", "11,6", "11,7", "11,8", "11,9", "12", "12,1",
"12,2", "12,3", "12,4", "12,5", "12,6", "12,7", "12,8", "12,9",
"13", "13,1", "13,2", "13,3", "13,4", "13,5", "13,6", "13,7",
"13,8", "13,9", "14", "14,1", "14,2", "14,3", "14,4", "14,5",
"14,6", "14,7", "14,8", "14,9", "15", "15,1", "15,2", "15,3",
"15,4", "15,5", "15,6", "15,7", "15,8", "15,9", "16", "16,1",
"16,2", "16,3", "16,4", "16,5", "16,6", "16,7", "16,8", "16,9",
"17", "17,1", "17,2", "17,3", "17,4", "17,5", "17,6", "17,7",
"17,8", "17,9", "18", "18,1", "18,2", "18,3", "18,4", "18,5",
"18,6", "18,7", "18,8", "18,9", "19", "19,1", "19,2", "19,3",
"19,4", "19,5", "19,6", "19,7", "19,8", "19,9", "2", "2,1", "2,2",
"2,3", "2,4", "2,5", "2,6", "2,7", "2,8", "2,9", "20", "20,1",
"20,2", "20,3", "20,4", "20,5", "20,6", "20,7", "20,8", "20,9",
"21", "21,1", "21,2", "21,3", "21,4", "21,5", "21,6", "21,7",
"21,8", "21,9", "22", "22,1", "22,2", "22,3", "22,4", "22,5",
"22,6", "22,7", "22,8", "22,9", "23", "23,1", "23,2", "23,3",
"23,4", "23,5", "23,6", "23,7", "23,8", "23,9", "24", "24,1",
"24,2", "24,3", "24,4", "24,5", "24,6", "24,7", "24,8", "24,9",
"25", "25,1", "25,2", "25,3", "25,4", "25,5", "25,6", "25,7",
"25,8", "25,9", "26", "26,1", "26,2", "26,3", "26,4", "26,5",
"26,6", "26,7", "26,8", "26,9", "27", "27,1", "27,2", "27,3",
"27,4", "27,5", "27,6", "27,7", "27,8", "27,9", "28", "28,1",
"28,2", "28,3", "28,4", "28,5", "28,6", "28,7", "28,8", "28,9",
"29", "29,1", "29,2", "29,3", "29,4", "29,5", "29,6", "29,7",
"29,8", "29,9", "3", "3,1", "3,2", "3,3", "3,4", "3,5", "3,6",
"3,7", "3,8", "3,9", "30", "30,1", "30,2", "30,3", "30,4", "30,5",
"30,6", "30,7", "30,8", "30,9", "31", "31,1", "31,2", "31,3",
"31,4", "31,5", "31,6", "31,7", "31,8", "31,9", "32", "32,1",
"32,2", "32,3", "32,4", "32,5", "32,6", "32,7", "32,8", "32,9",
"33", "33,1", "33,2", "33,3", "33,4", "33,5", "33,6", "33,7",
"33,8", "33,9", "34", "34,1", "34,2", "34,3", "34,4", "34,5",
"34,6", "34,7", "34,8", "34,9", "35", "35,1", "35,2", "35,3",
"35,4", "35,5", "35,6", "35,7", "35,8", "35,9", "36", "36,1",
"36,2", "36,3", "36,4", "36,5", "36,6", "36,7", "36,8", "36,9",
"37", "37,1", "37,2", "37,3", "37,4", "37,5", "37,6", "37,7",
"37,8", "37,9", "38", "38,1", "38,2", "38,3", "38,4", "38,5",
"38,6", "38,7", "38,8", "38,9", "39", "39,1", "39,2", "39,3",
"39,4", "39,5", "39,6", "39,7", "39,8", "39,9", "4", "4,1", "4,2",
"4,3", "4,4", "4,5", "4,6", "4,7", "4,8", "4,9", "40", "40,1",
"40,2", "40,3", "40,4", "40,5", "40,6", "40,7", "40,8", "40,9",
"41", "41,1", "41,2", "41,3", "41,4", "41,5", "41,6", "41,7",
"41,8", "41,9", "42", "42,1", "42,2", "42,3", "42,4", "42,5",
"42,6", "42,7", "42,8", "42,9", "43", "43,1", "43,2", "43,3",
"43,4", "43,5", "43,6", "43,7", "43,8", "43,9", "44", "44,1",
"44,2", "44,3", "44,4", "44,5", "44,6", "44,7", "44,8", "44,9",
"45", "45,1", "45,2", "45,3", "45,4", "45,5", "45,6", "45,7",
"45,8", "45,9", "46", "46,1", "46,2", "46,3", "46,4", "46,5",
"46,6", "46,7", "46,8", "46,9", "47", "47,1", "47,2", "47,3",
"47,4", "47,5", "47,6", "47,7", "47,8", "47,9", "48", "48,1",
"48,2", "48,3", "48,4", "48,5", "48,6", "48,7", "48,8", "48,9",
"49", "49,1", "49,2", "49,3", "49,4", "49,5", "49,6", "49,7",
"49,8", "49,9", "5", "5,1", "5,2", "5,3", "5,4", "5,5", "5,6",
"5,7", "5,8", "5,9", "50", "50,1", "50,2", "50,3", "50,4", "50,5",
"50,6", "50,7", "50,8", "50,9", "51", "51,1", "51,2", "51,3",
"51,4", "51,5", "51,6", "51,7", "51,8", "51,9", "52", "52,1",
"52,2", "52,3", "52,4", "52,5", "52,6", "52,7", "52,8", "52,9",
"53", "53,1", "53,2", "53,3", "53,4", "53,5", "53,6", "53,7",
"53,8", "53,9", "54", "54,1", "54,2", "54,3", "54,4", "54,5",
"54,6", "54,7", "54,8", "54,9", "55", "55,1", "55,2", "55,3",
"55,4", "55,5", "55,6", "55,7", "55,8", "55,9", "56", "56,1",
"56,2", "56,3", "56,4", "56,5", "56,6", "56,7", "56,8", "56,9",
"57", "57,1", "57,2", "57,3", "57,4", "57,5", "57,6", "57,7",
"57,8", "57,9", "58", "58,1", "58,2", "58,3", "58,4", "58,5",
"58,6", "58,7", "58,8", "58,9", "59", "59,1", "59,2", "59,3",
"59,4", "59,5", "59,6", "59,7", "59,8", "59,9", "6", "6,1", "6,2",
"6,3", "6,4", "6,5", "6,6", "6,7", "6,8", "6,9", "60", "60,1",
"60,2", "60,3", "60,4", "60,5", "60,6", "60,7", "60,8", "60,9",
"61", "61,1", "61,2", "61,3", "61,4", "61,5", "61,6", "61,7",
"61,8", "61,9", "62", "62,1", "62,2", "62,3", "62,4", "62,5",
"62,6", "62,7", "62,8", "62,9", "63", "63,1", "63,2", "63,3",
"63,4", "63,5", "63,6", "63,7", "63,8", "63,9", "64", "64,1",
"64,2", "64,3", "64,4", "64,5", "64,6", "64,7", "64,8", "64,9",
"65", "65,1", "65,2", "65,3", "65,4", "65,5", "65,6", "65,7",
"65,8", "65,9", "66", "66,1", "66,2", "66,3", "66,4", "66,5",
"66,6", "66,7", "66,8", "66,9", "67", "67,1", "67,2", "67,3",
"67,4", "67,5", "67,6", "67,7", "67,8", "67,9", "68", "68,1",
"68,2", "68,3", "68,4", "68,5", "68,6", "68,7", "68,8", "68,9",
"69", "69,1", "69,2", "69,3", "69,4", "69,5", "69,6", "69,7",
"69,8", "69,9", "7", "7,1", "7,2", "7,3", "7,4", "7,5", "7,6",
"7,7", "7,8", "7,9", "70", "70,1", "70,2", "70,3", "70,4", "70,5",
"70,6", "70,7", "70,8", "70,9", "71", "71,1", "71,2", "71,3",
"71,4", "71,5", "71,6", "71,7", "71,8", "71,9", "72", "72,1",
"72,2", "72,3", "72,4", "72,5", "72,6", "72,7", "72,8", "72,9",
"73", "73,1", "73,2", "73,3", "73,4", "73,5", "73,6", "73,7",
"73,8", "73,9", "74", "74,1", "74,2", "74,3", "74,4", "74,5",
"74,6", "74,7", "74,8", "74,9", "75", "75,1", "75,2", "75,3",
"75,4", "75,5", "75,6", "75,7", "75,8", "75,9", "76", "76,1",
"76,2", "76,3", "76,4", "76,5", "76,6", "76,7", "76,8", "76,9",
"77", "77,1", "77,2", "77,3", "77,4", "77,5", "77,6", "77,7",
"77,8", "77,9", "78", "78,1", "78,2", "78,3", "78,4", "78,5",
"78,6", "78,7", "78,8", "78,9", "79", "79,1", "79,2", "79,3",
"79,4", "79,5", "79,6", "79,7", "79,8", "79,9", "8", "8,1", "8,2",
"8,3", "8,4", "8,5", "8,6", "8,7", "8,8", "8,9", "80", "80,1",
"80,2", "80,3", "80,4", "80,5", "80,6", "80,7", "80,8", "80,9",
"81", "81,1", "81,2", "81,3", "81,4", "81,5", "81,6", "81,7",
"81,8", "81,9", "82", "82,1", "82,2", "82,3", "82,4", "82,5",
"82,6", "82,7", "82,8", "82,9", "83", "83,1", "83,2", "83,3",
"83,4", "83,5", "83,6", "83,7", "83,8", "83,9", "84", "84,1",
"84,2", "84,3", "84,4", "84,5", "84,6", "84,7", "84,8", "84,9",
"85", "85,1", "85,2", "85,3", "85,4", "85,5", "85,6", "85,7",
"85,8", "85,9", "86", "86,1", "86,2", "86,3", "86,4", "86,5",
"86,6", "86,7", "86,8", "86,9", "87", "87,1", "87,2", "87,3",
"87,4", "87,5", "87,6", "87,7", "87,8", "87,9", "88", "88,1",
"88,2", "88,3", "88,4", "88,5", "88,6", "88,7", "88,8", "88,9",
"89", "89,1", "89,2", "89,3", "89,4", "89,5", "89,6", "89,7",
"89,8", "89,9", "9", "9,1", "9,2", "9,3", "9,4", "9,5", "9,6",
"9,7", "9,8", "9,9", "90", "90,1", "90,2", "90,3", "90,4", "90,5",
"90,6", "90,7", "90,8", "90,9", "91", "91,1", "91,2", "91,3",
"91,4", "91,5", "91,6", "91,7", "91,8", "91,9", "92", "92,1",
"92,2", "92,3", "92,4", "92,5", "92,6", "92,7", "92,8", "92,9",
"93", "93,1", "93,2", "93,3", "93,4", "93,5", "93,6", "93,7",
"93,8", "93,9", "94", "94,1", "94,2", "94,3", "94,4", "94,5",
"94,6", "94,7", "94,8", "94,9", "95", "95,1", "95,2", "95,3",
"95,4", "95,5", "95,6", "95,7", "95,8", "95,9", "96", "96,1",
"96,2", "96,3", "96,4", "96,5", "96,6", "96,7", "96,8", "96,9",
"97", "97,1", "97,2", "97,3", "97,4", "97,5", "97,6", "97,7",
"97,8", "97,9", "98", "98,1", "98,2", "98,3", "98,4", "98,5",
"98,6", "98,7", "98,8", "98,9", "99", "99,1", "99,2", "99,3",
"99,4", "99,5", "99,6", "99,7", "99,8", "99,9"), class = "factor")), row.names = c(NA,
20L), class = "data.frame")
At first, I though it was a problem with class of variables (numeric, factor...) but even when I convert to numeric it doesnt work...
Data$Pmanche <- levels(Data$Pmanche)[Data$Pmanche]
Data$Pcsge <- levels(Data$Pcsge)[Data$Pcsge]
Thanks for your time!
Here is my code:
## definition workdirectory
setwd(dir="C:/Users/F596028/Documents/nouveau dossier/Optimisation")
##############################
# Packages #
##############################
# Graphics
#install.packages("ggplot2")
library(ggplot2)
##############################
# ouverture du fichier Excel #
##############################
## Fichiers extension .csv du repertoir
files <- list.files(pattern = "\\.csv$")
files <- sort(files)
## Enregistrement des titres
headers <- read.csv("Classeur1.csv", header = F, nrows = 1, as.is = T, sep=";")
## Enregistrement des données
Sub1 <- read.csv(files, skip=1, header=F, sep=";")
colnames(Sub1)=headers
Pmanche<-Sub1[,c(3)]
Pcsge<-Sub1[,c(4)]
Data <- data.frame(Pmanche,Pcsge)
##############################
# Plots #
##############################
#Data$Pmanche <- levels(Data$Pmanche)[Data$Pmanche]
#Data$Pcsge <- levels(Data$Pcsge)[Data$Pcsge]
p<- ggplot(Data, aes(Pmanche,Pcsge))
p + geom_point() + scale_x_continuous(breaks=1:10)
Ok, I solved it.
The problem was the data source.
I learned we need to be careful when taking data from excel... Decimals must be indicated with "." and not ","
Thanks

Drawing slope graph in R using ggplot, Error: Aesthetics must be either length 1 or the same as the data

I want to create a slope graph in R like this using ggplot
https://rud.is/b/2013/01/11/slopegraphs-in-r/
after cleaning the data and melt the data frame i ran into an error like this:
Error: Aesthetics must be either length 1 or the same as the data (182): x, y, group, colour, label
There's no NAs in my data. Any ideas? Much appreciated!
Here's the code
#Read file as numeric data
betterlife<-read.csv("betterlife.csv",skip=4,stringsAsFactors = F)
num_data <- data.frame(data.matrix(betterlife))
numeric_columns <- sapply(num_data,function(x){mean(as.numeric(is.na(x)))<0.5})
final_data <- data.frame(num_data[,numeric_columns],
betterlife[,!numeric_columns])
## rescale selected columns data frame
final_data <- data.frame(lapply(final_data[,c(3,4,5,6,7,10,11)], function(x) scale(x, center = FALSE, scale = max(x, na.rm = TRUE)/100)))
## Add country names as indicator
final_data["INDICATOR"] <- NA
final_data$INDICATOR <- betterlife$INDICATOR
employment.data <- final_data[5:30,]
indicator <- employment.data$INDICATOR
## Melt data to draw graph
employment.melt <- melt(employment.data)
#plot
sg = ggplot(employment.melt, aes(factor(variable), value,
group = indicator,
colour = indicator,
label = indicator)) +
theme(legend.position = "none",
axis.text.x = element_text(size=5),
axis.text.y=element_blank(),
axis.title.x=element_blank(),
axis.title.y=element_blank(),
axis.ticks=element_blank(),
axis.line=element_blank(),
panel.grid.major.x = element_line("black", size = 0.1),
panel.grid.major.y = element_blank(),
panel.grid.minor.y = element_blank(),
panel.background = element_blank())
sg1
This is the data I'm working with
dput(betterlife)
structure(list(X = c("", "ISO3", "AUS", "AUT", "BEL", "CAN",
"CHL", "CZE", "DNK", "EST", "FIN", "FRA", "DEU", "GRC", "HUN",
"ISL", "IRL", "ISR", "ITA", "JPN", "KOR", "LUX", "MEX", "NLD",
"NZL", "NOR", "POL", "PRT", "SVK", "SVN", "ESP", "SWE", "CHE",
"TUR", "GBR", "USA", "OECD", "", ""),
INDICATOR = c("UNIT", "COUNTRY",
"Australia", "Austria", "Belgium", "Canada", "Chile", "Czech Republic",
"Denmark", "Estonia", "Finland", "France", "Germany", "Greece",
"Hungary", "Iceland", "Ireland", "Israel", "Italy", "Japan",
"Korea", "Luxembourg", "Mexico", "Netherlands", "New Zealand",
"Norway", "Poland", "Portugal", "Slovak Republic", "Slovenia",
"Spain", "Sweden", "Switzerland", "Turkey", "United Kingdom",
"United States", "OECD average", "", "n.a. : not available"),
Rooms.per.person = c("Average number of rooms shared per person in a dwelling",
"", "2.4", "1.7", "2.3", "2.5", "1.3", "1.3", "1.9", "1.2",
"1.9", "1.8", "1.7", "1.2", "1", "1.6", "2.1", "1.1", "1.4",
"1.8", "1.3", "1.9", "1.566666667", "2", "2.3", "1.9", "1",
"1.5", "1.1", "1.1", "1.9", "1.8", "1.7", "0.7", "1.8", "1.605208333",
"1.6", "", ""),
Dwelling.without.basic.facilities = c("% of people without indoor flushing toilets in their home",
"", "3.425714286", "1.3", "0.6", "2.722", "9.36", "0.7",
"0", "12.2", "0.8", "0.8", "1.2", "1.8", "7.1", "0.3", "0.3",
"2.52", "0.2", "6.4", "7.46", "0.8", "6.6", "0", "2.984285714",
"0.1", "4.8", "2.4", "1.1", "0.6", "0", "0", "0.1", "17.1",
"0.5", "0", "2.82", "", ""),
Household.disposable.income = c("USD (PPPs adjusted)",
"", "27,039", "27,670", "26,008", "27,015", "8,712", "16,690",
"22,929", "13,486", "24,246", "27,508", "27,665", "21,499",
"13,858", "19,621", "24,313", "22,539", "24,383", "23,210",
"16,254", "19,621", "12,182", "25,977", "18,819", "29,366",
"13,811", "18,540", "15,490", "19,890", "22,972", "26,543",
"27,542", "21,030", "27,208", "37,685", "22,284", "", ""),
Employment.rate = c("% of the working age population (15-64)",
"", "72.3", "71.73", "62.01", "71.68", "59.32", "65", "73.44",
"61.02", "68.15", "63.99", "71.1", "59.55", "55.4", "78.17",
"59.96", "59.21", "56.89", "70.11", "63.31", "65.21", "60.39",
"74.67", "72.34", "75.31", "59.26", "65.55", "58.76", "66.2",
"58.55", "72.73", "78.59", "46.29", "69.51", "66.71", "64.52",
"", ""),
Long.term.unemployment.rate = c("% of people, aged 15-64, who are not working but have been actively seeking a job for over a year",
"", "1", "1.13", "4.07", "0.97", "2.98375", "3.19", "1.44",
"7.84", "2.01", "3.75", "3.4", "5.73", "5.68", "1.35", "6.74",
"1.85", "4.13", "1.99", "0.01", "1.29", "0.13", "1.24", "0.6",
"0.34", "2.49", "5.97", "8.56", "3.21", "9.1", "1.42", "1.49",
"3.11", "2.59", "2.85", "2.74", "", ""),
Quality.of.support.network = c("% of people who have friends or relatives to rely on in case of need",
"", "95.4", "94.6", "92.6", "95.3", "85.2", "88.9", "96.8",
"84.6", "93.4", "93.9", "93.5", "86.1", "88.6", "97.6", "97.3",
"93", "86", "89.7", "79.8", "95", "87.1", "94.8", "97.1",
"93.1", "92.2", "83.3", "89.6", "90.7", "94.1", "96.2", "93.2",
"78.8", "94.9", "92.3", "91.1", "", ""),
Educational.attainment = c("% of people, aged 15-64, having at least an upper-secondary (high-school) degree",
"", "69.72", "81.04", "69.58", "87.07", "67.97", "90.9",
"74.56", "88.48", "81.07", "69.96", "85.33", "61.07", "79.7",
"64.13", "69.45", "81.23", "53.31", "87", "79.14", "67.94",
"33.55", "73.29", "72.05", "80.7", "87.15", "28.25", "89.93",
"82.04", "51.23", "85.04", "86.81", "30.31", "69.63", "88.7",
"72.95", "", ""),
Students.reading.skills = c("Average reading performance of students aged 15, according to PISA",
"", "515", "470", "506", "524", "449", "478", "495", "501",
"536", "496", "497", "483", "494", "500", "496", "474", "486",
"520", "539", "472", "425", "508", "521", "503", "500", "489",
"477", "483", "481", "497", "501", "464", "494", "500", "493",
"", ""),
Air.pollution = c("Average concentration of particulate matter (PM10) in cities with population larger than 100 000, measured in micrograms per cubic meter",
"", "14.28", "29.03", "21.27", "15", "61.55", "18.5", "16.26",
"12.62", "14.87", "12.94", "16.21", "32", "15.6", "14.47",
"12.54", "27.57", "23.33", "27.14", "30.76", "12.63", "32.69",
"30.76", "11.93", "15.85", "35.07", "21", "13.14", "29.03",
"27.56", "10.52", "22.36", "37.06", "12.67", "19.4", "21.99",
"", ""),
Consultation.on.rule.making = c("Composite index, increasing with the number of key elements of formal consultation processes",
"", "10.5", "7.13", "4.5", "10.5", "2", "6.75", "7", "3.25",
"9", "3.5", "4.5", "6.5", "7.88", "5.13", "9", "2.5", "5",
"7.25", "10.38", "6", "9", "6.13", "10.25", "8.13", "10.75",
"6.5", "6.63", "10.25", "7.25", "10.88", "8.38", "5.5", "11.5",
"8.25", "7.28", "", ""),
Voter.turnout = c("Number of people voting as % of the registered population ",
"", "95", "82", "91", "60", "88", "64", "87", "62", "74",
"84", "78", "74", "64", "84", "67", "65", "81", "67", "63",
"57", "59", "80", "79", "77", "54", "64", "55", "63", "75",
"82", "48", "84", "61", "90", "72", "", ""),
Life.expectancy = c("Average number of years a person can expect to live",
"", "81.5", "80.5", "79.8", "80.7", "77.8", "77.3", "78.8",
"73.9", "79.9", "81", "80.2", "80", "73.8", "81.3", "79.9",
"81.1", "81.5", "82.7", "79.9", "80.6", "75.1", "80.2", "80.4",
"80.6", "75.6", "79.3", "74.8", "78.8", "81.2", "81.2", "82.2",
"73.6", "79.7", "77.9", "79.2", "", ""),
Self.reported.health = c("% of people reporting their health to be \"good or very good\"",
"", "84.9", "69.6", "76.7", "88.1", "56.2", "68.2", "74.3",
"56.3", "67.7", "72.4", "64.7", "76.4", "55.2", "80.6", "84.4",
"79.7", "63.4", "32.7", "43.7", "74", "65.5", "80.6", "89.7",
"80", "57.7", "48.6", "31.1", "58.8", "69.8", "79.1", "80.95",
"66.8", "76", "88", "69", "", ""),
Life.Satisfaction = c("Average self-evaluation of life satisfaction, on a scale from 0 to 10",
"", "7.5", "7.3", "6.9", "7.7", "6.6", "6.2", "7.8", "5.1",
"7.4", "6.8", "6.7", "5.8", "4.7", "6.9", "7.3", "7.4", "6.4",
"6.1", "6.1", "7.1", "6.8", "7.5", "7.2", "7.6", "5.8", "4.9",
"6.1", "6.1", "6.2", "7.5", "7.5", "5.5", "7", "7.2", "6.7",
"", ""),
Homicide.rate = c("Average number of reported homicides per 100 000 people",
"", "1.2", "0.5", "1.8", "1.7", "8.1", "2", "1.4", "6.3",
"2.5", "1.4", "0.8", "1.1", "1.5", "0", "2", "2.4", "1.2",
"0.5", "2.3", "1.5", "11.6", "1", "1.3", "0.6", "1.2", "1.2",
"1.7", "0.5", "0.9", "0.9", "0.7", "2.9", "2.6", "5.2", "2.1",
"", ""),
Assault.rate = c("% of people who report having been assaulted in the previous year",
"", "2.1", "3", "7.3", "1.4", "9.5", "3.5", "3.9", "6.2",
"2.4", "4.9", "3.6", "3.8", "3.8", "2.7", "2.7", "3.1", "4.7",
"1.6", "2.1", "4.3", "14.8", "5", "2.3", "3.3", "2.2", "6.2",
"3.5", "3.9", "4.2", "5.2", "4.2", "6", "1.9", "1.6", "4.1",
"", "")),
.Names = c("X", "INDICATOR", "Rooms.per.person", "Dwelling.without.basic.facilities",
"Household.disposable.income", "Employment.rate",
"Long.term.unemployment.rate", "Quality.of.support.network",
"Educational.attainment", "Students.reading.skills", "Air.pollution",
"Consultation.on.rule.making", "Voter.turnout", "Life.expectancy",
"Self.reported.health", "Life.Satisfaction", "Homicide.rate",
"Assault.rate"), class = "data.frame", row.names = c(NA, -39L))
Did I melt the data frame wrongly? since the index of each row are not in the correct order

Counting the Occurence of Hexadecimal Numbers - R

So I have a file which contains a large number of hexidecimal digits in pairs, and a 'NA'/missing data symbol of "??".
A4 BB 08 6F E7 88 D9 10 11 12 AC CB C8 CC #Row of data in the file.
?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? #Row of missing data in the file.
I'm attempting to pipe all of that in and get some insight into the frequency of each hexadecimal number from 0 to 256. So far I read it into a structure using the 'read table' command (call it test), and I'm really not sure exactly what to do from there. I've done a number of different things trying to suppress the lines with "??" in any column and then convert the rest to hex values and get something useful from this. If anyone can point me towards the tools I need to complete this task I'd much appreciate it.
Edit:
As per request the output of dput.
structure(list(V2 = structure(c(88L, 209L, 124L, 91L, 132L, 235L
), .Label = c("??", "00", "01", "02", "03", "04", "05", "06",
"07", "08", "09", "0A", "0B", "0C", "0D", "0E", "0F", "10", "11",
"12", "13", "14", "15", "16", "17", "18", "19", "1A", "1B", "1C",
"1D", "1E", "1F", "20", "21", "22", "23", "24", "25", "26", "27",
"28", "29", "2A", "2B", "2C", "2D", "2E", "2F", "30", "31", "32",
"33", "34", "35", "36", "37", "38", "39", "3A", "3B", "3C", "3D",
"3E", "3F", "40", "41", "42", "43", "44", "45", "46", "47", "48",
"49", "4A", "4B", "4C", "4D", "4E", "4F", "50", "51", "52", "53",
"54", "55", "56", "57", "58", "59", "5A", "5B", "5C", "5D", "5E",
"5F", "60", "61", "62", "63", "64", "65", "66", "67", "68", "69",
"6A", "6B", "6C", "6D", "6E", "6F", "70", "71", "72", "73", "74",
"75", "76", "77", "78", "79", "7A", "7B", "7C", "7D", "7E", "7F",
"80", "81", "82", "83", "84", "85", "86", "87", "88", "89", "8A",
"8B", "8C", "8D", "8E", "8F", "90", "91", "92", "93", "94", "95",
"96", "97", "98", "99", "9A", "9B", "9C", "9D", "9E", "9F", "A0",
"A1", "A2", "A3", "A4", "A5", "A6", "A7", "A8", "A9", "AA", "AB",
"AC", "AD", "AE", "AF", "B0", "B1", "B2", "B3", "B4", "B5", "B6",
"B7", "B8", "B9", "BA", "BB", "BC", "BD", "BE", "BF", "C0", "C1",
"C2", "C3", "C4", "C5", "C6", "C7", "C8", "C9", "CA", "CB", "CC",
"CD", "CE", "CF", "D0", "D1", "D2", "D3", "D4", "D5", "D6", "D7",
"D8", "D9", "DA", "DB", "DC", "DD", "DE", "DF", "E0", "E1", "E2",
"E3", "E4", "E5", "E6", "E7", "E8", "E9", "EA", "EB", "EC", "ED",
"EE", "EF", "F0", "F1", "F2", "F3", "F4", "F5", "F6", "F7", "F8",
"F9", "FA", "FB", "FC", "FD", "FE", "FF"), class = "factor"),
There are a number of other columns as well. I left them off as they have the same ~257 values for labels give or take a hex value here or there.
as.hexmode(names(test)) resulted in the same issue, couldn't coerce 'x' to hexmode.
Edit: Okay I had some success and I got it to do what I wanted it to do more or less.
First I wanted to merge the columns as I just wanted an overall count of the occurrences (this may even have been unnecessary)
test2 <-
c(as.character(test[,1]),as.character(test[,2]),as.character(test[,3]),as.character(test[,4]),
as.character(test[,5]), as.character(test[,6]), as.character(test[,7]),
as.character(test[,8]), as.character(test[,9]), as.character(test[,10]),
as.character(test[,11]), as.character(test[,12]), as.character(test[,13]),
as.character(test[,14]), as.character(test[,15]), as.character(test[,16]))
Then I just wanted the counts of each value:
table(test2)
No conversion to integers or any such shenanigans necessary. I feel more than a little dumb, but oh well. I am still curious though if there's a better way to get the overall count across all rows and columns of each value as the way I did it seems clunky.
Edit:
The ultimate answer was (going with my original naming convention):
table(unlist(lapply(test, as.character)))
Thank you BondedDust.
See if you get some success with:
as.hexmode ( names(test) )
The output you offer suggests a table-object has been created and teh first row would be the names (in character mode) of the entries seen below those hex-characters. It remains unclear whether you are showing the the content of an external text file or output on the console so this may be a WAG.
> res <- scan(what="")
1: A4 BB 08 6F E7 88 D9 10 11 12 AC CB C8 CC
15:
Read 14 items
> as.hexmode(res)
[1] "a4" "bb" "08" "6f" "e7" "88" "d9" "10" "11" "12" "ac" "cb" "c8" "cc"
> dput( as.hexmode(res) )
structure(c(164L, 187L, 8L, 111L, 231L, 136L, 217L, 16L, 17L,
18L, 172L, 203L, 200L, 204L), class = "hexmode")

Computing angle between two vectors (with one vector having a specific X,Y position)

I am trying to compute the angle between two vectors, wherein one vector is fixed and the other vector is constantly moving. I already know the math in this and I found a code before:
theta <- acos( sum(a*b) / ( sqrt(sum(a * a)) * sqrt(sum(b * b)) ) )
I tried defining my a as:
a<-c(503,391)
and my b as:
b <- NM[, c("X","Y")]
When I apply the theta function I get:
Warning message:
In acos(sum(a * b)/(sqrt(sum(a * a)) * sqrt(sum(b * b)))) : NaNs produced
I would appreciate help to solve this.
And here is my sample data:
structure(list(A = structure(c(1L, 1L, 1L, 1L, 1L, 1L), .Label =
c("1", "2", "3", "4", "5", "6", "7", "8", "9", "10", "11", "12",
"13", "14", "15", "16", "17", "18", "19", "20", "21", "22", "23",
"24", "25", "26", "27", "28", "29", "30", "31", "32", "33", "34",
"35", "36", "37", "38", "39", "40", "41", "42", "43", "44", "45",
"46", "47", "48", "49", "50", "51", "52", "53", "54", "55", "56",
"57", "58", "59", "60", "61", "62", "63", "64", "65", "66", "67",
"68", "69", "70", "71", "72", "73", "74", "75", "76", "77", "78",
"79", "80", "81", "82", "83", "84", "85", "86", "87", "88", "89",
"90", "91", "92", "93", "94", "95", "96", "97", "98", "99", "100",
"101", "102", "103", "104", "105", "106", "107", "108", "109",
"110"), class = "factor"), T = c(0.1, 0.2, 0.3, 0.4, 0.5, 0.6 ), X =
c(528.04, 528.04, 528.04, 528.04, 528.04, 528.04), Y = c(10.32,
10.32, 10.32, 10.32, 10.32, 10.32), V = c(0, 0, 0, 0, 0, 0),
GD = c(0, 0, 0, 0, 0, 0), ND = c(NA, 0, 0, 0, 0, 0), ND2 = c(NA,
0, 0, 0, 0, 0), TID = structure(c(1L, 1L, 1L, 1L, 1L, 1L), .Label = c("t1",
"t10", "t100", "t101", "t102", "t103", "t104", "t105", "t106",
"t107", "t108", "t109", "t11", "t110", "t12", "t13", "t14",
"t15", "t16", "t17", "t18", "t19", "t2", "t20", "t21", "t22",
"t23", "t24", "t25", "t26", "t27", "t28", "t29", "t3", "t30",
"t31", "t32", "t33", "t34", "t35", "t36", "t37", "t38", "t39",
"t4", "t40", "t41", "t42", "t43", "t44", "t45", "t46", "t47",
"t48", "t49", "t5", "t50", "t51", "t52", "t53", "t54", "t55",
"t56", "t57", "t58", "t59", "t6", "t60", "t61", "t62", "t63",
"t64", "t65", "t66", "t67", "t68", "t69", "t7", "t70", "t71",
"t72", "t73", "t74", "t75", "t76", "t77", "t78", "t79", "t8",
"t80", "t81", "t82", "t83", "t84", "t85", "t86", "t87", "t88",
"t89", "t9", "t90", "t91", "t92", "t93", "t94", "t95", "t96",
"t97", "t98", "t99"), class = "factor")), .Names = c("A", "T", "X", "Y", "V", "GD", "ND", "ND2", "TID"), row.names = c(NA, 6L),
class = "data.frame")
Your function is not vectorized. Try this:
theta <- function(x,Y) apply(Y,1,function(y,x) acos( sum(x*y) / ( sqrt(sum(x^2)) * sqrt(sum(y^2)) ) ),x=x)
a<-c(503,391)
b <- DF[, c("X","Y")]
theta(a,b)
# 1 2 3 4 5 6
#0.6412264 0.6412264 0.6412264 0.6412264 0.6412264 0.6412264
There is a problem with the acos and atan functions in this application, as you cannot compute angles for the full circle, only for the plus quadrant. In 2D, you need two values to specify a vector, and you need two values (sin and cos) to define it in degrees/radians up to 2pi. Here is an example of the acos problem:
plot(seq(1,10,pi/20)) ## A sequence of numbers
plot(cos(seq(1,10,pi/20))) ## Their cosines
plot(acos(cos(seq(1,10,pi/20)))) ## NOT Back to the original sequence
Here's an idea:
angle <- circular::coord2rad(x, y)
plot(angle)
where "(x,y)" has "angle"
as.numeric(angle)
gives the angle in radians (0,360). To report geographical directions, convert to degrees, and other things, you can use the added parameters for the circular function, e.g.:
x <- coord2rad(ea,eo, control.circular = list(type = "directions",units = "degrees"))
plot(x)
as.numeric(x)

Resources