R understanding behaviour of as.matrix and matrix while plotting an image - r
I am reading data from MNIST handwritten digit dataset. Its first column is the digit label and rest 784 columns (28 X 28) are pixel values. Pixel values are between [0, 255]. Thus, there are a total of 1+784=785 columns. To plot a row as an image, I have to convert the 784 pixel values into a matrix of 28X28. In this conversion process, I am not clear about the behaviour of as.matrix() and matrix() functions. The code and comments are as follows:
> train <- read.csv("train.csv", header=TRUE)
# Read 2nd row. Ignore the label col and read rest of 784 columns
> data<-train[2,2:785]
# Convert data into matrix of 28X28
> data<-as.matrix(data,nrow=28,ncol=28)
> dim(data)
[1] 1 784 <= Failed to convert to 28X28
> class(data)
[1] "matrix" <= But class is matrix
# as.matrix() failed. Use matrix()
> data<-matrix(data,nrow=28,ncol=28)
> dim(data)
[1] 28 28 <= Matrix conversion success
Probably, the solution is to ignore as.matrix() altogether and just use matrix(). But, it so happens that in plotting the image as.matrix() does serve as a necessary intermediary. For example, the following code works to plot the image of digit zero:
train <- read.csv("train.csv", header=TRUE)
data<-train[2,2:785]
data<-as.matrix(data,nrow=28,ncol=28)
data<-matrix(data,nrow=28,ncol=28)
##Color ramp def.
colors<-c('white','black')
cus_col<-colorRampPalette(colors=colors)
image(1:28,1:28,data,main="IInd row",col=cus_col(256))
But, in the following code where I have removed as.matrix(), the code gives an error:
> train <- read.csv("train.csv", header=TRUE)
> data<-train[2,2:785]
> data<-matrix(data,nrow=28,ncol=28)
> ##Color ramp def.
> colors<-c('white','black')
> cus_col<-colorRampPalette(colors=colors)
> image(1:28,1:28,data,main="IInd row",col=cus_col(256))
Error in is.finite(z) : default method not implemented for type 'list'
I am unable to understand the role of as.matrix() and the meaning of this error. Kindly do let me know what is wrong.
EDITED
Here is what the first three rows of data look like:
label,pixel0,pixel1,pixel2,pixel3,pixel4,pixel5,pixel6,pixel7,pixel8,pixel9,pixel10,pixel11,pixel12,pixel13,pixel14,pixel15,pixel16,pixel17,pixel18,pixel19,pixel20,pixel21,pixel22,pixel23,pixel24,pixel25,pixel26,pixel27,pixel28,pixel29,pixel30,pixel31,pixel32,pixel33,pixel34,pixel35,pixel36,pixel37,pixel38,pixel39,pixel40,pixel41,pixel42,pixel43,pixel44,pixel45,pixel46,pixel47,pixel48,pixel49,pixel50,pixel51,pixel52,pixel53,pixel54,pixel55,pixel56,pixel57,pixel58,pixel59,pixel60,pixel61,pixel62,pixel63,pixel64,pixel65,pixel66,pixel67,pixel68,pixel69,pixel70,pixel71,pixel72,pixel73,pixel74,pixel75,pixel76,pixel77,pixel78,pixel79,pixel80,pixel81,pixel82,pixel83,pixel84,pixel85,pixel86,pixel87,pixel88,pixel89,pixel90,pixel91,pixel92,pixel93,pixel94,pixel95,pixel96,pixel97,pixel98,pixel99,pixel100,pixel101,pixel102,pixel103,pixel104,pixel105,pixel106,pixel107,pixel108,pixel109,pixel110,pixel111,pixel112,pixel113,pixel114,pixel115,pixel116,pixel117,pixel118,pixel119,pixel120,pixel121,pixel122,pixel123,pixel124,pixel125,pixel126,pixel127,pixel128,pixel129,pixel130,pixel131,pixel132,pixel133,pixel134,pixel135,pixel136,pixel137,pixel138,pixel139,pixel140,pixel141,pixel142,pixel143,pixel144,pixel145,pixel146,pixel147,pixel148,pixel149,pixel150,pixel151,pixel152,pixel153,pixel154,pixel155,pixel156,pixel157,pixel158,pixel159,pixel160,pixel161,pixel162,pixel163,pixel164,pixel165,pixel166,pixel167,pixel168,pixel169,pixel170,pixel171,pixel172,pixel173,pixel174,pixel175,pixel176,pixel177,pixel178,pixel179,pixel180,pixel181,pixel182,pixel183,pixel184,pixel185,pixel186,pixel187,pixel188,pixel189,pixel190,pixel191,pixel192,pixel193,pixel194,pixel195,pixel196,pixel197,pixel198,pixel199,pixel200,pixel201,pixel202,pixel203,pixel204,pixel205,pixel206,pixel207,pixel208,pixel209,pixel210,pixel211,pixel212,pixel213,pixel214,pixel215,pixel216,pixel217,pixel218,pixel219,pixel220,pixel221,pixel222,pixel223,pixel224,pixel225,pixel226,pixel227,pixel228,pixel229,pixel230,pixel231,pixel232,pixel233,pixel234,pixel235,pixel236,pixel237,pixel238,pixel239,pixel240,pixel241,pixel242,pixel243,pixel244,pixel245,pixel246,pixel247,pixel248,pixel249,pixel250,pixel251,pixel252,pixel253,pixel254,pixel255,pixel256,pixel257,pixel258,pixel259,pixel260,pixel261,pixel262,pixel263,pixel264,pixel265,pixel266,pixel267,pixel268,pixel269,pixel270,pixel271,pixel272,pixel273,pixel274,pixel275,pixel276,pixel277,pixel278,pixel279,pixel280,pixel281,pixel282,pixel283,pixel284,pixel285,pixel286,pixel287,pixel288,pixel289,pixel290,pixel291,pixel292,pixel293,pixel294,pixel295,pixel296,pixel297,pixel298,pixel299,pixel300,pixel301,pixel302,pixel303,pixel304,pixel305,pixel306,pixel307,pixel308,pixel309,pixel310,pixel311,pixel312,pixel313,pixel314,pixel315,pixel316,pixel317,pixel318,pixel319,pixel320,pixel321,pixel322,pixel323,pixel324,pixel325,pixel326,pixel327,pixel328,pixel329,pixel330,pixel331,pixel332,pixel333,pixel334,pixel335,pixel336,pixel337,pixel338,pixel339,pixel340,pixel341,pixel342,pixel343,pixel344,pixel345,pixel346,pixel347,pixel348,pixel349,pixel350,pixel351,pixel352,pixel353,pixel354,pixel355,pixel356,pixel357,pixel358,pixel359,pixel360,pixel361,pixel362,pixel363,pixel364,pixel365,pixel366,pixel367,pixel368,pixel369,pixel370,pixel371,pixel372,pixel373,pixel374,pixel375,pixel376,pixel377,pixel378,pixel379,pixel380,pixel381,pixel382,pixel383,pixel384,pixel385,pixel386,pixel387,pixel388,pixel389,pixel390,pixel391,pixel392,pixel393,pixel394,pixel395,pixel396,pixel397,pixel398,pixel399,pixel400,pixel401,pixel402,pixel403,pixel404,pixel405,pixel406,pixel407,pixel408,pixel409,pixel410,pixel411,pixel412,pixel413,pixel414,pixel415,pixel416,pixel417,pixel418,pixel419,pixel420,pixel421,pixel422,pixel423,pixel424,pixel425,pixel426,pixel427,pixel428,pixel429,pixel430,pixel431,pixel432,pixel433,pixel434,pixel435,pixel436,pixel437,pixel438,pixel439,pixel440,pixel441,pixel442,pixel443,pixel444,pixel445,pixel446,pixel447,pixel448,pixel449,pixel450,pixel451,pixel452,pixel453,pixel454,pixel455,pixel456,pixel457,pixel458,pixel459,pixel460,pixel461,pixel462,pixel463,pixel464,pixel465,pixel466,pixel467,pixel468,pixel469,pixel470,pixel471,pixel472,pixel473,pixel474,pixel475,pixel476,pixel477,pixel478,pixel479,pixel480,pixel481,pixel482,pixel483,pixel484,pixel485,pixel486,pixel487,pixel488,pixel489,pixel490,pixel491,pixel492,pixel493,pixel494,pixel495,pixel496,pixel497,pixel498,pixel499,pixel500,pixel501,pixel502,pixel503,pixel504,pixel505,pixel506,pixel507,pixel508,pixel509,pixel510,pixel511,pixel512,pixel513,pixel514,pixel515,pixel516,pixel517,pixel518,pixel519,pixel520,pixel521,pixel522,pixel523,pixel524,pixel525,pixel526,pixel527,pixel528,pixel529,pixel530,pixel531,pixel532,pixel533,pixel534,pixel535,pixel536,pixel537,pixel538,pixel539,pixel540,pixel541,pixel542,pixel543,pixel544,pixel545,pixel546,pixel547,pixel548,pixel549,pixel550,pixel551,pixel552,pixel553,pixel554,pixel555,pixel556,pixel557,pixel558,pixel559,pixel560,pixel561,pixel562,pixel563,pixel564,pixel565,pixel566,pixel567,pixel568,pixel569,pixel570,pixel571,pixel572,pixel573,pixel574,pixel575,pixel576,pixel577,pixel578,pixel579,pixel580,pixel581,pixel582,pixel583,pixel584,pixel585,pixel586,pixel587,pixel588,pixel589,pixel590,pixel591,pixel592,pixel593,pixel594,pixel595,pixel596,pixel597,pixel598,pixel599,pixel600,pixel601,pixel602,pixel603,pixel604,pixel605,pixel606,pixel607,pixel608,pixel609,pixel610,pixel611,pixel612,pixel613,pixel614,pixel615,pixel616,pixel617,pixel618,pixel619,pixel620,pixel621,pixel622,pixel623,pixel624,pixel625,pixel626,pixel627,pixel628,pixel629,pixel630,pixel631,pixel632,pixel633,pixel634,pixel635,pixel636,pixel637,pixel638,pixel639,pixel640,pixel641,pixel642,pixel643,pixel644,pixel645,pixel646,pixel647,pixel648,pixel649,pixel650,pixel651,pixel652,pixel653,pixel654,pixel655,pixel656,pixel657,pixel658,pixel659,pixel660,pixel661,pixel662,pixel663,pixel664,pixel665,pixel666,pixel667,pixel668,pixel669,pixel670,pixel671,pixel672,pixel673,pixel674,pixel675,pixel676,pixel677,pixel678,pixel679,pixel680,pixel681,pixel682,pixel683,pixel684,pixel685,pixel686,pixel687,pixel688,pixel689,pixel690,pixel691,pixel692,pixel693,pixel694,pixel695,pixel696,pixel697,pixel698,pixel699,pixel700,pixel701,pixel702,pixel703,pixel704,pixel705,pixel706,pixel707,pixel708,pixel709,pixel710,pixel711,pixel712,pixel713,pixel714,pixel715,pixel716,pixel717,pixel718,pixel719,pixel720,pixel721,pixel722,pixel723,pixel724,pixel725,pixel726,pixel727,pixel728,pixel729,pixel730,pixel731,pixel732,pixel733,pixel734,pixel735,pixel736,pixel737,pixel738,pixel739,pixel740,pixel741,pixel742,pixel743,pixel744,pixel745,pixel746,pixel747,pixel748,pixel749,pixel750,pixel751,pixel752,pixel753,pixel754,pixel755,pixel756,pixel757,pixel758,pixel759,pixel760,pixel761,pixel762,pixel763,pixel764,pixel765,pixel766,pixel767,pixel768,pixel769,pixel770,pixel771,pixel772,pixel773,pixel774,pixel775,pixel776,pixel777,pixel778,pixel779,pixel780,pixel781,pixel782,pixel783
1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,188,255,94,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,191,250,253,93,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,123,248,253,167,10,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,80,247,253,208,13,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,29,207,253,235,77,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,54,209,253,253,88,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,93,254,253,238,170,17,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,23,210,254,253,159,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,16,209,253,254,240,81,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,27,253,253,254,13,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,20,206,254,254,198,7,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,168,253,253,196,7,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,20,203,253,248,76,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,22,188,253,245,93,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,103,253,253,191,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,89,240,253,195,25,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,15,220,253,253,80,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,94,253,253,253,94,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,89,251,253,250,131,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,214,218,95,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,18,30,137,137,192,86,72,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,13,86,250,254,254,254,254,217,246,151,32,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,16,179,254,254,254,254,254,254,254,254,254,231,54,15,0,0,0,0,0,0,0,0,0,0,0,0,0,0,72,254,254,254,254,254,254,254,254,254,254,254,254,104,0,0,0,0,0,0,0,0,0,0,0,0,0,61,191,254,254,254,254,254,109,83,199,254,254,254,254,243,85,0,0,0,0,0,0,0,0,0,0,0,0,172,254,254,254,202,147,147,45,0,11,29,200,254,254,254,171,0,0,0,0,0,0,0,0,0,0,0,1,174,254,254,89,67,0,0,0,0,0,0,128,252,254,254,212,76,0,0,0,0,0,0,0,0,0,0,47,254,254,254,29,0,0,0,0,0,0,0,0,83,254,254,254,153,0,0,0,0,0,0,0,0,0,0,80,254,254,240,24,0,0,0,0,0,0,0,0,25,240,254,254,153,0,0,0,0,0,0,0,0,0,0,64,254,254,186,7,0,0,0,0,0,0,0,0,0,166,254,254,224,12,0,0,0,0,0,0,0,0,14,232,254,254,254,29,0,0,0,0,0,0,0,0,0,75,254,254,254,17,0,0,0,0,0,0,0,0,18,254,254,254,254,29,0,0,0,0,0,0,0,0,0,48,254,254,254,17,0,0,0,0,0,0,0,0,2,163,254,254,254,29,0,0,0,0,0,0,0,0,0,48,254,254,254,17,0,0,0,0,0,0,0,0,0,94,254,254,254,200,12,0,0,0,0,0,0,0,16,209,254,254,150,1,0,0,0,0,0,0,0,0,0,15,206,254,254,254,202,66,0,0,0,0,0,21,161,254,254,245,31,0,0,0,0,0,0,0,0,0,0,0,60,212,254,254,254,194,48,48,34,41,48,209,254,254,254,171,0,0,0,0,0,0,0,0,0,0,0,0,0,86,243,254,254,254,254,254,233,243,254,254,254,254,254,86,0,0,0,0,0,0,0,0,0,0,0,0,0,0,114,254,254,254,254,254,254,254,254,254,254,239,86,11,0,0,0,0,0,0,0,0,0,0,0,0,0,0,13,182,254,254,254,254,254,254,254,254,243,70,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,8,76,146,254,255,254,255,146,19,15,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
Difference is explained in the documentation.
https://stat.ethz.ch/R-manual/R-devel/library/base/html/matrix.html
Also all rows & columns of a matrix must have the same class (numeric, character, etc). In a dataframe, you can have some of each. Sometimes R is not able to convert different classes when using as.matrix()
Related
R studio doesn't find objects in my function
I’m new to programming and I’m currently writing a function to go through hundreds of csv files in the working directory. The files have tons of NA values in it. The function (which I call it corr) has two parameters, the directory, and a threshold value (numeric vector of length 1 indicating the number of complete cases). The purpose of the function is to take the complete cases for two columns that are sulfate and nitrate(second and third column in the spreadsheet) and calculate the correlation between them if the number of complete cases is greater than the threshold parameter. The function should return a vector with the correlation if it met the threshold requirement (the default threshold value is 0). When I run the code I get back two of the following: A + sign in the console OR 2.The objects I created in the function can't be found. Any help would be much appreciated. Thank you in advance! corr <- function(directory, threshold=0){ filelist2<- data.frame(list.files(path=directory, pattern=".csv", full.names=TRUE)) corvector <- numeric() for(i in 1:length(filelist2)){ data <-data.frame(read.csv(filelist2[i])) removedNA<-complete.cases(data) newdata<-data[removedNA,2:3] if(nrow(removedNA) > threshold){ corvector<-c(corvector, cor(data$sulfate, data$nitrate )) } } corvector }
I don't think your nrow(removedNA) does what you think it does. To replicate the example I use the mtcars dataset. data <- mtcars # create dataset data[2:4, 2] <- NA # create some missings in column 2 data[15:17, 3] <- NA # create some missing in column 3 removedNA <- complete.cases(data) table(removedNA) # 6 missings indeed nrow(removedNA) # NULL removedNA is no data.frame, so nrow() doesn't work newdata <- data[removedNA, 2:3] # this works though nrow(newdata) # and this shows the rows in 'newdata' #---- therefore instead of nrow(removedNA) try if(nrow(data)-nrow(newdata) < threshold) { ... } NB: I changed the > in < in the line with threshold. I guess it depends on whether you want to set an absolute minimum number of lines (in which cases you could simply use nrow(newdata) > threshold) as threshold, or whether you want the threshold to reflect the different number of lines in the original data and 'new' data.
R in counting data
Right now I'm trying to do a bell curve on a file called output9.csv on my. Here is my code, I want to uses z score to detect outliers, and uses the difference between the value and mean of the data set.The difference is compared with standard deviation to find the outliers. va #DATA LOAD data <- read.csv('output9.csv') height <- data$Height hist(height) #histogram #POPULATION PARAMETER CALCULATIONS pop_sd <- sd(height)*sqrt((length(height)-1)/(length(height))) pop_mean <- mean(height) But I have this error after I tried the histogram part, > hist(height) Error in hist.default(height) : 'x' must be numeric how should I fix this?
Since I don't have your data I can only guess. Can you provide it? Or at least a portion of it? What class is your data? You can use class(data) to find out. The most common way is to have table-like data in data.frames. To subset one of your columns to use it for the hist you can use the $ operator. Be sure you subset on a column that actually exists. You can use names(data) (if data is a data.frame) to find out what columns exist in your data. Use nrow(data) to find out how many rows there are in your data. After extracting your height you can go further. First check that your height object is numeric and has something in it. You can use class(height) to find out. As you posted in your comment you have the following names names(data) # [1] "Host" "TimeStamp" "TimeZone" "Command" "RequestLink" "HTTP" [7] "ReplyCode" "Bytes" Therefore you can extract your height with height <- data$Bytes Did you try to convert it to numeric? as.numeric(height) might do the trick. as.numeric() can coerce all things that are stored as characters but might also be numbers automatically. Try as.numeric("3") as an example. Here an example I made up. height <- c(1,1,2,3,1) class(height) # [1] "numeric" hist(height) This works just fine, because the data is numeric. In the following the data are numbers but formatted as characters. height_char <- c("1","1","2","3","1") class(height_char) # [1] "character" hist(height_char) # Error in hist.default(height) : 'x' must be numeric So you have to coerce it first: hist(as.numeric(height_char)) ..and then it works fine. For future questions: Try to give Minimal, Complete, and Verifiable Examples.
xgb.DMatrix Error: The length of labels must equal to the number of rows in the input data
I am using xgboost in R. I created the xgb matrix fine using a matrix as input, but when I reduce the number in columns in the matrix data, I receive an error. This works: > dim(ctt1) [1] 6401 5901 > xgbmat1 <- xgb.DMatrix( Matrix(data.matrix(ctt1)), label = as.matrix(as.numeric(data$V2)) - 1 ) This does not: > dim(ctt1[,nr]) [1] 6401 1048 xgbmat1 <- xgb.DMatrix( Matrix(data.matrix(ctt1[,nr])), label = as.matrix(as.numeric(data$V2)) - 1) Error in xgb.setinfo(dmat, names(p), p[[1]]) : The length of labels must equal to the number of rows in the input data
In my case I fixed this error by changing assign operation: labels <- df_train$target_feature
It turns out that by removing some columns, there are some rows with all 0s, and could not contribute to model.
For sparse matrices, xgboost R interface uses the CSC format creation method. The problem currently is that this method automatically determines the number of rows from the existing non-sparse values, and any completely sparse rows at the end are not counted in. A similar loss of completely sparse columns at the end can happen with the CSR sparse format. For more details see xgboost issue #1223 and also wikipedia on the sparse matrix formats.
The proper way for creating the DBMatrix Like xgtrain <- xgb.DMatrix(data = as.matrix(X_train[,-5]), label = `X_train$item_cnt_month)` drop the label column in data parameter and use same data set for create label column in index five i have item_cnt_month i drop it at run time and use same data set for referring label column
Before splitting your data, you need to turn it into a data frame. For Exemplo: data <- read.csv(...) data = as.data.frame(data) Now you can set your train data and test data to use in your "sparse.model.matrix" and "xgb.DMatrix".
Removing/parsing rows from a matrix in R
I'm trying to parse out specific rows from a data matrix. The actual data is numeric and comprises a single column. I've used this method before for other data, and I cannot figure out why this isn't working. csize = data.matrix(wc$Csize) length(csize) [1] 134 csize[-111,][-110,][-107,][-105,][-104,][-94,][-88,][-68,][-58,][-57,][-56,][-30,][-22,][,1] Error in csize[-111, ][-110, ] : incorrect number of dimensions Here is the code that does work for me with other data: w.pc.res <- prcomp(sizeshapew) w.pcdata <- w.pc.res$x length(w.pcdata) [1] 11792 w.pcdata[-111,][-110,][-107,][-105,][-104,][-94,][-88,][-68,][-58,][-57,][-56,][-30,][-22,][,1]
I don't think it likes the multiple subscripting, just provide the subscripts in a vector e.g csize[c(-111, -110, ...),]
R what does 2 commas mean?
I'm looking at an example for the knnflex package and they setup a training and test set using the following: train <- rbind(iris3[1:25,,1], iris3[1:25,,2], iris3[1:25,,3]) test <- rbind(iris3[26:50,,1], iris3[26:50,,2], iris3[26:50,,3]) My questions is how does this differ from : train <- rbind(iris3[1:25,1], iris3[1:25,2], iris3[1:25,3]) test <- rbind(iris3[26:50,1], iris3[26:50,2], iris3[26:50,3])
Two commas means there were more than two dimensions and you selected all of the items in the dimension that could have been specified between the two commas. For example, imagine a cube instead of a square, with all of the data in it. You can select row, height, and depth. If you select [row,,depth], then you will have selected an entire column in the cube at that row and depth. The principle is the same up to larger dimensions but harder to describe.
Why don't you just try? > train <- rbind(iris3[1:25,,1], iris3[1:25,,2], iris3[1:25,,3]) > test <- rbind(iris3[26:50,,1], iris3[26:50,,2], iris3[26:50,,3]) > train <- rbind(iris3[1:25,1], iris3[1:25,2], iris3[1:25,3]) Error in iris3[1:25, 1] : incorrect number of dimensions > test <- rbind(iris3[26:50,1], iris3[26:50,2], iris3[26:50,3]) Error in iris3[26:50, 1] : incorrect number of dimensions More generally, leaving an index unspecified selects all entries for that index: > mtx<-matrix(c(1,2,3,4),nrow=2) > mtx [,1] [,2] [1,] 1 3 [2,] 2 4 > mtx[1,] [1] 1 3 > mtx[,1] [1] 1 2
The difference is between iris and iris3, iris3 is a 3- dimensional matrix, contains the same data as iris. But it stores it differently. You can see iris as a bidimensional matrix Check the link below https://stat.ethz.ch/R-manual/R-devel/library/datasets/html/iris.html