Moving the last column to a nth place in R [duplicate] - r

This question already has answers here:
How does one reorder columns in a data frame?
(12 answers)
Closed 2 years ago.
Good Day
I am trying to move the last column of a dataset to be the third column in a dataframe in R and was wondering what would be the most efficient way to do this.
My DataFrame structure is as follows:
str(HR)
'data.frame': 2940 obs. of 36 variables:
$ EmployeeNumber : int 1 2 3 4 5 6 7 8 9 10 ...
$ Attrition : Factor w/ 2 levels "No","Yes": 2 1 2 1 1 1 1 1 1 1 ...
$ Age : int 41 49 37 33 27 32 59 30 38 36 ...
$ BusinessTravel : Factor w/ 3 levels "Non-Travel","Travel_Frequently",..: 3 2 3 2 3 2 3 3 2 3
$ DailyRate : int 1102 279 1373 1392 591 1005 1324 1358 216 1299 ...
$ Department : Factor w/ 3 levels "Human Resources",..: 3 2 2 2 2 2 2 2 2 2 ...
$ DistanceFromHome : int 1 8 2 3 2 2 3 24 23 27 ...
$ Education : int 2 1 2 4 1 2 3 1 3 3 ...
$ EducationField : Factor w/ 6 levels "Human Resources",..: 2 2 5 2 4 2 4 2 2 4 ...
$ EmployeeCount : int 1 1 1 1 1 1 1 1 1 1 ...
$ EnvironmentSatisfaction : int 2 3 4 4 1 4 3 4 4 3 ...
$ Gender : Factor w/ 2 levels "Female","Male": 1 2 2 1 2 2 1 2 2 2 ...
$ HourlyRate : int 94 61 92 56 40 79 81 67 44 94 ...
$ JobInvolvement : int 3 2 2 3 3 3 4 3 2 3 ...
$ JobLevel : int 2 2 1 1 1 1 1 1 3 2 ...
$ JobRole : Factor w/ 9 levels "Healthcare Representative",..: 8 7 3 7 3 3 3 3 5 1 ...
$ JobSatisfaction : int 4 2 3 3 2 4 1 3 3 3 ...
$ MaritalStatus : Factor w/ 3 levels "Divorced","Married",..: 3 2 3 2 2 3 2 1 3 2 ...
$ MonthlyIncome : int 5993 5130 2090 2909 3468 3068 2670 2693 9526 5237 ...
$ MonthlyRate : int 19479 24907 2396 23159 16632 11864 9964 13335 8787 16577 ...
$ NumCompaniesWorked : int 8 1 6 1 9 0 4 1 0 6 ...
$ Over18 : Factor w/ 1 level "Y": 1 1 1 1 1 1 1 1 1 1 ...
$ OverTime : Factor w/ 2 levels "No","Yes": 2 1 2 2 1 1 2 1 1 1 ...
$ PercentSalaryHike : int 11 23 15 11 12 13 20 22 21 13 ...
$ PerformanceRating : int 3 4 3 3 3 3 4 4 4 3 ...
$ RelationshipSatisfaction: int 1 4 2 3 4 3 1 2 2 2 ...
$ StandardHours : int 80 80 80 80 80 80 80 80 80 80 ...
$ StockOptionLevel : int 0 1 0 0 1 0 3 1 0 2 ...
$ TotalWorkingYears : int 8 10 7 8 6 8 12 1 10 17 ...
$ TrainingTimesLastYear : int 0 3 3 3 3 2 3 2 2 3 ...
$ WorkLifeBalance : int 1 3 3 3 3 2 2 3 3 2 ...
$ YearsAtCompany : int 6 10 0 8 2 7 1 1 9 7 ...
$ YearsInCurrentRole : int 4 7 0 7 2 7 0 0 7 7 ...
$ YearsSinceLastPromotion : int 0 1 0 3 2 3 0 0 1 7 ...
$ YearsWithCurrManager : int 5 7 0 0 2 6 0 0 8 7 ...
$ AttritionB : num 1 0 1 0 0 0 0 0 0 0 ...
and I am trying to have AttritionB come after Attrition.
HRCorForm = HR[,c(1,2,36:35)], I have tried this code however it negates the rest of the columns
Kind Regards
Rehaan

This will get all your columns:
HRCorForm = HR[,c(1,2,36,3:35)]

Related

group by and sum not working as expected in R

Hi I have a simple dataframe with this structure
> str(allvalues)
'data.frame': 150 obs. of 8 variables:
$ seriesId : Factor w/ 1 level "2021-02-28T00:00:00Z": 1 1 1 1 1 1 1 1 1 1 ...
$ forecastPoint : Factor w/ 30 levels "790","791","792",..: 1 2 3 4 5 6 7 8 9 10 ...
$ rowId : Factor w/ 30 levels "2021-03-01T00:00:00.000000Z",..: 1 2 3 4 5 6 7 8 9 10 ...
$ timestamp : Factor w/ 65 levels "1842.6640625",..: 7 8 9 11 14 4 1 16 12 18 ...
$ predictionValues: Factor w/ 1 level "total_visits (actual)": 1 1 1 1 1 1 1 1 1 1 ...
$ forecastDistance: Factor w/ 30 levels "1","10","11",..: 1 12 23 25 26 27 28 29 30 2 ...
$ prediction : num 2111 2130 2258 2276 2298 ...
$ scenario : Factor w/ 5 levels "0 0 10 10 10",..: 4 4 4 4 4 4 4 4 4 4 ...
and I want to group by "scenario" and sum "prediction"
but when I use
> allvalues %>% group_by(scenario) %>% summarise(cond_disp = sum(prediction))
cond_disp
1 351940.8
Is not grouping by scenarios, there should be 5 rows, each scenario and the sum
any help on what I am doing wrong?

Correlation with discrete and categoric variables in R

I am analyzing this dataset it has numeric and factor variable. I would like to know the correlation so I can choose the best variables.
str(data)
$ Ag : num [1:1470] 41 49 37 33 27 32 59 30 38 36 ...
$ Ay : Factor w/ 2 levels "No","Yes": 2 1 2 1 1 1 1 1 1 1 ...
$ Bu : Factor w/ 3 levels "Non-Travel","Travel_Frequently",..: 3 2 3 2 3 2 3 3 2 3 ...
$ Di : num [1:1470] 1 8 2 3 2 2 3 24 23 27 ...
$ Ed : num [1:1470] 2 1 2 4 1 2 3 1 3 3 ...
$ Ep : num [1:1470] 1 1 1 1 1 1 1 1 1 1 ...
$ Em : num [1:1470] 1 2 4 5 7 8 10 11 12 13 ...
$ Ge : Factor w/ 2 levels "Female","Male": 1 2 2 1 2 2 1 2 2 2 ...
$ Ho : num [1:1470] 94 61 92 56 40 79 81 67 44 94 ...
$ J1 : num [1:1470] 3 2 2 3 3 3 4 3 2 3 ...
$ J2 : num [1:1470] 2 2 1 1 1 1 1 1 3 2 ...
When I execute this(althought I want correlations of all data not only numeric) :
cor(data[sapply(data, is.numeric)])
I return this message:
Warning message:
In cor(data[sapply(data, is.numeric)]) :
the standard deviation is zero
It just politely lets you know that you set out to calculate correlation where one of the variables is constant. This often pointless.
Just filter that out aswell
x1 <- data[sapply(data,is.numeric)]
x2 <- x1[sapply(x1,sd)!=0]
cor(x2)

why levels of just one variable change after the properly combination of two dataframes?and how should deal it?

I have two dataframes. My first dataframe contains 16 different Lines (genotypes) and due to the different number of plants of each line in my experiment, the str() command shows 145 observation and t 16 levels for my Line variable; as you can see here
data.frame': 145 obs. of 15 variables:
$ Plate.NO. : int 1 1 1 1 1 1 1 1 1 1 ...
$ Line : Factor w/ 16 levels "L000049","L000154",..: 15 15 15 15 15 7 7 7 7 7 ...
$ Strain : Factor w/ 2 levels "AF1","V31-2": 1 1 1 1 1 1 1 1 1 1 …
$ Plant.number: int 1 2 3 4 5 1 2 3 4 5 ...
$ X0DPI : num 0 0 0 0 0 0 0 0 0 0 ...
$ X7DPI : num 0 0 0 0 0 0 0 0 0 0 ...
$ X10DPI : num 0 0 0 0 1 0 0 0 2 0 ...
$ X12DPI : num 0.5 0 0 0 2 3 2.5 2.5 2 3 ...
$ X14DPI : num 2.5 1 0 0 2 3 2.5 2.5 2.5 3 ...
$ X17DPI : num 4 1 1 0 3 4 2.5 4 3 3 ...
$ X19DPI : num 4 1 1 1 4 4 2.5 4 3 4 ...
$ X21DPI : num 4 1.5 2 1 4 4 3.5 4 4 4 ...
$ X24DPI : num 4 3 2 1 4 4 4 4 4 4 ...
$ X26DPI : num 4 3 2 1 4 4 4 4 4 4 ...
$ X28DPI : num 4 3.5 2.5 1.5 4 4 4 4 4 4 ...
Also, I have the second dataframe which consists more complementary information for 252 Lines. Here you can see the str() result for my second dataframe
data.frame': 252 obs. of 7 variables:
$ ID : Factor w/ 252 levels "HM001 ","HM002 ",..: 1 2 3 4 5 6 7 8 9 10 ...
$ Line : Factor w/ 252 levels "A10","A20","CADL",..: 31 38 175 207 206 169 197 ...
$ Population.of.Origin: Factor w/ 252 levels "A10 ","A17_Varma ",..: 157 167 55 53 51 110 ...
$ Country.of.Origin : Factor w/ 19 levels "Algeria ","Cyprus ",..: 16 2 14 1 1 3 3 1 5 8 ...
$ Category : Factor w/ 16 levels "alfalfa ","CC144 ",..: 7 7 7 7 7 7 7 7 3 3 ...
$ Seeds.From : Factor w/ 6 levels "Charlie_Brummer,UGA ",..: 4 4 4 4 4 4 4 4 4 4 ...
$ Status : Factor w/ 2 levels "Failed.QA","Processed": 2 2 2 2 2 2 2 2 2 2 …
Out of this 252 Lines I used only 16 Lines for part of my experiment and I want to combine these two dataframes
The first dataframe object is “Rep1” (the one with only 16 Lines) and the second one is called “hap”(the one with 252 Lines)
I used these series of commands
inner<-inner_join(Rep1,hap, by = "Line")
left<- left_join(Rep1,hap, "Line")
right←right_join(hap,Rep1,"Line")
the combination take place without any problem and I have just the rows for my 16 Lines but surprisingly when the str() output shows me 252 levels for Line instead of 16 while the number of observation is correct
here is the str() output of my datafram after combination
'data.frame': 145 obs. of 21 variables:
$ ID : Factor w/ 252 levels "HM001 ","HM002 ",..: 1 1 1 1 1 1 2 2 2 2 ...
$ Line : Factor w/ 252 levels "A10","A20","CADL",..: 31 31 31 31 31 31 38 38 38 38 ...
$ Population.of.Origin: Factor w/ 252 levels "A10 ","A17_Varma ",..: 157 157 157 157 157 157 167 167 167 167 ...
$ Country.of.Origin : Factor w/ 19 levels "Algeria ","Cyprus ",..: 16 16 16 16 16 16 2 2 2 2 ...
$ Category : Factor w/ 16 levels "alfalfa ","CC144 ",..: 7 7 7 7 7 7 7 7 7 7 ...
$ Seeds.From : Factor w/ 6 levels "Charlie_Brummer,UGA ",..: 4 4 4 4 4 4 4 4 4 4 ...
$ Status : Factor w/ 2 levels "Failed.QA","Processed": 2 2 2 2 2 2 2 2 2 2 ...
$ Plate.NO. : int 2 2 2 5 5 5 1 1 1 1 ...
$ Strain : Factor w/ 2 levels "AF1","V31-2": 1 1 1 2 2 2 1 1 1 1 ...
$ Plant.number : int 1 2 3 1 2 3 1 2 3 4 ...
$ X0DPI : num 0 0 0 0 0 0 0 0 0 0 ...
$ X7DPI : num 0 0 0 0 0 0 0 0 0 0 ...
$ X10DPI : num 0 0.5 3 3 2 1 1 0.5 0 0 ...
$ X12DPI : num 0 1.5 3 3 3 3 1 3 0 0 ...
$ X14DPI : num 0.5 3 4 3 3.5 4 2.5 3 1 0 ...
$ X17DPI : num 1 4 4 3 4 4 3 4 1.5 0 ...
$ X19DPI : num 1.5 4 4 4 4 4 4 4 1.5 0 ...
$ X21DPI : num 2 4 4 4 4 4 4 4 1.5 0 ...
$ X24DPI : num 2 4 4 4 4 4 4 4 1.5 1 ...
$ X26DPI : num 3 4 4 4 4 4 4 4 1.5 1 ...
$ X28DPI : num 3.5 4 4 4 4 4 4 4 2 1 ...

binomial regression model produces glm.fit error

I have data like that below:
data.frame': 1460 obs. of 81 variables:
$ Id : int 1 2 3 4 5 6 7 8 9 10 ...
$ MSSubClass : int 60 20 60 70 60 50 20 60 50 190 ...
$ MSZoning : Factor w/ 5 levels "C (all)","FV",..: 4 4 4 4 4 4 4 4 5 4 ...
$ LotFrontage : int 65 80 68 60 84 85 75 NA 51 50 ...
$ LotArea : int 8450 9600 11250 9550 14260 14115 10084 10382 6120 7420 ...
$ Street : Factor w/ 2 levels "Grvl","Pave": 2 2 2 2 2 2 2 2 2 2 ...
$ Alley : Factor w/ 2 levels "Grvl","Pave": NA NA NA NA NA NA NA NA NA NA ...
$ LotShape : Factor w/ 4 levels "IR1","IR2","IR3",..: 4 4 1 1 1 1 4 1 4 4 ...
$ LandContour : Factor w/ 4 levels "Bnk","HLS","Low",..: 4 4 4 4 4 4 4 4 4 4 ...
$ Utilities : Factor w/ 2 levels "AllPub","NoSeWa": 1 1 1 1 1 1 1 1 1 1 ...
$ LotConfig : Factor w/ 5 levels "Corner","CulDSac",..: 5 3 5 1 3 5 5 1 5 1 ...
$ LandSlope : Factor w/ 3 levels "Gtl","Mod","Sev": 1 1 1 1 1 1 1 1 1 1 ...
$ Neighborhood : Factor w/ 25 levels "Blmngtn","Blueste",..: 6 25 6 7 14 12 21 17 18 4 ...
$ Condition1 : Factor w/ 9 levels "Artery","Feedr",..: 3 2 3 3 3 3 3 5 1 1 ...
$ Condition2 : Factor w/ 8 levels "Artery","Feedr",..: 3 3 3 3 3 3 3 3 3 1 ...
$ BldgType : Factor w/ 5 levels "1Fam","2fmCon",..: 1 1 1 1 1 1 1 1 1 2 ...
$ HouseStyle : Factor w/ 8 levels "1.5Fin","1.5Unf",..: 6 3 6 6 6 1 3 6 1 2 ...
$ OverallQual : int 7 6 7 7 8 5 8 7 7 5 ...
$ OverallCond : int 5 8 5 5 5 5 5 6 5 6 ...
$ YearBuilt : int 2003 1976 2001 1915 2000 1993 2004 1973 1931 1939 ...
$ YearRemodAdd : int 2003 1976 2002 1970 2000 1995 2005 1973 1950 1950 ...
$ RoofStyle : Factor w/ 6 levels "Flat","Gable",..: 2 2 2 2 2 2 2 2 2 2 ...
$ RoofMatl : Factor w/ 8 levels "ClyTile","CompShg",..: 2 2 2 2 2 2 2 2 2 2 ...
$ Exterior1st : Factor w/ 15 levels "AsbShng","AsphShn",..: 13 9 13 14 13 13 13 7 4 9 ...
$ Exterior2nd : Factor w/ 16 levels "AsbShng","AsphShn",..: 14 9 14 16 14 14 14 7 16 9 ...
$ MasVnrType : Factor w/ 4 levels "BrkCmn","BrkFace",..: 2 3 2 3 2 3 4 4 3 3 ...
$ MasVnrArea : int 196 0 162 0 350 0 186 240 0 0 ...
$ ExterQual : Factor w/ 4 levels "Ex","Fa","Gd",..: 3 4 3 4 3 4 3 4 4 4 ...
$ ExterCond : Factor w/ 5 levels "Ex","Fa","Gd",..: 5 5 5 5 5 5 5 5 5 5 ...
$ Foundation : Factor w/ 6 levels "BrkTil","CBlock",..: 3 2 3 1 3 6 3 2 1 1 ...
$ BsmtQual : Factor w/ 4 levels "Ex","Fa","Gd",..: 3 3 3 4 3 3 1 3 4 4 ...
$ BsmtCond : Factor w/ 4 levels "Fa","Gd","Po",..: 4 4 4 2 4 4 4 4 4 4 ...
$ BsmtExposure : Factor w/ 4 levels "Av","Gd","Mn",..: 4 2 3 4 1 4 1 3 4 4 ...
$ BsmtFinType1 : Factor w/ 6 levels "ALQ","BLQ","GLQ",..: 3 1 3 1 3 3 3 1 6 3 ...
$ BsmtFinSF1 : int 706 978 486 216 655 732 1369 859 0 851 ...
$ BsmtFinType2 : Factor w/ 6 levels "ALQ","BLQ","GLQ",..: 6 6 6 6 6 6 6 2 6 6 ...
$ BsmtFinSF2 : int 0 0 0 0 0 0 0 32 0 0 ...
$ BsmtUnfSF : int 150 284 434 540 490 64 317 216 952 140 ...
$ TotalBsmtSF : int 856 1262 920 756 1145 796 1686 1107 952 991 ...
$ Heating : Factor w/ 6 levels "Floor","GasA",..: 2 2 2 2 2 2 2 2 2 2 ...
$ HeatingQC : Factor w/ 5 levels "Ex","Fa","Gd",..: 1 1 1 3 1 1 1 1 3 1 ...
$ CentralAir : Factor w/ 2 levels "N","Y": 2 2 2 2 2 2 2 2 2 2 ...
$ Electrical : Factor w/ 5 levels "FuseA","FuseF",..: 5 5 5 5 5 5 5 5 2 5 ...
$ X1stFlrSF : int 856 1262 920 961 1145 796 1694 1107 1022 1077 ...
$ X2ndFlrSF : int 854 0 866 756 1053 566 0 983 752 0 ...
$ LowQualFinSF : int 0 0 0 0 0 0 0 0 0 0 ...
$ GrLivArea : int 1710 1262 1786 1717 2198 1362 1694 2090 1774 1077 ...
$ BsmtFullBath : int 1 0 1 1 1 1 1 1 0 1 ...
$ BsmtHalfBath : int 0 1 0 0 0 0 0 0 0 0 ...
$ FullBath : int 2 2 2 1 2 1 2 2 2 1 ...
$ HalfBath : int 1 0 1 0 1 1 0 1 0 0 ...
$ BedroomAbvGr : int 3 3 3 3 4 1 3 3 2 2 ...
$ KitchenAbvGr : int 1 1 1 1 1 1 1 1 2 2 ...
$ KitchenQual : Factor w/ 4 levels "Ex","Fa","Gd",..: 3 4 3 3 3 4 3 4 4 4 ...
$ TotRmsAbvGrd : int 8 6 6 7 9 5 7 7 8 5 ...
$ Functional : Factor w/ 7 levels "Maj1","Maj2",..: 7 7 7 7 7 7 7 7 3 7 ...
$ Fireplaces : int 0 1 1 1 1 0 1 2 2 2 ...
$ FireplaceQu : Factor w/ 5 levels "Ex","Fa","Gd",..: NA 5 5 3 5 NA 3 5 5 5 ...
$ GarageType : Factor w/ 6 levels "2Types","Attchd",..: 2 2 2 6 2 2 2 2 6 2 ...
$ GarageYrBlt : int 2003 1976 2001 1998 2000 1993 2004 1973 1931 1939 ...
$ GarageFinish : Factor w/ 3 levels "Fin","RFn","Unf": 2 2 2 3 2 3 2 2 3 2 ...
$ GarageCars : int 2 2 2 3 3 2 2 2 2 1 ...
$ GarageArea : int 548 460 608 642 836 480 636 484 468 205 ...
$ GarageQual : Factor w/ 5 levels "Ex","Fa","Gd",..: 5 5 5 5 5 5 5 5 2 3 ...
$ GarageCond : Factor w/ 5 levels "Ex","Fa","Gd",..: 5 5 5 5 5 5 5 5 5 5 ...
$ PavedDrive : Factor w/ 3 levels "N","P","Y": 3 3 3 3 3 3 3 3 3 3 ...
$ WoodDeckSF : int 0 298 0 0 192 40 255 235 90 0 ...
$ OpenPorchSF : int 61 0 42 35 84 30 57 204 0 4 ...
$ EnclosedPorch: int 0 0 0 272 0 0 0 228 205 0 ...
$ X3SsnPorch : int 0 0 0 0 0 320 0 0 0 0 ...
$ ScreenPorch : int 0 0 0 0 0 0 0 0 0 0 ...
$ PoolArea : int 0 0 0 0 0 0 0 0 0 0 ...
$ PoolQC : Factor w/ 3 levels "Ex","Fa","Gd": NA NA NA NA NA NA NA NA NA NA ...
$ Fence : Factor w/ 4 levels "GdPrv","GdWo",..: NA NA NA NA NA 3 NA NA NA NA ...
$ MiscFeature : Factor w/ 4 levels "Gar2","Othr",..: NA NA NA NA NA 3 NA 3 NA NA ...
$ MiscVal : int 0 0 0 0 0 700 0 350 0 0 ...
$ MoSold : int 2 5 9 2 12 10 8 11 4 1 ...
$ YrSold : int 2008 2007 2008 2006 2008 2009 2007 2009 2008 2008 ...
$ SaleType : Factor w/ 9 levels "COD","Con","ConLD",..: 9 9 9 9 9 9 9 9 9 9 ...
$ SaleCondition: Factor w/ 6 levels "Abnorml","AdjLand",..: 5 5 5 1 5 5 5 5 1 5 ...
$ SalePrice : int 208500 181500 223500 140000 250000 143000 307000 200000 129900 118000 ...
I would like to make a GLM to predict SalePrice from all of the other variables.
After I write this:
cena_nieruchomości.lm <- glm(SalePrice~.,
data=nieruchimości,family=binomial(logit))
I am getting an error:
contrasts can be applied only to factors with 2 or more levels.
I have read that it might occur because of NA values in my data. So I tried:
cena_nieruchomości.lm <- glm(SalePrice~.,
data=nieruchimości,family=binomial("logit"), na.action=na.pass)
Then I get the next error:
Error in glm.fit(x = c(1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
Could someone please tell what I'm doing wrong and how to avoid this error? Could it be because SalePrice is int (should it be a factor?)
SalePrice is an interval/continuous variable. family=binomial('logit') in your glm() call is for fitting logistic regression which assumes you have a dependent variable that only takes on two values.
Given your dependent variable logistic regression is not the right choice. You would do better with just estimating a linear model with lm():
cena_nieruchomości.lm <- lm(SalePrice~.,
data=nieruchimości)

C5.0 decision tree - input string 1 is invalid in this locale

I have read the questions related before, but still can not solve my problem, my training data does not have missing values, so I don't know where it was wrong.
Another problem is the tree size is 1, all predicted results are 0 (label is 0 or 1 ). I know this is an extremely unbalanced case (the 0 label take up 98%), how do I solve the problem?
model_boost<-C5.0(train,train_label)
Error:
c50 code called exit with value 1
Warning message:
In strsplit(Z$output, "\n"): input string 1 is invalid in this locale
training data:
str(train)
'data.frame': 7500 obs. of 148 variables:
$ CI_CUSTYPE : Factor w/ 4 levels "个人","家庭",..: 2 2 2 2 2 2 2 2 1 2 ...
$ CI_COUNTRY_FLAG : Factor w/ 3 levels "1","2","3": 3 2 3 2 2 2 2 2 2 1 ...
$ CI_AGE : int -1 44 31 53 58 -1 -1 46 43 61 ...
$ CI_GENDER : Factor w/ 3 levels "男","女","未知": 3 1 1 2 2 3 3 2 2 1 ...
$ CI_CITY : Factor w/ 21 levels "阿坝","巴中",..: 16 18 9 3 3 4 5 1 3 19 ...
$ CI_TENURE : int 4 44 205 92 92 26 9 110 24 48 ...
$ IS_DUAL_MODE : Factor w/ 4 levels "0","1","2","3": 2 2 2 1 2 1 4 4 4 2 ...
$ PD_CDMA_PAYMODE : Factor w/ 2 levels "1","2": 2 1 2 2 2 1 1 2 1 1 ...
$ PD_CDMA_TENURE : int 49 43 64 39 19 36 8 52 15 47 ...
$ VO_MOU_TOTAL_AVG : int 9520 344 2287 253 460 249 3 885 623 457 ...
train_label
str(train_label)
Factor w/ 2 levels "0","1": 1 1 1 1 1 1 1 1 1 1 .
print(head(train_label))
[1] 0 0 0 0 0 0
Levels: 0 1

Resources