In R: addDataFrame Excel package function - r

When I create an Excel workbook using addDataFrame function to add my dataset, then I get automatically an extra column displayed in Excel which indeed count the number of observations of the dataset.
Let say my dataset is 1 column dataset as:
My1stCol: val1 val2 val3
Then in Excel I'll get 2 columns:
The extra column added with values: 1 2 3
My1stCol: val1 val2 val3
How can I get rid of the automatically added column 1,2,3 information in excel that I don't need ?
Looking forward to your reply

Related

How to count specific occurrences of strings within chr data into a new column

I have a dt similar to below, with chr data held in the Description column like below. I need to count the number of times certain strings of characters occur in that column, and sum them in the Occurrences column.
In the table below, it would be counting the number of times "A18" or "A19" appears.
ID
Date
Description
Occurrences
1
2020-01-01
A1901,A1804,A2008,AB06
2
2
2020-01-14
A1402,A1805,A1902
2
3
2018-02-25
A1702
0
I'm very new to R and datatables, so haven't tried much. I've searched, but only found how to count occurrences of whole strings, not within them.
Use str_count:
library(stringr)
library(dplyr)
df %>%
mutate(Occurences2 = str_count(Description, "A18|A19"))

How do I write back results of a count query to a column in R?

I would like to count the instances of a Employee ID in a column and write back the results to a new column in my dataframe. So far I am able to count the instances and display the results in the R Studio console, but I'm not sure how to write the results back. Here is what I have tested successfully:
ids<-BAR$`Employee ID`
counts<-data.frame(table(ids))
counts
And here are the returned results:
1 00000018 1
2 00000179 1
3 00001045 1
4 00002729 1
5 00003095 2
6 00003100 1
Thanks!
If we need to create a column, use add_count
library(dplyr)
BAR1 <- BAR %>%
add_count(`Employee ID`)
table returns the summarised output. If we want to create a column in the original data
BAR1$n <- table(ids)[as.character(BAR$`Employee ID`)]
If you use a data.table you will be able to do this quickly, especially with larger datasets, using .N to count number of occurrences per grouping variable given in by.
# Load data.table
library(data.table)
# Convert data to a data.table
setDT(BAR)
# Count and assign counts per level of ID
BAR[, count := .N, by = ID]

Is there an R function for renaming the variables of a column?

I want to replace values in a column using R code. Data frame is DT. The column name is Data. It has values 1 and 2 repeatedly. I want a code to replace all 1 with Asia and 2 with India.

Include variable value on data frame name

I'm trying to figure out how can I add something to a data frame df, based on a variable (i.e. a date), ending up with a data frame named df_17 if variable is equal to 2017 for example.
The reason why I want this is because I'm importing datasets from several years and quarters, and I would like to make sure that they are named according to the year variable they have. Each dataset only has 1 date. I know I can do it manually but it would take me less time to automate it.
I know how to do it with columns and rows, but I can't figure it out for objects.
EDIT:
Example 1:
Data frame name "df"
A B Date
1 4 2017
2 3 2017
New data frame name "df_2017"
Example 2:
Data frame name "df"
A B Date
1 4 2016
2 3 2016
New data frame name - "df_2016 "
The assign function should do what you want. A solution could look like
assign(paste0("df_", year), dataframe_read_from_file, pos = 1)
If you use assign inside a function oder a loop, make sure that you set the pos option correctly.

How can I import all the decimal places in a mixed-type column with XLConnect and R?

I am trying to import many Excel spreadsheets into R using XLConnect. Two columns interest me, one containing variable names, and the other containing values. These values vary between being characters or numeric. The authors of the spreadsheet have set the numeric values to show varied numbers of decimals depending on the cell, although I need all numeric values with all decimal places. However, because the column contains both characters and numbers, readWorksheet converts everything to characters, and therefore seems to only read the visible decimal places displayed in Excel inside the cells.
How can I import a column specifying that I want both the character fields and the printed decimals to their full inputed values (as a character vector)?
Apologies for the lack of MWE, due to the requirement of a spreadsheet.
From my interpretation of your description of the worksheet, I've created a small test worksheet that looks like
Name Value
a abc
b 12.3
c 1.2
d 0.1
where the numbers in the Value column have more significant figures than shown. You seem to have two problems, first the one you ask about as to how to read this, second, how to store the results in R. The columns in R data frames must be all of the same data type and a list would seem a bit messy. A way to do this is to read the worksheet twice, first time to getting all data in Value col as character, the second using the forceConversion option in the readWorksheet function to force the column to be read as numeric and storing the results in a new col. The code looks like
library(XLConnect)
wb <- loadWorkbook("test.xlsx")
df <- readWorksheet(wb, "test" )
df <- cbind(df, Value_Num=readWorksheet(wb, "test", drop="Name", colTypes="numeric", forceConversion=TRUE)$Value)
This should give df as
Name Value Value_Num
1 a abc NA
2 b 12.3 12.300
3 c 1.2 1.230
4 d 0.1 0.123
where the data in the Value column are character and the data in the Value_Num column are numeric. From there you'll have to sort out what to do with the two Value columns.

Resources