How to remove column name - r

I have an easy question I can seem to figure out. I am trying to remove a column name. I am trying to make some tables with formattable and I don't like the column name that is there. Here is what the table looks like now.
df <- data.frame(
"test" = c("Average age of Diagnosis", "Average Life Space", "Average Number of Dogs Diagnosised with MR"),
"2008-2018" = c(28, 27, 30),
"Sample size" = c(12,23,34),
stringsAsFactors = FALSE, check.names=FALSE)
I want to remove column1 so I tried this
df <- data.frame(
"" = c("Average age of Diagnosis", "Average Life Space", "Average Number of Dogs Diagnosised with MR"),
"2008-2018" = c(28, 27, 30),
"Sample size" = c(12,23,34),
stringsAsFactors = FALSE, check.names=FALSE)
Error: attempt to use zero-length variable name
which is clearly related to me removing the column name. Is there a way to fix this? I just wanted to remove the name for test.
Thanks in advance!

have you tried this?
colnames(df)[1] <- ''

To solve this issue I used Kath suggestion (see comment to the original question). To fix this problem I used the following
df <- data.frame(
" " = c("Average age of Diagnosis", "Average Life Space", "Average Number of Dogs Diagnosised with MR"),
"2008-2018" = c(28, 27, 30),
"Sample size" = c(12,23,34),
stringsAsFactors = FALSE, check.names=FALSE)
It is subtly different from my original post but there is a space between the "".
I am using this ability to make some figures and I didn't want anything in that column name

Related

Flextable: change cell value if it meets a condition and formatting

I have a dataset (called example) like the following one.
mic <- rep(c("One", "Two", "Tree", "Four"), each = 3)
pap <- rep(c("1", "2", "3", "4"), each = 3)
ref <- rep(c("Trial 1", "Trial 2", "Trial 3", "Trial 4"), each = 3)
prob <- c(rep(NA,4), "Nogood", NA, "Bad", "Nogood", "Norel", NA, "Bad", "Nogood")
example <- data.frame(Micro = mic, Paper = pap, Reference = ref, Problem = prob)
example
Example
I would like to merge cells vertically when consecutive cells have identical
values so I use flextable merge_v() function.
ft_example <- example %>%
flextable() %>%
merge_v(j = ~ Micro + Paper + Reference + Problem) %>%
theme_vanilla()
ft_example
I obtain the following table when knitting in Word:
Table obtained
Is there a way to:
Insert a posteriori the value "None identified" in the empty
cells in the "Problem" field that are merged together; and
Remove the inappropriate horizontal lines in the "Problem" field when
there is one (or more) not empty cells and some empty cells so that
there is one horizontal line clearly separating each combination of
Micro, Paper, Reference and horizontal lines separating only non
empty cells in the Problem field?
You can see the desired result here below:
Table desired

Stprl from a dataframe to another

How is it possible having a dataframe like this:
df_words <- data.frame(words = c("4 Google", "5Amazon", "4sec"))
replace in the rows of a dataframe like this:
df <- data.frame(id = c(1,2,4), text = "Increase for 4 Google", "There is a slight decrease for 5Amazon", "I will need 4sec more"), stringAsFactors = FALSE)
replace with the specific word the one listed in the df_words like this
"4 Google|5Amazon" -> "stock"
"4sec" -> time
Example of expected output
data.frame(id = c(1,2,4), text = "Increase for stock", "There is a slight decrease for stock", "I will need time more"), stringAsFactors = FALSE)
I recommend the stringi library. Example:
library(stringi)
strings = c("Increase for 4 Google", "There is a slight decrease for 5Amazon", "I will need 4sec more")
patterns = c("4 Google", "5Amazon", "4sec")
replacements = c("stock", "stock", "time")
strings = stri_replace_all_fixed(strings,patterns,replacements)
However, you probably want to handle many stocks and many times, so you might be better off doing something like this:
stocks = c("4 Google", "5Amazon")
strings = stri_replace_all_fixed(strings,stocks,'stock')
strings = stri_replace_all_regex(strings,'\b[0-9]+sec\b',time)
\b[0-9]+sec\b is a regular expression meaning:
word boundary
one or more number characters
"sec"
word boundary
This will include strings such as "2sec" but exclude those such as "1sector"

Convert column values to percentages

I am reading in multiple Excel files into lists using read.xlsx from the openxlsx package. I append the lists with rbind and perform some data manipulation.
What I need to do is convert the values in columns 18 and 19 to percentages (currently the values show as .90, .85, etc. but I can also force the user to enter as 90, 85, etc. I need to 90%, 85%). I have tried to do this inside the data.frame and also using createStyle. So far, nothing has worked and will either corrupt my data or simply do nothing.
Here is what I have tried...
openxlsx Style
# Create percent style
pct = createStyle(numFmt = "0%")
# Apply style
addStyle(wb, sheet = "filename", style = pct, cols = 18, rows = 102, gridExpand = TRUE)
str_replace
allData <- str_replace(allData$'Content', pattern = "%", "")
allData$'Content' <- as.numeric(allData)/100
sapply (even just trying to convert data type to numeric didn't work. It was still set to General
allData[, c(18)] <- sapply(allData[, c(18)], as.numeric)
Any help would be greatly appreciated!
Figured this out sometime ago but forgot to post the answer. For those who are interested...
# Create a percent style
pct = createStyle(numFmt = "0%")
# Add percent style
addStyle(wb, sheet = "my_filename", style = pct, cols = c(18, 19), rows = 2:(nrow(allData)+1), gridExpand = TRUE)

Stacked table spreading and merging

I download SKOS Schema table from W3C to prepare a Vocabulary mapping mission. This is an example build in “dput”:
> dput(skosc)
structure(list(X1 = c("skos:Collection", "URI:", "Definition:",
"Label:", "Disjoint classes:", "skos:Concept", "URI:", "Definition:",
"Label:", "Disjoint classes:", "skos:ConceptScheme", "URI:",
"Definition:", "Label:", "Disjoint classes:", "skos:OrderedCollection",
"URI:", "Definition:", "Label:", "Super-classes:"), X2 = c("skos:Collection",
"http://www.w3.org/2004/02/skos/core#Collection", "Section 9. \r\n Concept Collections",
"Collection", "skos:Conceptskos:ConceptScheme", "skos:Concept",
"http://www.w3.org/2004/02/skos/core#Concept", "Section 3. The \r\n skos:Concept Class",
"Concept", "skos:Collectionskos:ConceptScheme", "skos:ConceptScheme",
"http://www.w3.org/2004/02/skos/core#ConceptScheme", "Section 4. \r\n Concept Schemes",
"Concept Scheme", "skos:Collectionskos:Concept", "skos:OrderedCollection",
"http://www.w3.org/2004/02/skos/core#OrderedCollection", "Section 9. \r\n Concept Collections",
"Ordered Collection", "skos:Collection")), .Names = c("X1", "X2"
), class = "data.frame", row.names = c(NA, -20L))
There is an odd in this stacked table besides the subtitle of each small table(such as “skos:Collection”, “skos:Concept” and so on) which we must notice: the rownames are not all the same either, like No. 20 Row in the example, it names “Super-classes:”, not “Disjoint classes:” as above small tables.
My plan is split this stacked table and transposition it as follows:
Before:
After:
“dplyr” and “tidyr” are both good at manipulating tables, and I choose “spread” function which can change table from long and narrow to short and wide. Unfortunately, it failed:
> skosns<-"http://www.w3.org/2009/08/skos-reference/skos.html"
> require(rvest)
载入需要的程辑包:rvest
载入需要的程辑包:xml2
> skospg<-read_html(skosns, encoding = "UTF-8", options = c("RECOVER", "NOERROR", "NSCLEAN"))
> skosnd<-html_nodes(skospg, "table")
> skosc<-html_table(skosnd[[1]], header = FALSE, trim = TRUE, fill = FALSE, dec = ".")
> skosp<-html_table(skosnd[[2]], header = FALSE, trim = TRUE, fill = FALSE, dec = ".")
> require(tidyr)
载入需要的程辑包:tidyr
> spread(skosc, key = X1, value = X2)
Error: Duplicate identifiers for rows (3, 8, 13, 18), (5, 10, 15), (4, 9, 14, 19), (2, 7, 12, 17)
The error massage didn’t tell me much about the reason, but I guess it maybe the odd row leads to this error. Can we ignore the differences among small tables and only spread the same value into different columns?
Question Updated:
The code post by scholar akrun in the commont is very helpful, I learned if there are 2 more values in one column, we need to group them and mutate the structure first. Then the data frame can be spread. Thanks to akrum!!!
Now comes the last process: delete there column of vocabulary name(such as "skos:Collection") and transfer them to the corresponding rows. But I have a weak point on write built-in function, so the program failed unsurprisingly:
> require(rvest)
> skospg<-read_html(skosns, encoding = "UTF-8", options = c("RECOVER", "NOERROR", "NSCLEAN"))
> skosnd<-html_nodes(skospg, "table")
> skosc<-html_table(skosnd[[1]], header = FALSE, trim = TRUE, fill = FALSE, dec = ".")
> require(dplyr)
> skosc_g<-group_by(skosc, X1)
> skosc_m<-mutate(skosc_g, n = row_number())
> require(tidyr)
载入需要的程辑包:tidyr
> skosc_t<-spread(skosc_m, key = X1, value = X2)
> vocn<-select_all(skosc_t, funs(colnames=grep("[[:alpha:]]+:[[:alpha:]]+")))
Error in grep("[[:alpha:]]+:[[:alpha:]]+") :
argument "x" is missing, with no default
> merge.data.frame(vocn, skosc_t, by=c("Collection", "Concept", "ConceptScheme"))
Error in as.data.frame(x) : object 'vocn' not found
The plan of this paragraph is to extract the columns which have a value as those vocabulary names{skosc_t[c(5,6,7,8),]}, then merge them with the dataframe in which these columns has been deleted{skosc_t[c(2,3,4,9,10),]}:
What to do is correct? Thanks a lot.

create variables from different columns with a for loop

I have a data frame, imported with excel with library(readxl). It contains long columns of data each with its own column title. Now I need to store specific values in new variables. I stored the column titles in the vector "titles" and want to extract certain values from a specific row e.g 151 and store it in a new variable.
I tried with the code below. I am really new to R and tried a lot and failed...
example <- data.frame(c('N 1','N 2'), c(50, 60), c(70, 80))
titles <- c('N 1', 'N 2')
for (i in titles) {
(paste("nkorrigiert",i)) <- as.numeric(example[[paste(i)]][3])
}
dput(head(example))
and get this
Fehler in (paste("nkorrigiert", i)) <- as.numeric(example[[paste(i)]][3]) :
Ziel der Zuweisung expandiert zu keinem Sprachobjekt
> dput(head(example))
structure(list(c..N.1....N.2.. = structure(1:2, .Label = c("N 1",
"N 2"), class = "factor"), c.50..60. = c(50, 60), c.70..80. = c(70,
80)), .Names = c("c..N.1....N.2..", "c.50..60.", "c.70..80."),
row.names = 1:2, class = "data.frame")
What am I doing wrong?
You can use the assign command
example <- data.frame(c('N 1','N 2'), c(50, 60), c(70, 80))
titles <- c('N 1', 'N 2')
for (i in titles) {
assign(paste("nkorrigiert",i), as.numeric(example[[paste(i)]][3]))
}
dput(head(example))
R does not understand that you want to create a new variable
With your help and the post suggested by #lmo i was able to solve it. Thank you guys! :D
Now i have my first code almost running, yay! With this code under the foor loop it was possible
as.numeric(assign(paste("nkorrigiert",i), example[3, i]))
Now i just need to find out how to calculate with this values stored in variable names in a for loop!:D
All the best, Sebastian

Resources