Related
I am using the R package DT to create a table. This table contains hyperlinks and the issue that I am having is that when I put rownames = FALSE to remove the row numbers, the formatting on the hyperlinks goes away. I was wondering if anyone had a solution to this problem?
Example data:
structure(list(school = structure(c(2L, 3L, 1L, 4L), .Label = c("Linfield",
"OSU", "UO", "Willamette"), class = "factor"), mascot = structure(c(2L,
3L, 4L, 1L), .Label = c("bearcats", "beavers", "ducks", "wildcats"
), class = "factor"), website = structure(c(1L, 3L, 2L, 4L), .Label = c("oregonstate.edu",
"linfield.edu", "uoregon.edu",
"willamette.edu"), class = "factor"),
School_colors = structure(c(2L, 1L, 3L, 4L), .Label = c("<span style=\"color:green\">green & yellow</span>",
"<span style=\"color:orange\">orange & black</span>", "<span style=\"color:purple\">purple and red</span>",
"<span style=\"color:red\">red and yellow</span>"), class = "factor")), class = "data.frame", row.names = c(NA,
-4L))
Code used to generate table WITH row names
datatable(df,escape = c(1,2,3))
Code used to generate table WITHOUT row names
datatable(df, rownames = FALSE,escape = c(1,2,3))
As you can see, with the second example code, the formatting in the third column is no longer there. What I want to do is create a table without row numbers but also keep the formatting of the hyperlinks
Since your deleted the rownames the indexes of your columns changed as well.
Therefore you should change your escape argument to only 1 and 2.
datatable(df, rownames = FALSE,escape = c(1,2))
A simple escape = FALSE also works for you.
datatable(df, rownames = FALSE, escape = FALSE)
Hello all and thank you in advance.
I would like to add a new column to my pre-existing data frame where the values sourced from a second data frame based on certain conditions. The dataset I wish to add the new column to ("data_melt") has many different sample IDs (sample.#) under the variable column. Using a second dataset ("metadata") I want to add the pond names to the "data_melt" new column based on the sample-ids. The sample IDs are the same in both datasets.
My gut tells me there's an obvious solution but my head is pretty fried. Here is a toy example of my data_melt df (since its 25,000 observations):
> dput(toy)
structure(list(gene = c("serA", "mdh", "fdhB", "fdhA"), process = structure(c(1L,
1L, 1L, 1L), .Label = "energy", class = "factor"), category = structure(c(1L,
1L, 1L, 1L), .Label = "metabolism", class = "factor"), ko = structure(1:4, .Label = c("K00058",
"K00093", "K00125", "K00148"), class = "factor"), variable = structure(c(1L,
2L, 3L, 3L), .Label = c("sample.10", "sample.19", "sample.72"
), class = "factor"), value = c(0.00116, 2.77e-05, 1.84e-05,
0.0125)), row.names = c(NA, -4L), class = "data.frame")
And here is a toy example of my metadata df:
> dput(toy)
structure(list(sample = c("sample.10", "sample.19", "sample.72",
"sample.13"), pond = structure(c(2L, 2L, 1L, 1L), .Label = c("lower",
"upper"), class = "factor")), row.names = c(NA, -4L), class = "data.frame")
Thank you again!
We can use match from base R to create a numeric index to replace the values
toy$pond <- with(toy, out$pond[match(variable, out$sample)])
I believe merge will work here.
sss <- structure(list(gene = c("serA", "mdh", "fdhB", "fdhA"), process = structure(c(1L,
1L, 1L, 1L), .Label = "energy", class = "factor"), category = structure(c(1L,
1L, 1L, 1L), .Label = "metabolism", class = "factor"), ko = structure(1:4, .Label = c("K00058",
"K00093", "K00125", "K00148"), class = "factor"), variable = structure(c(1L,
2L, 3L, 3L), .Label = c("sample.10", "sample.19", "sample.72"
), class = "factor"), value = c(0.00116, 2.77e-05, 1.84e-05,
0.0125)), row.names = c(NA, -4L), class = "data.frame")
ss <- structure(list(sample = c("sample.10", "sample.19", "sample.72",
"sample.13"), pond = structure(c(2L, 2L, 1L, 1L), .Label = c("lower",
"upper"), class = "factor")), row.names = c(NA, -4L), class = "data.frame")
ssss <- merge(sss, ss, by.x = "variable", by.y = "sample")
You can use left_join() from the dplyr package after renaming sample to variable in the metadata data frame.
library(tidyverse)
data_melt <- structure(list(gene = c("serA", "mdh", "fdhB", "fdhA"),
process = structure(c(1L, 1L, 1L, 1L),
.Label = "energy",
class = "factor"),
category = structure(c(1L, 1L, 1L, 1L),
.Label = "metabolism",
class = "factor"),
ko = structure(1:4,
.Label = c("K00058", "K00093", "K00125", "K00148"),
class = "factor"),
variable = structure(c(1L, 2L, 3L, 3L),
.Label = c("sample.10", "sample.19", "sample.72"),
class = "factor"),
value = c(0.00116, 2.77e-05, 1.84e-05, 0.0125)),
row.names = c(NA, -4L),
class = "data.frame")
metadata <- structure(list(sample = c("sample.10", "sample.19", "sample.72", "sample.13"),
pond = structure(c(2L, 2L, 1L, 1L),
.Label = c("lower", "upper"),
class = "factor")),
row.names = c(NA, -4L),
class = "data.frame") %>%
# Renaming the column, so we can join the two data sets together
rename(variable = sample)
data_melt <- data_melt %>%
left_join(metadata, by = "variable")
This question already has answers here:
Reshaping multiple sets of measurement columns (wide format) into single columns (long format)
(8 answers)
Closed 4 years ago.
I have a data set with a single identifier and five columns that repeat 18 times. I want to restructure the data into long format keeping the first five column headings as the column headings. Below is a sample with just two repeats:
structure(list(Response.ID = 1:2, Task = structure(c(1L, 1L), .Label = "task1", class = "factor"),
Freq = structure(c(1L, 1L), .Label = "Daily", class = "factor"),
Hours = c(3L, 2L), Value = c(10L, 8L), Mood = structure(1:2, .Label = c("Engaged",
"Neutral"), class = "factor"), Task.1 = structure(c(1L, 1L
), .Label = "task2", class = "factor"), Freq.1 = structure(c(1L,
1L), .Label = "Weekly", class = "factor"), Hours.1 = c(4L,
4L), Value.1 = c(10L, 6L), Mood.1 = structure(c(2L, 1L), .Label = c("Neutral",
"Optimistic"), class = "factor")), .Names = c("Response.ID", "Task", "Freq", "Hours", "Value", "Mood", "Task.1", "Freq.1", "Hours.1", "Value.1", "Mood.1"), class = "data.frame", row.names = c(NA, -2L))
I attempted using the melt and patterns functions, which appears to approximate my desired outcome without the desired column headings:
df = melt(df1, id.vars = c("Response.ID"), measure.vars = patterns("^Task", "^Freq","^Hours","^Mood"))
Here is the result:
structure(list(Response.ID = c(1L, 2L, 1L, 2L), variable = structure(c(1L, 1L, 2L, 2L), class = "factor", .Label = c("1", "2")), value1 = c("task1", "task1", "task2", "task2"), value2 = c("Daily", "Daily", "Weekly", "Weekly"), value3 = c(3L, 2L, 4L, 4L), value4 = c("Engaged", "Neutral", "Optimistic", "Neutral")), .Names = c("Response.ID", "variable", "value1", "value2", "value3", "value4"), row.names = c(NA, -4L), class = c("data.table", "data.frame"), .internal.selfref = <pointer: 0x0000000000330788>)
When I tried to specify names with value.name() below I receive an error:
df = melt(df1, id.vars = c("Response.ID"),measure.vars = patterns("^Task", "^Freq","^Hours","^Mood"), value.name=c("Task", "Freq", "Hours", "Value","Mood"))
My desired result would look like this:
structure(list(Response.ID = c(1L, 2L, 1L, 2L), Task = structure(c(1L, 1L, 2L, 2L), .Label = c("task1", "task2"), class = "factor"),
Freq = structure(c(1L, 1L, 2L, 2L), .Label = c("Daily", "Weekly"
), class = "factor"), Hours = c(3L, 2L, 4L, 4L), Value = c(10L,
8L, 10L, 6L), Mood = structure(c(1L, 2L, 3L, 2L), .Label = c("Engaged",
"Neutral", "Optimistic"), class = "factor")), .Names = c("Response.ID", "Task", "Freq", "Hours", "Value", "Mood"), class = "data.frame", row.names = c(NA, -4L))
It looks to me like you embarked on a difficult journey by using melt: this function is well named in the sense that trying to use it will probably melt your brain. Joke aside, the function melt has lots of underlying computations and its use could be inefficient if you have a large dataset.
I would instead solve the problem manually with rbindlist (from the excellent package data.table, which also ships with an optimized version of melt if you really want to use it), to manually concatenates groups of columns. This also preserves the column names:
> rbindlist(lapply(1:2, function(i) df1[,c(1,((i-1)*5+2):((i-1)*5+6))]))
Response.ID Task Freq Hours Value Mood
1: 1 task1 Daily 3 10 Engaged
2: 2 task1 Daily 2 8 Neutral
3: 1 task2 Weekly 4 10 Optimistic
4: 2 task2 Weekly 4 6 Neutral
This works on your example: replace the indices 1:2 by the number of repetitions to make it work with the real dataset (so, lapply(1:18)).
I've got a data set that looks like this:
date, location, value, tally, score
2016-06-30T09:30Z, home, foo, 1,
2016-06-30T12:30Z, work, foo, 2,
2016-06-30T19:30Z, home, bar, , 5
I need to aggregate these rows together, to obtain a result such as:
date, location, value, tally, score
2016-06-30, [home, work], [foor, bar], 3, 5
There are several challenges for me:
The resulting row (a daily aggregate) must include the rows for this day (2016-06-30 in my above example
Some rows (strings) will result in an array containing all the values present on this day
Some others (ints) will result in a sum
I've had a look at dplyr, and if possible I'd like to do this in R.
Thanks for your help!
Edit:
Here's a dput of the data
structure(list(date = structure(1:3, .Label = c("2016-06-30T09:30Z",
"2016-06-30T12:30Z", "2016-06-30T19:30Z"), class = "factor"),
location = structure(c(1L, 2L, 1L), .Label = c("home", "work"
), class = "factor"), value = structure(c(2L, 2L, 1L), .Label = c("bar",
"foo"), class = "factor"), tally = c(1L, 2L, NA), score = c(NA,
NA, 5L)), .Names = c("date", "location", "value", "tally",
"score"), class = "data.frame", row.names = c(NA, -3L))
mydat<-structure(list(date = structure(1:3, .Label = c("2016-06-30T09:30Z",
"2016-06-30T12:30Z", "2016-06-30T19:30Z"), class = "factor"),
location = structure(c(1L, 2L, 1L), .Label = c("home", "work"
), class = "factor"), value = structure(c(2L, 2L, 1L), .Label = c("bar",
"foo"), class = "factor"), tally = c(1L, 2L, NA), score = c(NA,
NA, 5L)), .Names = c("date", "location", "value", "tally",
"score"), class = "data.frame", row.names = c(NA, -3L))
mydat$date <- as.Date(mydat$date)
require(data.table)
mydat.dt <- data.table(mydat)
mydat.dt <- mydat.dt[, lapply(.SD, paste0, collapse=" "), by = date]
cbind(mydat.dt, aggregate(mydat[,c("tally", "score")], by=list(mydat$date), FUN = sum, na.rm=T)[2:3])
which gives you:
date location value tally score
1: 2016-06-30 home work home foo foo bar 3 5
Note that if you wanted to you could probably do it all in one step in the reshaping of the data.table but I found this to be a quicker and easier way for me to achieve the same thing in 2 steps.
I am creating a sweave document that uses xtable to create table and puts into a pdf file. It works but table is not fitting the document and some text are missing. Is there a way to text align in xtable/fully fit an xtable to a pdf file?
This is my data:
dput(x)
structure(list(App = structure(c(3L, 2L, 1L), .Label = c("AppServer",
"Db", "Web"), class = "factor"), Group = structure(c(2L, 1L,
1L), .Label = c("Back", "Front"), class = "factor"), Owner = structure(c(1L,
1L, 1L), .Label = "Infrasructure", class = "factor"), Server = structure(1:3, .Label = c("ServerA",
"ServerB", "ServerC"), class = "factor"), NumberCPU = c(64L,
120L, 120L), Description = structure(c(1L, 3L, 2L), .Label = c("Front End server to server web traffic",
"Hold Web templates to generate dynamic content", "Keeps customer data and login information"
), class = "factor"), Cost = structure(1:3, .Label = c("$200,000 ",
"$400,000 ", "$500,000 "), class = "factor")), .Names = c("App",
"Group", "Owner", "Server", "NumberCPU", "Description", "Cost"
), class = "data.frame", row.names = c(NA, -3L))
this is the code to put the table in pdf:
print(xtable(x, caption=paste("Summary of applications"),table.placement="!h",caption.placement="top", align=c('l', 'p{1.5in}', rep('c',6) )))
I recommend checking out the xtable gallery, there are lots of examples that are useful. Basically, If you don't want to adjust your table by shorting the strings, I see two options:
Use a smaller font.
Use landscape mode.
Here, I use a combination of both:
\documentclass{article}
\usepackage{rotating}
\begin{document}
<<Data,echo=FALSE>>=
library(xtable)
x <- structure(list(App = structure(c(3L, 2L, 1L), .Label = c("AppServer",
"Db", "Web"), class = "factor"), Group = structure(c(2L, 1L,
1L), .Label = c("Back", "Front"), class = "factor"), Owner = structure(c(1L,
1L, 1L), .Label = "Infrasructure", class = "factor"), Server = structure(1:3, .Label = c("ServerA",
"ServerB", "ServerC"), class = "factor"), NumberCPU = c(64L,
120L, 120L), Description = structure(c(1L, 3L, 2L), .Label = c("Front End server to server web traffic",
"Hold Web templates to generate dynamic content", "Keeps customer data and login information"
), class = "factor"), Cost = structure(1:3, .Label = c("$200,000 ",
"$400,000 ", "$500,000 "), class = "factor")), .Names = c("App",
"Group", "Owner", "Server", "NumberCPU", "Description", "Cost"
), class = "data.frame", row.names = c(NA, -3L))
#
<<tab,echo=FALSE,results='asis'>>=
print(xtable(x, caption=paste("Summary of applications"),
caption.placement="top",
align=c('l', 'p{1.5in}', rep('c',6))),
size="footnotesize",
floating.environment="sidewaystable")
#
\end{document}
Note that you have to use the LaTex package rotating. This should give you something like this:
An alternative solution would be to use some markdown backend (knitr, markdown or pander packages) and split the table automatically to 80 character (or other user specified width) with pander. E.g.:
> library(pander)
> res <- structure(list(App = structure(c(3L, 2L, 1L), .Label = c("AppServer", "Db", "Web"), class = "factor"), Group = structure(c(2L, 1L, 1L), .Label = c("Back", "Front"), class = "factor"), Owner = structure(c(1L, 1L, 1L), .Label = "Infrasructure", class = "factor"), Server = structure(1:3, .Label = c("ServerA", "ServerB", "ServerC"), class = "factor"), NumberCPU = c(64L, 120L, 120L), Description = structure(c(1L, 3L, 2L), .Label = c("Front End server to server web traffic", "Hold Web templates to generate dynamic content", "Keeps customer data and login information"), class = "factor"), Cost = structure(1:3, .Label = c("$200,000 ", "$400,000 ", "$500,000 "), class = "factor")), .Names = c("App", "Group", "Owner", "Server", "NumberCPU", "Description", "Cost"), class = "data.frame", row.names = c(NA, -3L))
> pander(res)
----------------------------------------------------
App Group Owner Server NumberCPU
--------- ------- ------------- -------- -----------
Web Front Infrasructure ServerA 64
Db Back Infrasructure ServerB 120
AppServer Back Infrasructure ServerC 120
----------------------------------------------------
Table: Table continues below
---------------------------------------
Description Cost
------------------------------ --------
Front End server to server web $200,000
traffic
Keeps customer data and login $400,000
information
Hold Web templates to generate $500,000
dynamic content
---------------------------------------
And the result can be converted to pdf or LaTeX easily then with pandoc or with pander directly from R.