This is what i've written for my task:
1:12
12:1
rep(1:2,12)
rep("Red",6)
c(rep("Red",6),rep("Blue",6))
sample(1:12,4)
sample(1:12)
sample(c(rep("Red",6),rep("Blue",6)))
alder <- data$alder
respons <- data$respons
plot(alder, respons)
behandling <- rep(c("FB2M","Placebo"), each = 15)
behandling <- factor(behandling)
data$behandling <- behandling
boxplot(respons ~ behandling, data=data)
data$randomisering <- sample(data$behandling)
boxplot(respons ~ data$randomisering, data=data)
Menn <- data[data$kjonn == 'Mann',]
table(Menn$randomisering)
..but when i try to file->compile report to MS Word, this is what happens:
Quitting from lines 3-33 (Oving3.spin.Rmd)
Error in data$alder : object of type 'closure' is not subsettable
Calls: <Anonymous> ... withVisible -> eval_with_user_handlers -> eval -> eval
Execution halted
Can someone help me?
I'm trying to implement simple query on Spark using "gapply", but face troubles.
This code works well.
library(SparkR)
library(dplyr)
df <- createDataFrame(iris)
createOrReplaceTempView(df, "iris")
display(SparkR::sql("SELECT *, COUNT(*) OVER(PARTITION BY Species) AS RowCount FROM iris"))
But I can't realize it via gapply
display(df %>% SparkR::group_by(df$Species)
%>% gapply(function(key, x) { y <- data.frame(x, SparkR::count()) },
"Sepal_Length double, Sepal_Width double, Petal_Length double, Petal_Width double, Species string, RowCount integer"))
returns error
SparkException: R unexpectedly exited. Caused by: EOFException:
org.apache.spark.SparkException: Job aborted due to stage failure:
Task 0 in stage 235.0 failed 4 times, most recent failure: Lost task
0.3 in stage 235.0 (TID 374) (10.150.202.5 executor 1): org.apache.spark.SparkException: R unexpectedly exited. R worker
produced errors: Error in (function (classes, fdef, mtable) :
unable to find an inherited method for function ‘count’ for signature
‘"missing"’ Calls: compute ... computeFunc -> data.frame ->
-> Execution halted
Is it possible to implement the window function "count" with gapply using pipes from dplyr?
Just a small mistake, you should have used base::nrow function instead of SparkR::count inside gapply.
display(df %>% SparkR::group_by(df$Species)
%>% gapply(function(key, x) { y <- data.frame(x, nrow(x)) },
"Sepal_Length double, Sepal_Width double, Petal_Length double, Petal_Width double, Species string, RowCount integer"))
This is how you could have done it through SparkR APIs using the SparkR::windowPartitionBy function, there is no need of creating any UDF here -
(
df %>%
SparkR::select(
c(
SparkR::columns(df),
SparkR::over(
SparkR::count(SparkR::lit(1)),
SparkR::windowPartitionBy(SparkR::column("Species"))
) %>% SparkR::alias("RowCount")
)
) %>%
display()
)
I would like to convert my r file to HTML, but it always turns this error even if my code is good to process.
error
Quitting from lines 71-90 (HW1.Rmd)
Error in eval(quote(list(...)), env) : object 'MAD' not found
Calls: <Anonymous> ... [.data.frame -> order -> standardGeneric -> eval -> eval -> eval
Execution halted
error part code
ss <- data.frame(
MED = apply(exprs(ESET),1,median),
MAD = apply(exprs(ESET),1,mad)
)
sort(ss$MAD, decreasing = TRUE)[1:101]
highlight_df <- ss %>%
filter(MAD>3.800260)
ss %>%
ggplot(aes(x=MED,y=MAD)) +
geom_point(pch=20) +
geom_point(data=highlight_df,
aes(x=MED,y=MAD,color = top100),
color='red',
size=3)
top <- ss[order(-MAD,-MED),]
top[1:10,]
I've got an RMarkdown script that works fine if I run the chunks manually, either one at a time or with "Run All". But when I try to use knitr to generate HTML or a PDF, I'm getting an error: Error in select(responses, starts_with("Q1 ") & !contains("None")) %>% : could not find function "%>%"
The actual full line reads:
cols <- select(responses, starts_with("Q1 ") & !contains("None") ) %>% colnames()
I'm working with data from a survey, where a lot of questions were "select as many as apply" type questions, and there was an open ended "None of the above" option. At this point, I'm pulling out exactly the columns I want (all the Q1 responses, but not Q10 or Q11 responses, and not the open ended response) so I can use pivot_longer() and summarize the responses. It works fine in the script: I get a list of the exact column names that I want, and then count the values.
But when I try to use knitr() it balks on the %>%.
processing file: 02_Survey2020_report.Rmd
|.... | 6%
ordinary text without R code
|......... | 12%
label: setup (with options)
List of 1
$ include: logi FALSE
|............. | 19%
ordinary text without R code
|.................. | 25%
label: demographics gender
Quitting from lines 28-46 (02_Survey2020_report.Rmd)
Error in select(responses, starts_with("Q1 ") & !contains("None")) %>% :
could not find function "%>%"
Calls: <Anonymous> ... handle -> withCallingHandlers -> withVisible -> eval -> eval
Execution halted
A simplified reproducible example gets the same results. I run the following and get what I expect, a tidy table with the count times each answer was selected:
example <- data.frame("id" = c(009,008,007,006,005,004,003,002,001,010), "Q3_Red" = c("","","","Red","","","","Red","Red","Red"), "Q3_Blue" = c("","","","","","Blue","Blue","Blue","",""),
"Q3_Green" = c("","Green","Green","","","","","Green","",""), "Q3_Purple" = c("","Purple","","","Purple","","Purple","","Purple","Purple"),
"Q3_None of the above" = c(009,008,"Verbose explanation that I don't want to count." ,006,005,004,003,002,"Another verbose entry.",010)
)
cols <- select(example, starts_with("Q3") & !contains("None") ) %>% colnames()
example %>%
pivot_longer(cols = all_of(cols),
values_to = "response") %>%
filter(response != "") %>%
count(response)
But when I use ctrlshiftk to output a document, I get the same error:
processing file: 00a_reproducible_examples.Rmd
Quitting from lines 9-25 (00a_reproducible_examples.Rmd)
Error in select(example, starts_with("Q3") & !contains("None")) %>% colnames() :
could not find function "%>%"
Calls: <Anonymous> ... handle -> withCallingHandlers -> withVisible -> eval -> eval
Execution halted
Why is knitr balking at a pipe?
recently I run into similar problem.
Not sure if there is more sophisticated solution, but loading library in each chunk of code worked for me.
To display result of a code, w/o message regarding library loading add message = FALSE.
Example:
```{r, echo=FALSE, message = FALSE}
library(dplyr)
>>your code with dplyr<<
```
I am trying to do a kNN test on a data set composed of numerical and catagorical data.
I need to convert my catagorical data into numerical so I wrote the following code
summary(factor(data$gender))
data$gender <- as.numeric(data$gender == 'Male')
data$Partner <- as.numeric(data$Partner == 'No')
data$Dependents <- as.numeric(data$Dependents == 'No')
data$PhoneService <- as.numeric(data$PhoneService == 'No')
data$PaperlessBilling <- as.numeric(data$PaperlessBilling == 'No')
summary(data[c("gender","Partner","Dependents","PhoneService","PaperlessBilling")])
In the execution everything works just fine, however when I try to knit the code into HTML I get the following error
Quitting from lines 32-39 (HW-kNN-MariamJallouli_Numercial_Categorical.Rmd)
Error in data$gender : objet de type 'closure' non indiçable
Calls: <Anonymous> ... withCallingHandlers -> withVisible -> eval -> eval -> summary -> factor