I am relatively new to R and am struggling to remove the column names for this graph. Here is a small sample of the 4417 row data which contains 3 trials and 8 tests. I have used row.names=FALSE, which doesn't remove their names from the graph.
Test TestNumber Display Trial TrueValue Subject Response
Vertical Distance, Aligned 1 1 B 0.6 1 0.6
Vertical Distance, Aligned 1 1 B 0.6 2 0.55
Vertical Distance, Aligned 1 1 B 0.6 3 0.7
Vertical Distance, Aligned 1 1 B 0.6 4 0.6
Vertical Distance, Aligned 1 1 B 0.6 5 0.65
Vertical Distance, Aligned 1 1 B 0.6 6 0.6
Vertical Distance, Aligned 1 1 B 0.6 7 0.5
Vertical Distance, Aligned 1 1 B 0.6 8 0.65
Vertical Distance, Aligned 1 1 B 0.6 9 0.5
ggplot(ds, aes(x=factor(Response),
y=TrueValue,
row.names=FALSE,
color=Trial,sd(x)))
+ geom_boxplot(notch=FALSE)
+ scale_y_continuous("Response")
+ scale_x_discrete('Trial')
+ theme_bw()
+ theme(axis.text.x=element_text(angle = -90, hjust = 0))
+ theme(text=element_text(size=10, family="Arial"))
+ ggtitle('Trial Median Comparison \n to Look for Over Estimation')
Related
My data frame looks like this.
data=data.frame(group=c("A","B","C","A","B","C","A","B","C"),
time= c(rep(1,3),rep(2,3), rep(3,3)),
value=c(0,1,1,0.1,10,20,10,20,30))
group time value
1 A 1 0.2
2 B 1 1.0
3 C 1 1.0
4 A 2 0.1
5 B 2 10.0
6 C 2 20.0
7 A 3 10.0
8 B 3 20.0
9 C 3 30.0
I would like to emphasize my search at the time point 1 and based on the values on that
time point to filter out the groups that do not fulfil a condition from the later time points.
I would like to delete the values of the groups that on the time point 1 are bigger than 0.5
and smaller than 0.1.
I want my data.frame to look like this.
group time value
1 A 1 0.2
2 A 2 0.1
3 A 3 10.0
Any help is highly appreciated.
You can select groups where value at time = 1 is between 0.1 and 0.5.
library(dplyr)
data %>%
group_by(group) %>%
filter(between(value[time == 1], 0.1, 0.5))
# group time value
# <chr> <dbl> <dbl>
#1 A 1 0.2
#2 A 2 0.1
#3 A 3 10
The wording of my question may not be great so I will try and illustrate via this example.
A Rank
0.5 1
0.6 2
0.7 3
0.8 4
0.9 5
1.0 6
I would like to add a column with a result of this formula:
(i/m)/Q
Where:
i = the rank of each value in column A (e.g 1, 2, 3, 4, 5, 6)
m = 6 (total rows)
Q = 0.05
The result would then provide another column with the result of that equation. I have 730 rows so would like to perform this instead of doing it manually for each one.
A Rank C
0.5 1 (1/6)0.05
0.6 2 (2/6)0.05
0.7 3 (3/6)0.05
0.8 4 (4/6)0.05
0.9 5 (5/6)0.05
1.0 6 (6/6)0.05
Many Thanks
I got a plot where on x-axis there are negative values from -0.15 to -1, but I need them from -1 to 0.
I plotted values (both positive and negative) by geom_bar in ggplot function. I got a plot where on x-axis there are negative values from -0.15 to -1, but I need them from -1 to 0.
Could you help how to fix it?
data frame looks like:
id value33333
<dbl> <chr>
1 -0.6
2 -0.8
3 -1
4 -0.2
5 -1
6 0.4
7 -1
8 -1
9 -0.6
10 0.1
11 -0.6
12 -1
13 0.1
14 0.15
15 0.5
16 0.4
17 -0.95
18 0.5
19 -0.6
20 0.05
I need to plot value33333 on x-axis and percent on y axis.
Thanks a lot!
ggplot(data = value33333) + geom_bar(mapping = aes(x = value33333, y = ..prop.., group = 1), stat = "count") +
scale_y_continuous(labels = scales::percent_format()) + theme_bw()
Using xlim(-1.1,0) (-1.1 to include the last bar) works without errors.
head(value33333)
interviewer internalID value
1 Nuriya 3 -0.6
2 Nuriya 5 -0.8
3 Nuriya 7 -1.0
4 Nuriya 9 -0.2
5 Nuriya 11 -1.0
6 Nuriya 13 0.4
ggplot(data = value33333) +
geom_bar(aes(x = value, y = ..prop.., group = 1), stat = "count") +
scale_y_continuous(labels = scales::percent_format()) + theme_bw() +
xlim(-1.1,0)
I have looked at all barplot questions in the sites but still couldn't figure out what to do with my dataset. I don't know if it's a duplicate but any help would be so much appreciated
My dataset
Region Scenario HC NPV1 NPV2
C 1 0.1 10 5
C 2 0.2 8 4
C 3 0.3 7 3
C 4 0.4 6 2
N 1 0.1 10 5
N 2 0.2 8 4
N 3 0.3 7 3
N 4 0.4 6 2
W 1 0.1 10 5
W 2 0.2 8 4
W 3 0.3 7 3
W 4 0.4 6 2
I want to create a barplot where HC, Scenario is at x-axis, NPV1 and NPV2 is the height and be distinguished by different patterns. A region should be a common name in the middle of each 4 scenarios.
Thanks a lot.
Expected output is something like this.
Further to my above comments, I'm quite unclear about how you'd like to visualise your data. What exactly would you like to show on the x axis?
As a start, perhaps you are after something like this?
library(tidyverse)
df %>%
gather(key, val, -Region, -Scenario, -HC) %>%
unite(x, Region, Scenario, HC) %>%
ggplot(aes(x, val, fill = key)) +
geom_col()
Here categories on the x-axis are of the form <Region>_<Scenario>_<HC>.
Update
To achieve a plot similar to the one you're showing you can do the following
library(tidyverse)
df %>%
gather(key, val, -Region, -Scenario, -HC) %>%
ggplot(aes(HC, val, fill = key)) +
geom_col(position = "dodge2") +
facet_wrap(~Region, nrow = 1, strip.position = "bottom") +
theme_minimal() +
theme(strip.placement = "outside")
Explanation: strip.position = "bottom" ensures that strip labels are at the bottom, and strip.placement = "outside" ensures that strip labels are below the axis labels (to be precise, between the axis labels and axis title).
Sample data
df <- read.table(text =
"Region Scenario HC NPV1 NPV2
C 1 0.1 10 5
C 2 0.2 8 4
C 3 0.3 7 3
C 4 0.4 6 2
N 1 0.1 10 5
N 2 0.2 8 4
N 3 0.3 7 3
N 4 0.4 6 2
W 1 0.1 10 5
W 2 0.2 8 4
W 3 0.3 7 3
W 4 0.4 6 2
", header = T)
I am trying to plot a stacked bar chart with multiple facets using the code below:
dat <- read.csv(file="./fig1.csv", header=TRUE)
dat2 <- melt(dat, id.var = c("id", "col1", "label"))
ggplot(dat2, aes(x=id, y=value, fill = variable)) +
geom_bar(stat="identity") +
scale_x_discrete(limits=dat2$label) +
facet_grid(. ~ col1) +
geom_col(position = position_stack(reverse = TRUE))
and here is a minimized example of how my data looks like:
id label col1 col2 col3 col4 col5
1 3 1 0.2 0.1 0.1 0.1
2 3 1 0.2 0.1 0.2 0.1
3 4 1 0.2 0.2 0.2 0.1
4 4 1 0.1 0.1 0.2 0.1
5 7 2 0.1 0.1 0.1 0.2
6 8 2 0.2 0.1 0.1 0.1
7 9 2 0.2 0.1 0.2 0.1
8 9 2 0.2 0.2 0.2 0.1
9 9 2 0.1 0.1 0.2 0.1
The problem I have is that the labels do not show up as I expect them. The labels for the facet where col1 is 1 gets repeated for the facet where col1 is 2, which means the labels (7,8,9,9,9) are ignored. Also, when consecutive labels are the same, they only appear once. For instance, when the first label which is 3 appears, the second label which is again 3 is ignored. Does anyone know how I can have the labels as I list them in the label column?