I am just a beginner in R and I have a set of historical monthly yield data that looks like the sample data below:
dates rt
1 1990-01 0.0790
2 1990-02 0.0800
3 1990-03 0.0817
4 1990-04 0.0804
5 1990-05 0.0801
These data are stored in a data frame called 'data'.
I then attempt to plot the graph as follow.
#First attempt:
data <- data.frame(dates, rt)
matplot(data, type = 'l', xlab = 'month', ylab = 'US 3M rates',
main = 'US short rates 1990 - 2019')
The resulting graph showed the x-axis labels as 0, 50, 100 etc. (I have a total 350 rows of data)
#Second attempt:
data <- data.frame(dates, rt)
matplot(data, type = 'l', xlab = 'month', ylab = 'US 3M rates',
main = 'US short rates 1990 - 2019', xaxt = 'n')
axis(1, at = data$dates)
The resulting graph left the x-axis blank with the following warning/error messages:
Warning in xy.coords(x, y, xlabel, ylabel, log = log, recycle = TRUE) :NAs introduced by coercion
Warning in xy.coords(x, y, xlabel, ylabel, log) :NAs introduced by coercion
Warning in axis(1, at = data$dates) : NAs introduced by coercion
Error in axis(1, at = data$dates) : no locations are finite
#Third attempt:
data <- data.frame(dates, rt)
matplot(data, type = 'l', xlab = 'month', ylab = 'US 3M rates', main = 'US short rates 1990 - 2019', xaxt = 'n')
axis(1, at = format(data$dates, "%y-%m"))
Similarly, this also left the x-axis blank and an error message as follows:
Error in format.default(data$dates, "%y-%m") : invalid 'trim' argument
May I know how should the code be tweaked such that the x-axis can show the dates as per the sample data, i.e. showing 1990-01 and so on so forth?
The problem is that the datesĀ“ column is just a character vector and you have to tell R what 1900-01` means.
The easiest way would be to convert the dates column to Date format and then to use plot instead of matplot:
data$dates <- as.Date(paste0(data$dates, "-01"))
plot(data, type = 'l', xlab = 'month', ylab = 'US 3M rates',
main = 'US short rates 1990 - 2019')
Related
I am plotting the mean, Stdev, skewness and autocorrelation of a dataset but have only been able to plot using xaxt='n'. When I try and add the x-axis my x and y lengths differ. Not sure if the rollapply function is changing the lengths in someway. length($year) and length($yield) are 157 but after I use rollapply the lengths of BK_W_Roll_Mean, SD, Skew and Auto all come up as 128. Trying to understand why rollapply is changing the length and also if there would be an easier way to plot this data then what I am doing. Just trying to get the x-axis label to be years from year1-year2
window.size = 30
BK_W_Roll_Mean <- rollapply(BK_W$Yield,window.size,mean, na.rm = T)
BK_W_Roll_SD <- rollapply(BK_W$Yield,window.size,StdDev, na.rm = T)
BK_W_Roll_Skew <- rollapply(BK_W$Yield,window.size,skewness, na.rm = T)
Moving windows with 'rollapply' function for autocorrelation of lag 1
BK_W_Roll_Auto <- rollapply(BK_W$Yield, window.size, FUN=function(x) acf(x,lag.max = 1, type = "correlation", na.action = na.pass, plot = FALSE)$acf[2])
Moving windows plots
x11(width=40,height=20)
par(mfrow=c(2,2))
plot(BK_W$Year, BK_W_Roll_Mean, type = "l", col = "orange", lwd=2, lty=1, main="example", xlab="Time", ylab="Mean")
plot(BK_W$Year, BK_W_Roll_SD, type = "l", col = "red", lwd=2, lty=1, main="example", xlab="Time", ylab="Std. Deviation")
plot(BK_W$Year, BK_W_Roll_Skew, type = "l", col = "purple", lwd=2, lty=1, main="example", xlab="Time", ylab="Skewness")
plot(BK_W$Year, BK_W_Roll_Auto, type = "l", col = "blue", lwd=2, lty=1, main="example", xlab="Time", ylab="Lag 1 Autocorrelation")
I tried harmonizing the data using [1:length(BK_W$year)] for each of the mean, sd, skew and auto sets which gave a graph with a labled x-axis but the data was off and not set to the right year values.
Also tried adding xlim(year1,year2) but the graph comes up without the trend line. I also tried just running it without xaxt='n' but it came up with a plot from 0-n data points instead of the desired time range in years.
I'm still new to R and have a project I'm trying to complete for my course. I'm looking to create a chart that shows average patient ratings versus average hospital ratings based on topics.
DATA FILE can be found in my GitHub repository: https://github.com/rachh8283/pt-satisfaction-r-tableau
df <- import("clean_file.xlsx")
### Average ratings by topic using SQL
avg_ratings <- sqldf('SELECT Topic, AVG(PtRating) AS MeanPtRate, AVG(HospRating) AS MeanHospRate
FROM df
WHERE Ownership == "Proprietary"
GROUP BY Topic')
### Remove missing values from dataset.
I have also tried doing this for the avg_ratings object (data frame?), but that doesn't work either.
df <- na.omit(df)
### Here's where I'm getting errors.
I'm trying to plot the two variables based on topic (another variable).
plot(avg_ratings$Topic, avg_ratings$MeanPtRate, main="Average Patient Rating vs Average Hospital Rating by Topic", type = "b",
pch=19, col = "red", xlab="Topic", ylab="Rating")
lines(avg_ratings$Topic, avg_ratings$MeanHospRate, type = "b", pch=18, col = "blue", lty=1)
legend("topright", legend=c("Patient Rating", "Hospital Rating"), col=c("red", "blue"),
lty=1, cex=0.8)
Here's the error I'm getting:
> plot(avg_ratings$Topic, avg_ratings$MeanPtRate, main="Average Patient Rating vs Average Hospital Rating by Topic", type = "b",
+ pch=19, col = "red", xlab="Topic", ylab="Rating")
Error in plot.window(...) : need finite 'xlim' values
In addition: Warning messages:
1: In xy.coords(x, y, xlabel, ylabel, log) : NAs introduced by coercion
2: In min(x) : no non-missing arguments to min; returning Inf
3: In max(x) : no non-missing arguments to max; returning -Inf
> lines(avg_ratings$Topic, avg_ratings$MeanHospRate, type = "b", pch=18, col = "blue", lty=1)
Warning message:
In xy.coords(x, y) : NAs introduced by coercion
> legend("topright", legend=c("Patient Rating", "Hospital Rating"), col=c("red", "blue"),
+ lty=1, cex=0.8)
Thanks in advance for the help!
Ok, this answer will need some editing pending some clarification. I'm not using the sqldf function in the following. I'm also using ggplot2 because you have that in a tag:
library(tidyverse)
hospital_data <- readxl::read_xlsx("pt-satisfaction-r-tableau-main/clean_file.xlsx")
str(hospital_data) # Topic needs to be a factor
# Convert Topic to Factor:
hospital_data <- hospital_data %>%
mutate(Topic = forcats::as_factor(Topic))
# Summarise means
hospital_data_summarise <- hospital_data %>%
filter(Ownership == "Proprietary") %>%
group_by(Topic) %>%
summarise(ave_pt_rating = mean(PtRating),
ave_hosp_rating = mean(HospRating)) %>%
ungroup()
hospital_data_summarise
# Going with geom point
ggplot(hospital_data_summarise, aes(x = Topic, y = ave_pt_rating)) +
geom_point() +
theme(axis.text.x = element_text(angle = 90, vjust = 0.5, hjust=1))
Is this roughly what you are after? You may need to adjust the file pathway in line 2.
This is the complete code:
require(RCurl)
require(foreign)
require(tidyverse)
x = getURL("https://raw.githubusercontent.com/RInterested/PLOTS/master/drinks_csv.csv")
data <- read.csv(textConnection(x))
data <- data[,c(1:5,8)]
interest <- c(1,2,3,7,9,10,11,14,16,17,21,24,26,29,30,31,34,35,36,38,
39,40,42,44,47,48,49,50,55,57,61,62,63,65,69,70,71,72,
73,74,75,76,77,78,79,80,81,82,83,84,85,89,91,92,93,94,
100,102,103,104,105,106,108,110,111,112,113,114,115,116,
118,119,121,122,125,126,127,134,141,142,143,144,152,154,
155,158,159,160,162,169)
data <- data[interest,]
data <- droplevels(data)
data <- data[with(data, order(data$wine_servings)),]
row.names(data) <- 1:nrow(data)
plot(data$country,data$wine_servings,las=2, xlab="", ylab="", xaxt = 'n', yaxt = 'n')
axis(1, at = 1:length(data$country), labels = data$country, cex.axis = 0.6,las = 2)
The x-axis should display the countries in the order of the value of the y axis. Therefore the plot should be increasing in values from left to right. Yet, this is not what I get, resulting in a misleading and incorrect plot.
I presume that even though I relabeled the rows after sorting, it is still using the initial row values...
In the code you have above:
head(data$country)
[1] afghanistan bangladesh india indonesia iran iraq
90 Levels: afghanistan albania algeria argentina australia ... zimbabwe
This is a factor, and when you plot it, it is converted to numeric, and drags along the other levels, for example:
plot(head(data$country),rep(1,6))
Your data frame is ok, we can just do:
plot(1:nrow(data),data$wine_servings,las=2, xlab="", ylab="", xaxt = 'n', yaxt = 'n')
axis(1, at = 1:length(data$country), labels = as.character(data$country), cex.axis = 0.6,las = 2)
Long and short, be careful when your column is a factor, because base R plot function easily converts them.
I have my database table from which i retrieve a column and plot it using plot function.
Here is column of table
Profit
1 21200
2 28000
3 29600
4 30200
5 33000
6 26800
7 32600
8 30000
9 28000
10 34000
Here 60 rows are present but i am showing only 10 rows.
when i try to plot the graph i am getting a straight line parallel to x axis but here profit is changing, so i don't think that it should parallel to x-axis.Since table is present in database in aws i am retrieving the profit column from table first then plotting using plot function.Here is plot function
choices = dbGetQuery(pool,"select Profit from input11;")
plot(Choices, type = "l", lwd = 3, main = "Profit",col = "green", xlab =
"Number of Overbooking", ylab = "Profit")
i am also getting warning messages here:
Warning messages:
1: In plot.window(xlim, ylim, log, ...) :
graphical parameter "type" is obsolete
2: In axis(side = side, at = at, labels = labels, ...) :
graphical parameter "type" is obsolete
3: In title(xlab = xlab, ylab = ylab, ...) :
graphical parameter "type" is obsolete
But when i remove type = "l", warning message disappears. But I want the plot in straight line format only.
Based on this R Help thread, the Profit column is class of factor, let's test:
Below works fine, when numeric:
plot(1:10, type = "l")
When we have factors, plots but with warnings:
plot(factor(1:10), type = "l")
Warning messages:
1: In plot.window(xlim, ylim, log = log, ...) :
graphical parameter "type" is obsolete
2: In axis(if (horiz) 2 else 1, at = at.l, labels = names.arg, lty = axis.lty, :
graphical parameter "type" is obsolete
3: In title(main = main, sub = sub, xlab = xlab, ylab = ylab, ...) :
graphical parameter "type" is obsolete
4: In axis(if (horiz) 1 else 2, cex.axis = cex.axis, ...) :
graphical parameter "type" is obsolete
I have an issue with the creation of custom axes in the base plotting system in R, I have the following data frame for which I want to plot a trend to show the changes for each year:
year <- c(2000, 2002, 2005, 2009)
values <- c(7332967, 5332780, 5135760, 3464206)
x <- data.frame(year, values)
## year values
## 1 2000 733296
## 2 2002 533278
## 3 2005 513576
## 4 2009 346420
My first attempt is:
plot(x$year, x$value,
xlab = "Year",
ylab = "Value",
type = "b")
However, that gives me a skewed x and y axis for the four values I have in the data frame. I would like for the x axis to only contain the four values under the "year" column and y axis to only contain the four values under the "values" column.
For this purpose I tried to create custom x and y axis but that resulted in errors:
plot(x$year, x$value,
type = "b",
xaxt = "n",
yaxt = "n",
xlab = "Year",
ylab = "Values",
axis(1, at = 1:nrow(x), labels = x$year),
axis(2, at = 1:nrow(x), labels = x$value))
"Error in plot.window(...) : invalid 'xlim' value"
and:
plot(x$year, x$value,
type = "b",
xaxt = "n",
yaxt = "n",
xlab = "Year",
ylab = "Values",
axis(1, at = 1:nrow(x), labels = x$year),
axis(2, at = 1:nrow(x), labels = x$value),
xlim = c(min(data_plot$year), max(data_plot$year)),
ylim = c(min(data_plot$Emissions), max(data_plot$Emissions)))
"Error in strsplit(log, NULL) : non-character argument"
I am quite new to R and tried searching for solutions on various sites, however, nothing seems to solve the issue so any help provided would be much appreciated.
axis is a separate function, not an argument to plot, so try the following:
# First make some extra space on the left for the long numeric axis labels
par(mar=c(5, 6, 1, 1))
# Now plot the points, but suppress the axes
plot(x$year, x$values, xaxt='n', yaxt='n', xlab='Year', ylab='', type='b')
# Add the axes
axis(1, at=x$year, labels=x$year, cex.axis=0.8)
axis(2, at=x$values, labels=x$values, las=1, cex.axis=0.8)
# Add the y label a bit further away from the axis
title(ylab='Value', line=4)