Groups with fewer than two data points have been dropped. ggplot r - r

A couple of weeks ago, I drew a ggplot density plot in r. It worked fine. And then yesterday, I revisited and ran the same code. There was absolutely no change in the body of the code (only change in the Rmarkdown formatting like changing font size of texts, echo = FALSE, etc). I simply re-ran the same code and for some reason it does not work.
Here's the code for the dataset:
m_hospitals = m_hospitals %>% group_by(Provider.Name) %>% summarise(accSum =
sum(Average.Covered.Charges), atpSum = sum(Average.Total.Payments), ampSum =
sum(Average.Medicare.Payments), accMean = mean(Average.Covered.Charges),
atpMean = mean(Average.Total.Payments), ampMean =
mean(Average.Medicare.Payments))
#Take a sample
set.seed(1219)
sample_m_hospitals = m_hospitals[sample(nrow(m_hospitals), 30), ]
And here's the code for the plot that worked before but not anymore:
ggplot(data = sample_m_hospitals) +
aes(x = Provider.Name, y = accMean) +
geom_density(alpha = .75) + theme(axis.text.x = element_text(angle = 45,
hjust = 1))
It's giving me this message, "Groups with fewer than two data points have been dropped" * 30.
It is true that each of 30 rows only has 1 observation because they are summarised. Funny thing is, the message did not pop up before, did not drop any data points and worked fine. I even have the screenshot I took for the plot drawn. Here's the link: 1
One change I feel suspicious about is updating R (updated before re-plotting, 3.5.0 > 3.5.1) which erased all my libraries that I had to re-install all of them. But I'm not sure if the update has to do anything with this issue. It worked fine, but why is it suddenly not working? I don't understand.. Please help! I just wanna plot this in similar form however possible.
Contents updated as per you guys' comments:
Screenshots for devtools::session_info()
2 , 3
Screenshots for sample_m_hospitals
4 , 5

I had a similar issue which was resolved by converting columns. Numeric columns were being imported as strings.

Related

How to scatterplot in RStudio

I am trying to create a scatterplot in rstudio with my data. I am new to rstudio and having a lot of time understanding. The code I have found says plot().
This is what I used: plot(pa2_wti2$ï..Approving, pa2_wti2$ï..Price)
Even when I tried a single it didn't give me a scatter plot. I have tried ggplot but it says it does not find what I have put as the x.
I have tried:
ggplot(pa2_wti2) +
geom_point(aes(x = Price, y= Approving))
ggplot(pa2_wti2) +
geom_point(aes(x = ï..Price, y= ï..Approving))
Any help is welcomed. Thanks!
Here is some additional information:
wti2 <- read.csv("C:/Users/thomp/OneDrive/Desktop/FQM/Data Project 2/WTI Price Only Take 2.csv")
summary(wti2)
table(wti2)
sd(wti2$ï..Price)
wti2_prices2 = rnorm(wti2$ï..Price, mean=43.92, sd=28.1762)
pa2 <- read.csv("C:/Users/thomp/OneDrive/Desktop/FQM/Data Project 2/PA Approval Only Take 2.csv")
summary(pa2)
table(pa2)
sd(pa2$ï..Approving)
pa2_approve = rnorm(pa2$ï..Approving, mean=50.09, sd=11.46188 )
plot(pa2_wti2$ï..Approving)
plot(plot(pa2_wti2$ï..Approving, pa2_wti2$ï..Price)

Histogram runs into object not found error

I am a student who is quite new to R and am having difficulty linking my dataset to the actual workspace. In particular, I am trying creating a histogram to show what life expectancy looks like across zipcodes of a state, but nothing is showing up.
This is what my code looks like:
install.packages("ggplot2")
library(ggplot2)
ggplot(data = df_mo, aes(x = life_expectancy)) + geom_histogram(color = "tomato")
Here is what my error message in the console states:
>ggplot(data = df_mo, aes(x = life_expectancy)) + geom_histogram(color = "tomato")
Error in FUN(X[[i]], ...) : object 'life_expectancy' not found
>
Here is what my dataset looks like:
This may be quite an elementary problem I imagine but I don't have a clue and have been at this for an hour. I've tried to look this problem up but everything i've seen has some additional bells and whistles added to the code or they are receiving a different error message.
Thank you in advance.
The problem is that the dataframe doesn't have the columns names that you want. As it is shown in the picture the names are V1, V2, .... Something like:
colnames(df_mo) = df_mo[1,]
df_mo = df_mo[-1,]
should do the trick. Also should re-consider the way you are loading the data to R so it uses the first line as column names

How to change or reset parameters in the plot(ACF)-device within R-studio

I have estimated a two-intercept mixed multilevel-model using the function lme of the r-package nlme.
After that I checked for autocorrelation by visual inspection using the plot(ACF)-function.
Plotting for the first time I specified maxlag=16.
Now I have two problems: First, the maxlag parameter seems to be stuck somehow, i.e. further plots are all plotted with maxlag=16 even when maxlag is set to other values. 2. The plot is cropped at y=0.8 even if the value of lag 0 obviously is 1.
In the following I share the respective replex in hope of getting answers or inputs on how to solve these two issues.
Link to the dataset and if prefered to copy-paste to the following code-script as well:
#read.dataset:
datafclr <-read.csv("datafclr.csv", header = TRUE, sep = ",", dec = ".", fill = TRUE)
#required packages:
library("Matrix")
library("nlme")
#model-estimation:
tim2 <- lme(fixed=EERTmn ~ male + female +
(male:time7c) + (female:time7c) +
(male:IERT_Cp) + (female:IERT_Cp) +
(male:IERT_Cp_Partner) + (female:IERT_Cp_Partner)-1,
control=list(maxIter=100000), data=datafclr,
random=~male + female -1|dyade/female, correlation=corAR1(), na.action=na.omit)
summary(tim2)
#checking for autocorrelation:
plot(ACF(tim2, maxlag = 16), alpha = 0.01)
Results in the following plot:
This results in thin plot
When I change the maxlag:
plot(ACF(tim2, maxlag = 10), alpha = 0.01)
It results in the same plot
Many thanks in advance!
Best,
Patrick
Joes Schwartz helped me solve these issues in the R-Studio community. For the case someone will have the same difficulties I had I'm sharing his answers here:
First issue: maxlag needs to be typed maxLag and the function works fine.
Second issue: detailed help under the following link:
https://community.rstudio.com/t/resetting-plotting-settings-plot-acf-data/19441

Server Error: Invalid plot index and many duplicate plots

I am not sure why, but it seems as if my code is plotting LOTS of plots in RStudio. I am new to R and RStudio so I can't figure out how many is being plotted, but I think there are duplicates, previous plots, and the plots I want, all within the Plots tab in RStudio. Also, when I try to scroll to see which plots are created, I am getting popups
I am expecting 5 plots for each state but it seems as though I am getting a lot more
library(ggplot2)
try(data('midwest', package='ggplot2'))
for (s in unique(midwest$state)) {
state_data = subset(midwest, state == s)
print(
ggplot(state_data, aes(x=county, y=percprof)) +
geom_bar(stat='identity') +
labs(title=paste(s)) +
xlab('Counties') + ylab('Percentage') +
theme(axis.text.x = element_text(angle = 45, hjust = 1))
)
}
Your code runs fine for me. Try restarting R. It may be a server issue, but it's nothing wrong with your code.

Automatically adjust plot title width using ggplot

I am fairly new to R/ggplot2 and still learning on the go. Hopefully I am not missing something obvious!
I am trying to create several different plots using ggplot2 that I am layouting using the function plot_grid from the cowplot package to make the plots visible side by side and add plot numeration and captions. The problem is that if the generated plots are displayed in a small window or I have many plots beside one another then the titles of the two plots sometimes overlap. To solve this problem I tried to automatically insert line breaks in my too long titles using code I found in another thread since I wanted the text size of the titles to stay constant.
Using the following code I can easily automatically insert the necessary line breaks to make my title a specific width, but the problem is that I always need to enter a numeric value for the width. Depending on the number of plots I am inserting this value would of course change. I could of course go through my code and manually set the width for each set of plots until it is the correct value, but I was hoping to automate this process so that the title width is adjusted automatically to match the width of the x-axis. Is there anyway to implement this in R?
#automatically line break and add titles
myplot_theme1 = function (plot, x.title = NULL, y.title = NULL, plot.title = NULL) {
plot +
labs(title = paste(strwrap(plot.title, width = 50), collapse = "\n"),
x = x.title,
y = y.title)
}
# generate an example plot
data_plot <- data.frame(x = rnorm(1000), y = rnorm (1000))
plot1 <- ggplot(data_plot, aes(x = x, y = y)) + geom_point()
title <- "This is a title that is very long and does not display nicely"
myplot_theme1(plot1, plot.title = title)
My test plot
I have tried searching but I haven't found any solutions that seem to address what I am looking for. The only solution I did find that looked promising was based on the package gridDebug. This packages doesn't seem to be supported by my operating system anymore though (macOS Sierra Version 10.12.6) since when I try to install it I get the following error message:
Warning in install.packages: dependencies ‘graph’, ‘Rgraphviz’ are not available
And on the CRAN package documentation it states that the package is not even available for macOS El Capitan which was my previous operating system. If someone knows what is causing this issue so that I could try the solution from the above thread that would of course be great as well.
One idea (but perhaps not an ideal solution) is to adjust the size of text based on the number of characters in the title. You can adjust ggplot properties using theme and in this case you want to adjust plot.title (the theme property, not your variable). plot.title has elements size and horizontal justification hjust, the latter is in range [0,1].
# generate an example plot
data_plot <- data.frame(x = rnorm(1000), y = rnorm (1000))
plot1 <- ggplot(data_plot, aes(x = x, y = y)) + geom_point()
title1 <- "This is a title that is very long and does not display nicely"
title2 <- "I'm an even longer sentence just test me out and see if I display the way you want or you'll be sorry"
myplot_theme1 = function (plot, x.title = NULL, y.title = NULL, plot.title = NULL) {
plot +
labs(title = plot.title,
x = x.title,
y = y.title) +
theme(plot.title = element_text(size=800/nchar(plot.title), hjust=0.5)) # 800 is arbitrarily chosen
}
myplot_theme1(plot1, plot.title = title1)
myplot_theme1(plot1, plot.title = title2)

Resources