I would like to make a bubble plot of two ordinal variables plotted against each other, with a loess line plotted trough it in SAS, could somebody help me with this?
More specific:
The two variables contain scores between 0 and 10.
my data looks pretty much like this:
data dataset;
Obs var1 var2
1 0 4
2 3 2
3 3 2
4 2 5
5 6 9
6 7 9
7 1 7
8 7 9
What I'm doing right now is just making a scatterplot and drawing a loess line trough it, but since a scatterplot of this kind of data only gives you a roster-like graph, I would like to make a bubble plot out of it to represent the frequency of each case... (so in my example the bubbles in (3,2) and (7,9) would be a bit bigger than te rest)
Afterwards however I would like to still be able to draw that loess line trough it...
Not exact but hopefully enough to get you started
data dataset;
input obs var1 var2;
cards;
1 0 4
2 3 2
3 3 2
4 2 5
5 6 9
6 7 9
7 1 7
8 7 9
;
run;
proc freq data=dataset noprint;
table var1*var2/out=data2;
run;
proc sgplot data=data2;
bubble x=var1 y=var2 size=count;
loess x=var1 y=var2;
run; quit;
Related
I have a dataframe that contains pixels coordinates and its RGB and CIElab value for each.
These values are abstracted from a certain image. After changing some RGB/CIElab value in this dataframe, I would like to let the 'data' go back to an 'image'.
I include a sample with variable r, g, b, x, and y. r, g, and b contain the RGB value of each pixel.x and y indicate the pixel's coordinate.
So basically, I would like to create a picture with three color channels(rgb) with this dataframe. But I have no idea how to implement the process. Abstracting RGB value from image is easy. However, inversing the process is quite difficult.
r g b x y
1 0.91373 0.72157 0.45098 1 1
2 0.86275 0.59216 0.21961 2 1
3 0.84314 0.56471 0.18039 3 1
4 0.83922 0.56078 0.17647 4 1
5 0.84314 0.56471 0.18039 5 1
6 0.84706 0.56863 0.18431 6 1
7 0.85098 0.57255 0.18824 7 1
8 0.85490 0.57647 0.19216 8 1
9 0.85490 0.57647 0.19216 9 1
10 0.85098 0.57255 0.18824 10 1
Update:
I tried to use as.cimg function
my_cimg <- as.cimg(unlist(rgb_image[1:3]), x=length(unique(rgb_image$x)), y=length(unique(rgb_image$y)),cc = 3)
And it works!!!
Thanks!
I have a dataset of Ages for the customer and I wanted to make a frequency distribution by 9 years of a gap of age.
Ages=c(83,51,66,61,82,65,54,56,92,60,65,87,68,64,51,
70,75,66,74,68,44,55,78,69,98,67,82,77,79,62,38,88,76,99,
84,47,60,42,66,74,91,71,83,80,68,65,51,56,73,55)
My desired outcome would be similar to below-shared table, variable names can be differed(as you wish)
Could I use binCounts code into it ? if yes could you help me out using the code as not sure of bx and idxs in this code?
binCounts(x, idxs = NULL, bx, right = FALSE) ??
Age Count
38-46 3
47-55 7
56-64 7
65-73 14
74-82 10
83-91 6
92-100 3
Much Appreciated!
I don't know about the binCounts or even the package it is in but i have a bare r function:
data.frame(table(cut(Ages,0:7*9+37)))
Var1 Freq
1 (37,46] 3
2 (46,55] 7
3 (55,64] 7
4 (64,73] 14
5 (73,82] 10
6 (82,91] 6
7 (91,100] 3
To exactly duplicate your results:
lowerlimit=c(37,46,55,64,73,82,91,101)
Labels=paste(head(lowerlimit,-1)+1,lowerlimit[-1],sep="-")#I add one to have 38 47 etc
group=cut(Ages,lowerlimit,Labels)#Determine which group the ages belong to
tab=table(group)#Form a frequency table
as.data.frame(tab)# transform the table into a dataframe
group Freq
1 38-46 3
2 47-55 7
3 56-64 7
4 65-73 14
5 74-82 10
6 83-91 6
7 92-100 3
All this can be combined as:
data.frame(table(cut(Ages,s<-0:7*9+37,paste(head(s+1,-1),s[-1],sep="-"))))
I want to plot a lot of boxplots in on particular style to compare them.
But when a group is empty the group "isn't plotted".
lets say I have a dataframe:
a b
1 1 5
2 1 4
3 1 6
4 1 4
5 2 9
6 2 8
7 2 9
8 3 NaN
9 3 NaN
10 3 NaN
11 4 2
12 4 8
and I use boxplot to plot it:
boxplot(b ~ a , df)
than I get the plot without group 3
(which I can't show because I did not have "10 reputation")
I found some solutions for removing empty groups via Google but my problem is the other way around.
And I found the solution via at=c(1,2,4) but as I generate an Rscript with python and different groups are empty I would prefer, that the groups aren't dropped at all.
Oh I don't think I have the time to grapple with additional packages.
Therefore I would be thankful for solutions without them.
You can get the group on the x-axis by
boxplot(b ~ a , df, na.action=na.pass)
Or
boxplot(b~factor(a), df)
Following are first 15 rows of my data:
> head(df,15)
frame.group class lane veh.count mean.speed
1 [22,319] 2 5 9 23.40345
2 [22,319] 2 4 9 24.10870
3 [22,319] 2 1 11 14.70857
4 [22,319] 2 3 8 20.88783
5 [22,319] 2 2 6 16.75327
6 (319,616] 2 5 15 22.21671
7 (319,616] 2 2 16 23.55468
8 (319,616] 2 3 12 22.84703
9 (319,616] 2 4 14 17.55428
10 (319,616] 2 1 13 16.45327
11 (319,616] 1 1 1 42.80160
12 (319,616] 1 2 1 42.34750
13 (616,913] 2 5 18 30.86468
14 (319,616] 3 3 2 26.78177
15 (616,913] 2 4 14 32.34548
'frame.group' contains time intervals, 'class' is the vehicle class i.e. 1=motorcycles, 2=cars, 3=trucks and 'lane' contains lane numbers. I want to create 3 scatter plots with frame.group as x-axis and mean.speed as y-axis, 1 for each class. In a scatterplot for one vehicle class e.g. cars, I want 5 plots i.e. one for each lane. I tried following:
cars <- subset(df, class==2)
by(cars, lane, FUN = plot(frame.group, mean.speed))
There are two problems:
1) R does not plot as expected i.e. 5 plots for 5 different lanes.
2) Only one is plotted and that too is box-plot probably because I used intervals instead of numbers as x-axis.
How can I fix the above issues? Please help.
Each time a new plot command is issued, R replaces the existing plot with the new plot. You can create a grid of plots by doing par(mfrow=c(1,5)), which will be 1 row with 5 plots (other numbers will have other numbers of rows and columns). If you want a scatterplot instead of a boxplot you can use plot.default
It is easier to do all this with the ggplot2 library instead of the base graphics, and the resulting plot will look much nicer:
library(ggplot2)
ggplot(cars,aes(x=frame.group,y=mean.speed))+geom_point()+facet_wrap(~lane)
See the ggplot2 documentation for more details: http://docs.ggplot2.org/current/
I have binned data that looks like this:
(8.048,18.05] (-21.95,-11.95] (-31.95,-21.95] (18.05,28.05] (-41.95,-31.95]
81 76 18 18 12
(-132,-122] (-122,-112] (-112,-102] (-162,-152] (-102,-91.95]
6 6 6 5 5
(-91.95,-81.95] (-192,-182] (28.05,38.05] (38.05,48.05] (58.05,68.05]
5 4 4 4 4
(78.05,88.05] (98.05,108] (-562,-552] (-512,-502] (-482,-472]
4 4 3 3 3
(-452,-442] (-412,-402] (-282,-272] (-152,-142] (48.05,58.05]
3 3 3 3 3
(68.05,78.05] (118,128] (128,138] (-582,-572] (-552,-542]
3 3 3 2 2
(-532,-522] (-422,-412] (-392,-382] (-362,-352] (-262,-252]
2 2 2 2 2
(-252,-242] (-142,-132] (-81.95,-71.95] (148,158] (-1402,-1392]
2 2 2 2 1
(-1372,-1362] (-1342,-1332] (-942,-932] (-862,-852] (-822,-812]
1 1 1 1 1
(-712,-702] (-682,-672] (-672,-662] (-632,-622] (-542,-532]
1 1 1 1 1
(-502,-492] (-492,-482] (-472,-462] (-462,-452] (-442,-432]
1 1 1 1 1
(-432,-422] (-352,-342] (-332,-322] (-312,-302] (-302,-292]
1 1 1 1 1
(-202,-192] (-182,-172] (-172,-162] (-51.95,-41.95] (88.05,98.05]
1 1 1 1 1
(108,118] (158,168] (168,178] (178,188] (298,308]
1 1 1 1 1
(318,328] (328,338] (338,348] (368,378] (458,468]
1 1 1 1 1
How can I plot this data so that the bin is sorted from most negative on the left to most positive on the right? Currently my graph looks like this. Notice that it is not sorted at all. In particular the second bar (value = 76) is placed to the right of the first:
(8.048,18.05] (-21.95,-11.95]
81 76
This is the command I use to plot:
barplot(x,ylab="Number of Unique Tags", xlab="Expected - Observed")
I really want to help answer your question, but I gotta tell you, I can't make heads or tails of your data. I see a lot of opening parenthesis but no closing ones. The data looks sorted descending by whatever the values are on the bottom of each row. I have no idea what to make out of a value like "(8.048,18.05]"
Am I missing something obvious? Can you make a more simple example where your data structure is not a factor?
I would generally expect a data frame or a matrix with two columns, one for the X and one for the Y.
See if this example of sorting helps (I'm sort of shooting in the dark here)
tN <- table(Ni <- rpois(100, lambda=5))
r <- barplot(tN)
#stop here and examine the plot
#the next bit converts the matrix to a data frame,
# sorts it, and plots it again
df<-data.frame(tN)
df2<-df[order(df$Freq),]
barplot(df2$Freq)