Displaying large sequence of number for x axis in barplot - r

While plotting bargraph i want to display label of large sequence of numbers for x-axis.I figured by using names.arg=c() .But for displaying large sequence of numbers such as from 1 to 50 in x -axis i have to type all the numbers such as names.arg=c("1","2","3","4"------"50").
Is there any method to display such sequences by other method such as 1:50 or any other method so that i need not type all the numbers.
Can anyone help me to solve this problem.
Thanks in advance

You can use names.arg=1:50 or if you'd like them to have a text datatype names.arg=as.character(1:50)
(follow up from comments)

Related

How to use text function in R?

I am learning graphical analysis using R. Here is the code, which I can not understand.
barplotVS <- barplot(table(mtcarsData$vs), xlab="Type of engine")
text(barplotVS,table(mtcarsData$vs)/2,table(mtcarsData$vs),cex=1.25)
The output is like below. I can not understand the function of text(), I googled the text() function, which shows that the parameter of text(x,y) is numeric vectors of coordinates where the text labels should be written. Can anyone tell me what is barplotVS,table(mtcarsData$vs)/2,table(mtcarsData$vs),cex=1.25 in my code.
barplotVS <- barplot(table(mtcarsData$vs), xlab="Type of engine")
print(barplotVS)
outputs:
[,1]
[1,] 0.7
[2,] 1.9
These are the positions where the center of the bars in the barplot are on the x axis.
print(table(mtcarsData$vs))
outputs:
0 1
18 14
the numbers below are the occurrences of each value that is present in mtcarsData$vs and the numbers above are the actual value that is counted.
When you run the function:
text(barplotVS,table(mtcarsData$vs)/2,table(mtcarsData$vs),cex=1.25)
the first value will be the x positions where to put the labels (i.e. 0.7 and 1.9), the second parameter will be the y positions set in this case to total counts divided by two (i.e. 9 and 7) meaning to put the labels halfway in the bars, the third will be the labels (i.e. 18 and 14) and finally cex is a value that allows to change the size of the font.
Anyway R has in general a good documentation that you can call by using the ? operator (as suggested in the comments). In order to understand try to run the code and check what each variable contains with print or str functions. If you use a IDE (e.g. RStudio) have the content of the variables in a graphical panel so you don't event need to print.

Is there a way to print random values

In R programming, can we print random values at any given point. For example we have unique(iris$Species) showing 3 categories. But Can we print any one category at any given point of time.
Use sample() from base R
sample(unique(iris$Species),1)

Making a histogram

this sounds pretty basic but every time I try to make a histogram, my code is saying x needs to be numeric. I've been looking everywhere but can't find one relating to my problem. I have data with 240 obs with 5 variables.
Nipper length
Number of Whiskers
Crab Carapace
Sex
Estuary location
There is 3 locations and i'm trying to make a histogram with nipper length
I've tried making new factors and levels, with the 80 obs in each location but its not working
Crabs.data <-read.table(pipe("pbpaste"),header = FALSE)##Mac
names(Crabs.data)<-c("Crab Identification","Estuary Location","Sex","Crab Carapace","Length of Nipper","Number of Whiskers")
Crabs.data<-Crabs.data[,-1]
attach(Crabs.data)
hist(`Length of Nipper`~`Estuary Location`)
Error in hist.default(Length of Nipper ~ Estuary Location) :
'x' must be numeric
Instead of correct result
hist() doesn't seem to like taking more than one variable.
I think you'd have the best luck subsetting the data, that is, making a vector of nipper lengths for all crabs in a given estuary.
crabs.data<-read.table("whatever you're calling it")
names<-(as you have it)
Estuary1<-as.vector(unlist(subset(crabs.data, `Estuary Loc`=="Location", select = `Length of Nipper`)))
hist(Estuary1)
Repeat the last two lines for your other two estuaries. You may not need the unlist() command, depending on your table. I've tended to need it for Excel files, but I don't know what format your table is in (that would've been helpful).

How to manage factors with mixed data types

I'm afraid this question has two sub parts. My project is to determine which insurance carrier has the lowest cost based on CPT Codes. Since there are so many CPT Codes I wanted to group them using cut like this:
uCPTCode<- unique(data$CPTCode)
uCPTCode <- cut(uCPTCode,
breaks = c(-Inf, "01999", "69979", "79999", "89398", "99091", "99499", Inf),
labels = c("NA","Anesthesia", "Surgery", "Radiology", "Pathology&Laboratory", "Medicine","Evaluation&Management", "Temp"),
right = FALSE)
Not sure unique is required or wise, but seemed to make sense to me. The issue is that some codes have leading zeros and terminating letters like this
2608 Levels: 0014F 0159T 0164T 0191T 0195T 0232T 0319T 0326T 0513F 0517F 0518F
So question 1 is what is the process to convert these ranges into integers corresponding to the labels I have in the cut function so I can graph the grouped results the x axis?
Question 2 is that I expected the ranges to be continuous, but they are not. How to I manage what happens around code 99000 through 99216 where previous groups (Medicine, Anesthesiology and Evaluation and Management) get combined? Here is a link to the CPT grouper file https://www.dropbox.com/s/wm55n17pufoacww/CPTGrouper.xlsx?dl=0
Here is a smattering of results to see where I am going with it
https://www.dropbox.com/s/h6sdnvm9yew6jdg/SampleStudyResults.xlsx?dl=0
Thanks very much for your time and attention

R spline function given a fixed space

So, I need to generate a spline function to feed it into another program which only accepts a fixed space between consecutive points. So, I used spline function in R with a given number of points to genrate spline, however, the floating-point cutoff makes the space among the points variable, for example:
spline(d$V1, d$V2, n=(max(d$V1)-min(d$V1))/0.0200)
> head(t.spl, 7)
x y
1 2.3000 -3.0204
2 2.3202 -3.0204
3 2.3404 -3.0204
4 2.3606 -3.0204
5 2.3807 -3.0204
6 2.4009 -3.0204
7 2.4211 -3.0204
so, the space between 1st 1nd 2nd row is 0.0202, while between 4th and 5th is 0.0201. So because of this problem, the other program that I am feeding this spline into, doesn't accept this. So, is there any way to make this work?
As an aside: please provide a reproducible example next time (I can't copy/paste your code in because I don't have d or t.spl)
I think you'll find that the different intervals (0.0202 vs 0.0201) is an artifact of the number of characters you are printing on the screen, not of the spline function.
It seems R is printing 4 digits after the decimal point for you for neatness, so it's doing the rounding only for the purposes of displaying the results to you.
You can see how many digits are displayed with options('digits')$digits, and adjust it with options(digits=new_number_of_digits) (see ?options for details).
For example:
options(digits=4)
pi
# 3.142
options(digits=10)
pi
# 3.141592654
In summary, when you feed the values in to your other program, make sure you print the values with enough decimal points that the other program accepts the intervals as being "equal".
If you are writing to a file, for example, just make sure you write enough digits out. If you are copy-pasting from the R console, make sure you adjust R to print out enough digits.
MathematicalCoffee is probably right. I'm just adding an alternative for the sake of wordiness.
myspline <- splinefun(dV$1,dV$2)
mydata.y <- myspline(desired_x_values,deriv=0)
Will guarantee the uniform x-spacings you desire.

Resources