How do I print text along with a calculated value in R - r

At the end of the following code I want to print n_species, but I first want to print "The number of species is" and then the value. How can I do this?
n_species <- 0
n_invisibility <- 0
for(i in Species) {
n_species <- n_species + 1
for(i in Invisibility){
if(i == "Y") {
n_invisibility <- n_invisibility + 1
}
else {
n_invisibility <- n_invisibility + 0
}
}
}
print(n_species)
print(n_invisibility)

Another option that allows to easily control formatting of numbers:
sprintf("The number of species is %i.", n_species)

print(paste("The number of species is",n_species))
Paste also takes the parameters sep, where the default is sep=" ", and collapse, which is basically separation for vectors, if you're trying to print one as a string (and more).

Related

Function to abbreviate scientific names

Could you please help me?
I'm trying to modify an R function written by a colleague. This function receives a character vector with scientific names (Latin binomes), just like this one:
Name
Cerradomys scotti
Oligoryzomys sp
Philander frenatus
Byrsonima sp
Campomanesia adamantium
Cecropia pachystachya
Cecropia sp
Erythroxylum sp
Ficus sp
Leandra aurea
Then, it should abbreviate the scientific names, using only the first three letters of the genus (first term) and the epithet (second term) to make a short code. For instance, Cerradomys scotti should become Cersco.
This is the original function:
AbbreviatedNames <- function(vector) {
abbreviations <- character(length = length(vector))
splitnames <- strsplit(vector, " ")
for (i in 1:length(vector)) {
vector[i] <- if(splitnames[[i]][2] == "^sp") {
paste(substr(splitnames[[i]][1],1,3),
splitnames[[i]][2], sep = "")
}
else {
paste(substr(splitnames[[i]][1],1,3),
substr(splitnames[[i]][2],1,3), sep = "")
}
}
vector
}
With a simple list like that one, the function works perfectly. However, when the list has some missing or extra elements, it does not work. The loop stops when it meets the first row that does not match the pattern. Let's take this more complex list as an example:
Name
Cerradomys scotti
Oligoryzomys sp
Philander frenatus
Byrsonima sp
Campomanesia adamantium
Cecropia pachystachya
Cecropia sp
Erythroxylum sp
Ficus sp
Leandra aurea
Morfosp1
Vismia cf brasiliensis
See that Morfosp1 has only 1 term. And Vismia cf brasiliensis has an additional term (cf) in the middle.
I've tried adapting the function, for instance, this way:
AbbreviatedNames <- function(vector) {
abbreviations <- character(length = length(vector))
splitnames <- strsplit(vector, " ")
for (i in 1:length(vector)) {
vector[i] <- if(splitnames[[i]][2] == "^sp" & is.na(splitnames[[i]][2]))) {
paste(substr(splitnames[[i]][1],1,3),
splitnames[[i]][2], sep = "")
}
else {
paste(substr(splitnames[[i]][1],1,3),
substr(splitnames[[i]][2],1,3), sep = "")
}
}
vector
}
Nevertheless, it does not work. I get this error message:
Error in if (splitnames[[i]][2] == "^sp" & is.na(splitnames[[i]][2])) { :
valor ausente onde TRUE/FALSE necessário
How could I make the function:
Deal also with names that have only 1 term?
Expected outcome: Morfosp1 -> Morfosp1 (stays the same)
Deal also with names that have an additional term in the middle?
Expected outcome: Vismia cf brasiliensis -> Visbra (term in the middle is ignored)
Thank you very much!
Something like this is pretty concise:
test <- c("Cerradomys scotti", "Oligoryzomys sp", "Latingstuff", "Latin staff more")
# function to truncate a given name
trunc_str <- function(latin_name) {
# split it on a space
name_split <- unlist(strsplit(latin_name, " ", fixed = TRUE))
# if one name, just return it
if (length(name_split) == 1) return(name_split)
# truncate to first 3 letters
name_trunc <- substr(name_split, 1, 3)
# paste the first and last term together (skipping any middle ones)
paste0(head(name_trunc, 1), tail(name_trunc, 1))
}
# iterate over all
vapply(test, trunc_str, "")
# Cerradomys scotti Oligoryzomys sp Latingstuff Latin staff more
# "Cersco" "Olisp" "Latingstuff" "Latmor"
If you don't want a named vector output, you can use USE.NAMES = FALSE in vapply(). Or feel free to use a loop here.
AbbreviatedNames <- function(vector) {
abbreviations <- character(length = length(vector))
splitnames <- strsplit(vector, " ")
for (i in 1:length(vector)){
# One name
if(length(splitnames[[i]])==1){
vector[i] <- paste(substr(splitnames[[i]][1],1,3),
substr(splitnames[[i]][2],1,3), sep = "")
}
# Two names
else if(length(splitnames[[i]])==2){
vector[i] <- if(splitnames[[i]][2] == "^sp") {
paste(substr(splitnames[[i]][1],1,3),
splitnames[[i]][2], sep = "")
}
else {
paste(substr(splitnames[[i]][1],1,3),
substr(splitnames[[i]][2],1,3), sep = "")
}
}
# Three names
else if(length(splitnames[[i]])==3){
vector[i] <- paste(substr(splitnames[[i]][1],1,3),
substr(splitnames[[i]][3],1,3), sep = "")
# Assuming that the unwanted word is always in the middle
}
}
return(vector)
}
I tested on the list you gave and it seems to work, tell me if you need a more general code
Thank you very much for the help, Ricardo and Adam! I've made the code available on GitHub to other people who work with interaction networks, and need to abbreviate scientific names to be used in graphs.

I try to give value for a 2 dimensional matrix using for loop in R, however it gives me unexpected NA values

I want to calculate the moving sum with varying window sizes of 1:15.
a <- matrix(0,257,15)
b <- c(1:257)
for(j in 1:15) {
for(i in j:257) {
a[i,j] <- sum(b[i-j+1:i])
}
}
However, the above code gives cases me confusion, as it yields NA after the 129th row in every column. What could be reason for such behaviour?
Add parentheses (...):i into indexing of b[(i-j+1):i] in order to properly have a range between i-j+1 and i. The full code then reads as
a <- matrix(0,257,15)
b <- c(1:257)
for (j in 1:15) {
for (i in j:257) {
a[i,j] <- sum(b[(i-j+1):i])
}
}
As an example on the importance of the parentheses, you may compare the calculating order of the following three cases:
> (1+1):2
[1] 2
> 1+1:2
[1] 2 3
> 1+(1:2)
[1] 2 3
I'm replicating exactly same results. There must be some kind of bug (or feature unknown to me) in R engine. The expressions are correct (added "bCoordinetes" to check those). The results get fixed when closing the row expression in brackets:
a <- matrix(0,257,15)
bCoordinates <- matrix(0,257,15) #added just for validation - not needed for results
b <- c(1:257)
for(j in 1:15) {
for(i in j:257) {
a[i,j] <- sum(b[(i-j+1):i]) #closing column calc to brackets fixes the issue
bCoordinates[i,j] <- i-j+1 #added just for validation - not needed for results
}
}

Input in R console

a is more than 1 and b is less than 1000. How do I input in a and b in R console instead of defining in the R script? I have read about the readline function but don't really understand it well.
a <- 3
b <- 4
y <- a*b
y
if((y %% 2) == 0) {
print(paste(y,"is Even"))
} else {
print(paste(y,"is Odd"))
}
You can use readline() function.
Example:
my.name <- readline(prompt="Enter name: ")
my.age <- readline(prompt="Enter age: ")
# convert character into integer
my.age <- as.integer(my.age)
print(paste("Hi,", my.name, "next year you will be", my.age+1, "years old."))
example
By just changing your first two lines using readline and wrapping the entire thing in {} You can combine your script into a clause.
{
a <- as.numeric(readline(prompt = "Enter a: ")) # Read in from console and change to number
b <- as.numeric(readline(prompt = "Enter b: ")) # Read in from console and change to number
y <- a*b
y
if((y %% 2) == 0) {
print(paste(y,"is Even"))
} else {
print(paste(y,"is Odd"))
}
}
This allows you to run the whole thing from top to bottom and take your inputs consecutively. You can also make this into a function.

Dynamic variable names in plots, files and compatibility with loop

I am trying to write a function that makes a plot and saves it into a file automatically.
The trick I struggle with it to do both dynamically [plotname=varname & filename=varname &],
and to make it compatible with calling it from a loop.
# Create data
my_df = cbind(uni=runif (100),norm=rnorm (100),bino=rbinom(100,20, 0.5)); head (my_df)
my_vec = my_df[,'uni'];
# How to make plot and file-name meaningful if you call the variable in a loop?
# if you call by name, the plotname is telling. It is similar what I would like to see.
hist(my_df[,'bino'])
for (plotit in colnames(my_df)) {
hist(my_df[,plotit])
print (plotit)
# this is already not meaningful
}
# step 2 write it into files
hist_auto <- function(variable, col ="gold1", ...) {
if ( length (variable) > 0 ) {
plotname = paste(substitute(variable), sep="", collapse = "_"); print (plotname); is (plotname)
# I would like to define plotname, and later tune it according to my needs
FnP = paste (getwd(),'/',plotname, '.hist.pdf', collapse = "", sep=""); print (FnP)
hist (variable, main = plotname)
#this is apparently not working: I do not get my_df[, "bino"] or anything similar
dev.copy2pdf (file=FnP )
} else { print ("var empty") }
}
hist_auto (my_vec)
# name works, and is meaningful [as much as the var name ... ]
hist_auto (my_df[,'bino'])
# name sort of works, but falls apart
assign (plotit, my_df[,'bino'])
hist_auto (get(plotit))
# name works, but meaningless
# Now in a loop
for (plotit in colnames(my_df)) {
my_df[,plotit]
hist(my_df[,plotit])
## name works, but meaningless and NOT UNIQUE > overwritten by next
}
for (plotit in colnames(my_df)) {
hist_auto(my_df[,plotit])
## name works, but meaningless and NOT UNIQUE > overwritten by next
}
for (plotit in colnames(my_df)) {
assign (plotit, my_df[,plotit])
hist_auto (get(plotit))
## name works, but meaningless and NOT UNIQUE > overwritten by next
}
My aim is to have a function that iterates over eg. columns of a matrix, plots and saves each with a unique and meaningful name.
The solution will probably involve a smart combination of substitute() parse() eval() and paste (), but lacking solid understanding I failed to figure out.
My basis of experimentation was:
how to dynamically call a variable?
How about something like this? You may need to install.packages("ggplot2")
library(ggplot2)
my_df <- data.frame(uni=runif(100),
norm=rnorm(100),
bino=rbinom(100, 20, 0.5))
get_histogram <- function(df, varname, binwidth=1, save=T) {
stopifnot(varname %in% names(df))
title <- sprintf("Histogram of %s", varname)
p <- (ggplot(df, aes_string(x=varname)) +
geom_histogram(binwidth=binwidth) +
ggtitle(title))
if(save) {
filename <- sprintf("histogram_%s.png", gsub(" ", "_", varname))
ggsave(filename, p, width=10, height=8)
}
return(p)
}
for(var in names(my_df))
get_histogram(my_df, var, binwidth=0.5) # If you want to save them
get_histogram(my_df, "uni", binwidth=0.1, save=F) # If you want to look at a specific one
So I ended up with 2 functions, one that can iterate over data frames, and another that takes a single vectors. Using parts of Adrian's [thanks!] solution:
hist_dataframe <- function(variable, col ="gold1", ...) {
stopifnot(colName %in% colnames(df))
variable = df[,colName]
stopifnot(length (variable) >1 )
plotname = paste(substitute(df),'__', colName, sep="")
FnP = paste (getwd(),'/',plotname, '.hist.pdf', collapse = "", sep=""); print (FnP)
hist (variable, main = plotname)
dev.copy2pdf (file=FnP )
}
And the one for simple vectors stays as in Q.

combine results from loop in one file in R (some results were missing)

I want to combine the results from a for loop into 1 txt file and I have written my code based on suggestion from this link
combine results from a loop in one file
There is one problem. I am supposed to get 8 results (row) but I only ended with only 5. Somehow the other results did not get into the file. I think the problem is with the if statement but I don't know how to fix it.
Here is my code
prob <- c(0.10, 0.20)
for (j in seq(prob)) {
range <- c(2,3)
for (i in seq(range)) {
sample <- c(10,20)
for (k in seq(sample)) {
data <- Simulation(X =1,Y =range[i], Z=sample[k] ,p = prob[j])
filename <- paste('file',i,'txt')
if (j == 1) {
write.table(data, "Desktop/file2.txt", col.names= TRUE)
} else {
write.table(data,"Desktop/file2.txt", append = TRUE, col.names = FALSE)
}
}
}
}
That's because the if ( j == 1 ) bit is meant to check whether this is the first time you've written to the file or not.
If it is the first time, then it will write the column names (i.e. X, Y, Z, p) into the file (see the col.names=TRUE?).
If it isn't the first time, then it won't write the column names, but will just append the data.
Since you have multiple nested loops, that condition won't work so well for you: when j==1 (i.e. for prob=0.1) you perform 4 other loops within. But since j==1, the data is getting overwritten each time.
I'd recommend initialising a variable count that counts how many times you've performed Simulation, and then changing that line to if ( count == 1 ):
count <- 1
prob <- c(0.10,0.20)
# .... code as before
data <- Simulation(X =1,Y =range[i], Z=sample[k] ,p = prob[j])
if ( count == 1 ) {
write.table(data, "Desktop/file2.txt", col.names=T)
} else {
write.table(data, "Desktop/file2.txt", append=T, col.names=F)
}
# increment count
count <- count + 1
}}}

Resources