I'm having trouble with simply writing a function that turns a string representation of a number into a decimal representation. To boil the issue down to essentials, consider the following function:
f <- function(x) {
y <- as.numeric(x)
return(y)
}
When I apply this function to the string "47.418" I get back 47.42, but what I want to get back is 47.418. It seems like the return value is being rounded for some reason.
Any suggestions would be appreciated
You have done something to your print options. I get no rounding:
> f <- function(x) { y <- as.numeric(x); return(y) }
> f(47.418)
[1] 47.418
?options
The default value for digits is 7:
> options("digits")
$digits
[1] 7
Further questions should be accompanied by dput() on the object in question.
Related
I'm working a practice exercise for a class, and I've reached an impasse. The instructions state:
Write a function that takes a string of text and counts the number of characters. The function should return "There are xx characters in that string."
This is what I have thus far:
w <- "I hope everyone has a good weekend"
answer <- function (nchar) {
statement <- paste("There are", nchar, "characters in that string")
}
I've tried plugging "w" into the function to see if it works, but I'm getting no results. Please bear in mind that I'm new to R.
But I've been wracking my brain over this. Can someone give me a clue as to what I'm missing? Many thanks for any help provided.
nchar is your function to count the number of characters in a string. If you don't want to count the whitespace you could use gsub to remove them from your string and count again the characters. You could use the following code:
w <- "I hope everyone has a good weekend"
answer <- function (x) {
statement <- paste("There are", nchar(x), "characters in that string")
statement
}
answer(w)
#> [1] "There are 34 characters in that string"
answer2 <- function (x) {
statement <- paste("There are", nchar(gsub(" ", "",x))
, "characters in that string")
statement
}
answer2(w)
#> [1] "There are 28 characters in that string"
Created on 2023-02-03 with reprex v2.0.2
You are confusing the function
nchar()
with your function input
Look at the following:
w <- "I hope everyone has a good weekend"
answer <- function (myInputString) { statement <- paste("There are",
nchar(myInputString), "characters in that string")
return(statement) }
Note that you also missed to add return at the end of your function, to specify what the output should be.
Good luck with you journey into coding ;)
Just for a bit of fun - and for you to try to work out what is going on - here are some alternative functions that give the same answer as the built-in nchar but don't actually use it...
This one splits it into a list of single characters, converts it to a vector, and returns the length...
nchar1 <- function(s) length(unlist(str_split(s, "")))
This one converts it into RAW format (a vector of the byte values that are used to encode the string) and returns the length...
nchar2 <- function(s) length(charToRaw(s))
This one uses a while loop to see at which point the substring function substr returns an empty string...
nchar3 <- function(s){
i <- 0
while(substr(s, i+1, i+2) != ""){
i <- i+1
}
return(i)
}
This one uses a similar approach to count how many times we can remove the first character before getting to an empty string...
nchar4 <- function(s){
i <- 0
while(s != ""){
s <- sub(".", "", s)
i <- i + 1
}
return(i)
}
This one might make your head hurt a bit. It uses a similar technique to the last one but uses Recall to call itself until it gets to the point (a blank string) at which it returns an answer.
nchar5 <- function(s, n = 0){
if(s == "") {
return(n)
} else {
Recall(sub(".", "", s), n + 1)
}
}
nchar1("Good luck!")
[1] 10
nchar2("Good luck!")
[1] 10
nchar3("Good luck!")
[1] 10
nchar4("Good luck!")
[1] 10
nchar5("Good luck!")
[1] 10
I am a beginner in R and am attempting the following question:
Create a function in R which takes as its
input a natural number N and returns as an output the list of
all perfect numbers between 1 and N.
There are 3 steps here:
1. Check the list of factors
2. Check whether it is a perfect number
3.check from 1 to 10000
factorlist<-function(n){
if(n<2){return("Invalid Input")}
if(n%%1!=0){return("Invalid Input")}
vec<-0
for(i in 1:(n-1)){
if(n%%i==0){
vec[length(vec)]<-i
vec<-c(vec,0)
}
}
vec<-vec[-length(vec)]
return(vec)
}
perfectcheck<-function(n){
if(n-sum(factorlist(n)) ==0) {return("Perfect Number")}
else{return("Not Perfect Number")}
}
perfectcheckN<-function(N){
for(i in 1:N){
if(perfectcheck(i)=="Perfect Number"){
vec[length(vec)]<-i
vec<-c(vec)
}
}
vec<-vec[-length(vec)]
return(vec)
}
and i got the following error for my third step
Error in sum(factorlist(n)) : invalid 'type' (character) of argument
I spent like few hours and still could not figure out my mistake, please help. Thanks!
The output of factorlist(i) is character when i==1.
There's a lot of loops and ifs in your code. You can just do
facs <- function (x) {
x <- as.integer(x)
div <- seq_len(abs(x) - 1L)
div[x%%div == 0L]
}
perfectcheckN <- function(N){
out <- 1:N
out[sapply(out, function(x) x == sum(facs(x)))]
}
I am trying to find the equivalent of the ANYALPHA SAS function in R. This function searches a character string for an alphabetic character, and returns the first position at which at which the character is found.
Example: looking at the following string '123456789A', the ANYALPHA function would return 10 since first alphabetic character is at position 10 in the string. I would like to replicate this function in R but have not been able to figure it out. I need to search for any alphabetic character regardless of case (i.e. [:alpha:])
Thanks for any help you can offer!
Here's an anyalpha function. I added a few extra features. You can specify the maximum amount of matches you want in the n argument, it defaults to 1. You can also specify if you want the position or the value itself with value=TRUE:
anyalpha <- function(txt, n=1, value=FALSE) {
txt <- as.character(txt)
indx <- gregexpr("[[:alpha:]]", txt)[[1]]
ret <- indx[1:(min(n, length(indx)))]
if(value) {
mapply(function(x,y) substr(txt, x, y), ret, ret)
} else {ret}
}
#test
x <- '123A56789BC'
anyalpha(x)
#[1] 4
anyalpha(x, 2)
#[1] 4 10
anyalpha(x, 2, value=TRUE)
#[1] "C" "A"
I have a vector of a binary string:
a<-c(0,0,0,1,0,1)
I would like to convert this vector into decimal.
I tried using the compositions package and the unbinary() function, however, this solution and also most others that I have found on this site require g-adic string as input argument.
My question is how can I convert a vector rather than a string to decimal?
to illustrate the problem:
library(compositions)
unbinary("000101")
[1] 5
This gives the correct solution, but:
unbinary(a)
unbinary("a")
unbinary(toString(a))
produces NA.
You could try this function
bitsToInt<-function(x) {
packBits(rev(c(rep(FALSE, 32-length(x)%%32), as.logical(x))), "integer")
}
a <- c(0,0,0,1,0,1)
bitsToInt(a)
# [1] 5
here we skip the character conversion. This only uses base functions.
It is likely that
unbinary(paste(a, collapse=""))
would have worked should you still want to use that function.
There is a one-liner solution:
Reduce(function(x,y) x*2+y, a)
Explanation:
Expanding the application of Reduce results in something like:
Reduce(function(x,y) x*2+y, c(0,1,0,1,0)) = (((0*2 + 1)*2 + 0)*2 + 1)*2 + 0 = 10
With each new bit coming next, we double the so far accumulated value and add afterwards the next bit to it.
Please also see the description of Reduce() function.
If you'd like to stick to using compositions, just convert your vector to a string:
library(compositions)
a <- c(0,0,0,1,0,1)
achar <- paste(a,collapse="")
unbinary(achar)
[1] 5
This function will do the trick.
bintodec <- function(y) {
# find the decimal number corresponding to binary sequence 'y'
if (! (all(y %in% c(0,1)))) stop("not a binary sequence")
res <- sum(y*2^((length(y):1) - 1))
return(res)
}
I have a fasta format file where in i have to only keep those nodes whose length is less than 100. however, the problem i am currently facing is that i am able to separate the nodes but am not able to put the characters of each node in separate variable whose length i can then check and subsequently separate the requisite nodes from longer ones.
So what i mean is i am able to read the headings and separate nodes but how do i put the characters within each node in a variable.
This is a sample of my data
>NODE_1
GTTGGCCGAGCCCCAGGACGCGTGGTTGTTGAACCAGATCAGGTCCGGGCTCCACTGCAC
GTAGTCCTCGTTGGACAGCAGCGGGGCGTACGAGGCCAGCTTGACCACGTCGGCGTTGCG
CTCGAGCCGGTCATGAACGCGGCCTCGGCGAGGGCGTTCTTCCAGGCGTTGCCCTGGGAA
>NODE_2
CCTCCGGCGGCACCACGGTCGGCGAGGCCCTCAACATCCTGGAGCGCACCGACCTGTCCA
CCGCGGACAAGGCCGGTTACCTGCACCGCTACATCGAGGCCAGCCGCATCGCGTTCGCGG
ACCGCGGGCGCTGGGTCGGCGACCCCGCCTTCGAGGACGTAC
>NODE_3
CCTCCGGCGGCACCACGGTCGGCGAGGCCCTCAACATCCTGGAGCGCACCGACCTGTCCA
CCGCGGACAAGGCCGGTTACCTGCACCGCTACATCGAGGCCAGCCGCATCGCGTTCGCGG
ACCGCGGGCGCTGGGTCGGCGACCCCGCCTTCGAGGACGTACATCATTCCTTAATCTTCC
my code:
x <- readLines("1.fa", n = -1L, ok = TRUE, warn = TRUE)
for (i in 1:length(x)) {
if (substr(x[i],1,1)=='>') {
head <- c(head,x[i])
q <- x[i+1]
if (q=!0) {
contig <- c(contig,q)
print(contig)
contig.length <- c(contig.length, nchar(q))
} else {
break
}
} else {
z <- paste(z,x[i], sep=" ")
}
}
You should use BioConductor for that. You're actually trying to parse a FASTA-file to some kind of a list. Bioconductor has a simple function read.fasta() that does just that, and returns an object where you can get the lengths and so on. Learning bioconductor is definitely worth the hassle if you work with sequences.
To do it in base R, you'll need to work with lists, something like :
Split.Fasta <- function(x){
out <- list()
for(i in x){
if(substr(i,1,1)==">") {
name <- gsub(">","",i)
out[[name]] <- character(0)
} else if (grepl("\\w",i)){
out[[name]] <- paste(out[[name]],gsub("\\W","",i),sep="")
}
}
out
}
Which works like :
zz <- textConnection(">NODE_1
GTTGGCCGAGCCCCAGGACGCGTGGTTGTTGAACCAGATCAGGTCCGGGCTCCACTGCAC
GTAGTCCTCGTTGGACAGCAGCGGGGCGTACGAGGCCAGCTTGACCACGTCGGCGTTGCG
CTCGAGCCGGTCATGAACGCGGCCTCGGCGAGGGCGTTCTTCCAGGCGTTGCCCTGGGAA
>NODE_2
CCTCCGGCGGCACCACGGTCGGCGAGGCCCTCAACATCCTGGAGCGCACCGACCTGTCCA
CCGCGGACAAGGCCGGTTACCTGCACCGCTACATCGAGGCCAGCCGCATCGCGTTCGCGG
ACCGCGGGCGCTGGGTCGGCGACCCCGCCTTCGAGGACGTAC
>NODE_3
CCTCCGGCGGCACCACGGTCGGCGAGGCCCTCAACATCCTGGAGCGCACCGACCTGTCCA
CCGCGGACAAGGCCGGTTACCTGCACCGCTACATCGAGGCCAGCCGCATCGCGTTCGCGG
ACCGCGGGCGCTGGGTCGGCGACCCCGCCTTCGAGGACGTACATCATTCCTTAATCTTCC")
X <- readLines(zz,n=-1L,ok=TRUE,warn=TRUE)
close(zz)
Y <- Split.Fasta(X)
$`NODE_1 `
[1] "GTTGGCCGAGCCCCAGGACGCGTGGTTGTTGAACCAGATCA...
$`NODE_2 `
[1] "CCTCCGGCGGCACCACGGTCGGCGAGGCCCTCAACATCCTGGAGC...
$`NODE_3 `
[1] "CCTCCGGCGGCACCACGGTCGGCGAGGCCCTCAACATCCTGGAGCGCAC...
It returns a list which you can use later on to check lengths and so on :
sapply(Y,nchar)
NODE_1 NODE_2 NODE_3
180 162 180
Still, learn to use BioConductor, you'll thank yourself for that.
You could install the seqinr package, which has lots of methods for analysing sequence data.
install.packages("seqinr")
Next, read in your fasta file:
seqs <- read.fasta("myfile.fa")
And then, extract sequences from the list with length < 100:
seqs.small <- seqs[sapply(seqs, function(x) getLength(x) < 100)]
maybe assign would be helpful?
assign('NODE_1', 'GTTGG...')