Customized Replacing in R Studio [duplicate] - r

This question already has answers here:
Converting unit abbreviations to numbers
(4 answers)
Closed 1 year ago.
I have a homework to analyze data of Bloomberg Billionaires Index in R Studio, but I am facing a problem with the periods.. There are three forms of values:
185B (No periods)
18.5B (one digit after the period)
1.85B (two digits after the period)
I want to delete the dots and add nine zeros in place of the billion symbol (B) but that means the three values will be the same.
Is there a way to add:
Nine zeros for the first formula (where there are no points)
Eight zeros for the second formula (where there is one digit after
the period)
Seven zeros for the third formula (where there are two digits after
the period)
Thank you!!

x <- c('185B', '18.5B', '1.85B')
as.numeric(sub('B', '', x, fixed = TRUE)) * 10^9
If use of packages are allowed you can use readr::parse_number to get the number directly.
readr::parse_number(x) * 10^9

We can use
library(stringr)
as.numeric(str_remove(x, 'B')) * 10 ^9
#[1] 1.85e+11 1.85e+10 1.85e+09
data
x <- c('185B', '18.5B', '1.85B')

Related

Count occurrences of value in a set of variables in R (per column) [duplicate]

This question already has answers here:
Counting the number of elements with the values of x in a vector
(20 answers)
Closed 1 year ago.
I have this data and I want to figure out a way to know how many ones and how many zeros are in each column (ie Arts and Crafts). I have been trying different things but it hasn't been working. Does anyone have any suggestions?
You can use the table() function in R. This creates a categorical representation of your data. Additionally here convert list to vector I have used unlist() function.
df1 <- read.csv("Your_CSV_file_name_here.csv")
table(unlist(df1$ArtsAndCrafts))
If you want to row vice categorize the number of zeros and ones you can refer to this question in Stackoverflow.

Finding specific decimal point digit [duplicate]

This question already has an answer here:
Extract digit from numeric in r
(1 answer)
Closed 1 year ago.
For example, if I had the number 7.12935239484 and wanted just the 10th decimal place digit (in this example the answer would be 8), how would I go about displaying that using R?
Multiple by 1e10, convert to an integer, and then perform mod 10 to retrieve the number.
floor(7.12935239484* 1e10) %%10
The easiest way is probably by string manipulation.
Use format() with enough digits to make sure that you include the digits you want.
I have written the digit position as 10+2 to emphasize that you are skipping over the first two digits (7.) and taking the 10th digit after the decimal point.
x <- 7.12935239484
substr(format(x,digits=20), start = 10+2, stop = 10+2)
It might be more principled (and robust) to use numerical manipulation
floor((x*1e10) %% 10)
This shifts the decimal point 10 places and then calculates the reminder modulo 10 (the parentheses around x*1e10 are needed to get the right order of operations). This would still work if there were more digits to the left of the decimal point (unlike the string-based solution).
Extract digit from numeric in r is almost a duplicate ...

How to create a sequence of text IDs in r? [duplicate]

This question already has answers here:
How to add leading zeros?
(8 answers)
Closed 6 years ago.
My problem is to create a sequence of IDs in a vector. The vector will contain 001 to 020 then 030 to 100.
I can generate numbers by
x <- c(1:20,30:100)
but this is not in the format I am interested.
x <- c(paste("00", 1:9, sep=""),paste("0", 10:99, sep=""),100)
As suggested by the Frank... Use sprinf for formatted output. I like the %f formatter to format numbers. It is designed to format floating point numbers. %f will be replaced by the number. You can add 0 in front of the f to get leading numbers. Or you can also define how many digits you want to have overall (in your case 3) and how many should be decimal (0 after the .). Play a little with it. It is great for formatted output, filename etc.
sprintf('%03.0f', c(1:20,30:100))

Grep a 1 or 2 character number from a variable in R

I have two tables, Grow and Temp. Grow details the growing and nongrowing seasons for each area I'm concerned with (these are static), while Temp details the average temperature each month of a wide range of years (these are columns) for those same areas. I'm trying to identify the average temperature each year for the growing and nongrowing seasons, but I've run into a problem with how I pick the columns - I'm using grep, but can't figure out how to account for two-digit months!
At this point I've isolated each column of years, and have a vector of column names for a single year - something like "tmp_2001_1" "tmp_2001_2" "tmp_2001_3" ... "tmp_2001_11" "tmp_2001_12". I also have, stored in a pair of variables, the start and end of growing season, which are stored as integers representing months: start <- some number between 1 and 12, end <- some number between 1 and 12. But I can't figure out how to get grep to identify the growing season when start or end has 2 digits. This is what I have that works for the case of them both being 1 digit numbers:
grow_months <- grep(paste('tmp_','2001','_','[',start,'-',end,']',sep = ''), vector_of_column_names)
How can I expand this to account for cases where start or end is double digit?
As Gregor suggests,
strings = paste0("tmp_2001_",c(1:20,100))
start = 9
end = 12
ii = sapply(strsplit(strings,'_'), function(x)x[3]%in%start:end)
strings[ii]
or using an easy regex you could try
ii = grep(paste(paste0('_(',start:end,')$'),collapse='|'), strings)
strings[ii]

Counting number of sequences in a vector [duplicate]

This question already has answers here:
How can I count runs in a sequence?
(2 answers)
Closed 8 years ago.
I have a binary vector and I want to count how many sequences of 1's I've got. So that if I have a vector like:
bin <- c(1,1,0,1,1,1,1,0,0,0,1,0,1,1,0,0,1,1,1)
I would get 5. I haven't found any existing functions that could do this, anyone got any good tips on how one could write one? I don't know how to build the "counter" when the sequences all have different lengths.
The run length encoding function (rle) is built for this. Helpfully whilst it computes the length of runs of equal values in a vector, it returns those lengths with the values. So use rle( bin ).
Compare the $values output to your desired value (1) with == and sum the result (because you get a TRUE or 1L when the run of values is of 1's):
sum( rle(bin)$values == 1 )
[1] 5

Resources