How can I format my varible Number to format like this:
1
10
100
1 000
10 000
100 000
1 000 000
10 000 000
100 000 000
1 000 000 000
My plan is to reformat my Numberstring then check if str(Number)in element.text
element.text output is:
xxxxxxx Sälj xxxxxxx
B xxxxxxx B zzz
zzzzzz zzz 1 500 0,767 1 150,50 SEK 19:14
Time=datetime.now().strftime('%H:%M')[:]
Time=19:14
Action=Sälj
Number=int(1500)
sring=f'{number:,}'
sring=sring.replace(','," ")
Number=sring
result_elements = driver2.find_elements_by_xpath("""//*[#data-deal_id]""")
for element in result_elements:
if Time,Action,Number in element.text:
print('order was executed')
print(element.text)
Related
I would like to extract number information in cells which is located beside specific string. My data looks like this.
item stock
PRE 24GUSSETX4SX15G 200
PLS 12KLRX10SX15G 200
ADU 24SBX200ML 200
NIS 18BNDX40SX11G 200
REF 500GX12BTL 200
i want to extract the numbers which located besides string 'GUSSET','KLR','SB','BND' and 'BTL'. I want to use this number to do multiplication with the stock. For example like this.
item stock pcs total
PRE 24GUSSETX4SX15G 200 24 4800
PLS 12KLRX10SX15G 200 12 2400
ADU 24SBX200ML 200 24 4800
NIS 18BNDX40SX11G 200 18 3600
REF 500GX12BTL 200 12 2400
anyone know how to extract the numbers? thanks very much in advance
One way using base R, is to use sub to extract numbers besides those groups and multiply them with stock to get total.
df$pcs <- as.numeric(sub(".*?(\\d+)(GUSSET|KLR|SB|BND|BTL).*", "\\1", df$item))
df$total <- df$stock * df$pcs
df
# item stock pcs total
#PRE 24GUSSETX4SX15G 200 24 4800
#PLS 12KLRX10SX15G 200 12 2400
#ADU 24SBX200ML 200 24 4800
#NIS 18BNDX40SX11G 200 18 3600
#REF 500GX12BTL 200 12 2400
Or everything in one pipe
library(dplyr)
df %>%
mutate(pcs = as.numeric(sub(".*?(\\d+)(GUSSET|KLR|SB|BND|BTL).*", "\\1", item)),
total = stock * pcs)
We can do this in tidyverse
library(tidyverse)
df %>%
mutate(pcs = as.numeric(str_extract(item, "(\\d+)(?=(GUSSET|KLR|SB|BND|BTL))")),
total = pcs * stock)
# item stock pcs total
#1 PRE 24GUSSETX4SX15G 200 24 4800
#2 PLS 12KLRX10SX15G 200 12 2400
#3 ADU 24SBX200ML 200 24 4800
#4 NIS 18BNDX40SX11G 200 18 3600
#5 REF 500GX12BTL 200 12 2400
data
df <- structure(list(item = c("PRE 24GUSSETX4SX15G", "PLS 12KLRX10SX15G",
"ADU 24SBX200ML", "NIS 18BNDX40SX11G", "REF 500GX12BTL"), stock = c(200L,
200L, 200L, 200L, 200L)), class = "data.frame", row.names = c(NA,
-5L))
How to remove whitespaces between letters NOT numbers
For example:
Input
I ES P 010 000 000 000 000 000 001 001 000 000 IESP 000 000
Output
IESP 010 000 000 000 000 000 001 001 000 000 IESP 000 000
I tried something like this
gsub("(?<=\\b\\w)\\s(?=\\w\\b)", "", x,perl=T)
But wasn't able to arrive at the output I was hoping for
Use gsub to replace whitespace " " with nothing "" between letters then return replacement and letters.
Input <- "I ES P 010 000 000 000 000 000 001 001 000 000 IESP 000 000"
gsub("([A-Z]) ([A-Z])", "\\1\\2", Input)
[1] "IESP 010 000 000 000 000 000 001 001 000 000 IESP 000 000"
Edit after #Wiktor Stribiżew comment (replaced [A-z] to [a-zA-Z]):
For lower and upper case use [a-zA-Z]
Input <- "I ES P 010 000 000 000 000 000 001 001 000 000 IESP 000 000 aaa ZZZ"
gsub("([a-zA-Z]) ([a-zA-Z])", "\\1\\2", Input)
[1] "IESP 010 000 000 000 000 000 001 001 000 000 IESP 000 000 aaaZZZ"
You need to use
Input <- "I ES P E ES P 010 000 000 000 000 000 001 001 000 000 IESP 000 000"
gsub("(?<=[A-Z])\\s+(?=[A-Z])", "", Input, perl=TRUE, ignore.case = TRUE)
## gsub("(*UCP)(?<=\\p{L})\\s+(?=\\p{L})", "", Input, perl=TRUE) ## for Unicode
See the R demo online and a regex demo.
NOTE: The ignore.case = TRUE will make the pattern case insensitive, if it is not expected, remove this argument.
Details
(?<=[A-Z]) (or (?<=\p{L})) - a letter must appear immediately to the left of the current location (without adding it to the match)
\\s+ - 1 or more whitespaces
(?=[A-Z]) (or (?=\\p{L})) - a letter must appear immediately to the right of the current location (without adding it to the match).
I'm looking for a way to produce descriptive statistics by group number in R. There is another answer on here I found, which uses dplyr, but I'm having too many problems with it and would like to see what alternatives others might recommend.
I'm looking to obtain descriptive statistics on revenue grouped by group_id. Let's say I have a data frame called company:
group_id company revenue
1 Company A 200
1 Company B 150
1 Company C 300
2 Company D 600
2 Company E 800
2 Company F 1000
3 Company G 50
3 Company H 80
3 Company H 60
and I'd like to product a new data frame called new_company:
group_id company revenue average min max SD
1 Company A 200 217 150 300 62
1 Company B 150 217 150 300 62
1 Company C 300 217 150 300 62
2 Company D 600 800 600 1000 163
2 Company E 800 800 600 1000 163
2 Company F 1000 800 600 1000 163
3 Company G 50 63 50 80 12
3 Company H 80 63 50 80 12
3 Company H 60 63 50 80 12
Again, I'm looking for alternatives to dplyr. Thank you
Using the sample data frame
dd<-read.csv(text="group_id,company,revenue
1,Company A,200
1,Company B,150
1,Company C,300
2,Company D,600
2,Company E,800
2,Company F,1000
3,Company G,50
3,Company H,80
3,Company H,60", header=T)
You could do something fancy like use ave() to create all the values per row for your different functions and then just combine that with the original data.frame.
ext <- with(dd, Map(function(x) ave(revenue, group_id, FUN=x),
list(avg=mean, min=min, max=max, SD=sd)))
cbind(dd, ext)
# group_id company revenue avg min max SD
# 1 1 Company A 200 216.66667 150 300 76.37626
# 2 1 Company B 150 216.66667 150 300 76.37626
# 3 1 Company C 300 216.66667 150 300 76.37626
# 4 2 Company D 600 800.00000 600 1000 200.00000
# 5 2 Company E 800 800.00000 600 1000 200.00000
# 6 2 Company F 1000 800.00000 600 1000 200.00000
# 7 3 Company G 50 63.33333 50 80 15.27525
# 8 3 Company H 80 63.33333 50 80 15.27525
# 9 3 Company H 60 63.33333 50 80 15.27525
but really a simple dplyr command would be easier.
dd %>% group_by(group_id) %>%
mutate(
avg=mean(revenue),
min=min(revenue),
max=max(revenue),
SD=sd(revenue))
Another function I like to use is: describeBy from package "psych".
library(psych)
description <- describeBy(data.frame$variable_to_be_described, df$group_variable)
EMPLTOT_N FIRMTOT average min
12289593 4511051 5 1
26841282 1074459 55 10
15867437 81243 300 100
6060684 8761 750 500
52366969 8910 1000 1000
137003 47573 5 1
226987 10372 55 10
81011 507 300 100
23379 52 750 500
13698 42 1000 1000
67014 20397 5 1
My data look like the data above. I want to create a new column EMP using mutate function that:
emp= average*FIRMTOT if EMPLTOT_N/FIRMTOT<min
and emp=EMPLTOT_N if EMPLTOT_N/FIRMTOT>min
In your sample data EMPLTOT_N / FIRMTOT is never less than min, but this should work:
df <- read.table(text = "EMPLTOT_N FIRMTOT average min
12289593 4511051 5 1
26841282 1074459 55 10
15867437 81243 300 100
6060684 8761 750 500
52366969 8910 1000 1000
137003 47573 5 1
226987 10372 55 10
81011 507 300 100
23379 52 750 500
13698 42 1000 1000
67014 20397 5 1", header = TRUE)
library('dplyr')
mutate(df, emp = ifelse(EMPLTOT_N / FIRMTOT < min, average * FIRMTOT, EMPLTOT_N))
In the above if EMPLTOT_N / FIRMTOT == min, emp will be given the value of EMPLTOT_N since you didn't specify what you want to happen in this case.
adding as integers instead of list elements in R
I am getting
> total = 0
> for (qty in a[5]){
+ total = total + as.numeric(unlist(qty))
+ print(total)
+ }
[1] 400 400 400 400 400 400 400 400 400 400
what i really want is :
> total = 0
> for (qty in a[5]){
+ total = total + as.numeric(unlist(qty))
+ print(total)
+ }
[1] 400 800 1200 1600 2000 2400 2800 3200 3600 4000
refine: a little bit more to a more specific scenario,
price buy_sell qty
100 B 100
100 B 200
90 S 300
100 S 400
I want to make a forth column
price buy_sell qty net
100 B 100 10000
100 B 200 30000
90 S 300 3000
100 S 400 -37000
Note that if a is a list, you want to use double brackets. Otherwise you are getting back a list of size one, where the first element has the values you are looking for
Try:
total <- cumsum(a[[5]])
a <- list()
a[[5]] <- rep(400, 10)
cumsum(a[[5]])
# [1] 400 800 1200 1600 2000 2400 2800 3200 3600 4000
Compare:
a[5]
a[[5]]
a[5][[1]]