If Else in R, if product ID is X then change Product Name - r

I am trying to figure out what is working and why the other way is not working for me.
At the moment I have a list of shops I use and I need to change the naming every time; so I have decided to go by the product_id which never changes, but my code is not working.
product_id <- vector()
This one is not working:
product_name[product_id == '40600000003'] <- 'my cool store']
but this one does work:
product_name[product_name == 'my#cool#Store'] <- 'my cool store'
Now, I am not sure what am I doing wrong, I tried to do:
if (product_id == '40600000003') {
product_name = 'my cool shop'
}
I have a list of 15 shops that I need to change the naming as they arrive in the wrong format from the api connection.

Try 40600000003 instead of '40600000003' it's more than likely reading your vector slots as int if it doesn't contain any characters

Related

In R: Search all emails by subject line, pull comma-separate values from body, then save values in a dataframe

Each day, I get an email with the quantities of fruit sold on a particular day. The structure of the email is as below:
Date of report:,04-JAN-2022
Time report produced:,5-JAN-2022 02:04
Apples,6
Pears,1
Lemons,4
Oranges,2
Grapes,7
Grapefruit,2
I'm trying to build some code in R that will search through my emails, find all emails with a particular subject, iterate through each email to find the variables I'm looking for, take the values and place them in a dataframe with the "Date of report" put in a date column.
With the assistance of people in the community, I was able to achieve the desired result in Python. However as my project has developed, I need to now achieve the same result in R if at all possible.
Unfortunately, I'm quite new to R and therefore if anyone has any advice on how to take this forward I would greatly appreciate it.
For those interested, my Python code is below:
#PREP THE STUFF
Fruit_1 = "Apples"
Fruit_2 = "Pears"
searchf = [
Fruit_1,
Fruit_2
]
#DEF THE STUFF
def get_report_vals(report, searches):
dct = {}
for line in report:
term, *value = line
if term.casefold().startswith('date'):
dct['date'] = pd.to_datetime(value[0])
elif term in searches:
dct[term] = float(value[0])
if len(dct.keys()) != len(searches):
dct.update({x: None for x in searches if x not in dct})
return dct
#DO THE STUFF
outlook = win32com.client.Dispatch("Outlook.Application").GetNamespace("MAPI")
inbox = outlook.GetDefaultFolder(6)
messages = inbox.Items
messages.Sort("[ReceivedTime]", True)
results = []
for message in messages:
if message.subject == 'FRUIT QUANTITIES':
if Fruit_1 in message.body and Fruit_2 in message.body:
data = [line.strip().split(",") for line in message.body.split('\n')]
results.append(get_report_vals(data, searchf))
else:
pass
fruit_vals = pd.DataFrame(results)
fruit_vals.columns = map(str.upper, fruit_vals.columns)
I'm probably going about this the wrong way, but I'm trying to use the steps I took in Python to achieve the same result in R. So for example I create some variables to hold the fruit sales I'm searching for, then I create a vector to store the searchables, and then when I create an equivalent 'get_vals' function, I create an empty vector.
library(RDCOMClient)
Fruit_1 <- "Apples"
Fruit_2 <- "Pears"
##Create vector to store searchables
searchf <- c(Fruit_1, Fruit_2)
## create object for outlook
OutApp <- COMCreate("Outlook.Application")
outlookNameSpace = OutApp$GetNameSpace("MAPI")
search <- OutApp$AdvancedSearch("Inbox", "urn:schemas:httpmail:subject = 'FRUIT QUANTITIES'")
inbox <- outlookNameSpace$Folders(6)$Folders("Inbox")
vec <- c()
for (x in emails)
{
subject <- emails(i)$Subject(1)
if (grepl(search, subject)[1])
{
text <- emails(i)$Body()
print(text)
break
}
}
read.table could be a good start for get_report_vals.
Code below outputs result as a list, exception handling still needs to be implemented :
report <- "
Date of report:,04-JAN-2022
Apples,6
Pears,1
Lemons,4
Oranges,2
Grapes,7
Grapefruit,2
"
get_report_vals <- function(report,searches) {
data <- read.table(text=report,sep=",")
colnames(data) <- c('key','value')
# find date
date <- data[grepl("date",data$key,ignore.case=T),"value"]
# transform dataframe to list
lst <- split(data$value,data$key)
# output result as list
c(list(date=date),lst[searches])
}
get_report_vals(report,c('Lemons','Oranges'))
$date
[1] "04-JAN-2022"
$Lemons
[1] "4"
$Oranges
[1] "2"
The results of various reports can then be concatenated in a data.frame using rbind:
rbind(get_report_vals(report,c('Lemons','Oranges')),get_report_vals(report,c('Lemons','Oranges')))
date Lemons Oranges
[1,] "04-JAN-2022" "4" "2"
[2,] "04-JAN-2022" "4" "2"
The code now functions as intended. Function was written quite a bit differently from those recommended:
get_vals <- function(email) {
body <- email$body()
date <- str_extract(body, "\\d{2}-[:alpha:]{3}-\\d{4}") %>%
as.character()
data <- read.table(text = body, sep = ",", skip = 9, strip.white = T) %>%
row_to_names(1) %>%
mutate("Date" = date)
return(data)
}
In addition I've written this to bind the rows together:
info <- sapply(results, get_vals, simplify = F) %>%
bind_rows()
May this is not what you are expecting to get as an answer, but I must state that here to help other readers to avoid such mistakes in future.
Unfortunately your Python code is not well-written. For example, I've noticed the following code where you iterate over all items in a folder and check the Subject and message bodies for keywords:
for message in messages:
if message.subject == 'FRUIT QUANTITIES':
if Fruit_1 in message.body and Fruit_2 in message.body:
You need to use the Find/FindNext or Restrict methods of the Items class instead. So, you don't need to iterate over all items in a folder. Instead, you get only items that correspond to your conditions. Read more about these methods in the following articles:
How To: Use Find and FindNext methods to retrieve Outlook mail items from a folder (C#, VB.NET)
How To: Use Restrict method to retrieve Outlook mail items from a folder
You may combine all your search criteria into a single query. So, you just need to iterate over found items and extract the data.
Also you may find the AdvancedSearch method helpful. The key benefits of using the AdvancedSearch method in Outlook are:
The search is performed in another thread. You don’t need to run another thread manually since the AdvancedSearch method runs it automatically in the background.
Possibility to search for any item types: mail, appointment, calendar, notes etc. in any location, i.e. beyond the scope of a certain folder. The Restrict and Find/FindNext methods can be applied to a particular Items collection (see the Items property of the Folder class in Outlook).
Full support for DASL queries (custom properties can be used for searching too). You can read more about this in the Filtering article in MSDN. To improve the search performance, Instant Search keywords can be used if Instant Search is enabled for the store (see the IsInstantSearchEnabled property of the Store class).
You can stop the search process at any moment using the Stop method of the Search class.
See Advanced search in Outlook programmatically: C#, VB.NET for more information.

How do I access a calculated field in Rails?

In my first attempt to develop something in Ruby on Rails :) ... I have a list of names stored in fields "first_name" and "last_name". In my Person model, I have defined something like this:
def sort_name
sort_name = last_name + ',' + first_name
end
Now I want to show all persons shown in a list, sorted by sort_name, but (in my controller) something like
#persons = Person.order(:sort_name)
doesn't work (Unknown column 'sort_name' in 'order clause'). How do reference to the calculated field sort_name in my controller?
I am sure this is a "oh my god I am so stupid moment" but happy for any advise!
If the model Person has the fields name, first_lastname and second_lastname, you can do the next:
Person.order(:name, :first_lastname, :second_lastname)
By default is ordering in ascending way. Also you can put if you want ascending or descending way for each field:
Person.order(name: :asc, first_lastname: :desc, second_lastname: :asc)
Additional if you want add a column with the complete name, you can use select, using postgresql the code would be:
people = Person.order(
name: :asc, first_lastname: :desc, second_lastname: :asc
).select(
"*, concat(name,' ', first_lastname, ' ',second_lastname) as sort_name"
)
people[0].sort_name
# the sort_name can be for example "Adán Saucedo Salas"

Vectorized Operation in R causing problems with custom function

I'm writing out some functions for Inventory management. I've recently wanted to add a "photo url column" to my spreadsheet by using an API I've used successfully while initially building my inventory. My Spreadsheet header looks like the following:
SKU | NAME | OTHER STUFF
I have a getProductInfo function that returns a list of product info from an API I'm calling.
getProductInfo<- function(barcode) {
#Input UPC
#Output List of product info
CallAPI(barcode)
Process API return, remove garbage
return(info)
}
I made a new function that takes my inventory csv as input, and attempts to add a new column with product photo url.
get_photo_url_from_product_info_output <- function(in_list){
#Input GetProductInfo Output. Returns Photo URL, or nothing if
#it doesn't exist
if(in_list$DisplayStockPhotos == TRUE){
return(in_list$StockPhotoURL)
} else {
return("")
}
}
add_Photo_URL <- function(in_csv){
#Input CSV data frame, appends photourl column
#Requires SKU (UPC) assumes no photourl column
out_csv <- mutate(in_csv, photo =
get_photo_url_from_product_info_output(
getProductInfo(SKU)
)
)
}
return (out_csv)
}
#Call it
new <- add_Photo_URL(old)
My thinking was that R would simply input the SKU of the from the row, and put it through the double function call "as is", and the vectorized DPLYR function mutate would just vectorize it. Unfortunately I was running into all sorts of problems I couldn't understand. Eventually I figured out that API call was crashing because the SKU field was all messed up as it was being passed in. I put in a breakpoint and found out that it wasn't just passing in the SKU, but instead an entire list (I think?) of SKUs. Every Row all at once. Something like this:
#Variable 'barcode' inside getProductInfo function contains:
[1] 7.869368e+11 1.438175e+10 1.256983e+10 2.454357e+10 3.139814e+10 1.256983e+10 1.313260e+10 4.339643e+10 2.454328e+10
[10] 1.313243e+10 6.839046e+11 2.454367e+10 2.454363e+10 2.454367e+10 2.454348e+10 8.418870e+11 2.519211e+10 2.454375e+10
[19] 2.454381e+10 2.454381e+10 2.454383e+10 2.454384e+10 7.869368e+11 2.454370e+10 2.454390e+10 1.913290e+11 2.454397e+10
[28] 2.454399e+10 2.519202e+10 2.519205e+10 7.742121e+11 8.839291e+11 8.539116e+10 2.519211e+10 2.519211e+10 2.519211e+10
Obviously my initial getProductInfo function can't handle that, so it'll crash.
How should I modify my code, whether it be in the input or API call to avoid this vectorized operation issue?
Well, it's not totally elegant but it works.
I figured out I need to use lapply, which is usually not my strong suit. Initally I tried to nest them like so:
lapply(SKU, get_photo_url_from_product_info_output(getProductInfo())
But that didn't work. So I just came up with bright idea of making another function
get_photo_url_from_sku <- function(barcode){
return(get_photo_url_from_product_info_output(getProductInfo(barcode)))
}
Call that in the lapply:
out_csv<- mutate(in_csv, photocolumn = lapply(SKU, get_photo_url_from_sku))
And it works great. My speed is only limited by my API calls.

Dynamic filter with LINQ (EntityFramework)

the problem is quite simple. We have some store with many products.
For example each item has fields (and many others):
Height
Weight
Length
Name
Color
Price
We implement some page where we can filter items by range for numbers (like price, length) and string (color).
So, the problem is here:
- we need to allow people to filter items by any of criteria above (color+price, color+length+weight).
The basic approach with predefined SELECT+WHERE is too hard to main.
Is there any other option?
Thank you.
Jonathan is right. I think using if cases would be easiest and would look something like this.
var res = _dbContextorSource.table.Where(x => x.ID > 0); //something to get all of them
//or (x => x.ID >= 0 && x.ID <= 50) if you're showing 50 on a page
if(FirstFilterIsUsed){
res = res.table.Where(x => x.FirstField == FirstFilter);
}
if(SecondFilterIsUsed){
res = res.table.Where(x => x.SecondField == SecondFilter);
}
//etc
I think you could implement a more clean solution that loops through each filter. This is super pseudo but I've used solutions like this.
var filters = GetUserFilters();
foreach(Filter filter in filters){
res = res.table.Where(x => x.GetType().
GetProperty(filter.MatchingName).GetValue(x) == filter.FilterValue);
}
var result = res.ToList();
Common Solution
The most common solution is performing a big if` case. It takes some time to support all your fields but will work great at the end.
However, if you want to do it dynamically, 2 third-party libraries can help you with this
LINQ Dynamic
https://www.nuget.org/packages/System.Linq.Dynamic.Core/
The syntax required is a little bit different from C# but work great. It's the most popular library to do such a thing.
C# Eval Expression
Disclaimer: I'm the owner of the project C# Eval Expression
The library is not free, but you can do pretty much any dynamic LINQ using the same syntax as C#.
So you will be able to build a string to evaluate and the library will do the rest for you.
Here are some example using EF Classic:
https://dotnetfiddle.net/Otm0Aa
https://dotnetfiddle.net/mwTqW7

Avoiding JSON error displaying Japanese strings within Plotly (R) / Running a function on one variable at a time

I'm very new to R and beginner level at programming in general, and trying to figure out how to get hovertext in plotly to display a Japanese string from my dataframe. After venturing through character encoding hell, I've got things mostly worked out but am getting stuck on a single point: Getting the Japanese string to display in the final plot.
plot_ly(df, x = ~cost, y = ~grossSales, type = "scatter", mode = "markers",
hoverinfo = "text",
text = ~paste0("Product name: ", productName,
"<br>Gross: ", grossSales, "<br> Cost: ", cost,
)
)
The problem I encounter is that using 'productName' returns the Japanese string from the dataframe, which causes the plot to fail to render. DOM Inspector's console shows JSON encountering issues with the string (even though it's just encoded in UTF-8).
Using toJSON(productName), I am able to render the table, however this renders the hover textbox with the full information of the productName column (e.g., ["","Product1","Product2","Product3"...]). I only want the name of that specific product; just as 'grossSales' and 'cost' only return one the data specific to that product at each point on the plot.
Is there a way I can execute toJSON() only on each specific instance of 'productName'? (i.e., output should be "Product1" with JSON friendly string format) Alternatively, is there a way I can have plotly read the list output and select only the correct productName?
Stepping away from the problem to continue studying other things, I found a partial solution in using a for-loop:
productNames <- NULL
for (i in 1:nrow(df))
{
productNames <- c(productNames, toJSON(df[i, "productName"]))
}
df$jsonProductNames <- productNames
Using the jsonProductNames variable within plotly, the graph renders and displays only the name for each product! The sole issue remaining is that it is displayed with the JSON [""] formatting around each product's name.
Update:
I've finally got this working fully how I want it. I imagine there are more elegant solutions, and I'd still be interested to learn how to achieve what I originally was looking at if possible (run a function on a variable within R for each time it is encountered in a loop), but here is how I have it working:
colToJSON <- function(df, colStr)
{
JSONCol <- NULL
for (i in 1:nrow(df))
{
JSONCol <- c(JSONCol, toJSON(df[i, colStr]))
}
JSONCol <- gsub("\\[\"", "", JSONCol)
JSONCol <- gsub("\"\\]", "", JSONCol)
return(JSONCol)
}
df$jsonProductNames <- colToJSON(df, "productName")

Resources