Autofill based on list and value of a cell - formula

I'm making a spreadsheet to help me with my personal accounting. I'm trying to create a formula in LibreOffice Calc that will search in a given cell for a number of different text strings and if found return a text string.
For example, the formula should search for "burger" or "McDonalds" in $C6 and likewise then return "Food" to $E6. It should not be case sensitive. And needs partially to match strings as well as in the case of Burger King. I need it to be able to search for other keywords and return those values as well, like "AutoZone" and return "Auto" and NewEgg and return "Electronics".
I've had a tough time finding any kind of solution to this and the closet I could get was with a MATCH formula but once I nested it in an IF it would not work. I've also tried nested IF with OR; not joy on either.
Examples:
=IF(OR(D10="*hulu*",D10="*netflix*",D10="*movie*",D10="*theature*",D10="*stadium*",D10="*google*music*")=1,"Entertainment",IF(OR(D10="*taco*",D10="*burger*",D10="*mcdonald*",D10="*dq*",D10="*tokyo*",D10="*wendy*",D10="*cafe*",D10="*wing*",D10="*tropical*",D10="*kfc*",D10="*olive*",D10="*caesar*",D10="*costa*vida*",D10="*Carl*",D10="*in*n*out*",D10="*golden*corral*",D10="*nija*",D10="*arby*",D10="*Domino*",D10="*Subway*",D10="*Iggy*",D10="*Pizza*Hut*",D10="*Rumbi*",D10="*Custard*",D10="*Jimmy*")=1,"Food",IF(OR(D10="*autozone*",D10="*Napa*",D10="*OREILLY*")=1,"AUTO","-")))
I can create a different table and make a lookup reference so another way to put this is I need something that does the opposite of what VLOOKUP and HLOOKUP do and return the header value for any data matching in given columns.
Something like:
=IF(NOT(ISNA(MATCH(A1,B3:B99))),B2,IF(NOT(ISNA(MATCH(A1,C3:C99))),c2,0))
If A1 was the test and B2 and C2 were the headers and it was searching below those.

As per my comments, try this:
=IF(SUM(LEN(G150)-LEN(SUBSTITUTE(LOWER(G150),{"hulu","netflix","movie","theater"," stadium"},"")))>0,"Entertainment",IF(SUM(LEN(G150)-LEN(SUBSTITUTE(LOWER(G150),{"burger","taco","vida","caf‌​e","wing","dairy","mcdonald","wendy","kfc","pizza","carl","domino","ceaser","oliv‌​e","jimmy","custard","subway","arby"},"")))>0,"Food",IF(SUM(LEN(G150)-LEN(SUBSTITUTE(LOWER(G150),{"autozone","Napa","oreilly"},"")))>0,"AUTO","-")))
It is an Array formula and must be confirmed with Ctrl-Shift-Enter.

You can do this various ways using INDEX/MATCH/VLOOKUP formulae. Just a couple of caveats: I am using Excel, and never used Libre so hope this works; and, you will need a mapping table that maps MacDonalds to Food, Google Music to Entertainment and so on (for all the cases possible).
Let's assume your mapping table in your screenshot is A6 to E9.
The formula in E10 =vlookup(C10,$C$6:$E$9,3,0)
Explanation: it looks up C10 (Burger King) in the table $C$6:$E$9 and result is the 3rd column (E is 3rd column from C, where C10 was looked up) in that table. The 0 will give you an exact match, if you want a partial match then enter 1 there.
Note: if your mapping table is in say columns G and H (Service name in G and Type of Service in H), AND you are unsure how many entries it will have, a mod to the formula is =vlookup(C10,$G:$H,2,0) OR =vlookup(C10,$G:$H,2,1) for a partial match. Here, 3 is replaced by 2 because H is the 2nd column from G where C10 will be looked up.
EDIT: Doing VLOOKUP with INDEX and MATCH functions for an approximate match of text - this could be the solution you are looking at in your last comment(?)
Two things needed to be done. a.Reference table entries, b.applying the INDEX/MATCH function.
Part a - in your reference table, you will have to make entries between 2*s for the value to be looked up. The way you mention in your example in the Qn *movie*,*wendy*,etc. That's really the trick that enables us to lookup by cell reference. Corresponding return values like Entertainment/Food/etc need to be their own full words. Let's assume you have this table prepared in columns G6:H26 (G-lookup value, H-return value)
Part b - In you cell F6 (as per your screenshot), you can try this formula =INDEX($H$6:$H$26,MATCH(C6,$G$6:$G$26,0))
That really just is the replacement formula for VLOOKUP using INDEX/MATCH.
As your values stored in column G are in *s, the cell C6 in the MATCH formula will do a partial read.

Related

LibreOffice Calc - How to reuse multi step formula?

I'm making a balance sheet, Sheet1 is for the ins and outs, and most values are added manually or simple formulas, and Sheet2 is where I created a formula, in the hopes of being able to reuse it.
I'm not an accountant to understand how I could make the calculations easier, and I'm a programmer, so I understand that the way I may be imagining the solution is likely impossible with the way Libreoffice Calc's formulas work.
So, to explain a bit.
On Sheet1, each column is a month, and the value is a tax that will appear one time each month, dependent on another value.
So, the base value is on ROW 17, and on 18, I would like that result to be set. For every month, of course
On Sheet2, I have the function, it contains 5 steps, with the values being reused a lot (hence, simplifying everything into one line would be hell).
This is the complex formula in question, D1 is the input, C6 is the output.
The formula below is the one used on C2, and repeated down to C5.
I would like to keep the constants as a table since it would be easier to update it in the future in case it suffer any changes.
I have been searching for a possible solution but found none, and I believe that it's likely because I'm looking for a solution like a programmer (use Sheet as a function), and I should seek sort of way, but I don't know how Calc works.
In regards to the calculation, I don't know the specific name, but the idea is, from 0 to A1, I have to B1% from A1-0, then from A2-A1, remove B2%, and so on.
Of course the formula's complexity comes from treating lower values, so for example, if D1 was 2K, then I would have to take 7.5% of R$ 96.02, and everything beyond is 0, since there is nothing remaining for them to calculate
Most of the descriptions I found on MULTIPLE.OPERATIONS were confusing, but I found one that made it much easier to understand.
The answer was to use this formula on Sheet1:
=MULTIPLE.OPERATIONS('Sheet2'.$C$6, 'Sheet2'.$D$1, C17)
I can just copy paste it to the side and the calculation will be executed properly.
To explain the arguments:
1 - where the result will appear
2 - the location of the main/first formula variable
3 - the location of dynamic variable you want to insert in that formula (So this is from Sheet1)
More arguments could be used if more variables were needed, but I just needed one.
This is the place with the best explanation I found for the function.
https://wiki.documentfoundation.org/Documentation/Calc_Functions/MULTIPLE.OPERATIONS

Is there a way to extract a substring from a cell in OpenOffice Calc?

I have tens of thousands of rows of unstructured data in csv format. I need to extract certain product attributes from a long string of text. Given a set of acceptable attributes, if there is a match, I need it to fill in the cell with the match.
Example data:
"[ROOT];Earrings;Brands;Brands>JeweleryExchange;Earrings>Gender;Earrings>Gemstone;Earrings>Metal;Earrings>Occasion;Earrings>Style;Earrings>Gender>Women's;Earrings>Gemstone>Zircon;Earrings>Metal>White Gold;Earrings>Occasion>Just to say: I Love You;Earrings>Style>Drop/Dangle;Earrings>Style>Fashion;Not Visible;Gifts;Gifts>Price>$500 - $1000;Gifts>Shop>Earrings;Gifts>Occasion;Gifts>Occasion>Christmas;Gifts>Occasion>Just to say: I Love You;Gifts>For>Her"
Look up table of values:
Zircon, Diamond, Pearl, Ruby
Output:
Zircon
I tried using the VLOOKUP() function, but it needs to match an entire cell and works better for translating acronyms. Haven't really found a built in function that accomplishes what I need. The data is totally unstructured, and changes from row to row with no consistency even within variations of the same product. Does anyone have an idea how to do this?? Or how to write an OpenOffice Calc function to accomplish this? Also open to other better methods of doing this if anyone has any experience or ideas in how to approach this...
ok so I figured out how to do this on my own... I created many different columns, each with a keyword I was looking to extract as a header.
Spreadsheet solution for structured data extraction
Then I used this formula to extract the keywords into the correct row beneath the column header. =IF(ISERROR(SEARCH(CF$1,$D769)),"",CF$1) The Search function returns a number value for the position of a search string otherwise it produces an error. I use the iserror function to determine if there is an error condition, and the if statement in such a way that if there is an error, it leaves the cell blank, else it takes the value of the header. Had over 100 columns of specific information to extract, into one final column where I join all the previous cells in the row together for the final list. Worked like a charm. Recommend this approach to anyone who has to do a similar task.

How to calculate average annual salary in libreoffice calc

I have salary data table from 10 years period. Every column has properly set data type (date for "B", number for "C" and "E".
I'm trying to write a formula to calculate average salary for every year. In column "E" I've manually entered all possible years and in column "F" should be an yearly average, according to year from "E".
So, my best try is this formula: =AVERAGEIF(YEAR(B2:B133);"="&E2;C2:C133)
Trying so calculate an average from column C, where year in date from column B equals a year in column E
But all I get is an error Err:504. Figured out, that problem is in YEAR(interval) part, but can't get what exactly...
Can someone point that out?
Thank you!
There are actually many possibilities to solve this.
#JvdV answer;
using an array formula with #JvdV solution;
using an array formula with a combination of AVERAGE() and IF();
using the SUMPRODUCT() function;
and surely many other solutions that I don't know about!
Please beware: I use , instead of ; as formula separator, according to my locale; adapt to your needs.
A side note on "array formulas"
This kind of formulas are applied by mandatory pressing the Ctrl + Shift + Enter key combination to insert them, not only Enter or Tab or mouse-clicking elsewhere on the sheet.
The resulting formula is shown between brackets {}, which are not inserted by the user but are automatically shown by the software to inform that this is actually an array formula.
More on array formulas i.e. on the LibreOffice help system.
Usually you cannot drag and drop array formulas, you have to copy-paste them instead.
Array formula with #JvdV solution
The solution of JvdV could be slighly modified like this, and then inserted as an array formula:
=AVERAGEIFS(C$2:C$133,YEAR($B$2:$B$133),"="&E2)
When you insert this formula with the Ctrl + Shift + Enter key combination, the software puts the formula into brackets, so that you see it like this: {=AVERAGEIFS(C$2:C$133,YEAR($B$2:$B$133),"="&E2)}
You cannot simply drag the formula down, but you can copy-paste it.
Array formula with a combination of AVERAGE() and IF():
For your example, put this formula in cell F2 (for the year 2010):
=AVERAGE(IF(YEAR($B$2:$B$133)=E2,$C$2:$C$133))
When you insert this formula with the Ctrl + Shift + Enter key combination, the software puts the formula into brackets, so that you see it like this {=AVERAGE(IF(YEAR($B$2:$B$133)=E2,$C$2:$C$133))}
You cannot simply drag the formula down, but you can copy-paste it.
SUMPRODUCT() formula:
My loved one...
Plenty of resources on the web to explain this formula.
In your situation, this would give:
=SUMPRODUCT($C$2:$C$133,--(YEAR($B$2:$B$133)=E2))/SUMPRODUCT(--(YEAR($B$2:$B$133)=E2))
This one you can drag down to your needs.
Unfortunately AVERAGEIF() expects a range reference instead of a calculated array. Therefor it will error out. That's the theory at least for Excel, and I expect this to be the same for LibreCalc.
One way around it is using the AVERAGEIFS() function and check against first and last days of the year, for example:
=AVERAGEIFS(C$2:C$133;B$2:B$133;">="&DATE(E2;1;1);B$2:B$133;"<="&DATE(E2;12;31))
Drag the formula down.

How to Add Column (script) transform that queries another column for content

I’m looking for a simple expression that puts a ‘1’ in column E if ‘SomeContent’ is contained in column D. I’m doing this in Azure ML Workbench through their Add Column (script) function. Here’s some examples they give.
row.ColumnA + row.ColumnB is the same as row["ColumnA"] + row["ColumnB"]
1 if row.ColumnA < 4 else 2
datetime.datetime.now()
float(row.ColumnA) / float(row.ColumnB - 1)
'Bad' if pd.isnull(row.ColumnA) else 'Good'
Any ideas on a 1 line script I could use for this? Thanks
Without really knowing what you want to look for in column 'D', I still think you can find all the information you need in the examples they give.
The script is being wrapped by a function that collects the value you calculate/provide and puts it in the new column. This assignment happens for each row individually. The value could be a static value, an arbitrary calculation, or it could be dependent on the values in the other columns for the specific row.
In the "Hint" section, you can see two different ways of obtaining the values from the other rows:
The current row is referenced using 'row' and then a column qualifier, for example row.colname or row['colname'].
In your case, you obtain the value for column 'D' either by row.D or row['D']
After that, all you need to do is come up with the specific logic for ensuring if 'SomeContent' is contained in column 'D' for that specific row. In your case, the '1 line script' would look something like this:
1 if [logic ensuring 'SomeContent' is contained in row.D] else 0
If you need help with the logic, you need to provide more specific examples.
You can read more in the Azure Machine Learning Documentation:
Sample of custom column transforms (Python)
Data Preparations Python extensions
Hope this helps

Extract formula from Excel Data Table (What-If Analysis)

I am faced with rewriting an Excel project in R. I see a table in which a cell {= TABLE (F2, C2)} is shown. I understand how to create a Table like this (What-If Analysis, Data Table...).
As I have to understand this to rewrite in R, how can I find the original formula which stands behind that cell?
EXAMPLE: I have created a Data Table as shown here and the sheet looks like this:
In my case, I don't know how the sheet was created, and I want to know the initial formula. Now this is shown as {=TABLE(,C4)}.
(In the example I know the answer, it is in the cell (D10), but where is reference for this cell in Data Table?)
I'm using Excel 2007 but have no reason to believe things differ in other versions.
#Stanislav was right to reject my comment suggestion that TABLE was a name; it is an EXCEL function. But it is a very strange function :-}
There isn't any help on the TABLE function in the local help, it isn't listed in "List of worksheet functions (alphabetical)".
You can't manually enter or edit the TABLE function; error "That function is not valid".
Copy/Pasting cells containing the TABLE function pastes their values, not their formulae, even when you specify Paste Special > Formulas
You can't insert rows/columns immediately above/left of cells containing the TABLE function; error "Cannot change part of a data table".
Pace #pnuts using Formulas > Formula Auditing cells containing the TABLE function shows no precedents and no cells show them as dependents. Although in a VBA sheet auditing tool which I use the Range.DirectDependents Property finds the "formula range" dependent on the "margin" cells containing the formulas, but not those containg the values (see below for explanation of those terms).
I haven't been able to find anything I regard as decent documentation of TABLE(). I have found lots of illustrations of how to produce and use that function, but nothing clearly specifying the arguments and result. The best I've found is https://support.office.com/en-us/article/Calculate-multiple-results-by-using-a-data-table-e95e2487-6ca6-4413-ad12-77542a5ea50b. I'd be pleased if anyone can point me to better documentation.
I deduce the bahaviour as described here:
TABLE(Rowinp,Colinp) is an array formula in a contiguous array of cells. I'll refer to that contiguous array as the "formula range" of the data table.
The cells immediately above/left of the formula range are also part of the data table, even though they do not contain a TABLE() function and can be edited; I'll refer to those cells as the "margins" of the data table.
Rowinp and Colinp must be blank or references to single cells.
Rowinp and Colinp must be different (or error "Input cell reference is not valid"), they must not both be blank.
The values in the formula range are calculated by taking formula(s) from the margin(s) and substituting references to Rowinp and/or Colinp with values from the margin(s).
There are three mutually exclusive possibilities, corresponding to Rowinp blank or not.
TABLE(Rowinp, ) Colinp blank. The formula is that in the left margin of the same row with instances of Rowinp replaced by values from the upper margin of the same column.
TABLE( , Colinp) Rowinp blank. The formula is that in the top margin of the same column with instances of Colinp replaced by values from the the left margin of the same row.
TABLE(Rowinp, Colinp) Neither blank. The formula is that in the cell at the intersection of the left and top margins with instances of Rowinp replaced by values from the upper margin of the same column and instances of Colinp replaced by values from the the left margin of the same row.
I think that should let you work out what the effective formula is in each cell of the formula range.
But I wouldn't be surprised to learn that any of the above is wrong :-0
I welcome pointers to anything more authoritative.
I think in your example the F2 and C2 are effectively only the addresses of parameters for a function (TABLE) where that may be located anywhere, with the associated formula in the table's top left cell.
So I suggest go to C2, FORMULAS > Formula Auditing and click Trace Dependents, repeat for F2 and see where the arrows converge.

Resources