Align Text To Row Identified by Number and to ID Matching that Embedded in String - formula

I need to align text in an ALERT STRING column with the row identified by number in an ID ROW column.
Additionally, I need to also align the same ALERT STRING text with the same ID ROW number AND with the ID matching that embedded in a string in the TEXT WITH ID column. (This double-check will sometimes be necessary with the real-world data.)
So far, I've only figured out how to align the ALERT STRING with the ID matching that embedded in the TEXT WITH ID column:
=LOOKUP(2,1/SEARCH(A2,$F$2:$F$11),$G$2:$G$11)
I appreciate any help folks can offer. You can find an editable copy of the workbook here:
https://1drv.ms/x/s!ArQ7Kw6ayNMY2zktTW3pDCbMmJZ_
UPDATE: Nayan provided a solution to the first part of this question (please see answer below). I'm still trying to work out a formula for the column D part of this question, in which the row reference shown in column E is combined with a match of the ID shown in column A with its corresponding value in one of the text strings in column F.
The best I've been able to come up with so far is a formula with a high failure rate:
=INDEX($G$2:$G$11,MATCH(ROW(D2),$E$2:$E$11,MATCH("*"&A2&"*",$F$2:$F$11,0)))
Any help with this part of the question will be greatly appreciated.

ROW([reference])
Returns the row number of a reference
E.g.: Row(B2) returns 2. If nothing provided like ROW() will also
return row number based on position of cell where it is called.
VLOOKUP(loolup_value, table_array, col_index_num, [range_lookup])
Looks for a value in the leftmost column of a table, and then returns a value in the same row from a column you specify (col_index_num)
By default - the table must be sorted in an ascending order.
Try this:
=VLOOKUP(ROW(B2),$E$2:$G$11,3,FALSE)
INDEX(array, row_num, [column_num]) INDEX(reference, row_num,
[column_num], [area_num])
Returns a value or reference of the cell at the intersection of a particular row and column, in a given range.
In this case, you have to get row_num with MATCH function.
MATCH(lookup_value, lookup_array, [match_type])
Returns a relative position of an item in an array that matches a specified value in a specified order.
match_type: 1 (Less than), 0 (Exact match), -1 (Greater than)
Try this:
=INDEX($G$2:$G$11,MATCH(ROW(B2),$E$2:$E$11,0))
Identity Data with Multiple Criteria Condition using MATCH()
=INDEX($G$2:$G$11,MATCH(1, (ROW(D2) = $E$2:$E$11) * (ISNUMBER(SEARCH(A2, $F$2:$F$11))),0))
References:
https://exceljet.net/excel-functions/excel-vlookup-function
https://exceljet.net/excel-functions/excel-index-function
https://exceljet.net/formula/index-and-match-with-multiple-criteria

This is the formula I was looking for in column D:
=INDEX($G$2:$G$11,MATCH(ROW(D2)&"*"&A2&"*",INDEX($E$2:$E$11&$F$2:$F$11,),0))
You can see it working here.
Nayan provided a great deal of help with answering this question, so I will mark his answer as the accepted solution.
Syeda Fahima Nazreen provided the example I referenced to figure out the formula shown above.
Reference:
Nested Excel Formula with Two INDEX Functions and a MATCH Function with Multiple Criteria

Related

Using factors when integer reference column already exists

Say I have a dataframe as such, obtained by reading a csv file:
DF1<-data.frame(ID = c(1,2,3,4),
Names = c('John','Bob','Hannah','Mark'))
I want to factorize the names of the second column, as they take a finite amount of values. Nevertheless, an "ID" column already exists, composed of integers which uniquely identify any name in the second.
Is
DF1$Names<-factor(DF1$Names)
the correct approach, hence "disregarding" the ID column for the matter, or am I missing something?
EDIT: Sorry everyone, I'll try to be more specific: when factorizing a character column, for which every element is ALREADY uniquely identified in another column by integers ("ID"), should I consider the "ID" in any way during the factorization? My doubt comes from the fact that, as I read in the documentation, factorizing something means giving it an integer reference, which in my case already exists.
Thanks

find common rows between two dataframes based on two columns using bash

I found this very difficult to solve in bash - I have two files that I want to find the common rows between them based on two columns.
f1.csv:
col1,col2,col3,col4
Dalir,Cpne1,down,2174
Fendrr,Aco2,up,280
Cpne1,Tox1,down,8900
f2.csv
col1,col2,col3,col4,col5,col6
Linc,Rmo,ch2,ch2,p,l
Tox1,Cpne1,ch1,ch2,l,p
so basically the code should look only at the first two columns of the dfs and see if pairs are the same (the order of the pairs is not important). So you can see that in the first df there is
Cpne1,Tox1 in the third row and in the second df there is Tox1,Cpne1 in the second row - so this should be printed in the output from the second file.
Desired output:
Tox1,Cpne1
Unfortunately, I have not been able to develop a bash command for this - it would be great if you could help me with this. Thanks
Just adding the explanation to oguz' fine answer in the comments above:
BEGIN{FS=OFS=","} defines , to be the separator for both input and output.
NR==FNR{pair[$1,$2];next} while the record number of the entire input matches the current file's record number (in other words, for the first file) add an element with the first and second field as index to the array pair.
($1,$2) in pair||($2,$1) in pair{print $1,$2} operating on the second file, check if field one and two in any order are present as index in the array pair, and print them if they are.

Is there a way to extract a substring from a cell in OpenOffice Calc?

I have tens of thousands of rows of unstructured data in csv format. I need to extract certain product attributes from a long string of text. Given a set of acceptable attributes, if there is a match, I need it to fill in the cell with the match.
Example data:
"[ROOT];Earrings;Brands;Brands>JeweleryExchange;Earrings>Gender;Earrings>Gemstone;Earrings>Metal;Earrings>Occasion;Earrings>Style;Earrings>Gender>Women's;Earrings>Gemstone>Zircon;Earrings>Metal>White Gold;Earrings>Occasion>Just to say: I Love You;Earrings>Style>Drop/Dangle;Earrings>Style>Fashion;Not Visible;Gifts;Gifts>Price>$500 - $1000;Gifts>Shop>Earrings;Gifts>Occasion;Gifts>Occasion>Christmas;Gifts>Occasion>Just to say: I Love You;Gifts>For>Her"
Look up table of values:
Zircon, Diamond, Pearl, Ruby
Output:
Zircon
I tried using the VLOOKUP() function, but it needs to match an entire cell and works better for translating acronyms. Haven't really found a built in function that accomplishes what I need. The data is totally unstructured, and changes from row to row with no consistency even within variations of the same product. Does anyone have an idea how to do this?? Or how to write an OpenOffice Calc function to accomplish this? Also open to other better methods of doing this if anyone has any experience or ideas in how to approach this...
ok so I figured out how to do this on my own... I created many different columns, each with a keyword I was looking to extract as a header.
Spreadsheet solution for structured data extraction
Then I used this formula to extract the keywords into the correct row beneath the column header. =IF(ISERROR(SEARCH(CF$1,$D769)),"",CF$1) The Search function returns a number value for the position of a search string otherwise it produces an error. I use the iserror function to determine if there is an error condition, and the if statement in such a way that if there is an error, it leaves the cell blank, else it takes the value of the header. Had over 100 columns of specific information to extract, into one final column where I join all the previous cells in the row together for the final list. Worked like a charm. Recommend this approach to anyone who has to do a similar task.

How to Add Column (script) transform that queries another column for content

I’m looking for a simple expression that puts a ‘1’ in column E if ‘SomeContent’ is contained in column D. I’m doing this in Azure ML Workbench through their Add Column (script) function. Here’s some examples they give.
row.ColumnA + row.ColumnB is the same as row["ColumnA"] + row["ColumnB"]
1 if row.ColumnA < 4 else 2
datetime.datetime.now()
float(row.ColumnA) / float(row.ColumnB - 1)
'Bad' if pd.isnull(row.ColumnA) else 'Good'
Any ideas on a 1 line script I could use for this? Thanks
Without really knowing what you want to look for in column 'D', I still think you can find all the information you need in the examples they give.
The script is being wrapped by a function that collects the value you calculate/provide and puts it in the new column. This assignment happens for each row individually. The value could be a static value, an arbitrary calculation, or it could be dependent on the values in the other columns for the specific row.
In the "Hint" section, you can see two different ways of obtaining the values from the other rows:
The current row is referenced using 'row' and then a column qualifier, for example row.colname or row['colname'].
In your case, you obtain the value for column 'D' either by row.D or row['D']
After that, all you need to do is come up with the specific logic for ensuring if 'SomeContent' is contained in column 'D' for that specific row. In your case, the '1 line script' would look something like this:
1 if [logic ensuring 'SomeContent' is contained in row.D] else 0
If you need help with the logic, you need to provide more specific examples.
You can read more in the Azure Machine Learning Documentation:
Sample of custom column transforms (Python)
Data Preparations Python extensions
Hope this helps

BIRT, How to get Dataset Row Count using Javascript

How can I get Dataset Row Count from Javascript function in BIRT. I tried searching this in BIRT exchange, but the only solution offered there is to have a new dataset getting count of values of required data set. This wont suit my needs.
Is there any way to obtain it using dataset events.
An easy way would be to count dataset items in a report variable.
Declare a new variable in your report outline:
Reset it in beforeOpen script of the dataset (in case this dataset is invoked multiple times during report execution):
vars["items"]=0;
Increment the variable in onFetch script of the dataset:
vars["items"]++;
Use your variable in any expression. For example add a dynamic text element in report's body such:
"Items count="+vars["items"]
Important 1: This approach works if and only if the dataset is bound to at least one report element (a table, chart, data element, etc.). For example, it won't work if the dataset is only invoked to fill a list of selection choices of a report parameter.
Important 2: In the body of the report, this variable can only be used after the first report element using the relevant dataset, otherwise it won't be initialized
Dominique has a great answer; It is not clear to me from your question if this simpler solution might also meet your needs.
In your data set use a computed column with a value of '1', then sum the values.
You can write JS that only adds the value if specific criteria are met.
Or you can use an aggregation on your report to sum the values, which would be after any filters or groups are placed.
There is a simpler way if you use the row number in your tables footer:
In the 'Dynamic Text' element you can select:
Avilable Column Bindings > Table > RowNum
Add 1 as the index starts with 0.
You could also make a variable and then add to that for each row created in a table.
For example, in the Table-Detail script, set an onCreate event to check for a value in each row and if there is one to increase the row count. The following onCreate script would check if the row is empty. If the row is not empty, the script increases the counter and goes to the next row.
var checker = this.getRowData().getExpressionCount();
if( checker > 0 ) vars["Counter"]++;
Then you could add dynamic text after the table with the following expression:
"Row count="+vars["Counter"]

Resources