How to add row in jupyter notebook - jupyter-notebook

code:
df_t["My address"]=["My address",7,"My city","My State","My Country","My Shop",12,"My Continent"]
ValueError: Length of values (8) does not match length of index (7)

Your code tries to add a new column, not a row. If you want to actually add a row to your DataFrame you have several options, one of many being:
df_t.loc[len(df_t)] = ["My address",7,"My city","My State","My Country","My Shop",12,"My Continent"]
However, please notice that growing a DataFrane by adding new rows isn't a best practice (see this very explanatory reference).
Next time please add your sample dataframe directly in the question and not in the comments (you can also edit your question)

Related

Align Text To Row Identified by Number and to ID Matching that Embedded in String

I need to align text in an ALERT STRING column with the row identified by number in an ID ROW column.
Additionally, I need to also align the same ALERT STRING text with the same ID ROW number AND with the ID matching that embedded in a string in the TEXT WITH ID column. (This double-check will sometimes be necessary with the real-world data.)
So far, I've only figured out how to align the ALERT STRING with the ID matching that embedded in the TEXT WITH ID column:
=LOOKUP(2,1/SEARCH(A2,$F$2:$F$11),$G$2:$G$11)
I appreciate any help folks can offer. You can find an editable copy of the workbook here:
https://1drv.ms/x/s!ArQ7Kw6ayNMY2zktTW3pDCbMmJZ_
UPDATE: Nayan provided a solution to the first part of this question (please see answer below). I'm still trying to work out a formula for the column D part of this question, in which the row reference shown in column E is combined with a match of the ID shown in column A with its corresponding value in one of the text strings in column F.
The best I've been able to come up with so far is a formula with a high failure rate:
=INDEX($G$2:$G$11,MATCH(ROW(D2),$E$2:$E$11,MATCH("*"&A2&"*",$F$2:$F$11,0)))
Any help with this part of the question will be greatly appreciated.
ROW([reference])
Returns the row number of a reference
E.g.: Row(B2) returns 2. If nothing provided like ROW() will also
return row number based on position of cell where it is called.
VLOOKUP(loolup_value, table_array, col_index_num, [range_lookup])
Looks for a value in the leftmost column of a table, and then returns a value in the same row from a column you specify (col_index_num)
By default - the table must be sorted in an ascending order.
Try this:
=VLOOKUP(ROW(B2),$E$2:$G$11,3,FALSE)
INDEX(array, row_num, [column_num]) INDEX(reference, row_num,
[column_num], [area_num])
Returns a value or reference of the cell at the intersection of a particular row and column, in a given range.
In this case, you have to get row_num with MATCH function.
MATCH(lookup_value, lookup_array, [match_type])
Returns a relative position of an item in an array that matches a specified value in a specified order.
match_type: 1 (Less than), 0 (Exact match), -1 (Greater than)
Try this:
=INDEX($G$2:$G$11,MATCH(ROW(B2),$E$2:$E$11,0))
Identity Data with Multiple Criteria Condition using MATCH()
=INDEX($G$2:$G$11,MATCH(1, (ROW(D2) = $E$2:$E$11) * (ISNUMBER(SEARCH(A2, $F$2:$F$11))),0))
References:
https://exceljet.net/excel-functions/excel-vlookup-function
https://exceljet.net/excel-functions/excel-index-function
https://exceljet.net/formula/index-and-match-with-multiple-criteria
This is the formula I was looking for in column D:
=INDEX($G$2:$G$11,MATCH(ROW(D2)&"*"&A2&"*",INDEX($E$2:$E$11&$F$2:$F$11,),0))
You can see it working here.
Nayan provided a great deal of help with answering this question, so I will mark his answer as the accepted solution.
Syeda Fahima Nazreen provided the example I referenced to figure out the formula shown above.
Reference:
Nested Excel Formula with Two INDEX Functions and a MATCH Function with Multiple Criteria

find common rows between two dataframes based on two columns using bash

I found this very difficult to solve in bash - I have two files that I want to find the common rows between them based on two columns.
f1.csv:
col1,col2,col3,col4
Dalir,Cpne1,down,2174
Fendrr,Aco2,up,280
Cpne1,Tox1,down,8900
f2.csv
col1,col2,col3,col4,col5,col6
Linc,Rmo,ch2,ch2,p,l
Tox1,Cpne1,ch1,ch2,l,p
so basically the code should look only at the first two columns of the dfs and see if pairs are the same (the order of the pairs is not important). So you can see that in the first df there is
Cpne1,Tox1 in the third row and in the second df there is Tox1,Cpne1 in the second row - so this should be printed in the output from the second file.
Desired output:
Tox1,Cpne1
Unfortunately, I have not been able to develop a bash command for this - it would be great if you could help me with this. Thanks
Just adding the explanation to oguz' fine answer in the comments above:
BEGIN{FS=OFS=","} defines , to be the separator for both input and output.
NR==FNR{pair[$1,$2];next} while the record number of the entire input matches the current file's record number (in other words, for the first file) add an element with the first and second field as index to the array pair.
($1,$2) in pair||($2,$1) in pair{print $1,$2} operating on the second file, check if field one and two in any order are present as index in the array pair, and print them if they are.

Ignore first row in R

How do I do in R, to ignore the first row of a data set and a second turn the column names?
I am currently reading a file that sometimes has garbage on the first or second line, and I am looking for a way to resolve this.

How to Add Column (script) transform that queries another column for content

I’m looking for a simple expression that puts a ‘1’ in column E if ‘SomeContent’ is contained in column D. I’m doing this in Azure ML Workbench through their Add Column (script) function. Here’s some examples they give.
row.ColumnA + row.ColumnB is the same as row["ColumnA"] + row["ColumnB"]
1 if row.ColumnA < 4 else 2
datetime.datetime.now()
float(row.ColumnA) / float(row.ColumnB - 1)
'Bad' if pd.isnull(row.ColumnA) else 'Good'
Any ideas on a 1 line script I could use for this? Thanks
Without really knowing what you want to look for in column 'D', I still think you can find all the information you need in the examples they give.
The script is being wrapped by a function that collects the value you calculate/provide and puts it in the new column. This assignment happens for each row individually. The value could be a static value, an arbitrary calculation, or it could be dependent on the values in the other columns for the specific row.
In the "Hint" section, you can see two different ways of obtaining the values from the other rows:
The current row is referenced using 'row' and then a column qualifier, for example row.colname or row['colname'].
In your case, you obtain the value for column 'D' either by row.D or row['D']
After that, all you need to do is come up with the specific logic for ensuring if 'SomeContent' is contained in column 'D' for that specific row. In your case, the '1 line script' would look something like this:
1 if [logic ensuring 'SomeContent' is contained in row.D] else 0
If you need help with the logic, you need to provide more specific examples.
You can read more in the Azure Machine Learning Documentation:
Sample of custom column transforms (Python)
Data Preparations Python extensions
Hope this helps

command to remove row from a data frame [duplicate]

This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
How to delete a row in R
I can't figure out how to simply remove row (n) from a dataframe in R.
R's documentation and intro manual are so horribly written, they are virtually zero help on this very simple problem.
Also, every explanation i've found here/ on google is for removing rows that contain strings, or duplicates, etc, which have been excessively advanced for my problem and lead me to introduce more bugs and get nowhere. I just want to remove a row.
Thanks in advance for your help.
fyi the list is in the variable eld, which has 5 columns and 33 rows. I would like to remove row 14. I initialized eld with the following command
eld <- read.table("election2012.txt")
so my desired result is
eldNew <- eld(minus row 14)
eldNew <- eld[-14,]
See ?"[" for a start ...
For ‘[’-indexing only: ‘i’, ‘j’, ‘...’ can be logical
vectors, indicating elements/slices to select. Such vectors
are recycled if necessary to match the corresponding extent.
‘i’, ‘j’, ‘...’ can also be negative integers, indicating
elements/slices to leave out of the selection.
(emphasis added)
edit: looking around I notice
How to delete the first row of a dataframe in R? , which has the answer ... seems like the title should have popped to your attention if you were looking for answers on SO?
edit 2: I also found How do I delete rows in a data frame? , searching SO for delete row data frame ...
Also http://rwiki.sciviews.org/doku.php?id=tips:data-frames:remove_rows_data_frame

Resources