In Robot Framework, I'm trying to get values from excel sheet using excel library. I've observed below issue for the "Get Row values" keyword
Open Excel E:/Robot/excel/Test.xls
#{Datas}= Get Row Values Sheet1 0 includeEmptyCells=False
Log to console ${Datas}
If the datas in the column are from A1 to Z1, the values are updated in the ${Datas} properly, I'm using these values ${Datas} again to convert it into List and get the exact value from the sheet.
But now when it exceeds Z1, i.e A1,B1,...,Z1,AA1,AB1,AC1.. ${Datas} is not generated as same in the sheet. Its stored as something as below
[('A1', u'Valli'), ('AA1', u'sdffgdg'), ('AB1', u'k'), ('AC1', u'h'),('AD1', u'jk'), ('AE1', u'j'), ('AF1', u";\\'"), ('AG1', u'jk'), ('AH1', u'j'), ('AI1', u'Kasthurikala')........]
I want the values as same as in the excel rather than alphabetical order [('A1','Value1'),('B1','value1'), ('Z1','ValueZ'),('AA1','ValueAA1')
Is there any solution?
You can sort by the length of the second element and breaking ties with the first elemnt:
a = [('B1','value1'), ('Z1','ValueZ'),('AA1','ValueAA1'),('A1','Value1')]
a.sort(key=lambda x: (len(x[1]),x[0]))
print(a)
[('A1', 'Value1'), ('B1', 'value1'), ('Z1', 'ValueZ'), ('AA1', 'ValueAA1')]
Related
I have got this .txt file outputed by a microscope to process.
#read the .txt file generated by microscope, skipping the first 9 lines of garbage information
df <- read.csv("Objects_Population - AllCells.txt", sep="\t", skip = 9,header=TRUE, fill = T)
Then I started looking at the structure of the dataframe, everything seems fine except I now found an extra column in the end of the data frame named "x.1" and all rows of it are NA values. I don't see this column when I open the .txt file in excel. I suspect the problem has something to do with the column names generated by microscope, they contain quite some special characters
Below is the dataframe read by Excel(only showing the last 2 columns since I have 132 columns, and their names are disgustingly long):
AllCells - Cell Contact Area with Neighbors [%] AllCells - Nucleus Nearest Neighbor Distance [µm]
0 4.82083
21.9512 0
15.7895 0
29.4118 0.584611
0 4.21569
0 1.99599
0 3.50767
...
This has happened to me before but I never took it too serious as I was always interested in a subset of my data frame. Now I'm looking at all columns then this starts to bothering me.
Is there any way I can read them correctly without R attaching that additional "X.1" column in the end? Preferably not manually delete or subset out the last column...
Cheers,
ML
If all other column names are correct, you have probably a trailing \t in the text file. R tries to include it and gives it the generic column name X.1.
You could try and read the file first as 'plain text' and remove the trailing \t and only then use read.csv:
file_connection <- file("Objects_Population - AllCells.txt")
content <- readLines(file_connection )
close(file_connection)
Now we try to get rid of these trailing \t (this might need some testing to fit your needs)
sanitized <- gsub("\\t$", "", content)
And then we read this sanitized string as if it was a file (using the argument text)
df <- read.csv(text=paste0(sanitized, collapse="\n"), sep="\t", skip = 9,header=TRUE, fill = T)
Had that problem too. Fixed it by saving the file as "CSV (MS-DOS (*csv)" instead of what I originally had as "CSV (Comma delimited)(*csv)".
This is almost certainly because you've got an extra empty column in your spreadsheet.
In Excel, open your sheet and press Ctrl-End. If you end up in an empty cell outside the range of your data, there's the problem. Select the column (Ctrl-Space), right-click, and choose Delete.
I also encountered similar problem. I found that three extra columns were created (X, X.1, X.2), after I loaded dataset from excel sheet to R studio.
Steps Followed by me:
a) I went to the excel sheet and selected those three extra columns after last column with actual values in excel sheet. Selected extra 3 columns by keeping cursor on top of columns and then right click the mouse and select delete.
b) Again loaded that excel sheet in R. I did not find those 3 columns.
this is the excel:
enter image description here
I want to get the first column's cell count, I use excellibrary, but the Get Row Count method doesn't suite my case, it get the max row in excel,any solution?
my code as below(i use ride):
Open Excel | D:\\try.xls
${a} | Get Column Count | Sheet1
the result is 4, but i want to get 3
The excellibrary uses internally a python lib for working with Excel files - xlrd; the keyword Get Column Count is actually returning the property ncols of a Sheet object in it, the documentation of which says:
Nominal number of columns in sheet. It is one more than the maximum column index found, ignoring trailing empty cells.
So it will always return one more than the actual used - just subtract the extra one.
I have csv file contains iphone device roadmap like version number, name of model, release of model , price etc. I have done following:
I have imported data set in Rstudio in variable name iphonedetail by following command. iphonedetail <-read.csv("iphodedata.csv")
Than i hv changed the attribute "name of model" to character by using following: iphonedetail$nameofmodel <- as.character(iphonedetail$nameofmodel)
Now i need to access 1st 5 name of model and store them in vector .
I tried this to achieve : iphonesubset <- data.frame(iphonedetail$nameofmodel)
Then on console i typed iphonesubset, but gave 0 col and row.
Could someone help in above 2 steps correct or not ? and also suggest how to fix 3rd step?
if you want to extract the first five (non unique):
iphonedf1to5 <- df[1:5,]
That means that you get the first 5 rows and all columns. Then if you want to get the unique first five elements it should be like:
iphonedf1to5 <- unique(df[1:5,])
Edit:
df means your data frame of the read csv, iphonedetail in your case.
I have a folder with tons of txt files from where I have to extract especific data. The problem is that the format of the file has changed once and the position of the data I need to extract has also changed. So I need to deal with files in different format.
To try to make it more clear, in column 4 I have the name of the variable and in 5 I have the value, but sometimes this is in a different row. Is there a way to find the name of the variable (in which row) and then extract its value?
Thanks in advance
EDITING
In some files I will have the data like this:
Column 1-------Column 2.
Device ID------A.
Voltage------- 500.
Current--------28
But in some point in life, there was a change in the software to add another variable and the new file iis like this:
Column 1-------Column 2.
Device ID------A.
Voltage------- 500.
Error------------5.
Current--------28
So I need to deal with these 2 types of data, extracting the same variables which are in different rows.
If these files can't be read with read.table use readLines and then find those lines that start with the keyword you need.
For example:
Sample file 1 (with the dashes included and extra line breaks):
Column 1-------Column 2.
Device ID------A.
Voltage------- 500.
Error------------5.
Current--------28
Sample file2 (with a comma as separator):
Column 1,Column 2.
Device ID,A.
Current,555
Voltage, 500.
Error,5.
For both cases do:
text = readLines(con = file("your filename here"))
curr = text[grepl("^Current", text, ignore.case = T)]
Which returns:
for file 1:
[1] "Current--------28"
for file 2:
[1] "Current,555"
Then use gsub to remove anything that is not a number.
In PhpExcel library when i am assigning values to IW4 the assigned value not generatted there
Steps:
We are using The code to generate the Value to cell in PHPExcel
**$objPHPExcel->getActiveSheet()->setCellValue('A1', 'cell value here');**
When i am using it to generate value to IW4 cell the value not getting generatted
**$objPHPExcel->getActiveSheet()->setCellValue('IW4', 'cell value here');**
Please Help me to find the solution
BIFF format Excel files only allow 256 columns (up to IV), OfficeOpenXML allows more.
If you set a value in a column beyond the limit, PHPExcel only knows it's invalid at the point where you save (when it knows whether you're saving as an Excel5 or Excel2007 file), Rather than trigger an exception at that point (which would be much more frustrating if it was a long running script), it silently discards the invalid columns or rows.
This is similar behaviour to Excel itself, if you open an xlsx file in an earlier version of Excel that doesn't support as many rows and columns.