pygsheet worksheet size without empty values - pygsheets

How to get row/col number of the last (bottom-right) non-empty cell in worksheet?
Worksheet's rows and cols attributes count also empty cells.

There is no direct function to achieve this in pygsheets. But you can figure this out after getting all the values using get_all_values() and by excluding the empty values.
cells = wks.get_all_values(include_empty_rows=False, include_tailing_empty=False, returnas='cells')
bottom_right = cells[-1][-1]
# get row col as bottom_right.row, bottom_right.row,
NB: please use the staging version of pygsheets to run this

Related

Ammend a R data.frame cell based on whether another cell within the row is contained in another object

I have a data.frame - UnknownSamples - with a column $Identity, and another column $ID.
I want to change the values of $Identity to "Parent" if the $ID in a particular cell is in a reference data.frame (but it can be a list) - ParentSamples.
This is my attempt:
lapply(UnknownSamples, function(x) if_else(UnknownSamples$ID[x] %in% ParentSamples$ID, UnknownSamples$Identity[x] <- "Parent", UnknownSamples$Identity[x] <- "Unknown" ))
UnknownSamples has multiple entries of most ID values, but ParentSamples only has one instance of each values. I do not know why this command is throwing an error however, since there shouldn't be a reason why a cell in ParentSamples cant be referenced twice. The error:
Error: Assigned data must be compatible with existing data. x Existing data has 1271 rows. x Assigned data has 2542 rows. ℹ Only vectors of size 1 are recycled.
I am probably returning the incorrect thing in the function but I am not sure how to address this issue.
UnknownSamples$Identity <- if_else(UnknownSamples$ID %in% ParentSamples$ID, print("Parent"), print("Unknown"))
The solution is to have the if_else output what you want and then assigning it into the data.frame.

Trouble with setting conditionals to parse through data.frame

seems like we have some extra artifacts that appear as the dataset changes firms. write a piece of code that checks to see where the tickers change, and delete all artifacts from those points
for (y in 1:nrow(longitudinal)){
if (longitudinal[y,2] != longitudinal[y-1,2])
{longitudinal[y,] = NA }}
hey guys, I am trying to remove values from a column in a dataset according to a change in column 2, the name value. Unfortunately I am getting the error
Error in Ops.data.frame(longitudinal[y, 2], longitudinal[y - 1, 2]) :
‘!=’ only defined for equally-sized data frames
I cannot think of a different way to compare the elements in the name column in order to set the condition for the NA's to correspond to the change in the name. Would appreciate any help thinking through this.
The for loop had to start from 1 row down ie
y in 2:nrow(longitudinal)
because the conditional would've had the second element starting at row 0.

Dynamically assign variable names for vectors in R?

I'm new to R and I am trying to create variables referencing vectors within a for loop, where the index of the loop will be appended to the variable name. However, the following code below, where I'm trying to insert the new vectors into the appropriate place in the larger data frame, is not working and I've tried many variations of get(), as.vector(), eval() etc. in the data frame construction function.
I want num_incorrect.8 and num_incorrect.9 to be vectors with a value of 0 and then be inserted into mytable.
cols_to_update <- c(8,9)
for (i in cols_to_update)
{
#column name of insertion point
insertion_point <- paste("num_correct",".",i,sep="")
#create the num_incorrect col -- as a vector of 0s
assign(paste("num_incorrect",".",i,sep=""), c(0))
#index of insertion point
thespot <- which(names(mytable)==insertion_point)
#insert the num_incorrect vector and rebuild mytable
mytable <- data.frame(mytable[1:thespot], as.vector(paste("num_incorrect",".",i,sep="")), mytable[(thespot+1):ncol(mytable)])
#update values
mytable[paste("num_incorrect",".",i,sep="")] <- mytable[paste("num_tries",".",i,sep="")] - mytable[paste("num_correct",".",i,sep="")]
}
When I look at how the column insertion went, it looks like this:
[626] "num_correct.8"
[627] "as.vector.paste..num_incorrect........i..sep........2"
...
[734] "num_correct.9"
[735] "as.vector.paste..num_incorrect........i..sep........3"
Basically, it looks like it's taking my commands as literal text. The last line of code works as expected and creates new columns at the end of the data frame (since the line before it didn't insert the column into the proper place):
[1224] "num_incorrect.8"
[1225] "num_incorrect.9"
I am kind of out of ideas, so if someone could please give me an explanation of what's wrong and why, and how to fix it, I would appreciate it. Thanks!
The mistake is in the second last lines of your code, excluding the comments where you are creating the vector and adding it to your data frame.
You just need to add the vector and update the name. You can remove the assign function as it's not creating a vector instead just assigning a value of 0 to the variable.
Instead of the second last line of your code put the code below and it should work.
#insert the vector at the desired location
mytable <- data.frame(mytable[1:thespot], newCol = vector(mode='numeric',length = nrow(mytable)), mytable[(thespot+1):ncol(mytable)])
#update the name of new location
names(mytable)[thespot + 1] = paste("num_incorrect",".",i,sep="")

Boxplot in octave

I am trying to create a boxplot, using boxplot(data) for this sample data
1,0.3074855004
1,0.5342907151
1,0.1243014226
1,0.8373050862
1,0.2964970712
2,0.2753391378
2,0.0662903741
2,0.7435585174
2,0.141665858
2,0.8710871406
3,0.683215396
3,0.9968826184
3,0.8009274979
3,0.6164554236
3,0.9880523647
4,0.6854059871
4,0.4828904583
4,0.6001796951
4,0.3790802876
4,0.5728325425
I expect to get a graph with four columns but the output currently only shows two columns. Here is the output
I have tried following the documentation here
http://octave.sourceforge.net/statistics/function/boxplot.html
but I'm still having trouble getting desired results.
Please help me with the correct syntax for getting a proper boxplot in octave.
Thanks,
Your expectations are wrong. Why would boxplot() assume that the first column is the group number. The documentation for boxplot() says:
DATA is a matrix with one column for each data set, or data is a cell vector with one cell for each data set.
Your data is not any of the above.
Also, why are you even wasting memory by setting it up like that? Why do you have a column just to store the group number? Since each group seems to have the same number of values, you can reshape your second column into a matrix with one column per group:
octave> reshape (data(:,2), 5, 4)
ans =
0.307486 0.275339 0.683215 0.685406
0.534291 0.066290 0.996883 0.482890
0.124301 0.743559 0.800927 0.600180
0.837305 0.141666 0.616455 0.379080
0.296497 0.871087 0.988052 0.572833
or if each group has different number of values, use a cell array:
octave> accumarray (data(:,1), data(:,2), [], #(x) {x})
ans =
{
[1,1] =
0.30749
0.53429
0.12430
0.83731
0.29650
[2,1] =
0.275339
0.066290
0.743559
0.141666
0.871087
[3,1] =
0.68322
0.99688
0.80093
0.61646
0.98805
[4,1] =
0.68541
0.48289
0.60018
0.37908
0.57283
}
Once your data is a sensible format, boxplot() will work as you expected.

Plotting uneven row sizes in R

I have data in tab delimited rows of uneven length and I want to make a histogram for each row:
1    23    352    4    12    94    0    2
434    13    29
5    93    93    34
(...more rows)
This is what I currently have (no fanciness included):
data = read.delim(file.txt,header = F, sep="\t")
for (j in 1:nrow(data)) { #loop over each row
hist(data[j,])
But when I try to make the histogram, I think it tries to include the NA's in the row of the data frame, since R gives me the error message: "Error in hist.default(data[2, ]) : 'x' must be numeric".
When I try to use:
read.scan("file.txt, sep="\t")
I'm left with something I don't know how to separate by rows. Do I have a better option than splitting the file into one row per file and then reading in each row separately? (I am running into the same problem with uneven column size...)
The error results from the fact that grabbing a row from a data.frame yields an object of class data.frame (and hist() wants class numeric). Just convert it to numeric:
hist(as.numeric(data[j,]))

Resources