I trained a BERT based encoder decoder model (EncoderDecoderModel) named ed_model with HuggingFace's transformers module.
I used the BertTokenizer named as input_tokenizer
I tokenized the input with:
txt = "Some wonderful sentence to encode"
inputs = input_tokenizer(txt, return_tensors="pt").to(device)
print(inputs)
The output clearly shows that a input_ids is the return dict
{'input_ids': tensor([[ 101, 5660, 7975, 2127, 2053, 2936, 5061, 102]], device='cuda:0'), 'token_type_ids': tensor([[0, 0, 0, 0, 0, 0, 0, 0]], device='cuda:0'), 'attention_mask': tensor([[1, 1, 1, 1, 1, 1, 1, 1]], device='cuda:0')}
But when I try to predict, I get this error:
ed_model.forward(**inputs)
ValueError: You have to specify either input_ids or inputs_embeds
Any ideas ?
Well, apparently this is a known issue, for example: This issue of T5
The problem is that there's probably a renaming procedure in the code, since we use a encoder-decoder architecture we have 2 types of input ids.
The solution is to explicitly specify the type of input id
ed_model.forward(decoder_input_ids=inputs['input_ids'],**inputs)
I wish it was documented somewhere, but now you know :-)
I am new to JavaFX and ControlsFX.
I am trying to create a very basic SpreadsheetView using the ControlsFX library. Following is the function to populate and create the SpreadsheetView:
private fun spreadSheetFunc() : SpreadsheetView {
val rowCount = 15
val columnCount = 10
val grid = GridBase(rowCount, columnCount)
val rows = FXCollections.observableArrayList<ObservableList<SpreadsheetCell>>()
var list = FXCollections.observableArrayList<SpreadsheetCell>()
list.add(SpreadsheetCellType.STRING.createCell(0, 0, 1, 1, "row0-col0"))
list.add(SpreadsheetCellType.STRING.createCell(0, 1, 2, 1, "row0-col1"))
list.add(SpreadsheetCellType.STRING.createCell(0, 2, 1, 1, "row0-col2"))
rows.add(list)
list = FXCollections.observableArrayList()
list.add(SpreadsheetCellType.STRING.createCell(1, 0, 1, 1, "row1-col0"))
//commenting row1-col1 as row0-col1 has a rowspan of 2
//list.add(SpreadsheetCellType.STRING.createCell(1, 1, 1, 1, "row1-col1"))
list.add(SpreadsheetCellType.STRING.createCell(1, 2, 1, 1, "row1-col2"))
rows.add(list)
list = FXCollections.observableArrayList()
list.add(SpreadsheetCellType.STRING.createCell(2, 0, 1, 1, "row2-col0"))
list.add(SpreadsheetCellType.STRING.createCell(2, 1, 1, 1, "row2-col1"))
list.add(SpreadsheetCellType.STRING.createCell(2, 2, 1, 1, "row2-col2"))
rows.add(list)
list = FXCollections.observableArrayList()
list.add(SpreadsheetCellType.STRING.createCell(3, 0, 1, 1, "row3-col0"))
list.add(SpreadsheetCellType.STRING.createCell(3, 1, 1, 1, "row3-col1"))
list.add(SpreadsheetCellType.STRING.createCell(3, 2, 1, 1, "row3-col2"))
rows.add(list)
grid.setRows(rows)
return SpreadsheetView(grid)
}
On running it, I get the following error:
java.lang.IndexOutOfBoundsException: Index: 2, Size: 2 at
java.util.ArrayList.rangeCheck(ArrayList.java:653)
I know its happening because I am not adding any value for rowIndex=1 colIndex=1 (see the commented out line) ... but that is what I want.
The row0-col1 has a rowspan of 2 which should mean that even if my row1-col1 is absent, there shouldn't be any problem.
Why doesn't ControlsFX automatically take care of this?
If I uncomment that line, I get the following output:
Edit 1:
Also, I found another issue, when a colspan/rowspan occupies the whole column/row in the SpreadsheetView and then when one presses arrow key to navigate to cells you get an error:
The above situation arises when you press the right arrow key (Even though their isn't a cell on the right)
Let me apologize because it is not well documented how span must be made in the SpreadsheetView. I will update the documentation.
If you want to span, you have to put the same cell in every cell inside the span.
So either you build your own cell, and then in every place. In your case, you would add the same cell in row 0 column 1 and in row 1 column 1.
Or you could keep your code, and simply call the method spanRow on the Grid. This method will automatically take your cell and place it accordingly.
Regarding the second issue, please submit it to our issue tracker so we can fix it : https://bitbucket.org/controlsfx/controlsfx/issues?status=new&status=open
If you have other issue regarding the SpreadsheetView, consider posting in our Google group where we will get notifications : http://groups.controlsfx.org
I have a txt file contain some lines, like this:
[datetime.datetime(2013, 1, 4, 9, 35, 0, 4996), datetime.datetime(2013, 1, 4, 9, 40, 0, 4998),datetime.datetime(2013, 1, 4, 9, 45, 0, 5000)]
how to load data and translate to list like this
[2013-01-04 09:35:00.004996,
2013-01-04 09:40:00.004998,
2013-01-04 09:45:00.005000]
for line in dataFile.readlines():
print(type(line))
I get
<class 'str'>
how to do please
Thank you in advance
you will always have strings in a text file. but you can convert the strings to datetime objects:
import time
fmt = '%Y-%m-%d %H:%M:%S.%f'
dt = time.strptime('2013-01-04 09:35:00.004996', fmt)
print(dt)
or maybe i got you wrong and in your file you really have the string that looks like a list (please clarify); then you could try
from ast import literal_eval
import datetime
import re
strg = '[datetime.datetime(2013, 1, 4, 9, 35, 0, 4996), datetime.datetime(2013, 1, 4, 9, 40, 0, 4998),datetime.datetime(2013, 1, 4, 9, 45, 0, 5000)]'
dates = []
match = re.findall('datetime.datetime\([0-9 ,]+\)', strg)
for date_str in match:
args = literal_eval(date_str.replace('datetime.datetime', ''))
dates.append(datetime.datetime(*args))
print(dates)
Your text file is not dumped property for reading as json file.
its ok you can solve your problem as below
import datetime
output=[]
#I am assuming that you have already defined datafile
for line in dataFile.readlines():
output.append(eval(line))
print output
but before writing data to your text file you need to use json.dumps(object) then it is easy to get your object back by using json.load().
dateList = []
for line in dataFile.readlines():
match = re.findall('datetime.datetime\([0-9 ,]+\)', line)
for date_str in match:
# I can get this
print(eval(date_str))
# Translate
dates = date_str.replace('datetime.datetime', '')
dateList.append(dates)
# get this
print(dateList)
Thanks ALL
I created the following function to determine the lag of two variables.
However, this function takes only two parameters, and I would like to run it over my whole dataset:
datSel <- structure(list(stat.resProp.Dwell.4 = c(0.000887705, 0.007954085,
-0.025859667, 0.024097552, 0.114052787, 0.023329207, 0.042143181,
-0.092587287, -0.004050228, -0.001624696, 0.020121403, -0.100502922,
0.057354185, 0.025463388, 0.037409854, 0.001561281, -0.028482938,
-0.004827041, 0.014411779, -0.029034298, 0.021053409, -0.067963182,
0.032070259, -0.038091783, 0.039751534, 0.027802281, -0.027802281,
-0.013355791, 0.009201236, -0.073403679, 0.021277398, -0.033901552,
0.012624153, -0.065733979, 0.032017801, -0.072042665, 0.041936911,
0.002861232, 0.017933468, -0.01698154, 0.006638242, -0.08375153,
-0.007220248, 0.0255507, 0.019980685, 0.013752673, 0.026000502,
-0.021134312, -0.019608471, 0.0166916, -0.021654389, 0.066402455,
0.024828862, -0.083302632, 0.042518482, -0.052439198, 0.037186281,
-0.056311172, -0.012270093), stat.lohn = c(0, -0.007558004, -0.015289567,
0, 0, -0.009609384, -0.019500305, 0, 0, -0.012458015, -0.025391532,
-0.000983501, 0, -0.00165265, -0.003313516, 0.000204576, 0, -0.004898564,
-0.009869709, 0, 0, -0.010574012, -0.021489482, 0, 0, -0.011534651,
-0.023476287, 0, 0, -0.00814845, -0.016498838, 0, 0, -0.0099856,
-0.020275409, -0.002818337, 0, -0.007212389, -0.014582736, 0,
0, -0.004121565, -0.008294445, 0, 0, -0.010766386, -0.021886884,
0, 0, -0.010179741, -0.02067574, 0, 0, -0.011797067, -0.024020039,
-0.002017983, -0.007343864, -0.007398196, -0.014962644), stat.resProp.Dwell.1 = c(0.012777325,
-0.002991775, -0.057819571, -0.00796817, -0.019386714, 0, 0.009740337,
0.005638356, -0.035148694, 0, 0.027084134, -0.160377856, 0.101169235,
-0.043007944, 0.043007944, -0.002580647, -0.015625318, 0.023347364,
0.007662873, -0.09607383, -0.024575906, 0.056733018, -0.000904568,
-0.058703392, 0.011450507, 0.007561473, 0.037879817, -0.032246,
0.042169401, -0.001796946, -0.024580209, -0.148788737, 0.082097362,
-0.000985707, -0.00098668, 0.003940892, -0.049380309, 0.005151995,
0.027371197, -0.025317808, 0.019299736, -0.047382704, -0.010604553,
0.082827084, -0.04516573, 0.003075348, 0.007139245, 0.022111454,
-0.004982571, -0.038701368, 0.018519048, -0.049096021, 0.061254226,
-0.020346582, 0.023363175, -0.00402415, -0.014213437, 0.023245109,
0.027587957), stat.carReg = c(0.022775414, 0.008073857, 0.002624717,
0.169431097, -0.144595366, 0.066716837, -0.086971929, 0.037928208,
0.071752161, -0.046824102, 0.106085873, 0.049965928, -0.057984255,
-0.091650262, 0.090732857, -0.082282389, 0.053376121, -0.044203971,
-0.022855425, 0.025856271, 0.000136493, 0.05579193, -0.293966656,
0.013645739, 0.059732986, 0.187020956, -0.145234848, 0.11041385,
-0.126539687, -0.000949877, 0.031473389, 0.020267816, -0.02180532,
-0.07175183, 0.147500145, -0.040559138, 0.008394819, 0.049045337,
-0.043050615, 0.094358754, -0.058408438, -0.005018402, -0.061717889,
0.100150837, -0.071100417, -0.084393865, 0.002854733, 0.002141389,
-0.026538398, 0.013480513, -0.046002189, -0.030495611, 0.052899746,
0.012842017, 0.064086498, 0.020757573, -0.043441298, -0.009563043,
0.048033848)), .Names = c("stat.resProp.Dwell.4", "stat.lohn",
"stat.resProp.Dwell.1", "stat.carReg"), row.names = c(NA, -59L
), class = "data.frame")
The function and my function call is:
select.lags<-function(x,y,max.lag=8) {
y<-as.numeric(y)
y.lag<-embed(y,max.lag+1)[,-1,drop=FALSE]
x.lag<-embed(x,max.lag+1)[,-1,drop=FALSE]
t<-tail(seq_along(y),nrow(y.lag))
ms=lapply(1:max.lag,function(i) lm(y[t]~y.lag[,1:i]+x.lag[,1:i]))
pvals<-mapply(function(i) anova(ms[[i]],ms[[i-1]])[2,"Pr(>F)"],max.lag:2)
ind<-which(pvals<0.05)[1]
ftest<-ifelse(is.na(ind),1,max.lag-ind+1)
aic<-as.numeric(lapply(ms,AIC))
bic<-as.numeric(lapply(ms,BIC))
structure(list(ic=cbind(aic=aic,bic=bic),pvals=pvals,
selection=list(aic=which.min(aic),bic=which.min(bic),ftest=ftest)))
}
for (i in length(datSel) ) {
for (y in length(datSel) ) {
d1<-ts(datSel[i])
d2<-ts(datSel[y])
lag <- select.lags(d1,d2,5)
}
}
As output of lag I get:
> lag
$ic
aic bic
[1,] -115.3623 -109.56679
[2,] -114.3370 -106.60972
[3,] -116.2026 -106.54350
[4,] -114.7030 -103.11210
[5,] -112.7153 -99.19253
[6,] -110.8018 -95.34721
[7,] -110.0812 -92.69477
[8,] -110.1427 -90.82446
$pvals
[1] 0.1952302 0.3017934 0.7858944 0.9176337 0.5040079 0.0604511 0.3406657
$selection
$selection$aic
[1] 3
$selection$bic
[1] 1
$selection$ftest
[1] 1
As you can see I get only 8 results back, however, my data.frame has 20 variables.
Any recommendation what I am doing wrong?
I appreciate your replies!
If you want to e.g. store the result of the AIC criterion:
lag.aic.store = matrix(NA, 4, 4)
for (i in 1:length(datSel) ) {
for (y in 1:length(datSel) ) {
d1<-ts(datSel[,i])
d2<-ts(datSel[,y])
lag <- select.lags(d1,d2,5)
lag.store.aic[i,y] = lag$selection$aic
}
}
You get 8 values in $ic because max.lag is 8, it has nothing to do with your number of variables.
Please also note that i added commas when indexing by variable for clarity and that you have to loop through 1:length(datSel) as otherwise you will only catch the last variable.