How can I build a custom context based Question answering model SQuAD using deeppavlov - bert-language-model

I have the following queries
Dataset format (is how to split train, test and valid data )
Where to place the dataset
How to change the path for dataset reader
How to save the model in my own directory
And How to use the trained model
Edit
my_config['dataset_reader']['data_path'] = '/home/ec2-user/SageMaker/squad/data/'
my_config['metadata']['variables']['MODELS_PATH'] = '/home/ec2-user/SageMaker/squad/model/'
I used this command to change my dataset path and model path in configuration file. My model is saved in this location but It is not using my dataset during training instead of this it is downloading its own dataset in that folder and using it.

The example of dataset is https://github.com/deepmipt/DeepPavlov/blob/f5117cd9ad1e64f6c2d970ecaa42fc09ccb23144/deeppavlov/dataset_readers/squad_dataset_reader.py#L46
Your dataset should have the same format.
2-3. The dataset should be placed in the folder https://github.com/deepmipt/DeepPavlov/blob/f5117cd9ad1e64f6c2d970ecaa42fc09ccb23144/deeppavlov/configs/squad/squad_torch_bert.json#L4
(you can change the folder name)
Model is saved in the directory https://github.com/deepmipt/DeepPavlov/blob/f5117cd9ad1e64f6c2d970ecaa42fc09ccb23144/deeppavlov/configs/squad/squad_torch_bert.json#L166
(here you can write your own directory)
Trained model can be used with the command: python3 -m deeppavlov interact <your_config_name> More detailed tutorial how to launch models is here https://github.com/deepmipt/DeepPavlov

Related

is there a library that can generate csv files given a data dictionary and data model in some format

is there a library in any language that can generate .csv files for each entity of the data model that complies with a data dictionary.
For example:
data dictionary is specified in a csv file with these column names - field,regex,description
data model is specified in another csv file with these column names - entity,field
faker comes very close however it needs some programming to work for a data model. If there is a wrapper around faker, that might work great I suppose.

How to Save Convolution Neural Network(CNN) Model Output in R?

I would like to copy CNN model output as (RDS or RDA) file in R and I would like to load resultant output in some other application.
I have tried saverds option but it is creating Empty file in work space.
model1<- saveRDS("model", file = "model.rda")

Rscript in Hive

I have a linear regression model created in R. And have the model file stored. How do i run this on a hive table and score inside hive CLI.
This document has some useful information on this. My question is how to code scorer.R wrapper file. What will be the content of the file. Thank
http://www.slideshare.net/huguk/hug-data-science

How to export a dataset to SPSS?

I want to export a dataset in the MASS package to SPSS for further investigation. I'm looking for the EuStockMarkets data set in the package.
As described in http://www.statmethods.net/input/exportingdata.html, I did:
library(foreign)
write.foreign(EuStockMarkets, "c:/mydata.txt", "c:/mydata.sps", package="SPSS")
I got a text file but the sps file is not a valid SPSS file. I'm really looking for a way to export the dataset to something that a SPSS can open.
As Thomas has mentioned in the comments, write.foreign doesn't generate native SPSS datafiles (.sav). What it does generate is the data in a comma delimited format (the .txt file) and a basic syntax file for reading that data into SPSS (the .sps file). The EuStockMarkets data object class is multivariate time series (mts) so when it's exported the metadata is lost and the resulting .sps file, lacking variable names, throws an error when you try to run it in SPSS. To get around this you can export it as a data frame instead:
write.foreign(as.data.frame(EuStockMarkets), "c:/mydata.txt", "c:/mydata.sps", package="SPSS")
Now you just need to open mydata.sps as a syntax file (NOT as a datafile) in SPSS and run it to read in the datafile.
Rather than exporting it, use the STATS GET R extension command. It will take a specified data frame from an R workspace/dataset and convert it into a Statistics dataset. You need the R Essentials for Statistics and the extension command, which are available via the SPSS Community site (www.ibm.com/developerworks/spssdevcentral)
I'm not trying to answer a question that has been answered. I just think there is something else to complement for other users looking for this.
On your SPSS window, you just need to find the first line of code and edit it. It should be something like this:
"file-name.txt"
You need to find the folder path where you're keeping your file:
"C:\Users\DELL\Google Drive\Folder-With-Your-File"
Then you just need to add this path to your file's name:
"C:\Users\DELL\Google Drive\Folder-With-Your-File\file-name.txt"
Otherwise SPSS will not recognize the .txt file.
Sorry if I'm repeating some information here, I just wanted to make it easier to understand.
I suppose that EuStockMarkets is a (labelled) data frame.
This should work and even keep the variable and value labels:
require(sjlabelled)
write_spss(EuStockMarkets, "mydata.sav")
Or you try rio:
rio::export(EuStockMarkets, "mydata.sav")

Saving and loading a model in R

When working with caret, how can I save a model after training, and load it later (e.g. in a different session) for prediction?
A better solution nowadays is to use saveRDS to save and readRDS to read:
saveRDS(model, "model.rds")
my_model <- readRDS("model.rds")
This lets you to choose a new name for the object (you don't need to remember the name you used when you saved it)
The correct syntax would be to use:
save(model, file="model.Rdata")
Thereafter, it can be loaded using the load() command.
The following code assumes that your model's variable name is 'model':
save(model, "model.RData")
This will save your model as "model.RData" in the current working directory. You can find out what the working directory is by issuing the following:
getwd()
To load it back in, ensure that your model is saved in your working directory and issue:
load("model.RData")

Resources