How to convert .txt files to BDIC format? - dictionary

I was recently looking for a spell checker add-on for Anki and I found this add-on named (Legacy) Spelling Police™ and here it's GitHub page but the problem is that it needs to feed with user's own dictionaries and that also in BDIC format and after looking on the internet for a long time I can only find dictinaries in .txt format and here is the link for it.
So, now I need a way to convert this .txt format to .bdic format.

Related

Importing dates from excel (some formatted as dates some as numbers)

I am working an uploaded document originally from google docs downloaded to an xlsx file. This data has been hand entered & formatted to be DD-MM-YY, however this data has uploaded inconsistently (see example below). I've tried a few different things (kicking myself for not saving the code) and it left me with just removing the incorrectly formatted dates.
Any suggestions for fixing this in excel or (preferably) in R? This is longitudinal data so it would be frustrating to have to go back into every excel sheet to update. Thanks!
data <- read_excel("DescriptiveStats.xlsx")
ex:
22/04/13
43168.0
43168.0
is a correct date value
22/04/13
is not a valid date. it is a text string. to convert it into date you will need to change it into 04/13/2022
there are a few options. one is to change the locale so the 22/04/13 would be valid. see more over here: locale differences in google sheets (documentation missing pages)
2nd option is to use regex to convert it. see examples:
https://stackoverflow.com/a/72410817/5632629
https://stackoverflow.com/a/73722854/5632629
however, it is very likely that 43168 is also not the correct date. if your date before import was 01/02/2022 then after import it could be: 44563 which is actually 02/01/2022 so be careful. you can check it with:
=TO_DATE(43168)
and date can be checked with:
=ISDATE("22/04/13")

Why does Excel make un-expanded dates unreadable to other software?

I recently ran into an issue in R where it wasn't reading the date values from my csv file. I had reviewed and revised my code many times before realizing the source of the issue was the file itself. After experimentation, I realized that the date would only read if I expanded the date column in Excel and then resaved the file. This doesn't seem logical to me, the data is stored in the spreadsheet so while I expect Excel to make it unreadable to the human eye until the column is expanded, I did not expect it to be unreadable to another computer program, in this case, R. I feel that I got lucky discovering this and would like to understand why it works that way?
It helps to mention that I noticed a very similar issue when pasting un-expanded dates from Excel to Google Sheets; the result is cells filled with "######" instead of the actual date values.
Because Excel is the common party here I'm assuming this is an Excel issue.
What is happening with Excel to make the un-expanded date values unreadable to other software? Is this something that Excel is aware of?
Un-expanded dates
R-Misread
Google Sheets Paste
After expanding the column in the csv file and saving it, both the R issue and the Google Sheets issue went away.

How to extract a database from a text file (word or libreoffice) with styles and content

I ask my question after searching an answer on stackoverflow and on the web, without success.
I'm sorry if there is already an answer somewhere.
Global objective
I aim to create my questionnaires in libreoffice ( I need to print it, it's not for an online survey), and secondly to use it in a R shiny app I've created for register the collected answers and to export the data.
I want to create the fields in R (questions, answers...) automatically from the styles of my questionnaires in .odt, .docx or others formats.
I need to have well formatted questionnaires, nice-looking.
There is the problem:
I have written a questionnaire on a libreoffice .odt file (or if necessary in microsoft word).
I uses styles for different text blocks: one style for the "questions", one for the "answer", one for the parts of the questionnaire, one for the "instructions"...
I want to get a database ( in .csv format) with one column with the styles, and one column with the text content.
Solutions?
I try to open the xml files in the .odt or .docx archives, but the conversion to a simpler and readable format seems quite difficult.
Is it possible to export a toc from libreoffice or word to a spreadsheet format?
R can read in such files (.odt or .dox, or.xml) ?
Thank you very much for your ideas, and more generaly for your feedbacks on my project.
I'm sorry for my english
I would recommend using .Rmd (for rmarkdown) or .Rnw (for knitr) files as the source for your questionaires, rather than starting with .odt or .docx. You can produce output in various formats, including .docx, .pdf, .html (only .pdf for .Rnw) to display the questionaire to the subjects, but you can also develop functions to manage the data, or even interactive displays to collect and record the data.
I'm not familiar with R packages that do all of this for you, but I expect they already exist. Maybe someone else will give an answer with more details.
You might explore using the .fodt format in libreOffice Writer. That format is an "unzipped" version of the Writer xml format, so could be directly readable by xml utilities (and probably R, with appropriate libraries). I note that for another answer you seemed to want to avoid markdown or knitr composition, and .fodt would provide a "text" format completely compatible with LibreOffice as a front end.
(Note the other parts of LibreOffice have "flat" versions, so you could, in theory, process text versions of spreadsheets, graphics, and presentation files in your R utility.)
A few web searches indicates some relevant libraries and utilities for R exist, which may get you closer to what you need for your project.

Importing to R an Excel file saved as web-page

I would like to open an Excel file saved as webpage using R and I keep getting error messages.
The desired steps are:
1) Upload the file into RStudio
2) Change the format into a data frame / tibble
3) Save the file as an xls
The message I get when I open the file in Excel is that the file format (excel webpage format) and extension format (xls) differ. I have tried the steps in this answer, but to no avail. I would be grateful for any help!
I don't expect anybody will be able to give you a definitive answer without a link to the actual file. The complication is that many services will write files as .xls or .xlsx without them being valid Excel format. This is done because Excel is so common and some non-technical people feel more confident working with Excel files than a csv file. Now, the files will have been stored in a format that Excel can deal with (hence your warning message), but R's libraries are more strict and don't see the actual file type they were expecting, so they fail.
That said, the below steps worked for me when I last encountered this problem. A service was outputting .xls files which were actually just HTML tables saved with an .xls file extension.
1) Download the file to work with it locally. You can script this of course, e.g. with download.file(), but this step helps eliminate other errors involved in working directly with a webpage or connection.
2) Load the full file with readHTMLTable() from the XML package
library(XML)
dTemp = readHTMLTable([filename], stringsAsFactors = FALSE)
This will return a list of dataframes. Your result set will quite likely be the second element or later (see ?readHTMLTable for an example with explanation). You will probably need to experiment here and explore the list structure as it may have nested lists.
3) Extract the relevant list element, e.g.
df = dTemp[2]
You also mention writing out the final data frame as an xls file which suggests you want the old-style format. I would suggest the package WriteXLS for this purpose.
I seriously doubt Excel is 'saved as a web page'. I'm pretty sure the file just sits on a server and all you have to do is go fetch it. Some kind of files (In particular Excel and h5) are binary rather than text files. This needs an added setting to warn R that it is a binary file and should be handled appropriately.
myurl <- "http://127.0.0.1/imaginary/file.xlsx"
download.file(url=myurl, destfile="localcopy.xlsx", mode="wb")
or, for use downloader, and ty something like this.
myurl <- "http://127.0.0.1/imaginary/file.xlsx"
download(myurl, destfile="localcopy.csv", mode="wb")

Flex: Write data to file to open in Excel

I have a Flex application with a couple of DataGrids with data. I'd like to save the data to a file so that the user can keep working with them in Excel, OpenOffice or Numbers.
I'm currently writing a csv file straight off, which opens well in OpenOffice or Numbers, but not in Excel. The problem is with the Swedish characters ÅÄÖ, which turn up as other characters when opening in Excel. Converting (in Notepad++) the csv-file to ANSI encoding makes the ÅÄÖ show up correctly in Excel.
Is there any way to write ANSI-encoded files straight from Flex?
Any other options for writing a file that can be opened in Excel and OpenOffice?
(I've looked at the as3xls library, but according to the comments those files cannot be opened in OpenOffice)
Using the writeMultiByte function from the ByteArray class allows you to specify a character set. See :
http://www.adobe.com/livedocs/flash/9.0/ActionScriptLangRefV3/flash/utils/ByteArray.html#writeMultiByte%28%29
There is also the option of the as3xls package at http://code.google.com/p/as3xls/. I like this as it comes out as a straight excel file that can also be easily opened in open office as well.

Resources