Bad character in format - r

I hope someone can help in this great community. I have run for several weeks a script in r which is producing a txt file as output which is then imported in teradata after a daily drop, create of a table. I have never had any issue so far but today I received the error: "Error executing query for record 1:2621. Bada character in format or data...".
I frantically googled all the googalable content but none of it could answer my problem. In fact I also tried to replace the content of the txt file with an old file which was uploaded just fine, and today it generated this horrendous error in return. It is only a small table of 6 columns with these characteristics:
Top_wagered_Games varchar(111)
,support DEC(9,6)
,confidence DEC(9,6)
,lift DEC(9,2)
,cnt int
, "date" date
And generally made of few rows (no more than 15). What went wrong? Why is this happening? Could anyone help?
Teradata provider version: ODBC 15.10.01.01
Thanks!

Related

R won't recognize updated CSV

so this is a super basic question that I'm hoping someone can help me with (I'm super new to R, so my troubleshooting is remedial at best).
I noticed there were some spelling errors in my data, so I went back to the CSV file, made the changes to the file, saved, closed out, and re-read in the data using read.csv(). Everything showed up and worked as normal until I wanted to simply run a count of the entries in three of the columns (I've done this numberous times with the exact same code and exact same CSV file with the exact same working directory, no spelling errors), but for whatever reason I got the following error message:
Error in file(file, "rt") : cannot open the connection In addition: Warning message: In file(file, "rt") : cannot open file 'FFS_TargetingEventsAllCSV2.csv': No such file or directory
I restarted everything, reset the working directory, doublechecked spelling and used getwd(), but encountered the same problem.
So I decided to use an older backup version of the same dataset. I was able to read it in as normal, and I was able to run my counts. However, I noticed there were similar spelling errors. So I went back to the CSV, made the necessary changes, re-read in the CSV. Everything looked normal until I ran the counts again and saw the exact same spelling errors, unchanged.
So I decided to start fresh and re-save the file (Save As) under a new name, new CSV, same working directory...same. exact. issue.
Every time I opened the CSV file on my desktop, it shows me the most up to date version, but I can't figure out why R isn't recognizing any changes. I even made new, subtle spelling changes in a different column to see if that would make a difference, but nope.
To clarify, the data in the three colums of interest are just 2-3 letters (i.e. FSP). It's only text, sometimes with a hypthen (i.e. HTBR-DA). I'm not trying to run any stats/tests or anything, I just want summary statistics. I also updated my R/RStudio last Thursday, so I have the most recent version of the software as well.
Any advice on this would be much appreciated.

Why does Excel make un-expanded dates unreadable to other software?

I recently ran into an issue in R where it wasn't reading the date values from my csv file. I had reviewed and revised my code many times before realizing the source of the issue was the file itself. After experimentation, I realized that the date would only read if I expanded the date column in Excel and then resaved the file. This doesn't seem logical to me, the data is stored in the spreadsheet so while I expect Excel to make it unreadable to the human eye until the column is expanded, I did not expect it to be unreadable to another computer program, in this case, R. I feel that I got lucky discovering this and would like to understand why it works that way?
It helps to mention that I noticed a very similar issue when pasting un-expanded dates from Excel to Google Sheets; the result is cells filled with "######" instead of the actual date values.
Because Excel is the common party here I'm assuming this is an Excel issue.
What is happening with Excel to make the un-expanded date values unreadable to other software? Is this something that Excel is aware of?
Un-expanded dates
R-Misread
Google Sheets Paste
After expanding the column in the csv file and saving it, both the R issue and the Google Sheets issue went away.

fread not reading all records and no warning message

I'm trying to load some data using fread. While loading it shows the correct number of records, but when its finished loading, the no. of records are comparatively less.
Surprising it doesn't show any warnings. Please can someone advise? see attached pic
click here
Thanks
One common reason is un-clean data with inappropriat un-ended quotations.
E.g., if you have data like this:
number_column,text_column
1,text data 1
2,"text with single quote here
3,text data 3
EVERYTHING after the single quote will be included in the text_column on the 2nd line. This is actually the correct way to interpret, it's just that your CSV/TSV file is broken.
The easiest solution is to use quote="" as a parameter, but the real solution is to go through your TSV/CSV file and fix all the issues manually, since the interpreter cannot know exactly what you want if the file is broken.

Bulkload option exporting data from R to Teradata using RODBC

I have done many researches on how to upload a huge data in .txt through R to Teradata DB. I tried to use RODBC's sqlSave() but it did not work. I also followed some other similar questions posted such as:
Write from R to Teradata in 3.0 OR Export data frame to SQL server using RODBC package OR How to quickly export data from R to SQL Server.
However, since Teradata somehow is structured differently than MS SQL server, most of those options suggested are not applicable to my situation.
I know that there is a TeradataR package available but it has not been updated since like 2-3 years ago.
So here are my 2 main problems I am facing:
1. How to bulk load (all records at once) data in .txt format to Teradata using R if there is any way. (So far I only tried using SAS to do so, but I need to explore this in R)
2. The data is big like 500+ MB so I cannot load it through R, I am sure there is a way to go around this but directly pull data from server.
Here is what I tried according to one of posts but this was for MS SQL server:
toSQL = data.frame(...) #this doesn't work for me cause its too big.
write.table(toSQL,"C:\\export\\filename.txt",quote=FALSE,sep=",",row.names=FALSE,col.names=FALSE,append=FALSE);
sqlQuery(channel,"BULK
INSERT Yada.dbo.yada
FROM '\\\\<server-that-SQL-server-can-see>\\export\\filename.txt'
WITH
(
FIELDTERMINATOR = ',',
ROWTERMINATOR = '\\n'
)");
*Note: there is an option in Teradata to insert/import data but that is the same as writing millions of rows of Insert statements.
Sorry that I do not have sample codes at this point since the package that I found wasn't the right one that I should use.
Anyone has similar issues/problems like this?
Thank you so much for your help in advance!
I am not sure if you figured out how to do this but I second Andrew's solution. If you have Teradata installed on your computer you can easily run the FastLoad Utility from the shell.
So I would:
export by data frame to a txt file (Comma separated)
create my fastload script and call the exported txt file from within the fastload scrip (you can learn more about it here)
run the shell command referencing my fastload script.
setwd("pathforyourfile")
write.table(mtcars, "mtcars.txt", sep = ",", row.names = FALSE,quote= FALSE, na = "NA",col.names = FALSE)
shell("fastload < mtcars_fastload.txt")
I hope this sorts your issue. Let me know if you need help especially on the fastloading script. More than happy to help.

Generating Excel file with XLConnect-Removed Feature: Format from /xl/styles.xml part (Styles)

I am using XLConnect in R for the purpose of daily report generation. I have a program that runs automatically at specific time to append the data for most recent date daily into an excel file (Excel 2007). The program works fine to do this task. But, sometimes when i open the excel file it says that "Excel found unreadable content. do you want to recover the content of this workbook?"
The best part of this issue is that i can't reproduce this issue again to know the exact root cause for the problem. It arises in a random manner. Because, when i try to run the program again it works fine. Can somebody help me to identify the root cause?

Resources