I am preparing a simulation in CAPL for almost 200+ pgn which is in *.xls format.
1)Is there any way to convert the string to pgn datatype?
2) Other method to convert the 200+ PGNs to capl with prio would be really helful?
Related
I am using DBI package in R to connect to teradata this way:
library(teradatasql)
query <- "
SELECT sku, description
FROM sku_table
WHERE sku = '12345'
"
dbconn <- DBI::dbConnect(
teradatasql::TeradataDriver(),
host = teradataHostName, database = teradataDBName,
user = teradataUserName, password = teradataPassword
)
dbFetch(dbSendQuery(dbconn, query), -1)
It returns a result as follows:
SKU DESCRIPTION
12345 18V MAXâ×¢ Collated Drywall Screwgun
Notice the bad characters â×¢ above. This is supposed to be superscript TM for trademarked.
When I use SQL assistant to run the query, and export the query results manually to a CSV file, it works fine as in the DESCRIPTION column has correct encoding.
Any idea what is going on and how I can fix this problem? Obviously, I don't want a manual step of exporting to CSV and re-reading results back into R data frame, and into memory.
The Teradata SQL Driver for R (teradatasql package) only supports the UTF8 session character set, and does not support using the ASCII session character set with a client-side character set for encoding and decoding.
If you have stored non-LATIN characters in a CHARACTER SET LATIN column in the database, and are using a client-side character set to encode and decode those characters for the "good" case, that will not work with the teradatasql package.
On the other hand, if you used the UTF8 or UTF16 session character set to store Unicode characters into a CHARACTER SET UNICODE column in the database, then you will be able to retrieve those characters successfully using the teradatasql package.
I am accessing a data frame generated from a CSV file I read in with read.csv().
The end goal is this: Look at rows where aColumn is the value of value[i], and get the value of anotherColumn from these rows (stored in tmpC). Later this will be reduced to a single string of AVALUE or DIFFERENTVALUE.
As you can see, the tmpC variable gets me an array (?) of strings I could then reduce down to a single string with head(). Cool. On windows, not so much. I get a factor.
What is happening?
I have multiple 30GB/1billion records files which I need to load into Netezza. I am connecting using pyodbc and running the following commands.
create temp table tbl1(id bigint, dt varchar(12), ctype varchar(20), name varchar(100)) distribute on (id)
insert into tbl1
select * from external 'C:\projects\tmp.CSV'
using (RemoteSource 'ODBC' Delimiter '|' SkipRows 1 MaxErrors 10 QuotedValue DOUBLE)
Here's a snippet from the nzlog file
Found bad records
bad #: input row #(byte offset to last char examined) [field #, declaration] diagnostic,
"text consumed"[last char examined]
----------------------------------------------------------------------------------------
1: 2(0) [1, INT8] contents of field, ""[0x00<NUL>]
2: 3(0) [1, INT8] contents of field, ""[0x00<NUL>]
and the nzbad file has "NUL" between every character.
I created a new file with the first 2million rows. Then I ran iconv on it
iconv -f UCS-2LE -t UTF-8 tmp.CSV > tmp_utf.CSV
The new file loads perfectly with no errors using the same commands. Is there any way for me to load the files without the iconv transformation? It is taking a really long time to run iconv.
UCS-2LE is not supported by Netezza, i hope for your sake that UTF-8 is enough for the data you have (no ancient languages or the like ?)
You need to focus on doing the conversion faster by:
searching the internet for a more cpu efficient implementation than/of iconv
Convert multiple files in parallel at a time (same as your number of CPU-cores minus one Is probably max). You may need to split the original files before you do it. The netezza loader prefers relatively large files though, so you may want to put them back together while loading for extra speed in that step :)
I need to read in this XML COLB column from Oracle table. I tried the simple read in like below:
xmlbefore <- dbGetQuery(conn, "select ID, XML_TXT from XML_table")
But I can only read in about 225,000 characters. When I compare with the sample XML file, it only read in maybe 2/3 or 3/4 of the entire field.
I assume R has limitation of maybe 225,000 characters and SAS has even less, like about only 1000 Characters.
How can I read in the entire field with all characters (I think it is about 250,000-270,000)?
SAS dataset variables have a 32k char limit, macro variables 64k. LUA variables in SAS however have no limit (other than memory) so you can read your entire XML file into a single variable in one go.
PROC LUA is available from SAS 9.4M3 (check &sysvlong for details). If you have an earlier version of SAS, you can still process your XML by parsing it a single character at a time (RECFM=N).
I have a comma delimited data set whose first 2 rows is like this:
1/13/2010 21:09,3.3
11/30/2010 7:33,7.2
....
In trying to read the data in SAS, I have done this data step below:
data myDataSet;
infile 'sampleData.csv' dlm=',';
input timestamp :mmddyy16. value;
run;
Now that the data is now a SAS data set, I try to view it by doing:
data viewData;
set myDataSet;
format timestamp date9.;
run;
proc print data=viewData;
run;
I observed that the timestamp column output only contains the date and not with the time. I want the timestamp to be read and displayed in a format like "dd-mm-yyyy HH:MM:SS". How do I ensure that in reading the file, the informat is correctly specified and no component of the timestamp is lost?
MMDDYYw is a date informat, not a datetime informat - it only reads days and larger and ignores the time. It has a practical maximum length of 10 (although it allows up to 32) as a result.
You can use MDYAMPM. to read those two dates in, among other informats. That might be the ultimately correct informat, or it might not, depending on the totality of your data; several NLS specific informats might also work. See the SAS informat documentation for more details.