Read a CSV file that have indefinite number of columns every time and create a table based on column names in csv file - plsql

I have a requirement to load the csv into DB using oracle apex or pl/sql code, but the problem is they are asking to load the csv file which will not come with same number of columns and column names .
I should create table & upload data dynamically based on the file name and data that i'm uploading.
For every file i need to create a new table dynamically and insert data that are present in csv file.
For Example:
File1:
col1 col2 col3 col4 (NOTE: If i upload File 1, Table should be created dynamically based on the file name and table should contain same column name and data same as column headers of csv file . )
file 2:
col1 col2 col3 col4 col 5
file 3:
col4 col2 col1 col3
Depending on the columns and file name i need to create table for every file upload.
Can we load like this or not?
If yes, Please help me on this.
Regards,
Sachin.

((Where's the PL/SQL code in this solution!!??! Bear with me... the
answer is buried in here somewhere... I introduced some considerations
and assumptions you will need to think about before going into the
task. In the end, you'll find that Oracle APEX actually has a
built-in solution that satisfies exactly what you've specified... with
some caveats.))
If you are working within the Oracle APEX platform, you will have some advantages. APEX Version 4.2 and higher has a new page element called "Data Loading". The disadvantage however is that the definition of the upload target is fixed and not dynamic. You will need to know how your table is structured prior to loading the data.
One approach to overcome this is to build a generic, two-column table as your target, which will serve for all uploads. Column 1 will be your file-name and column two will be a single clob data type, which will contain the entire data file's contents including the header row. The "Data Loading" element will give the user the opportunity to verify and select this mapping convention in a couple of clicks.
At this point, it's mostly PL/SQL backend work doing the heavy lifting to parse and transform the data uploaded. As far as the dynamic table creation, I have noticed that the Oracle package, DBMS_SQL allows the execution of DDL SQL commands, which could be the route to making custom tables.
Alex Poole's comment is important as well, you will need to make some blanket assumption about the data type or have a provision to give more clues about what kind of data is contained. Assuming you can rely on a sample of existing data values is not good... what if all the values in your upload are null? I recommend perhaps a second column in the data input with a clue about the type of data for each column... just like the intended header names, maybe: AAAAA = for a five character column, # = for a numeric, MM/DD/YYYY = for a date with a specific masking.
The easier route:
You will need to allow your end-user access to a developer-role account on a workspace of your APEX server. It is not as scary as you think. With careful instruction and some simple precautions, I have been able to make this work with even the most non-technical of users. The reason for this is that there is a more powerful upload tool found under the following menu item:
SQL Workshop --> Utilities --> Data Workshop
There is a choice under "Data Load" --> "Spreadsheet Data"
The data load tool will automatically do the following:
Accept a CSV formatted file through a browse function on your client machine
Upload the file and parse the first record for the column layout (names)
Allow the user to create a new table from the uploaded file, or to map to an existing one.
For new tables, each column data type can be declared and also a specific numeric/date mask if additional conversion from the uploaded data is necessary.
Delimiter type, optional enclosures (like double quotes), decimal conventions and currency types can also be declared prior to parsing the uploaded file.
Once the user has identified all these mappings and settings, the table is created with the uploaded data. Any errors in record upload are reported immediately afterwards with detailed feedback on the failed records.
A security consideration to note:
You probably do not want to give end users access to your APEX server's backend... but you CAN create a new workspace... just for your end users... create a new database schema for receiving their uploads, maybe with some careful resource controls. Developer is the minimum role needed... but even if the end users see the other stuff there won't be access to anything important from an isolated workspace.
I have implemented the isolated workspace approach on a 4.0/4.1 release APEX platform a few years back, and it worked nicely. Our end user had control over the staging and quality checking of her data inputs (from excel spreadsheet/csv exports collected from a combination of sources). I suppose it may have been even better to cut her out of the picture entirely and focused on automating the export-review-upload process between our database and her other sources. In this case, the volume of data involved was not great enough (100's to 1000's of records) and the need for manual review and edit of the exported data was very important prior to pushing it into the database... so the human element was still important in this case - it is something you'll want to think about now.

Related

how to move data between datasets in different regions?

I'm using BigQuery integrated with Firebase and all the datasets are in the same Project. My analytics dataset is in useast-4 but for some reason my firebase_imported_segments dataset region is just marked as US
I'd like to move data from the analytics dataset into a table in the firebase_imported_segments.
At first, I tried a simple INSERT query but I get the error firebase_imported_segments was not found in location us-east4
So then I tried building a SELECT statement and exporting the rows using "Save Results > Big Query Table" but that gives a similar error that the destination dataset is not found. Oddly enough, if I create a table in firebase_imported_segments and try to save the results using that table name, I get a "Table already exists" error. So it's not that it can't find the firebase_imported_segments dataset, it just won't create a new table in that dataset.
How can I get around this? I saw some BQ documentation that moving data between regions is possible but I didn't a simple walkthrough of how it's accomplished. I'm also confused by why firebase would put some data in one specific region (useast-4) and then other data in a multi-region (US) if they aren't compatible.
You can move datasets using "Copy" in the BigQuery UI then delete the old dataset. See Copy dataset documentation.
Option 1: Use the Copy button.
Go to the BigQuery page in the Cloud console.
In the Explorer panel, expand your project and select a dataset.
Expand the More Actions option (triple dot button) and click Open.
Click Copy. In the Copy dataset dialog that appears, do the following:
a. In the Dataset field, either create a new dataset or select an
existing dataset ID from the list.
Dataset names within a project must be unique. The project and dataset
can be in different regions, but not all regions are supported for
cross-region dataset copying.
b. In the Location field, the location of the source dataset is
displayed.
c. Optional: To overwrite both data and schema of the destination tables
with the source tables, select the Overwrite destination tables
checkbox.
d. To copy the dataset, click Copy.
To avoid additional storage costs, consider deleting the old dataset.
Option 2: Use the BigQuery Data Transfer Service.
Enable the BigQuery Data Transfer Service.
Create a transfer for your
data source.
I tested this and can confirm that it works. I created a dataset in us-east4 named analytics_us_regional and has a table named east_4_table and copied it to a dataset located in US.
Copy us-east4 to US dataset:
When copy is initiated a data transfer job is created:
Copied to US:
With regards to the data in firebase located in us-east4 based from the firebase export to BQ. When the export is enabled the first time, the user will define the location of the tables. It might be possible thatus-east4 region was selected initially.
Don't know if it will work in your case, but I had a dataset in europe-west1 and I want to copy it to EU region, I have done these two ways and it both worked:
First way:
1- Click on the dataset you want to copy and click on "COPY".
2- On the copy menu on the dataset destination click on "CREATE NEW DATA SET" and select the destination region you want that dataset to be. Click on CREATE DATA SET.
3 - On the "Copy data set" menu click on COPY.
4 - You will get an error "Cannot create a transfer in REGION_EUROPE_WEST_1 when destination dataset is located in JURISDICTION_EU" but a dataset with no tables will be created on your destination Region.
5 - Now if you try to copy the source dataset by clicking on COPY and selecting the dataset created in set 4, it will work now.
Second way: (best way)
1 - Open a New Query sheet Click on MORE- >Query settings-> Advanced options, uncheck the "Automatic location selection" and select the destination region or Multi-region you want (in my case EU).
2- On this query sheet run "CREATE SCHEMA your_new_dataset_name" -> this will create the dataset "your_new_dataset_name" in the destination region selected in point 1.
3 - Click on the dataset you want to copy and click on "COPY".
4 - On the copy menu on the Data Set destination select the dataset created in point 2, and click on COPY.
Both ways under the wood utilize the BigQuery Data Transfer Service but you don't need to access the service directly.
In fact, both ways do exactly the same thing which is creating a destination empty dataset in the correct region you want to copy yours, once you have that the Copy function will work correctly.

PeopleCode to load from CSV file and split 1 field into multiple columns

I am not familiar with Application Engine or PeopleCode but inherited this project when someone left. Seems simple but I'm not sure how to approach it.
I have to load a CSV file that has 5 fields. The last field has multiple values separated by a comma and it is qualified with quotes.
file example:
ID , YEAR, VALUE1 , VALUE2, CODE
87778, 2022, processed, none , 100,40
93332, 2022, processed, none , 60
76633, 2022, error , none , 55,35,9
I have created a File Layout definition and set the qualifier and I can load the file into a staging table but now I want to split the last column (CODE) into individual codes.
I have created 2 PeopleTools Record definitions with a parent/child relationship:
parent Record definition with ID,YEAR,VALUE1,VALUE2, and
child Record definition with ID,YEAR,CODE
I have found that I can use the PeopleCode split function to break the CODE column out into an array containing each value in an element. I'm not sure what the best way to structure the program is though.
Is the staging table necessary?
Or can I use the split function as I read the CSV file in and update the parent/child tables?
Or do I need to keep the staging table and then read out the fields for the parent record and move them to the permanent table and then do the same for the child after using the split function and then loop through the array?
Just looking for some guidance so my first AE project is not a mess.
IMO, there are always multiple ways to achieve the same thing(especially in AE). we choose one based on our requirements and efficiency.
for staging table: In your case, you can ignore the staging table unless you are expecting to load a huge set of data every time or want to do parallel processing. In other words, you can have staging table if you think loading takes a lot of time and you don't want to risk failing that due to other errors.
You can even achieve this whole thing in one peoplecode action without a staging table.
or,
Load the data into staging table and commit.
loop through the data from staging table in AE (having the data in state rec)
Do the transformation as required using peoplecode action
insert data in necessary tables
update status(have a field in staging table) field in staging table, this may come in handy for any analysis/issue in production

How to get all rows from Page list and convert them to CSV utilizing pxConvertResultsToCSV

I have a Repeat grid layout, as a source is Report definition. The grid displays twenty row per page. So, if there are thirty-three rows, there are four pages.
I have got a task to export all grid's data to CSV. I have found out the pxConvertResultsToCSV activity. It requires to pass PageList with the properties to convert. I use pgRepPgSubSectionMySectionListB.pxResults to do this. But I have realized, that the property pxResults contains only first twenty elements of pgRepPgSubSectionMySectionListB. But I must export to CSV all the rows. How can I reach this? Thank you.
First run your report by calling pxRetrieveReportData activity of class Rule-Obj-Report-Definition in you acticity
Syntex:- call Rule-Obj-Report-Definition.pxRetrieveReportData
It will ask for parameters:-
pyReportName :- your report definition name
pyReportClass:- class of the report definition
pyPageName :- any page name for example ReportListExport. This page must be defined in Pages & Classes of class Code-Pega-List
After successful execution of this step, you will get ReportListExport.pxResults in Clipboard.
Now use this pxResults for export.
There is one more activity to export your Report in excel.
Call pzViewExportToExcel activity after running your report. And keep ReportListExport.pyReportDefinition as step page of this step.
This is preferred one.
This question is a bit old now so I'm sure the OP has probably solved the problem and moved on at this point. But for future viewers there is an easier way to solve this.
Pega includes a gadget called the "Record Editor" which can be used to display a report definition as an editable data table. It shows the provided report definition in a simple table as normal but users can also edit the rows, delete the rows and add new ones. It also includes import and export actions at the top so users can drop the entire resultset being shown in the table to CSV and then re-import changes back in after editing. You can find more information on this gadget and how to use it in this community article
If you simply want to provide an option at the top of a table sourced from a report definition that allows users to export the results as CSV without using the record editor gadget there is an API for that as well. The activity "pxDownloadDataRecordsAsCSV" in class "PegaAccel-Task-DataTableEditor" does this. It accepts the class and name of a report definition as parameters, runs that report and serves up the contents as a CSV file.
The second part here isn't too different from AJ's solution it's just an already existing parameterized activity you can use instead of writing one yourself.

What is the best practice to input a bunch of data in asp.net?

What is the best practice to upload bunch of data(multiple rows) at once time. I don't want to upload any files on the server.
Is it good to have a text area to input data with a predefined structure(format). And create a small parser to read and analyze that input to insert it into to the database.
Edit:
I have the data set in excel file. I want to store it in the database, I don't want to upload the server.
Data sample :
id fid sid name
--------------------------------------------------------------
1- 3a3458 2a2125 3a4541 John Smith
2- 313547 3a4541 212145 Albert koku
.....................
...............
.........
100- ...
Since the data has already a predefined structure, to avoid manual errors and parsing I would build an interface which contains a <table> and the user will fill the corresponding data and then submit it to the server.
But if you don't want to bother to guide the user and help him then you could of course use a <textarea> in which the user could enter the data under some known form: CSV, JSON, XML, ... and then do the parsing on the server.

fill a custom excel template from sql server proc with multiple result sets?

A stored proc (SQL Server 2008) returns multiple result sets.
An Excel (.xls) file with custom formatting - not a generic workbook, not a spreadsheet built on the fly - has particular cells on particular
worksheets where I need to correctly "paste" each appropriate result set
from the stored proc. The worksheets designated for holding data need to
receive the data, and then other worksheets in the workbook will display the data
with a high degree of formatting and with charts.
For example:
result set 1 needs to be pasted in a worksheet named 'data01'
and beginning at cell B2;
result set 2 needs to be pasted 'data01'
and beginning at cell K2;
result set 3 needs to be pasted in a worksheet named 'data02'
and beginning at cell B2...
What are some approaches for tackling this problem
in a .NET environment? I have not found examples
or lessons which duplicate this scenario.
Update:
Essentially I'm wondering if it's possible to do what
SpreadsheetGear does with an excel template, without
having to pay thousands for a third party tool.
http://www.spreadsheetgear.com/support/samples/excel.aspx
Importing and Exporting Data by Using the SQL Server Import and Export Wizard
Exporting SQL Server Data to Excel (SQL Server Video)
On Zach Hunter's blog,
http://zachhunter.net/
he has a number of posts on how to make user of NPOI.
http://npoi.codeplex.com/
In particular, these two articles were invaluable:
"Use NPOI to populate an Excel template"
http://www.zachhunter.net/2010/05/npoi-excel-template/
"Improved NPOI ExportToExcel Function"
http://www.zachhunter.net/2010/06/improved-npoi-exportdatatabletoexcel-function/

Resources