I have a bunch of CSVs that get updated locally on my computer every few days. I want to refresh them in SQLite Studio but I can't find out where to actually refresh. Is there an option to do this? The only way i've been able to refresh is to fully delete the table, and then re-import it under the same name (so the query still works). All of the CSVs and Sqlite Studio are local on my computer I am not running anything remote.
CSV file is not linked in any way with SQLiteStudio. Once you import data to table, it is in table, not in CSV file. If you want to refresh contents of table with data from CSV files, then you need to do exactly what you already do, that is re-import.
An useful tool to make this repeatable task less clumsy is import() SQL function built in SQLiteStudio. You can easily delete old data and re-import new one in single execution:
delete from your_table;
select import('path/to/file.csv', 'CSV', 'your_table', 'UTF-8');
Of course you need to adjust your parameters. Also there can be 5th (optional) parameter specifying importing options, just like in Import Dialog. Quoting from User Manual (https://github.com/pawelsalawa/sqlitestudio/wiki/User_Manual#built-in-sql-functions):
charsets() Returns list of charsets supported by SQLiteStudio (to be used for example in arguments for import() function)
import_formats() Returns list of importing formats supported by SQLiteStudio (depends on import plugins being loaded)
import_options(format) Returns list of currently used importing settings for certain format (the format must be one of formats returned from import_formats()). Each setting in a separate line. Each line is a setting_name=setting_value
import(file, format, table, charset, options) Executes importing process using file for input, format for choosing import plugin (must be one of values returned from import_formats()). The import is done into the table. If table does not exists, it will be created. The charset is optional and must be one of values returned from charsets() (for example 'UTF-8'). It defaults to UTF-8. The options is optional and has to be in the same format as returned from import_options() (which is one option per line, each line is option_name=value), although it's okay to provide only a subset of options - then the rest of settings will remain.
Related
I'm working on a scenario where I have to compare a data record which is coming from a file with the data from a table as part of validation check before loading the data file into the staging table. I have come up with a couple of possible scenarios which involve something that needs to change within the load mapping, but my team suggested to me to make a change to something that is easy to notice since it is a non-standard approach.
Is there any approach that we can handle within the workflow manager using any of the workflow tasks or session properties?
Create a mapping that will read the file, join data with the table, do the required validation and will write nothing out (use a filter with FALSE condition) and set a variable to 0/1 to indicate if the loading should start.
Next, run the loading session if the validation passed.
This can be improved a bit if you want to store the validation errors in some audit table. Then you don't need a variable - the condition can refer to $PMTargetName#numAffectedRows built-in variable. If it's more then zero - meaning there were some errors - don't start the load.
create a workflow with command line where you need to write a script which will pull the data from the table by using JDBC connections and try to compare with data present in the file and then flag whether to load or not .
based on this command line output you need to go ahead with staging workflow or not..
Use awk commands for comparison of the data , where you ll get flexibility to compare date parts in a column
FYR : http://www.cs.unibo.it/~renzo/doc/awk/nawkA4.pdf
I have a positional input flat file schema of the following kind.
<Employees>
<Employee>
<Data>
In mapping, I need to extract the strings on position basis to pass on to the target schema.
I have the following conditions -
If Data has 500 records, there should be 5 files of 100 records at the output location.
If Data has 522 records, there should be 6 files (5*100, 1*22 records) at the output location.
I have tried few suggestions from internet like
Setting “Allow Message Breakup At Infix Root” to “Yes” and setting maxoccurs to "100". This doesn't seem to be working. How to Debatch (Split) a Flat File using Flat File Schema ?
I'm also working on a custom receive pipeline component suggested at Split Flat Files into smaller files (on row count) using Custom Pipeline but I'm quite new to this so it's taking some time.
Please let me know if there is any simpler way of doing this, without implementing the custom pipeline component.
I'm following the approach to divide the input flat file into multiple small files as per condition and write at the receive location, then process the files with native flat file dissembler. Please correct me if there is a better approach.
You have two options:
Import the flat file to a SQL table using SSIS.
Parse the input file as one Message, then map to a Composite Operation to insert the records into a SQL table. You could use in Insert Updategram also.
After either 1 or 2, call a Stored Procedure to retrieve the Count and Order of messages you need.
A simple way for a flat file structure without writing custom C# code is to just use a Database table. Just insert the whole file as records into the table, and then have a Receive Location that polls for records in the batch size you want.
Another approach is called the Scatter Gather Pattern, in this case you do set the Occurs to 1 which will debatch into individual records, and you then have an Orchestration that re-assembles it into the batch size you want. You will have to read up about Correlations Sets to do this.
When running: hive -e "query" > mysheet.xls, I am able to export the output to a new excel file.
Could you help me with exporting another hive query into already created excel file into different excel sheet? (not overwriting the existing file/data).
Is this possible with hive query? Please help.
The issue here is that you are using stdout redirect >. That will always create a new file. If you use the adding redirect >> that will add to your current file (not creating a new tab in Excel).
That said your query is probably creating a csv file which can be then opened in Excel.
If you are satisfied with your results I recommend generating multiple csv files via script and then merging it either with other script into one big excel file or use directly Excel to merge multiple csv files.
You can use multiple ways to merge it with Excel - one of the possible way is here
I've imported lot of data using neo4j-import tool only to be useless as the default data type is string for all columns. I'm not able to perform any aggregations on data. However, I'm able to change the data type using update commands but this is a lot of overhead.
Is it possible specify data types importing data itself using neo4j-import tool?
You should be able to use toInt(), toFloat(), and toString(). If you need booleans, you can do a comparison to get a boolean output.
Ah, this is assuming that you have access to Cypher in the import.
If you're using the 2.2.0 (and probably 3?) import, you should be able to define the type in the header like so propertyName:int
See the property types part of the 2.2.0 import tool docs, and the CSV header format sections.
I need a place to store images. My first thought was to use the database, but many seems to recommend using the filesystem. This seems to fit my case, but how do I implement it?
The filenames need to be unique, how to do that. Should I use a guid?
How to retrieve the files, should I go directly to the database using the filename, make a aspx page and passing either filename or primary key as a querystring and then read the file.
What about client side caching, is that enabled when using a page like image.aspx?id=123 ?
How do I delete the files, when the associated record is deleted?
I guess there are many other things that I still haven't thought about.
Links, samples and guidelines are very welcome!
Uploading Files in ASP.NET 2.0
You seem up in the air about how to do this, so I'll give you a few ideas to get you going.
If you want to use a file system make sure you know the limit of how many files are permitted per directory, so you can set up a system that creates subdirectories. http://ask-leo.com/is_there_a_limit_to_what_a_single_folder_or_directory_can_hold.html
I would make a table something like this :
FileUpload
UploadID int identity/auto generate PK, used as FK in other tables
InternalFileName string
DisplayFileName string
Comment string
MimeType string
FileSizeBytes int
FileHash string MD5 Hash of a File
UploadUserID int FK to UserID
UploadDate date
You would include the UploadID in the table that "owns" this upload, like the Order associated with it, etc. You could add the InternalDirectory column, but I prefer to calculate this based on a constant Root value + some key specific value. For example, the complete file name and directory would be:
Constant_Root+'\'+OrderID+'\'+InternalFileName
You could make the InternalFileName be the UploadID and the file extension from the original file. That way it would be unique, but You'll have to insert the row as the first part of the process, and update it after after you know the file name and identity/auto generate row value. You could make the InternalFileName something like YYYMMDDhhmissmmm and the file extension from the original file, which may be unique enough based on how you use subdirectories.
I like to store a MD5 Hash of a File which makes detecting duplicates easy.
MimeType and FileSizeBytes can be found from the file system, but if you store them in the database, it makes some maintenance easier since you can query for large files or files of certain types.