I fail to comprehend how to use data-table with CLSQL. CL-CSV repository README mentions that this is possible, but I would just really like an example.
f.e. If I have a data-table generated from a CSV in *db*, how do I use CLSQL to query the contents? I cannot even make CLSQL to use/connect to *db* as a database.
Related
I'm trying to find a way to extract all tasks dependencies. The idea is to find all SQL tasks (Bigquery) and find all it depending tables so i guess there is sort of a metadata db or another option i could think of is reading the "Render" (render template) code and extract the different entities straight from there but can't find any data source which holds that data.
I'm. trying to find a relevant data source / access which holds that info. Any idea how and where i can find it ?
Thanks
we have a larger dataset and have several preprocessing scripts.
These scripts alter data in place.
It seems when I try to register it with dvc run it complains about cyclic dependencies (input is the same as output).
I would assume this is a very common use case.
What is the best practice here ?
Tried to google around but i did not see any solution to this (besides creating another folder for the output).
Usually, we split input and output into separate files rather than modify everything in place, not only for the separation of concerns principles but also to make it fit with tools like DVC.
Hope you can try this way instead.
I will be creating a structure more or less of the form:
type FileState struct {
LastModified int64
Hash string
Path string
}
I want to write these values to a file and read them in on subsequent calls. My initial plan is to read them into a map and lookup values (Hash and LastModified) using the key (Path). Is there a slick way of doing this in Go?
If not, what file format can you recommend? I have read about and experimented with with some key/value file stores in previous projects, but not using Go. Right now, my requirements are probably fairly simple so a big database server system would be overkill. I just want something I can write to and read from quickly, easily, and portably (Windows, Mac, Linux). Because I have to deploy on multiple platforms I am trying to keep my non-go dependencies to a minimum.
I've considered XML, CSV, JSON. I've briefly looked at the gob package in Go and noticed a BSON package on the Go package dashboard, but I'm not sure if those apply.
My primary goal here is to get up and running quickly, which means the least amount of code I need to write along with ease of deployment.
As long as your entiere data fits in memory, you should't have a problem. Using an in-memory map and writing snapshots to disk regularly (e.g. by using the gob package) is a good idea. The Practical Go Programming talk by Andrew Gerrand uses this technique.
If you need to access those files with different programs, using a popular encoding like json or csv is probably a good idea. If you just have to access those file from within Go, I would use the excellent gob package, which has a lot of nice features.
As soon as your data becomes bigger, it's not a good idea to always write the whole database to disk on every change. Also, your data might not fit into the RAM anymore. In that case, you might want to take a look at the leveldb key-value database package by Nigel Tao, another Go developer. It's currently under active development (but not yet usable), but it will also offer some advanced features like transactions and automatic compression. Also, the read/write throughput should be quite good because of the leveldb design.
There's an ordered, key-value persistence library for the go that I wrote called gkvlite -
https://github.com/steveyen/gkvlite
JSON is very simple but makes bigger files because of the repeated variable names. XML has no advantage. You should go with CSV, which is really simple too. Your program will make less than one page.
But it depends, in fact, upon your modifications. If you make a lot of modifications and must have them stored synchronously on disk, you may need something a little more complex that a single file. If your map is mainly read-only or if you can afford to dump it on file rarely (not every second) a single csv file along an in-memory map will keep things simple and efficient.
BTW, use the csv package of go to do this.
I've got a bunch of audio files (let's say ogg or mp3), with metadata.
I wish to read their metadata into R so to create a data.frame with:
file name
file location
file artist
file album
etc
Any way you know of for doing that ?
You take an existing mp3 or ogg client, look at what library it uses and then write a binding for said library to R, using the existing client as guide for that side -- and something like Rcpp as a guide on the other side to show you how to connect C/C++ libraries to R.
No magic bullet.
A cheaper and less reliable way is to use a cmdline tool that does what you want and write little helper functions that use system() to run that tool over the file, re-reading the output in R. Not pretty, not reliable, but possibly less challenging.
Possible, yes, easy, no.
You "could" use a combination of readChar and/or readBin on the file and parse out the contents. This would be highly dependent, though, on parsing the frame tags from the raw bytes of the ID3v2 tag (and mind you it would change if it was a version 1 tag). If would certainly be a lot of work to implement a straight R solution. Take this Python code for example, it's very clean straight python code but a lot of branching and parsing.
You can use exiftool with system command available in R. Optionally, you can create regexp to handle the fields you need... If I were you, I'd stick with Dirk's advice (as usual) =)!
Out here in 2021, I wanted to do this so I did the following...
Create a new playlist while in 'songs' view.
Select all songs and drag to the new playlist. Highlight that playlist
File> Library>Export Playlist. My default file was to save as .txt, if not, designate.
Open Excel to save as csv or read.delim() in r as the txt file is tab-separated
import to R
What is the SQLite query to detect if the FTS3 extension module is installed? Or is it possible to get a list of installed extensions with an SQLite3 query? It has to work with pysqlite2.
I know that I can get the list of tables using SELECT * FROM sqlite_master, I'd like to get something similar for the list of extensions. I also know that CREATE VIRTUAL TABLE v USING FTS3 (t TEXT) succeeds iff FTS3 is installed, but I'd like to get a query without side effects (not even creating a temporary table).
As a workaround I have opened the ":memory:" database, and issued the CREATE VIRTUAL TABLE command above.
There is no way to do it in SQLite at the moment; it forgets what it has loaded and cannot report it even if it wanted to (I checked the source to the code that does the loading, and the critical information that describes what is loaded is just not stored). It is known (see Wish List at the bottom of that page) that it would be good to retain this information, but it does not appear to be retained as yet.
As a consequence, the only thing you can do is your current workaround – trying it and seeing if it works. Sorry I can't offer anything else.