Importing and querying a mongodb bson file from external harddrive - r

I am new to mongodb. I have a bson file(collect.bson) and it is on my external hard drive, very large about 200 GB and I want to run a query and i want to do that from my terminal. Do I need to create a database first in order to do that? Considering the fact that it is such a large file I don't how much space it will consume. I installed mongobd from my terminal and I was curious how I can proceed to extract the attributes and columns into a csv/R? Please suggest.
Thanks

Related

Load multiple files through SQL Loader

I have a requirement to load multiple files received from two source systems into 1 table using SQL Loader.
To make this possible, I want to understand the following --
Pros and Cons of integrating multiple files like this ? - Need this to compare the merging option at source or via SQL Loader
Any other way of interfacing the data from .CSV files except SQL Loader in Oracle for multiple files ? -- I don't think so but still need expert's confirmation
What are the things I need to mindful about? -
Ex- file format and the header sequence should be same for all the files.
Thanks in Advance.

SQLite Multiple Attached rather than single large database

I'm going to end-up with a rather large database CubeSQLite in the cloud and cloned on the local machine. In my current databases I already have 185 tables and growing. I store them in 6 SQLite databases and begin by attaching them together Using the ATTACH DATABASE command. There are views that point to information in other databases and, as a result, Navicat won't open the SQLite tables individually. It finds them to be corrupted, although they are not and are working fine.
My actual question is this:
Considering the potential size of the files, is it better/faster/slower to do it this way or to put them all into one really large SQLite DB?

Should we store data in database?

i'm asp.net beginner and currently working in "upload download file" project with asp.net and vb.net as code behind language (like skydrive's web).
what i'm want ask is about upload file in server, must we store path file, size, accessed or created date into database? as we know we can use directory listing in system.io.
Thanks for your help.
You definetly want to store the path of the file. You want a way to find the file ;) Maybe later you will have multiple servers, replication or other fancy things.
For the rest, it depends a bit on the type of website. If it's going to get high traffic then store it in the database, this will limit the number of IO call (very slow). Also, it'll be a lot easier to handle sorting and queries. (sort by date, pull only the read onyl files, ...).
Database will also help if you want to show history or statistique.
You can save file in some directory and can save path of that file in database. You can also store size and created date of that file in DB. But storing a file in DB is a bit difficult. Rather than save file in Directory and save path of that file in DB
you could store the file information in a database to built some extra features like "avoid storing duplicate files", because you are having a faster search in the database! if you search the filesystem always a recursive function call get started

copying sqlite3 db while being read

I have a script that was reading data from a sqlite3 database and while this script was running I made a copy of the database cp mydatabase mydatabase.bak. Will this affect either the script that was reading from the db or the copy of the db? I had a look at the sqlite documentation here [0] but I didn't put a lock on the db as per the instructions.
[0] http://www.sqlite.org/backup.html
Copying the file should be analogous to another application reading the database, so it shouldn't be a problem. Multiple applications can safely read the database file at the same time (per the SQLite FAQ).
As another point, consider that you can read from a database even if the database and its directory both lack write permissions. Since in that scenario there's no way for the reading application to be modifying the database file or creating a temp file that needs to be incorporated into it, there's no way for any of a number of simultaneously reading applications to affect what any of the others see.

need help in choosing the right tool

I have a client who has set-up a testing environment in some AI language. It basically runs some predefined test cases and stores the results in as log files (comma separated txt files). My job is to identify and suggest a reporting system and I have these options in mind. either
1. Importing the logs into MSSQL and use the reporting(SSRS) it uses
2. or us import the logs to MySQL and use PHP to develop custom reporting.
I am thinking that going with option2 is better. The reason for this is, the logs are inconsistent and contain unexpected wild characters that normally DB's don't accept. So, I can write some scripts in php before loading them to the database.
Can anyone please suggest if this is your problem what will you suggest to do?
It depends how fancy you need to be. If the data is in CSV files, you could even go so simple as to load it into Excel (or their favorite spreadsheet tool), and use spreadsheet macros to analyze it.

Resources