I have an MS Access DB where the Saved Imports inside the External Data has Import Jobs which are actually importing certain data from various locations to SOME tables. I am unable to find out which tables are actually imported with each of these jobs present there as the names given for these imports are unclear and unrelated. Is there any way I could find out to which table the import actually brings the data ?
The items that appear when you click "Saved Imports" on the "External Data" tab are stored as ImportExportSpecification objects in the CurrentProject.ImportExportSpecifications collection. Each object has a .Name property and an .XML property (among others). The details of the import operation are in the XML data, for example
<?xml version="1.0"?>
<ImportExportSpecification Path="C:\Users\Public\zzz.csv" xmlns="urn:www.microsoft.com/office/access/imexspec">
<ImportText TextFormat="Delimited" FirstRowHasNames="false" FieldDelimiter="," TextDelimiter="" CodePage="437" Destination="MyNewTable">
<DateFormat DateOrder="YMD" DateDelimiter="-" TimeDelimiter=":" FourYearDates="true" DatesLeadingZeros="false"/>
<NumberFormat DecimalSymbol="."/>
<Columns PrimaryKey="id">
<Column Name="Col1" FieldName="id" Indexed="YESDUPLICATES" SkipColumn="false" DataType="Long" Width="2"/>
<Column Name="Col2" FieldName="textfield" Indexed="NO" SkipColumn="false" DataType="Text" Width="4"/>
</Columns>
</ImportText>
</ImportExportSpecification>
The Path= attribute of the <ImportExportSpecification> element indicates the location of the file to be imported.
The Destination= attribute of the <ImportText> element specifies the name of the table into which the data will be imported.
Related
I am trying to import my JSON files into the same directory but once I import the first one, the latter one overrides the former one:
The first import:
After the second one:
As you can see above, the first file was placed inside the latter one. How can I import multiple JSON files in the same directory level?
Assuming that you import the data using the console, any data in the JSON file replaces the existing location where you run the import. There is no way to change this behavior.
What you can do is import the data to a different location in the console. So if you open the recipes node to import the first JSON, and open the searches node for the second JSON, the two imports won't overwrite each other.
If you want to import them into the root of the database, you'll have to merge the two JSON files yourself and then import them in one go.
Reading the docs http://exist-db.org/exist/apps/doc/indexing.xml
I'm finding difficult to understand how and if I can improve the performances of a 'read' query (with 2 parameters: a string and an integer).
Do eXist-db have a default structural index? Can I improve a 2 params query with a 'range index'?
More details about my XML db (note there are 2 different dbs simply merged on the same root):
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<db>
<docs>
<doc>
<header>
<year>2001</year>
<number>1</number>
<type>O</type>
</header>
<metas>
<meta>
<number>26001</number>
<details>
<detail>
<description>legge</description>
<number>19</number>
<date>14/01/1994</date>
</detail>
<detail>
<description>decreto legge</description>
<number>453</number>
<date>15/11/1993</date>
</detail>
</details>
</meta>
</metas>
</doc>
<doc>
<header>
<year>2001</year>
<number>2</number>
<type>O</type>
</header>
<metas>
<meta>
<number>26002</number>
<details>
<detail>
<description>decreto legislativo</description>
<number>29</number>
<date>03/02/1993</date>
</detail>
</details>
</meta>
<meta>
<number>26016</number>
<details>
<detail>
<description>decreto legislativo</description>
<number>29</number>
<date>03/02/1993</date>
</detail>
</details>
</meta>
</metas>
</doc>
</docs>
<full_text_docs>
<doc>
<header>
<year>2001</year>
<number>1</number>
<type>O</type>
<president>ferrari</president>
</header>
<text>lorem ipsum ...
</text>
</doc>
<doc>
<header>
<year>2001</year>
<number>2</number>
<type>O</type>
<president>ferrari</president>
</header>
<text>lorem ipsum......
</text>
</doc>
</full_text_docs>
</db>
This is my xquery
xquery version "3.0";
let $doc := doc("/db//index_test/test_general.xml")//db/docs/doc
let $fulltxt := doc("/db//index_test/test_general.xml")//db/full_text_docs/doc
return <root> {
for $a in $doc[metas/meta/details/detail[date="03/02/1993" and number = "29"]]/header
return $fulltxt[header/year/text()=$a/year/text() and
header/number/text()=$a/number/text() and
header/type/text()=$a/type/text()
]
} </root>
Basically I simply find for the detail/number and detail/date that matches the input in the first db and take the results for querying the second db. The results are all the <full_text_header> documents that matches.
I would to know if I can create indexes for the fields number and date to improve performance. Note this is the ONLY query I need to optimize (the only I do on this db) obviously number and date changes :).
SOLUTION:
For a clear explanation read the joewiz answer. My problem was the correct recognition of the .xconf file. It have to be placed in /db/yourcollectiondir. If you're using eXide when you create the file you should select Xml type with template "eXist-db collection configuration". When you try to save the file you will see a prompt "Apply configuration?" then click 'ok'. Just then run this xquery xmldb:reindex('/db/yourcollectiondir').
Now if all it's right when you run an xquery involving an index you will see the usage in "Monitoring and profiling".
As that documentation page states, eXist does create a structural index for all XML stored in the database. This is not an index of values, though, so without further indexes, queries based on value (rather than structure) would involve a lookup of values in the DOM. As your data grows larger, looking up values in the DOM gets slower and slower. This is where value-based indexes, such a range index, saves the day. (For a fuller explanation, see the "Indexing" section of Wolfgang Meier's "Tuning the Database" article, which is essential for getting the most performance out of eXist.)
So, yes, you can create indexes for the <number> and <date> fields. I'd recommend the "new range" index, as described on that documentation page. Your collection.xconf file setting up these indexes would look like this:
<collection xmlns="http://exist-db.org/collection-config/1.0"
xmlns:xs="http://www.w3.org/2001/XMLSchema">
<index>
<range>
<create qname="number" type="xs:integer"/>
<create qname="date" type="xs:string"/>
</range>
</index>
</collection>
You have to store this within the /db/system/config/ collection, in a subcollection corresponding to the location of your data in the database. So if your data is located in /db/apps/myapp/data, you would place this collection.xconf file in /db/system/config/db/apps/myapp/data.
Note that the configuration here would only affect the for clause's queries of date and number values, and not the predicates in the return clause, which depend on the values of <year> and <type> elements. So, to ensure your query maximized the use of indexes, you should declare indexes on these; it seems that xs:integer would be the appropriate type for each.
Lastly, I would suggest eliminating the /text() steps, which are completely extraneous. For more on the use/abuse of text(), see Evan Lenz's article, "text() is a code smell".
Update (2016-07-17): With the updated code sample above, I have a couple of additional suggestions. First, since the code is in /db/index_test, we will store our files as follows:
Assuming you're using eXide, when you store the collection.xconf file in a collection, eXide will prompt you to have a copy of the file placed in the correct location in /db/system/config. If you're not using eXide, you need to store the collection.xconf file there yourself.
Using the unmodified query, I can confirm that despite the presence of the collection.xconf file, monex shows no indexes are being applied:
Let's make a few modifications to the file to ensure indexes are properly applied:
xquery version "3.0";
<root> {
for $a in doc("/db/index_test/test_general.xml")//detail[date = "03/02/1993" and number = 29]/ancestor::doc/header
return
doc("/db/index_test/test_general.xml")/db/full_text_docs/doc
[
header/year = $a/year and
header/number = $a/number and
header/type = $a/type
]
} </root>
With these modifications, monex shows that indexes are applied to the comparisons in the for clause:
The insights here are derived from the "Tuning the Database" article. To get full indexing for all comparisons, you will need to define additional indexes and may need to make similar modifications to your query.
One final note: the version of monex you see in these pictures is using a feature I added this weekend, called "Tare", which tries to filter out other operations from the query profiling results in order to help the user see just the effects of their own query. This feature is still just a pull request, so running the current release version, you won't see identical results.
I just installed Dspace 5.4 and I am trying to move a collection from greenstone to Dspace.
I successfully exported the collection from greenstone but when I try to load it into Dspace via batch import (zip) I get the following error:
Notice
Import failed
/dspace/imports/New Folder.zip/New Folder/exported_DSpace/dublin_core.xml (No such file or directory)
Can anyone tell me what have I missed?
We do not have a great deal of information to go on from your question, such as how you did the export from greenstone. From what I can tell, it seems possible that you did not export the data in the correct format for dspace.
The structure should be this simple archive format
archive_directory/
item_000/
dublin_core.xml -- qualified Dublin Core metadata for metadata fields belonging to the dc schema
metadata_[prefix].xml -- metadata in another schema, the prefix is the name of the schema as registered with the metadata registry
contents -- text file containing one line per filename
file_1.doc -- files to be added as bitstreams to the item
file_2.pdf
item_001/
dublin_core.xml
contents
file_1.png
...
To export a collection from greenstone so it is suitable for dspace you can follow these steps it seems. Here is some information that might help
It seems possible that you have exported the data from greenstone but not in the correct format for DSpace.
For some more information on how the structure should look like when importing data into DSpace, you can take a look at here
I have to parse a CSV flat-file containing only line item data, with no recognisable header record, kinda like this:
930001,14-02-2013,100.00,1,Line 1,2,10.00,20.00
930001,14-02-2013,100.00,2,Line 2,2,20.00,40.00
930001,14-02-2013,100.00,3,Line 3,1,40.00,40.00
930002,13-02-2013,200.00,1,Line 1,10,10.00,100.00
930002,13-02-2013,200.00,2,Line 2,5,20.00,100.00
930003,14-02-2013,100.00,1,Line 1,3,20.00,60.00
930003,14-02-2013,100.00,2,Line 2,2,20.00,40.00
Where the fields are, in order:
Order No,Order Date,Order Amt,Line No,Line Desc,Line Qty,Unit Price,Line Price
I want to use the BizTalk Flat File receive pipeline to transform this into a hierarchical schema, grouping on the first field, the Order No:
Order_Batch
+ Order
+ OrderLine
Is there a way to perform a grouping operation via the flat-file receive, so that, in the above instance, the first 3 lines (Order No=930001)
<OrderBatch>
<Order>
<OrderLine>
<OrderNo>930001</OrderNo>
<other_fields />
<LineNo>1</LineNo>
<other_fields_etc />
</OrderLine>
<OrderLine>
<OrderNo>930001</OrderNo>
<other_fields />
<LineNo>2</LineNo>
<other_fields_etc />
</OrderLine>
<OrderLine>
<OrderNo>930001</OrderNo>
<other_fields />
<LineNo>2</LineNo>
<other_fields_etc />
</OrderLine>
</Order>
<Order> ... Details of Order 930002 ... </Order>
<Order> ... Details of Order 930003 ... </Order>
</OrderBatch>
The only option I currently see available to me is to accept the entire file as a set of OrderLine records, un-batched, then perform the batching using the Gather pattern in another Orchestration. I would prefer to Keep It Seriously Simple.
Use a map to translate from flat to hierarchical:
Create a schema for your flat file using the flat file schema wizard
Use a pipeline and the flat file disassembler to get the input message
Create a schema for your desired output xml
Create a map to transform the flat file message to the desired output message
I believe you can use xsl in the map you can do the grouping
I have created a configuration section designer project to represent nodes of a custom section necessary to read and save from my web application. I am able to successfully create instances of the configuration elements and collections, however when I save the configuration using the referenced System.Configuration.Configuration object and issuing save, the elements get merged into their parents as attributes. An example of the issue is outlined below:
After calling the referenced Configuration.save, the output is as follows:
<savedReports xmlns="SavedReportSchema.xsd">
<resultsSets dataViewId="1" id="4203bb88-b0c4-4d57-8708-18e48f0a1d2d">
<selects keyId="1" sortOrder="1" />
</resultsSets>
</savedReports>
As defined in my configuration section designer project (confirmed by the resulting xsd as well) the output should match the following:
<savedReports xmlns="SavedReportSchema.xsd">
<resultsSets>
<savedReport id="1">
<selects>
<select keyId="1" sortOrder="1"/>
</selects>
</savedReport>
</resultsSets>
</savedReports>
Any ideas? The element collection types are set to BasicMapAlternate however when I set them to AddRemoveClearMapAlternate they are not merged but they are prefixed by "add" rather than "select" or "savedReport" causing the validation to be off.
Turns out AddRemoveClearMapAlternate was the option I needed to correct my problem referenced in the question.