I am trying to use the type HashMultimap in one of my entity object.
#Persistent
private HashMultimap<String, String> testMap;
JDO
<field name="testMap" persistence-modifier="persistent" serialized="false">
<join />
</field>
However when i run schema tools with DataNuclues, the field is created as a mediumblob. My question is whether i can force this to be create as a join table.
What i am after is basically a data type of a hashmap that can support duplicate key value.
I am using MySQL as datastore
Thanks
Andy from DataNucleus
HashMultimap is not supported in the default package
I have a variable called StartDateTime, data type DateTime, expression #[System::StartTime]. In an Execute TSQL Task, I call a stored procedure with an input parameter of type DateTime and map the variable #[User::StartDateTime] to parameter data type DBTIMESTAMP.
I get the error
invalid time format
and do not understand why.
ALTER PROCEDURE [dbo].[spSSISInsControl]
#SourceTableName VARCHAR(100),
#PackageName VARCHAR(100),
#DateProcessed DATETIME,
#RowsTotal INT,
#RowsLoaded INT,
#RowsNoMatch INT,
#RowsDeleted INT = 0,
#SourceFileName VARCHAR(100) = NULL
In your question, you specifically mention the data type of DBTIMESTAMP, that's not what you want. Instead, specify DATE. I know, sounds like it's just the date part but this is the hell of SSIS's three different type systems.
Setup
Reducing your problem down to the fewest moving parts, I created the following stored procedure.
CREATE PROCEDURE
dbo.so_36208937
(
#StartDateTime datetime = '2013-01-01'
)
AS
BEGIN
SELECT #StartDateTime AS StartDateTime;
END
GO
My package is very basic
I have the execute sql tasks configured like this
Since I have the same proc running twice, I put it into a Variable called QueryProcOleOdbc with a value of EXECUTE dbo.so_36208937 ?; As you are using an OLE DB connection manager, we use the ? to specify the place where a parameter goes.
I run the Execute SQL Task twice, once with a static date value and once with a date value that is based on #[System::StartTime] Both work fine.
Biml
Biml, the Business Intelligence Markup Language, is a way of describing SSIS packages using XML. The free addon, bids helper converts biml to SSIS packages.
Fix the third line to have the connection string point to a valid location.
This biml describes an SSIS package that has two variables: StartDateTime and StaticStartDateTime. Both are of data type DateTime. The former has an Expression set to StartTime, the latter is static.
<Biml xmlns="http://schemas.varigence.com/biml.xsd">
<Connections>
<OleDbConnection Name="CM_OLE" ConnectionString="Data Source=localhost\dev2014;Initial Catalog=tempdb;Provider=SQLNCLI11.0;Integrated Security=SSPI;"/>
</Connections>
<Packages>
<Package Name="so_36208937" ConstraintMode="Linear">
<Variables>
<Variable DataType="DateTime" Name="StartDateTime" EvaluateAsExpression="true">#[System::StartTime]</Variable>
<Variable DataType="DateTime" Name="StaticStartDateTime">2016-03-24 13:14:15.678</Variable>
<Variable DataType="String" Name="QueryProcOleOdbc">EXECUTE dbo.so_36208937 ?;</Variable>
<Variable DataType="DateTime" Name="result">2015-01-01<![CDATA[]]></Variable>
</Variables>
<Tasks>
<ExecuteSQL
ConnectionName="CM_OLE"
Name="SQL Use static variable">
<VariableInput VariableName="User.QueryProcOleOdbc" />
<Parameters>
<Parameter DataType="DateTime" VariableName="User.StaticStartDateTime" Name="0" />
</Parameters>
</ExecuteSQL>
<ExecuteSQL
ConnectionName="CM_OLE"
Name="SQL Use dynamic value">
<VariableInput VariableName="User.QueryProcOleOdbc" />
<Parameters>
<Parameter DataType="DateTime" VariableName="User.StartDateTime" Name="0" />
</Parameters>
</ExecuteSQL>
</Tasks>
</Package>
</Packages>
</Biml>
data type datetime is not same as timestamp. I think that is your problem. Here's a link to more info: link
I have set DeletedFlag column as Not Null in database. I have also changed Conceptual Model and Storage Model accordingly. Both has column definition as
<Property Name="DeletedFlag" Type="Boolean" Nullable="false" DefaultValue="false"/>
still EF trying to insert null in this field hence throwing exception.
Cannot insert the value NULL into column 'DeletedFlag', table
'xxxxx.dbo.TblUsers'; column does not allow nulls. INSERT fails.
PS I can't update my model from database because it will mess up many things.
I am new to Flyway. Flyway is good and friendly.
I want to create and give schema name for my tables in V1__Initial_structure.sql file. I don't know where to assign value to the placeholder. I have configured Flyway programmatically. My sql file contains,
create schema ${schemaName}
create table ${schemaName}.brand(brand_code IDENTITY,
brand_name varchar(50) unique not null, active char(1) default 'Y')
Please help.
This is the method you are looking for: http://flywaydb.org/documentation/javadoc/com/googlecode/flyway/core/Flyway.html#setPlaceholders(java.util.Map)
I'm experimenting with moving a JDBC webapp to JDO DataNucleus 2.1.1.
Assume I have some classes that look something like this:
public class Position {
private Integer id;
private String title;
}
public class Employee {
private Integer id;
private String name;
private Position position;
}
The contents of the Position SQL table really don't change very often. Using JDBC, I read the entire table into memory (with the ability to refresh periodically or at-will). Then, when I read an Employee into memory, I simply retrieve the position ID from the Employee table and use that to obtain the in-memory Position instance.
However, using DataNucleus, if I iterate over all Positions:
Extent<Position> extent =pm.getExtent(Position.class, true);
Iterator<Position> iter =extent.iterator();
while(iter.hasNext()) {
Position position =iterPosition.next();
System.out.println(position.toString());
}
And then later, with a different PersistenceManager, iterate over all Employees, obtaining their Position:
Extent<Employee> extent =pm.getExtent(Employee.class, true);
Iterator<Employee> iter =extent.iterator();
while(iter.hasNext()) {
Employee employee =iter.next();
System.out.println(employee.getPosition());
}
Then DataNucleus appears to produce SQL joining the two tables when I obtain an Employee's Position:
SELECT A0.POSITION_ID,B0.ID,B0.TITLE FROM MYSCHEMA.EMPLOYEE A0 LEFT OUTER JOIN MYSCHEMA."POSITION" B0 ON A0.POSITION_ID = B0.ID WHERE A0.ID = <1>
My understanding is that DataNucleus will use a cached Position instance, when available. (Is that correct?) However, I'm concerned that the joins will degrade performance. I'm not yet far enough along to run benchmarks. Are my fears misplaced? Should I continue, and benchmark? Is there a way to have DataNucleus avoid the join?
<jdo>
<package name="com.example.staff">
<class name="Position" identity-type="application" schema="MYSCHEMA" table="Position">
<inheritance strategy="new-table"/>
<field name="id" primary-key="true">
<column name="ID" jdbc-type="integer"/>
</field>
<field name="title">
<column name="TITLE" jdbc-type="varchar"/>
</field>
</class>
</package>
</jdo>
<jdo>
<package name="com.example.staff">
<class name="Employee" identity-type="application" schema="MYSCHEMA" table="EMPLOYEE">
<inheritance strategy="new-table"/>
<field name="id" primary-key="true">
<column name="ID" jdbc-type="integer"/>
</field>
<field name="name">
<column name="NAME" jdbc-type="varchar"/>
</field>
<field name="position" table="Position">
<column name="POSITION_ID" jdbc-type="int" />
<join column="ID" />
</field>
</class>
</package>
</jdo>
I guess what I'm hoping to be able to do is tell DataNucleus to go ahead and read the POSITION_ID int as part of the default fetch group, and see if the corresponding Position is already cached. If so, then set that field. If not, then do the join later, if called upon. Better yet, go ahead and stash that int ID somewhere, and use it if getPosition() is later called. That would avoid the join in all cases.
I would think that knowing the class and the primary key value would be enough to avoid the naive case, but I don't yet know enough about DataNucleus.
With the helpful feedback I've received, my .jdo is now cleaned up. However, after adding the POSITION_ID field to the default fetch group, I'm still getting a join.
SELECT 'com.example.staff.Employee' AS NUCLEUS_TYPE,A0.ID,A0."NAME",A0.POSITION_ID,B0.ID,B0.TITLE FROM MYSCHEMA.EMPLOYEE A0 LEFT OUTER JOIN MYSCHEMA."POSITION" B0 ON A0.POSITION_ID = B0.ID
I understand why it is doing that, the naive method will always work. I was just hoping it was capable of more. Although DataNucleus might not read all columns from the result set, but rather return the cached Position, it is still calling upon the datastore to access a second table, with all that entails - including possible disk seeks and reads. The fact that it will throw that work away is little consolation.
What I was hoping to do was tell DataNucleus that all Positions will be cached, trust me on that. And if for some reason you find one that isn't, blame me for the cache miss. I understand that you'll have to (transparently) perform a separate select on the Position table. (Even better, pin any Positions you do have to go fetch due to a cache miss. That way there won't be a cache miss on the object again.)
That is what I'm doing now using JDBC, by way of a DAO. One of the reasons for investigating a persistence layer was to ditch these DAOs. It is difficult to imagine moving to a persistence layer that can't move beyond naive fetches resulting in expensive joins.
As soon as Employee has not only a Position, but a Department, and other fields, an Employee fetch causes a half dozen tables to be accessed, even though all of those objects are already pinned in the cache, and are addressable given their class and primary key. In fact, I can implement this myself, changing Employee.position to an Integer, creating an IntIdentity, and passing it to PersistenceManager.getObjectByID().
What I think I'm hearing is that DataNucleus is not capable of this optimization. Is that right? It's fine, just not what I expected.
By default, a join will not be done when the Employee entity is fetched from the datastore, it will only be done when Employee.position is actually read (this is called lazy loading).
Additionally, this second fetch can be avoided using the level 2 cache. First check that the level 2 cache is actually enabled (in DataNucleus 1.1 it is disabled by default, in 2.0 it is enabled by default). You should probably then "pin" the class so that the Position entities it will be cached indefinitely:
The level 2 cache can cause issues if other applications use the same database, however, so I would recommend only enabling it for classes such as Position which are rarely changed. For other classes, set the "cacheable" attribute to false (default is true).
EDITED TO ADD:
The <join> tag in your metadata is not suitable for this situation. In fact you don't need to specify the relationship explicitly at all, DataNucleus will figure it out from the types. But you are right when you say that you need POSITION_ID to be read in the default fetch group. This can all be achieved with the following change to your metadata:
<field name="position" default-fetch-group="true">
<column name="POSITION_ID" jdbc-type="int" />
</field>
EDITED TO ADD:
Just to clarify, after making the metadata change descibed above I ran the test code which you provided (backed by a MySQL database) and I saw only these two queries:
SELECT 'com.example.staff.Position' AS NUCLEUS_TYPE,`THIS`.`ID`,`THIS`.`TITLE` FROM `POSITION` `THIS` FOR UPDATE
SELECT 'com.example.staff.Employee' AS NUCLEUS_TYPE,`THIS`.`ID`,`THIS`.`NAME`,`THIS`.`POSITION_ID` FROM `EMPLOYEE` `THIS` FOR UPDATE
If I run only the second part of the code (the Employee extent), then I see only the second query, without any access to the POSITION table at all. Why? Because DataNucleus initially provides "hollow" Position objects and the default implementation of Position.toString() inherited from Object doesn't access any internal fields. If I override the toString() method to return the position's title, and then run the second part of your sample code, then the calls to the database are:
SELECT 'com.example.staff.Employee' AS NUCLEUS_TYPE,`THIS`.`ID`,`THIS`.`NAME`,`THIS`.`POSITION_ID` FROM `EMPLOYEE` `THIS` FOR UPDATE
SELECT `A0`.`TITLE` FROM `POSITION` `A0` WHERE `A0`.`ID` = <2> FOR UPDATE
SELECT `A0`.`TITLE` FROM `POSITION` `A0` WHERE `A0`.`ID` = <1> FOR UPDATE
(and so on, one fetch per Position entity). As you can see, there are no joins being performed, and so I'm surprised to hear that your experience is different.
Regarding your description of how you hope caching should work, that is how the level 2 cache ought to work when a class is pinned. In fact, I wouldn't even bother trying to pre-load Position objects into the cache at application start-up. Just let DN cache them cumulatively.
It's true that you may have to accept some compromises if you adopt JDO...you'll have to relinquish the absolute control that you get with hand-rolled JDBC-based DAOs. But in this case at least you should be able to achieve what you want. It really is one of the archetypal use cases for the level 2 cache.
Adding on to Todd's reply, to clarify a few things.
A <join> tag on a 1-1 relation means nothing. Well it could be interpreted as saying "create a join table to store this relationship", but then DataNucleus doesn't support such a concept since best practice is to use a FK in either owner or related table. So remove the <join>
A "table" on a 1-1 relation suggest that it is stored in a secondary table, yet you don't want that either, so remove it.
You retrieve Position objects, so it issues something like
SELECT 'org.datanucleus.test.Position' AS NUCLEUS_TYPE,A0.ID,A0.TITLE FROM "POSITION" A0
You retrieve Employee objects, so it issues something like
SELECT 'org.datanucleus.test.Employee' AS NUCLEUS_TYPE,A0.ID,A0."NAME" FROM EMPLOYEE A0
Note that it doesn't retrieve the FK for the position here since that field is not in the default fetch group (lazy loaded)
You access the position field of an Employee object, so it needs the FK retrieving (since it doesn't know which Position object relates to this Employee), so it issues
SELECT A0.POSITION_ID,B0.ID,B0.TITLE FROM EMPLOYEE A0 LEFT OUTER JOIN "POSITION" B0 ON A0.POSITION_ID = B0.ID WHERE A0.ID = ?
At this point it doesn't need to retrieve the Position object since it is already present (in the cache), so that object is returned.
All of this is expected behaviour IMHO. You could put the "position" field of Employee into its default fetch group and that FK would be retrieved in step 4, hence removing one SQL call.