How do I enforce "single source of truth" in pure FP? - functional-programming

In a OO system, I could have a "Data" object, and share a reference with several other objects. If I update the data object, any object with a reference will see the updated data.
In pure FP system, we have an issue where for any data structure we can ask "Should it store such and such data, or should we expect that data to be passed in?"
E.g. I have to decide, If it's ok to have something like:
students : Dict StudentId Student
toString : Dict StudentId Student -> StudentId -> Maybe String
or is it better to have
studentIds : Set StudentId
toString : (StudentId -> Maybe Student) -> StudentId -> Maybe String
The former runs the risk of duplicating the relationship or data, or having differing truths.

Related

How to store one to many relationship data in dynamodb

As per my data model, I need to store many to many relationship data items in dynamodb
Example :
I have a field called studentId and every studentId will have several subjects assigned to him.
Requirement :
So that for a given studentId, I need to store all the subjects. I would need to get all the subjects assigned to a given student.
Similary, for a given subjectId, I need to know the studentIds whom that subject has been assigned to.
am planning to store this in dynamoDb as follows :
Table1 : StudentToSubjects :
Hash Key : StudenId,
RangeKey: subjectId
so that if I query using only primaryKey, it would give me all the rows having that primary key and all the different hash keys.
Secondary Key as
secondary HashKey: subjectId
Secondary RangeKey: studentId
I wanted to know if this makes sense or the right thing to do. Or there are better ways to solve this problem.
Your Design looks OK but you need to think it through before finalizing it, let say you have implemented this design and after 10 years when you will query the table for particular subject, you will get all the students of past 10 years which you might not need (when you query using secondary table-GSI).
I would probably go with following
Student Master:
Hash Key: studentId
subjectIds (Number-set or String-set)
Subject Master:
Hash Key: subjectId
Range Key: Year
studentIds (Number-set or String-set)
Advantage of this would be you will consume less queries, for particular subject or student you will consume only 1 read (if the size is less then 4kb).
Again this is just scratching a surface think of all the queries before finalizing the Schema.
Edit: You don't need to repeat the studentId it will remain unique.
it would look something like this
studentId -> s1
subjectIds -> sub1,sub2,subN (This is set)
studentId -> s2
subjectIds -> sub3,sub4
Following is the data type link you can refer http://docs.aws.amazon.com/amazondynamodb/latest/developerguide/DataModel.html#DataModel.DataTypes

How can I more efficiently deal with DBNull.value in vb.net / SQL Server

I have a class type object, such as:
Private Class Car
Public Property Make As String
Public Property Model As String
Public Property Used As Boolean?
Public Property Mileage As Integer?
Public Property DateSold As Date?
End Class
I'm parsing a query string that's posted to my page that may or may not have key/value pairs for the above properties. For example, it may just be:
?Make=Toyota
I have a table in SQL Server based on the Public Properties you see above, and I'm allowing NULLS on all columns. (My real data has more fields, so this is for explanation purposes)
CREATE PROCEDURE InsertCar
#Make varchar(25),
#Model varchar(25),
etc.
I'm not setting default values in the parameters in the stored procedure above. I want to know of a cleaner way to do this from code.
For example:
The data is posted to my page... I use httputility to parse it, and I can build a new object, such as:
Dim record As New Car
record.Make = Data("make")
record.Model = Data("model")
record.Used = Data("used")
record.Mileage = Data("mileage")
record.DateSold = Data("datesold")
Afterwards, I build some parameters for my stored procedure, but here's where I think I can improve:
Dim pars As New List(Of SqlParameter)
pars.Add(New SqlParameter("#Make", If(record.Make = Nothing, DBNull.Value, record.Make))
and so on...
So you can see my ternary operators for building the parameters. Is there a cleaner way, perhaps using default values or when the class is being put together to make this cleaner?
This is more of a precision and best practice question I think. Should I be assigning DBNull.Value to the public properties? Or is how I'm doing it sufficient? Is there a cleaner way to map nullable properties?
My values from the query string are parsed into a NameValueCollection, so is there an easier way to just iterate over the ones that exist, and if not exist in the class, automatically build the parameter with DBNull.Value?
I hope my question makes sense. With a lot of data it just seems ugly with all the if/then DBNull.value, etc. I feel like I could be saving myself a step somewhere along the way. The class object is helpful for organization and is utilized in other parts of the code, or else I'd probably just build the parameters and be done with it.
You can set default values in the stored procedure parameters.
CREATE PROCEDURE InsertCar
#Make varchar(25) = null,
#Model varchar(25) = null
...
And just pass your object, if it is null/missing the stored proc will default the missing paramater value to null
pars.Add(New SqlParameter("#Make", record.Make))
The SQL table also needs to accept nulls.
Or, use the vb.net IF() operator with 2 parameters
pars.Add(New SqlParameter("#Make", If(record.Make,DBNull.Value))
Is there a VB.NET equivalent for C#'s '??' operator?

How can I use ORM-like queries on a Map?

I have created a slice of structs that has 3 properties
type Person struct {
age int
gender string
name string
}
How can I pull the item from the slice which matches my criteria?
For example I would like to do
var persons []Person = mySliceOfPersons
person := getFrom(persons).Where(age ==10).Where(gender == "male")
The purpose here is to keep the data in memory, and not be restricted by IO. (I'm expecting thousands of calls per second). I am new to Go and I am not sure where to find a package that does this. The data comes from Json and not a Database so I don't think I can use the sql package.
This solution IS a database, but you can embed it into your application for use rather than relying on an outside db: https://github.com/HouzuoGuo/tiedot
Another possibility is an approach like this one, which uses the sql package against local flat files, which could potentially be adapted to run against map?: https://github.com/dinedal/textql

How can I query system information and metadata?

In the datawarehouse which is build on Teradata how can I find out how many Databases exist in the whole datawarehouse, how many data marts exist in the warehouse, which databases have the most tables, which databases are most frequently used. This is certainly a programming question, because I am asking how to query the Datawarehouse to get the desired informations.
I would like to get a look and feel about the datawarehouese. Similar informations or suggestions would certainly help - what should I keep an eye on? What is the "heart" ot the Data warehouse. What is the first thing you need to look when you start to work with complete new Datawarehouse?
Go to the Teradata Documentation web site and find the "Data Dictionary" book for the version of Teradata you are using. There are numerous dictionary views available.
The one in particular that includes all databases in the environment is called "dbc.databases", so run this:
select *
from dbc.databases
where DBKind = 'D'
The other value for DBKind is 'U', which would include users on the system.
Information about tables is in dbc.tables and other views. I'm not aware of any Teradata concept of "data mart" so I can't help you there.
Answering a question like "most frequently used" would require using one of the query log tables (DBQL). However, you should ask your system DBA if these views are available to you.
-- how many databases exist
SEL COUNT(*)
FROM dbc.databases
WHERE dbkind = 'D'
-- which databases have the most tables?
SEL databasename, COUNT(*)
FROM dbc.tables
WHERE tablekind = 'T' GROUP BY 1 ORDER BY 2 DESC
TABLEKIND definitions
A: aggregate UDF
B: COMBINED AGGREGATE AND ORDERED ANALYTICAL FUNCTION
E: EXTERNAL STORED PROCEDURE
F: SCALAR UDF
G: TRIGGER
H: INSTANCE OR CONSTRUCTOR METHOD
I: JOIN INDEX
J: JOURNAL
M: MACRO
N: HASH INDEX
P: STORED PROCEDURE
Q: QUEUE TABLE
R: TABLE FUNCTION
S: ORDERED ANALYTICAL FUNCTION
T: TABLE
U: USER-DEFINED DATA TYPE
V: VIEW
X: AUTHORIZATION
-- which databases are most frequently used.
SEL DatabaseName, AccessCount, LastAccessTimeStamp
FROM dbc.databases ORDER BY AccessCount
Also be sure to check out the dbc.columns table for information on what columns are in each table, their datatypes, etc.

L2Entities, stored procedure and mapping

Finally checked out L2E framework and ran into problems almost instantly.
Yeah, i know... i should read some books before.
Situation:
entity with props -> id and name.
entity is mapped to table, which has id and name columns.
sproc, which returns ONLY id column.
Problem:
ObjectResult<MyProp> result = _container.MyStoredProcedure(uberParameter);
Calling this will cause an error
[guilty method goes here] threw exception:
System.Data.EntityCommandExecutionException: The data reader is incompatible with the specified 'DataBase.MyPropTableObject'. A member of the type, 'name', does not have a corresponding column in the data reader with the same name..
Problem #2:
Can`t "just return" that field, cause that column has XML data type, but sproc uses fancy select statements, which causes:
Msg 421, Level 16, State 1, Line 1
The xml data type cannot be selected as DISTINCT because it is not comparable.
Question:
Is it possible to exclusively turn off mapping for this entity prop only for this one sproc?
Problem 1 is due to the proc not having the columns to populate the entity. You don't really need the proc if you have mapped the table, just select the field you want from it using linq
var result = MyEntities.EntityIMapped.First(r => r.id = uberParameter).Name;
Would give you the value from the Name column of the table for the given id. You don't need to use a stored proc for this.
Problem 2 sounds like it is in the proc, I would think that distinct on an xml data column would give a lot of results, but I'm only guessing as I don't know your solution.
This is not a direct answer for your question but, hopefully it will point you in the right direction.

Resources