Use BizTalk for converting XML to JSON format - biztalk

We are working on a project that converts/transforms XML files from one format to another. The file and output file is not only different from "elements name" prospective but there are also calculations that involve huge number of DB tables for mapping elements and lookup values. In addition, the element names are different from both sides and there are too much conditional logic operate inside.
We have a C# project that does the whole logic for us but it takes us 2-3 minutes for a single file to be converted that is why we want to use a ready-made tool instead.
My Question Is: Does BizTalk support conversion of XML to JSON and vice versa by including business logic, Lookup values (tbls), different mappings of elements, and etc? Can I also run it as a service so that it handles the process in a loop base for converting thousands of files every day?

Yes. BizTalk can do this. In particular, BizTalk 2013R2 has some enhanced support for JSON, and 2016 (coming out later this year) should see further improvements. BizTalk is pretty much made for this.
However, I'd caution you against doing this purely for speed. It's entirely possible that a BizTalk integration for this will take as long as or longer than your C# project (depending on what methods/patterns you used in the C# project). It's also possible it could go a lot faster. It really depends on a lot of factors (size of the file, connectivity to the database, complexity of rules/transformations).
What BizTalk will bring to bear is an easier mapping/transformation interface, a built in rules engine, adapters and pipelines for connecting to your data sources/destinations, and baked in reliability/throttling/resource allocating/multithreading.
One other thing to add - if you envision having many integration needs such as this, then BizTalk can provide a solid foundation for building an integration platform/ESB.

Related

Storing and downloading Data in iOS Applications

I am a bit new to iOS Development and I was wondering if someone could point me in the right direction regarding an application I am working on.
I am currently working on an application that will be displaying product lists and categories. The list is updated on a weekly basis (one every week).
I am now trying to decide two things:
1- What's the best method of storing this data, I am looking for a way that will allow me to replace the data in the application once every week.
2- Is it going to be beneficial to use CoreData? Note that I Only have Product Category, Product and Product Information entities.
Appreciate your support.
I would use Core Data. Because I know Core Data and am used to work with it. But this is clearly very much like using a chainsaw to cut a slice of bread.
As I understand, you're not familiar with Core Data. Maybe it's not the right tool for the job considering the learning curve.
In your case I would simply use JSON files as provided by the server.
That said, if your looking in Core Data anyway, any store will do, either atomic, XML or SQLite. The first two will load the whole data set in memory and queries will be done in memory as well. SQLite provides the benefits usually associated with databases, with a slightly increased complexity. A chainsaw.
I would use Core Data. If you haven't worked with Core Data before, learn it. It's a great framework.

Need help with a large ASP.NET application

We have an ASP classic ERP (very large application) that we want to rewrite using ASP.NET.
I am looking for a way to organize the application so we are going to be able to separate every program / webpage (over 400) from each other. Every program needs to be independent because many developers will work on the project at the same time.
Visual Studio seems to make a DLL for every assembly so I was wondering if it’s a good idea to make a huge solution with one project per DLL.
Ex. :
Customers.aspx + Customers.aspx.vb (compiled) for presentation
Customers.DLL for the object entity
CustomersManager.DLL for business logic
CustomersData.DLL for data access
This way, we would be able to deploy every program separately without altering the others. We would also have over a thousand DLL to manage…
Does it seem to be a good solution for a large scale application?
Anyone has a better idea?
Thanks
Source control was invented exactly for the purpose of having multiple develops concurrently work on large solutions. I do appreciate the value of having components that can be deployed independently, but perhaps the value is lost as the number of independent components that require maintenance approaches the hundreds and thousands?
Separating the application into separate presentation/business logic/DAL DLLs does make sense on a per-module basis, but not usually on a per page basis.
Consider the different functional areas of your application that are likely to share code and start there (one set of projects for each).
That seems like a huge unmanageable nightmare to me.
I've been a part of several large .Net projects and the way that works the best is, like JeffN825, use some sort of source control, along with classes that support your model (database) directly.
Folders under the project root can help you split things up logically "/Customers", "/Orders", etc.
If you want to make separate projects for your classes, that is also done quite a bit. Have a separate project containing all of your database objects. Create another project for Business Logic. Actually create several Business Logic projects if you feel you need it "CustomerBO", "OrderBO", etc.
But managing over 1000 dlls and their associated web pages...that's going to be a nightmare.
I think the question was asking whether having a seperate DLL for every layer of every page was a good architecture, which it is not, if for no other reason than Visual Studio will more likely than not crawl to halt as it tries to load 100's of seperate projects (I shudder to think of what that would do, and how impossible it would be to maintain all those DLL's). Now a more reasonable solution would be to have a DLL for each layer and seperate each page's code into a different file and use a source control system. This would allow developers to share code even with the worst of breed source control systems. If your source control system has decent Branching/Merging support (i.e. not SourceSafe) like TFS, SVN, Git you don't really even after worry about people working on the same file simulatanously, and then you can organize your code by function not page. I'm hazarding to guess from the question, that there is probably an astronomical amount of duplicated code that could be simplified and made easier to maintain by breaking a rigid connection to the web code and reusing code. It can be amazing to see how much less code there will be. You can do the same on UI side with judicious use of User Controls to encapsulate shared functionality. Plus moving from ASP to .NET, you pickup things like SiteMap controls that should reduce the code footprint as well.

What is the best way to store site configuration data?

I have a question about storing site configuration data.
We have a platform for web applications. The idea is that different clients can have their data hosted and displayed on their own site which sits on top of this platform. Each site has a configuration which determines which panels relevant to the client appear on which pages.
The system was originally designed to keep all the configuration data for each site in a database. When the site is loaded all the configuration data is loaded into a SiteConfiguration object, and the clients panels are generated based on the content of this object. This works, but I find it very difficult to work with to apply change requests or add new sites because there is so much data to sift through and it's difficult maintain a mental model of the site and its configuration.
Recently I've been tasked with developing a subset of some of the sites to be generated as PDF documents for printing. I decided to take a different approach to how I would define the configuration in that instead of storing configuration data in the database, I wrote XML files to contain the data. I find it much easier to work with because instead of reading meaningless rows of data which are related to other meaningless rows of data, I have meaningful documents with semantic, readable information with the relationships defined by visually understandable element nesting.
So now with these 2 approaches to storing site configuration data, I'd like to get the opinions of people more experienced in dealing with this issue on dealing with these two approaches. What is the best way of storing site configuration data? Is there a better way than the two ways I outlined here?
note: StackOverflow is telling me the question appears to be subjective and is likely to be closed. I'm not trying to be subjective. I'd like to know how best to approach this issue next time and if people with industry experience on this could provide some input.
if the information is needed for per client specific configuration it is probably best done in a database with an admin tool written for it so that non technical people can also manage it. Also it's easier that way when you need versioning/history on it. XML isn't always the best on that part. Also XML is harder to maintain in the end (for non technical people).
Do you read out the XML every time from disk (performance hit) or do you keep it cached in memory? Either solution you choose, caching makes a big difference in the end for performance.
Grz, Kris.
You're using ASP.NET so what's wrong with web.config for your basic settings (if it's per project deploy), then as you've said, custom XML or database configuration settings for anything more complicated (or if you have multiple users/clients with the same project deploy)?
I'd only use custom XML documents for something like a "site layout document" where things won't change that often and you're going to have lots of semi-meaningless data (e.g. 23553123). And layout should be handled by css as much as possible anyway.
For our team XML is a good choice (app.config or web.config or custom configuration file, it depends), but sometimes it is better to design configuration API to make configurations in code. For example modern IoC containers has in-code configuration APIs with fluent interfaces. This approach can give benefits if you need to configure many similar to each other entities or want to achive good human readability. But this doesn't works if non-programmers need to make configurations.

Excel Upload to database table

I'm looking for the best solution to allow our users to upload XLS spreadsheet so that they can be used to populate tables in our data warehouse (DW).
Our users are heavy Business Object (BO) users, and BO lets you export to XLS. When they have data in a spreadsheet that needs to be loaded to the DW, they need a process to upload the data in the XLS to the DW's db. As a result, we end up with many of these "interfaces" when I think that what we really need is a programmatic automated feed. Using Excel as a data source for inter-system feeds, in my gut, just seems like a bad idea to me.
Question #1: I'd like to see if you agree and why or why not.
OK, there is no swimming against that tide, so I now take as a given that XLS uploads are here to stay for us. Now I need to find the best solution. First, I'll explain what we do now and then what I don't like about it:
Via web pages, we provide empty XLS files (no rows) with a defined set of columns. Each file is intended to be used to update a different target dest table. In each spreadsheet is an "upload" button. Pushing the Upload button results in the macro in the spreadsheet serializing the contents of the file to CSV and FTPing the data to server folder. Periodically, a scheduler fires off an Informatica ETL job that uses the CSV file as input and loads the data into a custom XLS-specific staging table and then, if the records pass edits, into the appropriate target table. Any errors encountered are logged to an error table. For each XLS file uploaded, the data ends up in a separate staging and error table that is specific for the file.
Some of the things I don't like include about our process are:
1) The macro code in the XLS is too exposed, includes passwords for example, can be tampered with and there are issues ensuring that the users are using the latest XLS templates.
2) Business Rule edits are placed in the ETL program, where they should probably be, but because we would like to catch the errors ASAP, i.e, in the spreadsheet, edits are also added to the macro code. This results in duplication of business edits. I want these rules in one place and centrally controlled. IMHO, I think putting any macro code in the XLS introduces a maintenance issue, even calls to stored procedures (some of which we have) or calls to web services (we haven't yet tried to call .NET Web Services from XLS macros.)
3) Every XLS file upload template has its own process with distinct set of staging and error tables and a custom screen for reporting errors encountered. It seems like we need a more generalized re-usable solution.
Besides often getting data exported to XLS from BO, the users like also Excel because it is easier to edit a large number of records and less clunkier than editing individual records via a web interface.
This is the general direction that I am thinking:
First, I want the users to have the ease of editing of Excel with editing, but without including embedded macros in the spreadsheet. I experimented with Farpoint's Grid with Excel compatibility...
http://www.fpoint.com/netproducts/spreadweb/tour/excel.aspx
...and I found that it was quite easy to allow a user the ability to open up an XLS file that resides on their PC and have it open up in a browser and be able to easily access the data read from server-side .NET web code. Excel isn't running locally in their browser, but the functionality of Excel is reproduced, presumably through a lot of client sided scripting that I expect would be a real pain to duplicate myself. You can even cut and paste from a local spreadsheet into the web's spreadsheet. This sounds great, by biggest problem is cost. Our company is near death and won't allow us to purchase any new software.
Next, I want to identify the common components across all spreadsheet upload processing and come up with generic processing code. For example, I imagine a table which defines each of our spreadsheets and the format of each including the column names and data type definitions, perhaps in terms of their destination columns instead of hard coding. Based on this table template definition, I can generate XLS templates for download from this table definition. I can also perform simple generic edits to ensure that the data entered matches the table definition. And one common web page can be used to present the data and allow report data type mismatch errors and allow for the user to correct them. I would also define a common table for storing the data in a "staging" table, using a table with two columns, submission #, row num, name and value, perhaps. No more "custom everything" is the goal.
Next I need to decide where to put the business rules. My dept's mgt firmly believes that all loading of data should be done by Informatica ETL batch processes and therefore the rules/edits belong "in Informatica". I have zero experience with Informatica tools, I am more of a .NET guy. I am therefore unsure as to how these rules are implemented but I suspect that they are not reusable in the sense that they can be used by a .NET web page to validate a particular record against. You see, in some cases, when the user is not performing a bulk upload, they do have the ability to edit a specific record and I would like the same edits that were applied by the ETL bulk insert process to be applied to an individual update attempt to a single record via a web page. If the solution to write a single web service or stored procedure that can be called from either the web page doing an update of a single record or called thousands of times for each record in a bulk upload? The latter sounds inefficient.
Your thoughts on anything above would be very much welcomed.
From a cost perspective, the efforts you'll need to go through to re-create spreadsheet functionality on the web will exceed the cost of Farpoint or other controls. Even if you made $20 an hour, do you think you could complete a working product in under 2 weeks? I think you have the facts on your side when you discussed maintenance issues if you allow ETL functionality to exist in Excel - you have twice the amount of work to maintain the transformation rules. I think you need to convince management that in order to create a maintainable, robust solution you need some flexible utilities.
Farpoint is a good choice. There is also SpreadsheetGear that is a .Net engine that interprets Excel macros and can run on a web server. It has a Win32 control that allows you to create a WinForms solution with very Excel interface functionality. Last time I checked there was no web control for the product. It does an excellent job of providing Excel capability for processing large amounts of data.
Good luck. I think you will find a good solution since you seem to have a good grasp of the pro's and con's of all the different potential solutions.

How do you use Excel server-side?

A client wants to "Web-enable" a spreadsheet calculation -- the user to specify the values of certain cells, then show them the resulting values in other cells.
(They do NOT want to show the user a "spreadsheet-like" interface. This is not a UI question.)
They have a huge spreadsheet with lots of calculations over many, many sheets. But, in the end, only two things matter -- (1) you put numbers in a couple cells on one sheet, and (2) you get corresponding numbers off a couple cells in another sheet. The rest of it is a black box.
I want to present a UI to the user to enter the numbers they want, then I'd like to programatically open the Excel file, set the numbers, tell it to re-calc, and read the result out.
Is this possible/advisable? Is there a commercial component that makes this easier? Are their pitfalls I'm not considering?
(I know I can use Office Automation to do this, but I know it's not recommended to do that server-side, since it tries to run in the context of a user, etc.)
A lot of people are saying I need to recreate the formulas in code. However, this would be staggeringly complex.
It is possible, but not advisable (and officially unsupported).
You can interact with Excel through COM or the .NET Primary Interop Assemblies, but this is meant to be a client-side process.
On the server side, no display or desktop is available and any unexpected dialog boxes (for example) will make your web app hang – your app will behave flaky.
Also, attaching an Excel process to each request isn't exactly a low-resource approach.
Working out the black box and re-implementing it in a proper programming language is clearly the better (as in "more reliable and faster") option.
Related reading: KB257757: Considerations for server-side Automation of Office
You definitely don't want to be using interop on the server side, it's bad enough using it as a kludge on the client side.
I can see two options:
Figure out the spreadsheet logic. This may benefit you in the long term by making the business logic a known quantity, and in the short term you may find that there are actually bugs in the spreadsheet (I have encountered tons of monster spreadsheets used for years that turn out to have simple bugs in them - everyone just assumed the answers must be right)
Evaluate SpreadSheetGear.NET, which is basically a replacement for interop that does it all without Excel (it replicates a huge chunk of Excel's non-visual logic and IO in .NET)
Although this is certainly possible using ASP.NET, it's very inadvisable. It's un-scalable and prone to concurrency errors.
Your best bet is to analyze the spreadsheet calculations and duplicate them. Now, granted, your business is not going to like the time it takes to do this, but it will (presumably) give them a more usable system.
Alternatively, you can simply serve up the spreadsheet to users from your website, in which case you do almost nothing.
Edit: If your stakeholders really insist on using Excel server-side, I suggest you take a good hard look at Excel Services as #John Saunders suggests. It may not get you everything you want, but it'll get you quite a bit, and should solve some of the issues you'll end up with trying to do it server-side with ASP.NET.
That's not to say that it's a panacea; your mileage will certainly vary. And Sharepoint isn't exactly cheap to buy or maintain. In fact, short-term costs could easily be dwarfed by long-term costs if you go the Sharepoint route--but it might the best option to fit a requirement.
I still suggest you push back in favor of coding all of your logic in a separate .NET module. That way you can use it both server-side and client-side. Excel can easily pass calculations to a COM object, and you can very easily publish your .NET library as COM objects. In the end, you'd have a much more maintainable and usable architecture.
Neglecting the discussion whether it makes sense to manipulate an excel sheet on the server-side, one way to perform this would probably look like adopting the
Microsoft.Office.Interop.Excel.dll
Using this library, you can tell Excel to open a Spreadsheet, change and read the contents from .NET. I have used the library in a WinForm application, and I guess that it can also be used from ASP.NET.
Still, consider the concurrency problems already mentioned... However, if the sheet is accessed unfrequently, why not...
The simplest way to do this might be to:
Upload the Excel workbook to Google Docs -- this is very clean, in my experience
Use the Google Spreadsheets Data API to update the data and return the numbers.
Here's a link to get you started on this, if you want to go that direction:
http://code.google.com/apis/spreadsheets/overview.html
Let me be more adamant than others have been: do not use Excel server-side. It is intended to be used as a desktop application, meaning it is not intended to be used from random different threads, possibly multiple threads at a time. You're better off writing your own spreadsheet than trying to use Excel (or any other Office desktop product) form a server.
This is one of the reasons that Excel Services exists. A quick search on MSDN turned up this link: http://blogs.msdn.com/excel/archive/category/11361.aspx. That's a category list, so contains a list of blog posts on the subject. See also Microsoft.Office.Excel.Server.WebServices Namespace.
It sounds like you're talking that the user has the spreadsheet open on their local system, and you want a web site to manipulate that local spreadsheet?
If that's the case, you can't really do that. Even Office automation won't help, unless you want to require them to upload the sheet to the server and download a new altered version.
What you can do is create a web service to do the calculations and add some vba or vsto code to the Excel sheet to talk to that service.

Resources