I have multiple XML files that I need to parse. Problem is that I only need some data in the last couple of lines.
I currently use XMLTextReader and reader.ReadToFollowing("DATANEEDED"); but it is still too slow. Does anyone know if I can 'tail' an XML file and read from there? (taking into account the tail would not be a valid XML file) or any other ways to retrieve the last few nodes in the XML without parsing through the entire XML file?
I am using .NET 2.0 so no in-built linq :(
Thanks
XmlDocument is a better choice. Within it use xPath queries. I guess XmlDocument take cares about performance automatically.
Related
Easy question. I need to read a CSV file in .NET, and for that I'm using the Lumenworks CSV library.
The problem is that it seems this solution reads the entire CSV content into memory. I was wondering if there's another option that would let me run through the CSV content one element at a time, and therefore, consume less memory.
Something like XmlDocument vs. XmlReader.
Thanks
You can use StreamReader Class to load the file line by line to do some operations like searching, matching, etc., with the method StreamReader.ReadLine Method. One sample is contained in it to show how. This really costs little time.
Store the position or line number after once of operation, then in the next operation use the Stream.Seek Method to start load from the stored position.
I want to use XML file as a data source for my application.
What approach should I take any example??
Thanx
This tutorial will show you how to use LINQ to XML to read and also add data to an external XML file, in C#, LINQ to XML C# Tutorial.
Another tutorial regarding LINQ to XML which explains LINQ a bit more can be found here
Also why are you opting for an XML file as data storage? Hope some of this will help though!
P.S. These tutorials are not mine thus credit goes out to the authors.
XML is very inefficient at the operations you mention. Manipulating XML files without reading the whole file into memory, changing it, and writing it back out again is very likely not worth the effort.
The better bet would be to use a real database, perhaps SQLite if you need something simple and file-based. Then you can write a simple routine to dump this data to XML whenever you require.
I need to parse a large trace file (up to 200-300 MB) in a Flex application. I started using JSON instead of XML hoping to avoid these problems, but it did not help much. When the file is bigger than 50MB, JSON decoder can't handle it (I am using the as3corelib).
I have doing some research and I found some options:
Try to split the file: I would really like to avoid this; I don't want to change the current format of the trace files and, in addition, it would be very uncomfortable to handle.
Use a database: I was thinking of writing the trace into a SQLite database and then reading from there, but that would force me to modify the program that creates the trace file.
From your experience, what do you think of these options? Are there better options?
The program that writes the trace file is in C++.
Using AMF will give you much smaller data sizes for transfer because it is a binary, not text format. That is the best option. But, you'll need some middleware to translate the C++ program's output into AMF data.
Check out James Ward's census application for more information about benchmarks when sharing data:
http://www.jamesward.com/census/
http://www.jamesward.com/2009/06/17/blazing-fast-data-transfer-in-flex/
Maybe you could parse the file into chunks, without splitting the file itself. That supposes some work on the as3 core lib Json parser, but it should be doable, I think.
I found this library which is a lot faster than the official one: https://github.com/mherkender/actionjson
I am using it now and works perfectly. It also has asynchronous decoder and encoder
Hey. I am quite new to the whole web development/programming. I am trying to create an RSS feed which gets info from a separate XML file.
I know basics about XML and RSS, but I don't know how to make it updade. Lets say I update the XML then how would the RSS update automatically? Can someone maybe put me on the right track? Thanks in advance.
In which programming language do you want to accomplish this? One way would be to run a program that does some XML parsing and writing, e.g. PHP with SimpleXML and running the script as a cronjob.
[Edit:]
You could use LINQ to XML for that in ASP.NET, it is easy to use, just look at tutorials like Using LINQ to XML or
Introduction to LINQ - Simple XML Parsing
I'm trying to read through the documentation on Berkeley DB XML, and I think I could really use a developer's blog post or synopsis of when they had a problem that found the XML layer atop Berkeley DB was the exact prescription for.
Maybe I'm not getting it, but it seems like they're both in-process DBs, and ultimately you will parse your XML into objects or data, so why not start by storing your data parsed, rather than as XML?
Ultimately I want my data stored in some reasonable format.
If that data started as XML and I want to retrieve it/them using XQuery, without the XML layer, I have to write a lot of code to do the XQuery by myself, and perhaps even worse to know my XML well enough to be able to have a reasonable storage system for it.
Conversely, so long as the performance of the system allows, I can forget about that part of the back end, and just worry about my XML document and up (i.e. to the user) level and leave the rest as a black box. It gives me the B-DB storage goodness, but I get to use it from a document-centric perspective.