parse lazy load result table(json) - asp.net

i try to parse this link : http://agent.bronni.ru/Result.aspx?id=c7a6a33a-174e-426d-b127-828ee612c36e&account=27178&page=1&pageSize=50&mr=true
but i can t get the result table because as i see in fiddler there are lazyloading method with json result.
My code is :
HtmlWeb hw = new HtmlWeb();
HtmlDocument doc = hw.Load("http://agent.bronni.ru/Result.aspx?id=c7a6a33a-174e-426d-b127-828ee612c36e&account=27178&page=1&pageSize=50&mr=true");
// Get all tables in the document
HtmlNodeCollection tables = doc.DocumentNode.SelectNodes("//table");
// Iterate all rows in the first table
HtmlNodeCollection rows = tables[0].SelectNodes(".//tr");
var data = rows.Skip(1).ToList().Take(10).ToList().Select(x => new TableRow()
{
Price = x.SelectNodes(".//td").ToList()[4].InnerText,
Operator = x.SelectNodes(".//td").ToList()[15].InnerText,
DepartureDate = x.SelectNodes(".//td").ToList()[6].InnerText,
DestinationRegion = x.SelectNodes(".//td").ToList()[7].InnerText
}).ToList();
UPDATE
Second site :
Code
WebClient wc = new WebClient();
wc.Headers.Add("Referer", "http://sletat.ru/");//MUST BE THIS HEADER
string result = wc.DownloadString("http://module.sletat.ru/Main.svc/GetTours?cityFromId=832&countryId=35&cities=&meals=&stars=&hotels=&s_adults=1&s_kids=0&s_kids_ages=&s_nightsMin=6&s_nightsMax=16&s_priceMin=0&s_priceMax=&currencyAlias=RUB&s_departFrom=25%2F06%2F2012&s_departTo=31%2F07%2F2012&visibleOperators=&s_hotelIsNotInStop=true&s_hasTickets=true&s_ticketsIncluded=true&debug=0&filter=0&f_to_id=&requestId=19198631&pageSize=20&pageNumber=1&updateResult=1&includeDescriptions=1&includeOilTaxesAndVisa=1&userId=&jskey=1&callback=_jqjsp&_1340633427022=");
result = result.Substring(result.IndexOf("{"), result.LastIndexOf("}") - result.IndexOf("{") + 1);
JavaScriptSerializer js = new JavaScriptSerializer();
dynamic json = js.DeserializeObject(result);
var prices = json["GetToursResult"]["Data"]["aaData"] as object[];
// var operators = ((object[])json["result"]["prices"]).Cast<Dictionary<string, object>>();
var temp = prices.ToList().Take(20).Select(x => new TableRow
{
Operator = (x as object[])[40].ToString(),
//Price = x["operatorPrice"].ToString(),
//DepartureDate = x["checkinDate"].ToString(),
//DestinationRegion = ((Dictionary<string, object>)x["country"])["englishName"].ToString()
}).ToList();
string str = "";
foreach (var tableRow in temp)
{
str += tableRow.Operator + "<br />";
}
Response.Write(str);
In this way i try all works ok but the problem is that this link works for roughly 30minutes and then i need to put other link again.Is any way to fix this?(only the second site has it)
THanks again,

The data is really coming from here:
http://beta.remote.bronni.ru/LazyLoading.ashx/getResult?jsonp=jQuery17207647891761735082_1340131755603&id=c7a6a33a-174e-426d-b127-828ee612c36e&page=3&pageSize=50&_=1340131756631
With the exception that the page=# and pageSize=# can be adjusted dynamically.
So instead of parsing HTML, you could just get the JSON data from the URL and parse it. For example:
WebClient wc = new WebClient();
string result =wc.DownloadString("http://beta.remote.bronni.ru/LazyLoading.ashx/getResult?jsonp=jQuery17207647891761735082_1340131755603&id=c7a6a33a-174e-426d-b127-828ee612c36e&page=1&pageSize=1000&_=1340131756631");
result = result.Substring(result.IndexOf("{"),result.LastIndexOf("}")-result.IndexOf("{")+1);
JavaScriptSerializer js = new JavaScriptSerializer();
dynamic json = js.DeserializeObject(result);
var prices = ((object[])json["result"]["prices"]).Cast<Dictionary<string,object>>();
var data = from p in prices
select new
{
OperatorID = p["operatorID"],
Price = p["operatorPrice"],
Country = ((Dictionary<string,object>)p["country"])["englishName"],
CheckinDate = p["checkinDate"]
};
Console.WriteLine(data);
On my LinqPad program, produces something like:
OperatorID Price Country CheckinDate
0 1,27 Greece 2012-06-28
0 55,90 Greece 2012-06-28
0 67,34 Greece 2012-06-28
And many more rows, depending on how much you ask for...
Note: the reason for the result = result.Substring(result.IndexOf("{"),result.LastIndexOf("}")-result.IndexOf("{")+1); line is that the jsonp result has this garbage in the beginning:
jQuery17207647891761735082_1340131755603({"
Ending with }) which makes the JavascriptSerializer choke when it tries to parse it; hence the need to remove it.
Update:
Interestingly, the ASHX handler that returns the data seems to require a Referer Header in the request; otherwise, the response will not include the operator information. The Referer required cannot be anything you want, it seems that it's actually looking for http://agent.bronni.ru in particular.
Basically, all you need to do is the following:
WebClient wc = new WebClient();
wc.Headers.Add("Referer","http://agent.bronni.ru");//MUST BE THIS HEADER
string result =wc.DownloadString("http://beta.remote.bronni.ru/LazyLoading.ashx/getResult?jsonp=jQuery17207647891761735082_1340131755603&id=c7a6a33a-174e-426d-b127-828ee612c36e&page=1&pageSize=1000&_=1340131756631");
result = result.Substring(result.IndexOf("{"),result.LastIndexOf("}")-result.IndexOf("{")+1);
JavaScriptSerializer js = new JavaScriptSerializer();
dynamic json = js.DeserializeObject(result);
var prices = ((object[])json["result"]["prices"]).Cast<Dictionary<string,object>>();
var data = from p in prices
select new
{
OperatorID = p["operatorID"],
Price = p["operatorPrice"],
Country = ((Dictionary<string,object>)p["country"])["englishName"],
Hotel = ((Dictionary<string,object>)p["hotel"])["englishName"],
Operator = ((Dictionary<string,object>)p["operator"])["englishName"],//OPERATOR
CheckinDate = p["checkinDate"]
};
OperatorID Price Country Hotel Operator CheckinDate
19681 1,27 Greece Julia Hotel Mouzenidis Travel 2012-06-28
19681 1,27 Greece Forest Park Mouzenidis Travel 2012-06-28
19681 1,27 Greece Kassandra Mare (ï-îâ Êàññàíäðà) Mouzenidis Travel 2012-06-28
UPDATE 2:
I decided to compare the performance of the out-of-the-box Javascriptserializer vs JSON.NET serializer and in all my tests with different record sizes (50,1000,3000) JSON.NET was at least twice faster than the Javascriptserializer and in some cases even 10 times faster on smaller record-sets.
If you decide to use the JSON.NET library, here's the code that will get you the same results as above code:
WebClient wc = new WebClient();
wc.Headers.Add("Referer","http://agent.bronni.ru");
string result =wc.DownloadString("http://beta.remote.bronni.ru/LazyLoading.ashx/getResult?jsonp=jQuery17207647891761735082_1340131755603&id=c7a6a33a-174e-426d-b127-828ee612c36e&page=1&pageSize=50&_=1340131756631");
result = result.Substring(result.IndexOf("{"),result.LastIndexOf("}")-result.IndexOf("{")+1);
JObject o = JObject.Parse(result);
var data = from x in o["result"]["prices"]
select new
{
OperatorID = x["operatorID"],
Price = x["operatorPrice"],
Country = x["country"]["englishName"],
Hotel = x["hotel"]["englishName"],
Operator = x["operator"]["englishName"],
CheckinDate = x["checkinDate"]
};
Console.WriteLine(data);

Related

Best way to import bulk data into ArangoDB

I'm currently working on an ArangoDB POC. I find that the time taken for document creation is very high in ArangoDB with PyArango. It takes about 5 minutes to insert 300 documents. I've pasted the rough code below, please let me know if there are better ways to speed this up :
with open('abc.csv') as fp:
for line in fp:
dataList = line.split(",")
aaa = dbObj['aaa'].createDocument()
bbb = dbObj['bbb'].createDocument()
ccc = dbObj['ccc'].createEdge()
bbb['bbb'] = dataList[1]
aaa['aaa'] = dataList[0]
aaa._key = dataList[0]
aaa.save()
bbb.save()
ccc.links(aaa,bbb)
ccc['related_to'] = "gfdgf"
ccc['weight'] = 0
ccc.save()
The different collections are created by the below code :
dbObj.createCollection(className='aaa', waitForSync=False)
for your problem with the batch mode in the arango java driver. if you know the key attributes of the vertices you can build the document handle by "collectionName" + "/" + "documentKey".
Example:
arangoDriver.startBatchMode();
for(String line : lines)
{
String[] data = line.split(",");
BaseDocument device = new BaseDocument();
BaseDocument phyAddress = new BaseDocument();
BaseDocument conn = new BaseDocument();
String keyDevice = data[0];
String handleDevice = "DeviceId/" + keyDevice;
device.setDocumentKey(keyDevice);
device.addAttribute("device_id",data[0]);
String keyPhyAddress = data[1];
String handlePhyAddress = "PhysicalLocation/" + keyPhyAddress;
phyAddress.setDocumentKey(keyPhyAddress);
phyAddress.addAttribute("address",data[1]);
final DocumentEntity<BaseDocument> from = arangoDriver.graphCreateVertex("testGraph", "DeviceId", device, null);
final DocumentEntity<BaseDocument> to = arangoDriver.graphCreateVertex("testGraph", "PhysicalLocation", phyAddress, null);
arangoDriver.graphCreateEdge("testGraph", "DeviceId_PhysicalLocation", null, handleDevice, handlePhyAddress, null, null);
}
arangoDriver.executeBatch();
I would build all of the data to be inserted into a json formatted string and use createDocumentRaw to create them all at once with one save.

After encoding UTF-16, the string is broken if I want to use in iTextSharp

Firstly I am getting some informations from a text file, later these informations are added to pdf files' meta data. In the "Producer" section an error was occured about Turkish characters as ğ, ş. And I solved the problem via using UTF-16 like this:
write.Info.Put(new PdfName("Producer"), new PdfString("Ankara Üniversitesi Hukuk Fakültesi Dergisi (AÜHFD), C.59, S.2, y.2010, s.309-334.", "UTF-16"));
Here is the screenshot:
Then, I am getting all pdf files with foreach loop and reading meta data and insert into SQLite database file. The problem occurs right here. Because when I want to get from pdf file and set to database file UTF-16 encoded string (Producer data), it arises strange characters like this:
I don't understand, why it occurs error.
EDIT: Here is my all codes. The following codes get meta data from text file and insert pdf files' meta meta section:
var articles = Directory.GetFiles(FILE_PATH, "*.pdf");
foreach (var article in articles)
{
var file_name = Path.GetFileName(article);
var read = new PdfReader(article);
var size = read.GetPageSizeWithRotation(1);
var doc = new Document(size);
var write = PdfWriter.GetInstance(doc, new FileStream(TEMP_PATH + file_name, FileMode.Create, FileAccess.Write));
// Article file names like, 1.pdf, 2.pdf, 3.pdf....
// article_meta_data.txt file content like this:
//1#Article 1 Tag Number#Article 1 first - last page number#Article 1 Title#Article 1 Author#Article 1 Subject#Article 1 Keywords
//2#Article 2 Tag Number#Article 2 first - last page number#Article 2 Title#Article 2 Author#Article 2 Subject#Article 2 Keywords
//3#Article 3 Tag Number#Article 3 first - last page number#Article 3 Title#Article 3 Author#Article 3 Subject#Article 3 Keywords
var pdf_file_name = Convert.ToInt32(Path.GetFileNameWithoutExtension(article)) - 1;
var line = File.ReadAllLines(FILE_PATH + #"article_meta_data.txt");
var info = line[pdf_file_name].Split('#');
var producer = Kunye(info); // It returns like: Ankara Üniversitesi Hukuk Fakültesi Dergisi (AÜHFD), C.59, S.2, y.2010, s.309-334.
var keywords = string.IsNullOrEmpty(info[6]) ? "" : info[6];
doc.AddTitle(info[3]);
doc.AddSubject(info[5]);
doc.AddCreator("UzPDF");
doc.AddAuthor(info[4]);
write.Info.Put(new PdfName("Producer"), new PdfString(producer, "UTF-16"));
doc.AddKeywords(keywords);
doc.Open();
var cb = write.DirectContent;
for (var page_number = 1; page_number <= read.NumberOfPages; page_number++)
{
doc.NewPage();
var page = write.GetImportedPage(read, page_number);
cb.AddTemplate(page, 0, 0);
}
doc.Close();
read.Close();
File.Delete(article);
File.Move(TEMP_PATH + file_name, FILE_PATH + file_name);
}
And the following codes get data from files and insert SQLite database file. For database operation, I am using Devart - dotConnect for SQLite.
var files = Directory.GetFiles(FILE_PATH, "*.pdf");
var connection = new Linq2SQLiteDataContext();
TruncateTable(connection);
var i = 1;
foreach (var file in files)
{
var read = new PdfReader(file);
var title = read.Info["Title"].Trim();
var author = read.Info["Author"].Trim();
var producer = read.Info["Producer"].Trim();
var file_name = Path.GetFileName(file)?.Trim();
var subject = read.Info["Subject"].Trim();
var keywords = read.Info["Keywords"].Trim();
var art = new article
{
id = i,
title = (title.Length > 255) ? title.Substring(0, 255) : title,
author = (author.Length > 100) ? author.Substring(0, 100) : author,
producer = (producer.Length > 255) ? producer.Substring(0, 255) : producer,
filename = file_name != null && (file_name.Length > 50) ? file_name.Substring(0, 50) : file_name,
subject = (subject.Length > 50) ? subject.Substring(0, 50) : subject,
keywords = (keywords.Length > 500) ? keywords.Substring(0, 500) : keywords,
createdate = File.GetCreationTime(file),
update = File.GetLastWriteTime(file)
};
connection.articles.InsertOnSubmit(art);
i++;
}
connection.SubmitChanges();
Instead of:
new PdfString(producer, "UTF-16")
Use:
new PdfString(producer, PdfString.TEXT_UNICODE)
UTF-16 is a specific way to store Unicode values but you don't need to worry about that, iText will take care of everything for you.

StoreRequestParameters,get the values issue

on the web service side I am applying
StoreRequestParameters parameters = new StoreRequestParameters(this.Context);
string condition= parameters.GridFilters.ToString();
//I ma sending this to the methot "List<Ks> Get(....)"
to get the gridfilter parameters.
inside the other methot ,trying to get the selected gridfilters values like this.
public List<Ks> Get(int start, int limit, string sort, string terssiralama, string condition, out int totalrow)
{
FilterConditions fc = new FilterConditions(condition);
foreach (FilterCondition cnd in fc.Conditions)
{
Comparison comparison = cnd.Comparison;
string fi = cnd.Field;
FilterType type = cnd.Type;
switch (cnd.Type)
{
case FilterType.Date:
switch (comparison)
{
case Comparison.Eq:
field1 = cnd.Field;
cmp1 = "=";
value1 = cnd.Value<string>();
...........
..........
}
but I failed getting the values like this
FilterConditions fc = new FilterConditions(condition);
I couldnt pass the string values .
should I serializes or deserilized first ?
StoreRequestParameters parameters = new StoreRequestParameters(this.Context);
instead of using this, string condition= parameters.GridFilters.ToString();
I use this
string obj = this.Context.Request["filter"];
and pass it to the
FilterConditions fc = new FilterConditions(obj);
It can be reach all filter condition in fc filtercondition variable.

Getting date as string - need to convert

Programming in Flex 4.5
I'm getting a date as a String.
I don't know what date or hour I'm getting.
I want to convert the string to date and take only the hours & minutes.
For example:
Getting - "2012-02-07T13:35:46+02:00"
I want to see: 13:35.
Suggestions or any other solutions?
After some digging, Solution:
var myDate:Date;
myDate = DateFormmater.parseDateString(myDateString);
var dateResult:String = myDate.getHours() + ":" + myDate.getMinutes();
Thanks anyway! :-)!
You can to use date.getHours() and date.getMinutes(). Try the following:
var d:Date = DateField.stringToDate("your_date_string","YYYY-MM-DD");
trace("hours: ", date.getHours()); // returns 13
trace("minutes: ", date.getMinutes()); // returns 35
private function init():void
{
var isoStr:String = "2012-02-07T13:35:46+02:00";
var d:Date = new Date;
d = isoToDate(isoStr)
trace(d.hours);
}
private function isoToDate(value:String):Date
{
var dateStr:String = value;
dateStr = dateStr.replace(/\-/g, "/");
dateStr = dateStr.replace("T", " ");
dateStr = dateStr.replace("+02:00", " GMT-0000");
return new Date(Date.parse(dateStr));
}
I see you've already got the answer, but for future users, here it is.
var myDateString:String="2012-02-07T13:35:46+02:00"
//This is of the format <yyyy-mm-dd>T<hh:mm:ss><UTC-OFFSET AS hh:mm>
//You could write your own function to parse it, or use Flex's DateFormatter class
var myDate:Date=DateFormatter.parseDateString(myDateString);
//Now, myDate has the date as a Flex Date type.
//You can use the various date functions. In this case,
trace(myDate.getHours()); //Traces the hh value
trace(myDate.getMinutes()); //Traces the mm value

Action Script 3.0 : How to extract two value from string.?

HI
I have a URL
var str:String = "conn=rtmp://server.com/service/&fileId=myfile.flv"
or
var str:String = "fileId=myfile.flv&conn=rtmp://server.com/service/"
The str might be like this, But i need to get the value of "conn" and "fileId" from the string.
how can i write a function for that.
I'm guessing that you're having trouble with the second '=' in the string. Fortunatly, ActionScript's String.Split method supports splitting on strings, so the following code should work:
var str:String = "conn=rtmp://server.com/service/&fileId=myfile.flv";
var conn:String = (str + "&").Split("conn=")[1].Split("&")[0];
and
var str:String = "fileId=myfile.flv&conn=rtmp://server.com/service/";
var fileId:String = (str + "&").Split("fileId=")[1].Split("&")[0];
Note: I'm appending a & to the string, in case the string didn't contain any url parameters beyond the one we're looking for.
var str:String = "fileId=myfile.flv&conn=rtmp://server.com/service/"
var fa:Array = str.split("&");
for(var i:uint=0;i<fa.length;i++)
fa[i] = fa[i].split('=');
That's how the "fa" variable be in the end:
fa =
[
["fileId","myfile.flv"],
["conn","rtmp://server.com/service/"]
]
var url:String = "fileId=myfile.flv&conn=rtmp://server.com/service/";
var strArray:Array = url.split(/=/);
trace(strArray[0]) //Just to test
returns an array, with the word 'conn or fileid' in index 0 - 2 (anything even), alternatives of 1, 3 is the information within.
Or was it something else you needed?

Resources