I retrieve the html from a cross domain web page using asp.net vb
Dim objWebClient As New WebClient()
objWebClient.UseDefaultCredentials = True
objWebClient.Headers.Add(HttpRequestHeader.UserAgent, "XPlorer")
'STEP 2: Call the DownloadedData method
Const strURL As String = "http://www.example.com"
Dim aRequestedHTML() As Byte
aRequestedHTML = objWebClient.DownloadData(strURL)
'STEP 3: Convert the Byte array into a String
Dim objUTF8 As New UTF8Encoding()
Dim strRequestedHTML As String
strRequestedHTML = objUTF8.GetString(aRequestedHTML)
Additionally I want show just a portion of it in a literal control. As an example I want to show just the table with the class "result".
How do I process this further in XML and XQuery in VB.NET? How do I declare strRequestedHTML as XML and how do I xquery in it?
thx in advance...
If your talking about a webpage (html) it would be better to parse it as HTML rather than XML.
Html Agility Pack is a good open source HTML parser for .NET
You could also use Html Agility Pack do download the web page aswell.
Something like:
Dim htmlWeb As HtmlAgilityPack.HtmlWeb = New HtmlWeb()
Dim htmlDocument As HtmlAgilityPack.HtmlDocument = htmlWeb.Load("http://www.google.com")
Dim htmlNode As HtmlAgilityPack.HtmlNode = htmlDocument.DocumentNode.SelectSingleNode("//table[#class='result']")
Response.Write(htmlNode)
Related
I am trying to create mail merge process with word document (images, text, html). I am finding help online for a desktop application, but trying to do it from a web aspx page.
here is code I tried
Dim wordDoc As WordprocessingDocument = WordprocessingDocument.Open(document, True)
Using (wordDoc)
Dim docText As String = Nothing
Dim sr As StreamReader = New StreamReader(wordDoc.MainDocumentPart.GetStream)
Using (sr)
docText = sr.ReadToEnd
End Using
Dim regexText As Regex = New Regex("Hello world!")
docText = regexText.Replace(docText, "Hi Everyone!")
Dim sw As StreamWriter = New StreamWriter(wordDoc.MainDocumentPart.GetStream(FileMode.Create))
Using (sw)
sw.Write(docText)
End Using
End Using
but can not even run. I get an error in visual studio 2015.
Severity Code Description Project File Line Suppression State
Error BC30002 Type 'WordprocessingDocument' is not defined.
Any ideas?
thanks
I've an XML string in database column like this
<trueFalseQuestion id="585" status="correct" maxPoints="10"
maxAttempts="1"
awardedPoints="10"
usedAttempts="1"
xmlns="http://www.ispringsolutions.com/ispring/quizbuilder/quizresults">
<direction>You have NO control over how you treat customers.</direction>
<answers correctAnswerIndex="1" userAnswerIndex="1">
<answer>True</answer>
<answer>False</answer>
</answers>
</trueFalseQuestion>
But I need to do XML operations on this string like select its name, attributes values,inner text etc. How can I make this possible from this string
EDIT
Im sharing the code snippet I tried, but not working
Dim myXML As String
Dim gDt As New DataTable
gDt.Columns.Add("id")
gDt.Columns.Add("questionid")
gDt.Columns.Add("serial")
gDt.Columns.Add("direction")
Dim dr As DataRow
myXML ='<my above shared XML>'
Dim xmlDoc As New XmlDocument
xmlDoc.LoadXml(myXML)
dr = gDt.NewRow
dr("serial") = 1
dr("id") = xmlDoc.Attributes("id").Value
dr("direction") = xmlDoc("direction").InnerText
gDt.Rows.Add(dr)
But thats not working at all as I wish
There are many ways to parse XML in .NET, such as using one of the serialization classes or the XmlReader class, but the two most popular options would be to parse it with either XElement or XmlDocument. For instance:
Dim input As String = "<trueFalseQuestion id=""585"" status=""correct"" maxPoints=""10"" maxAttempts=""1"" awardedPoints=""10"" usedAttempts=""1"" xmlns=""http://www.ispringsolutions.com/ispring/quizbuilder/quizresults""><direction>You have NO control over how you treat customers.</direction><answers correctAnswerIndex=""1"" userAnswerIndex=""1""><answer>True</answer><answer>False</answer></answers></trueFalseQuestion>"
Dim element As XElement = XElement.Parse(input)
Dim id As String = element.#id
Or:
Dim input As String = "<trueFalseQuestion id=""585"" status=""correct"" maxPoints=""10"" maxAttempts=""1"" awardedPoints=""10"" usedAttempts=""1"" xmlns=""http://www.ispringsolutions.com/ispring/quizbuilder/quizresults""><direction>You have NO control over how you treat customers.</direction><answers correctAnswerIndex=""1"" userAnswerIndex=""1""><answer>True</answer><answer>False</answer></answers></trueFalseQuestion>"
Dim doc As New XmlDocument()
doc.LoadXml(input)
Dim nsmgr As New XmlNamespaceManager(doc.NameTable)
nsmgr.AddNamespace("q", "http://www.ispringsolutions.com/ispring/quizbuilder/quizresults")
Dim id As String = doc.SelectSingleNode("/q:trueFalseQuestion/#id", nsmgr).InnerText
Based on your updated question, it looks like the trouble you were having is that you weren't properly specifying the namespace. If you use XElement, it's much more forgiving (i.e. loose), but when you use XPath to select nodes in an XmlDocument, you need to specify every namespace, even when it's the default namespace on the top-level element of the document.
I'm trying to scrape some schedules off of a website. the information is displayed in a GridView with paging.
The url is:
http://www.landmarkworldwide.com/when-and-where/register/search-results.aspx?prgid=0&pgID=270&crid=0&ctid=&sdt=0
My Issue is when I want to scrape pages other then #1 in the grid view.
The best post I found so far was This One, but it doesn't work and that topic is not complete. I tried to use Fiddler and Chrome to get the post data and use it, but I can't get it to work for me. Can you guys see what's missing?
Here's the code I am using. it's in VB, but you can answer in C# and I'll translate -) (sorry)
Protected Sub Page_Load(sender As Object, e As System.EventArgs) Handles Me.Load
Dim lcUrl As String = "http://www.landmarkworldwide.com/when-and-where/register/search-results.aspx?prgid=0&pgID=270&crid=0&ctid=&sdt=0"
' first, request the login form to get the viewstate value
Dim webRequest__1 As HttpWebRequest = TryCast(WebRequest.Create(lcUrl), HttpWebRequest)
Dim responseReader As New StreamReader(webRequest__1.GetResponse().GetResponseStream())
Dim responseData As String = responseReader.ReadToEnd()
responseReader.Close()
' extract the viewstate value and build out POST data
Dim viewState As String = ExtractViewState(responseData)
Dim loHttp As HttpWebRequest = DirectCast(WebRequest.Create(lcUrl), HttpWebRequest)
' *** Send any POST data
Dim lcPostData As String = [String].Format("__VIEWSTATE={0}&__EVENTTARGET={1}&__EVENTARGUMENT={2}", viewState, HttpUtility.UrlEncode("contentwrapper_0$maincontent_0$maincontentfullwidth_0$ucSearchResults$gvPrograms"), HttpUtility.UrlEncode("Page$3"))
loHttp.Method = "POST"
Dim lbPostBuffer As Byte() = System.Text.Encoding.GetEncoding(1252).GetBytes(lcPostData)
loHttp.ContentLength = lbPostBuffer.Length
Dim loPostData As Stream = loHttp.GetRequestStream()
loPostData.Write(lbPostBuffer, 0, lbPostBuffer.Length)
loPostData.Close()
Dim loWebResponse As HttpWebResponse = DirectCast(loHttp.GetResponse(), HttpWebResponse)
Dim enc As Encoding = System.Text.Encoding.GetEncoding(1252)
Dim loResponseStream As New StreamReader(loWebResponse.GetResponseStream(), enc)
Dim lcHtml As String = loResponseStream.ReadToEnd()
loWebResponse.Close()
loResponseStream.Close()
Response.Write(lcHtml)
End Sub
Private Function ExtractViewState(s As String) As String
Dim viewStateNameDelimiter As String = "__VIEWSTATE"
Dim valueDelimiter As String = "value="""
Dim viewStateNamePosition As Integer = s.IndexOf(viewStateNameDelimiter)
Dim viewStateValuePosition As Integer = s.IndexOf(valueDelimiter, viewStateNamePosition)
Dim viewStateStartPosition As Integer = viewStateValuePosition + valueDelimiter.Length
Dim viewStateEndPosition As Integer = s.IndexOf("""", viewStateStartPosition)
Return HttpUtility.UrlEncodeUnicode(s.Substring(viewStateStartPosition, viewStateEndPosition - viewStateStartPosition))
End Function
To make it work you need to send all input fields to the page, not only viewstate. Other critical data is the __EVENTVALIDATION for example that you do not handle it. So:
First you need to make scrape on the #1 page. So load it and use the Html Agility Pack to convert it to a usable struct.
Then extract from that struct the input data that you need to post. From this answer HTML Agility Pack get all input fields here is a code sniped on how you can do that.
foreach (HtmlNode input in doc.DocumentNode.SelectNodes("//input"))
{
// use this to create the post string
// input.Attributes["value"];
}
Then when you have the post data that is needed to be a valid post, you move to the next step. Here is an example How to pass POST parameters to ASP.Net web request?
You can also read: How to use HTML Agility pack
XML/ASP.net VB newbie here having fun can't find needle in haystack.
I just want to dump some XML to the screen! Loads of sites tell me how to iterate the nodes, xpath my way in directly. I just want the whole lot to screen.
Dim doc As New XmlDocument
doc.Load("remote.xml")
Dim writer as XmlTextWriter = new XmlTextWriter("debug.xml",nothing)
writer.Formatting = Formatting.Indented
doc.Save(writer)
Does a sterling job of getting it to a file, but I want it on the screen. doc.print(writer).....
Please help.
Try it with the innerXml of your doc. Make sure to HtmlEncode it for it to show up. Stick a literalcontrol on your aspx with id='ltXml' and then something like this:
Dim doc As New XmlDocument()
doc.Load(Server.MapPath("~/remote.xml"))
ltXml.Text = Server.HtmlEncode(doc.InnerXml)
Edited per comment by OP.
Have the function in your class return the Xml string.
Private Class [MyClass]
Public Shared Function getXml() As String
Dim doc As New XmlDocument()
doc.Load("somefile.xml")
Return HttpContext.Current.Server.HtmlEncode(doc.InnerXml)
End Function
End Class
Then in your aspx code behind of your webpage call the class function:
ltXml.Text = [MyClass].getXml()
I suggest using the modern XDocument class instead of the old, deprecated XmlDocument.
XDocument.ToString already returns a nicely formatted version of the XML, so all you need to do is:
Dim doc As XDocument = XDocument.Load("remote.xml")
Dim formatted As String = doc.ToString()
I'm looking for options to fill a Word Document from either Visual Basic, or Visual C#. I'm currently using merge fields, and the code below to fill specific fields in a Word Document, but now I've run into a situation where I need tabular data pushed to MS Word. Is there anyway to take data from a grid view (number of rows is dynamic), and import it into a Word Document Table using a merge field or something of that sort? I have to maintain the format of my template doc, and would like to be able to control the layout of the page ..
Dim templateDoc As String = Server.MapPath("\Userfiles\docs\" & location)
Dim mergePath As String = Server.MapPath("\App_Data\Temp\")
Dim mergeFileName As String = location.Replace("/", "_") & ".docx"
Dim mergeDoc As String = mergePath & "\" & mergeFileName
File.Copy(templateDoc, mergeDoc, True)
Using pkg As Package = Package.Open(mergeDoc, FileMode.Open, FileAccess.ReadWrite)
Dim uri As Uri = New Uri("/word/document.xml", UriKind.Relative)
Dim part As PackagePart = pkg.GetPart(uri)
Dim xmlMainXMLDoc As XmlDocument = New XmlDocument()
xmlMainXMLDoc.Load(part.GetStream(FileMode.Open, FileAccess.Read))
Dim innerXml As String = xmlMainXMLDoc.InnerXml _
.Replace("«Corporate Legal Name»", businessName) _
.Replace("«Address 1»", mailingAddress1) _
.Replace("«Address 2»", mailingAddress2) _
.Replace("«City»", city)
xmlMainXMLDoc.InnerXml = innerXml
Using partWriter As New StreamWriter(part.GetStream(FileMode.Open, FileAccess.Write))
xmlMainXMLDoc.Save(partWriter)
End Using
pkg.Close()
End Using
You can write out in HTML and save it with a .doc extension and Word will handle it gracefully.
Just like my answer for your question regarding Excel, Office writer will work for you here too!