How to open varbinary word doc as HTML - asp.net

I have a problem which I have not been able to find an answer to in months. I store word doc resumes as varbinary(max). I can retrieve the resumes based on a full-text search – no problem. But the resumes are retrieved as word documents in a .ashx file with the following code. I really need to implement hit highlighting on the site so that users can see if the returned resume is a good fit or not. I don’t think this can be done from an .ashx file, so I think I need to be able to open the resume as html in an aspx page and maybe use javascript to do the hit highlighting or perhaps return the text only content of the word document somehow and manipulate the text before display with html tags. I cant find anything anywhere which addresses the problem. I am really hoping that someone can point me in the right direction.
Thanks in advance for any advice.
Imports System.io
Imports System.Web
Imports System.Data
Imports System.Data.SqlClient
 
Public Class ReadResume : Implements IHttpHandler
Const conString As String = "Data Source=tcp:sql2k804.discountasp.net;Initial Catalog=SQL2008R2_284060_resumedata;User ID=SQL2008R2_284060_resumedata_user;Password=mypwd2314;"
Public Sub ProcessRequest(ByVal context As HttpContext) Implements IHttpHandler.ProcessRequest
Dim con As SqlConnection = New SqlConnection(conString)
Dim cmd As SqlCommand = New SqlCommand("Select ResumeDoc, DocTypeExtension From ResumeTable WHERE CandidateId=#CandidateId", con)
Dim CId As String = System.Web.HttpContext.Current.Request.QueryString("Para")
cmd.Parameters.AddWithValue("#CandidateId", CId)
Using con
con.Open()
Dim myReader As SqlDataReader = cmd.ExecuteReader
If myReader.Read() Then
context.Response.Clear()
context.Response.ClearContent()
context.Response.ClearHeaders()
Dim file() As Byte = CType(myReader("ResumeDoc"), Byte())
Dim doc_type As String = CType(myReader("DocTypeExtension"), String)
context.Response.ContentEncoding = System.Text.Encoding.UTF8
context.Response.ContentType = "Application/msword"
context.Response.AddHeader("content-disposition", "Candidate Resume")
context.Response.BinaryWrite(file)
End If
End Using
End Sub
Public ReadOnly Property IsReusable() As Boolean Implements IHttpHandler.IsReusable
Get
Return False
End Get
End Property
End Class

You can use Microsoft Office COM components to deal with Word documents. For example, that is the way to convert Word to HTML: http://rongchaua.net/blog/c-convert-word-to-html/
UPDATE:
There are other solutions.
If you have only .docx (not .doc) documents then you can use this simple code to extract plain text from docx documents: http://www.codeproject.com/KB/office/ExtractTextFromDOCXs.aspx This is the same code: http://conceptdev.blogspot.com/2007/03/open-docx-using-c-to-extract-text-for.html
There are some commercial libraries for reading/writing Word documents:
http://www.aspose.com/categories/.net-components/aspose.words-for-.net/default.aspx
http://www.cellbi.com/Products.aspx

Related

Download PDF using Response on ASPX Page only working in Page_Load

I've seen several questions relating to downloading a PDF from a Web browser using Response, but none seem to fit the mysterious issue I'm having.
I am working on a project that requires the user to be able to click a button (btnPDF) to instantly download a PDF of a Telerik report with a specific "ID" string to the Downloads folder. This process was originally located in an ASPX Page on an IIS separate from where the button is located. When btnPDF was clicked, I used Response.Redirect to download the PDF through that page. The code to download the PDF looked like this:
Response.Clear()
Response.ContentType = result.MimeType 'this is always "application/pdf"
Response.Cache.SetCacheability(HttpCacheability.Private)
Response.Expires = -1
Response.Buffer = True
Response.AddHeader("Content-Disposition", String.Format("{0};FileName={1}", "attachment", fileName))
Response.BinaryWrite(result.DocumentBytes)
Response.End()
Note that result.DocumentBytes is a byte array containing correct bytes for the PDF.
This code worked fine. Now, instead of having the process on a separate Page in a separate project, I need to merge the process onto the same page where btnPDFis located, so that when you click btnPDF, a subroutine is called that performs the same task. I thought this would be very easy, pretty much a copy and paste. With the same code added in a new subroutine, this is what my click event handler "ButtonPDF_Click" now looks like:
Protected Sub ButtonPDF_Click(ByVal sender As Object, ByVal e As System.EventArgs) Handles btnPDF.Click
DownloadReportPDF(Me.RadGrid1.SelectedValue.ToString())
Dim strMessage As String = "alert('Printed PDF Sheet.');"
ScriptManager.RegisterStartupScript(Me, Me.GetType, "MyScript", strMessage, True)
End Sub
Protected Sub DownloadReportPDF(ByVal releaseMasterId As String)
'Service call to generate report source
Dim service As New TelerikReportLibrary.ReportServices.PPSReportService
Dim source As Telerik.Reporting.TypeReportSource = service.GetReportSource(releaseMasterId)
'Render PDF and download
Dim reportProcessor As New ReportProcessor()
Dim result As RenderingResult = reportProcessor.RenderReport("PDF", source, Nothing)
Dim fileName As String = result.DocumentName + "_" + releaseMasterId + "." + result.Extension
Response.Clear()
Response.ContentType = result.MimeType 'this is always "application/pdf"
Response.Cache.SetCacheability(HttpCacheability.Private)
Response.Expires = -1
Response.Buffer = True
Response.AddHeader("Content-Disposition", String.Format("{0};FileName={1}", "attachment", fileName))
Response.BinaryWrite(result.DocumentBytes)
Response.End()
End Sub
But the PDF no longer downloads. An accurate byte array is still created, but the Response portion does not result in the PDF being downloaded from the browser. I've found that putting a call to DownloadReportPDF in the Page_Load handler on the same Page successfully generates and downloads a PDF as it did before.
I can't see any reason why this isn't working, but I'm new to ASP, and I'm not great in VB. I've tried using Response.OutputStream, Response.WriteFile, and making use of a MemoryStream, among several other things that I've lost track of. I'm hoping there's something simple, maybe some sort of property of the Page or btnPDF I could be missing. Here is the markup for btnPDF, just in case:
<asp:linkButton ID="btnPDF" CssClass="btn btn-default" runat="server" Width="115px">
<i class="fa fa-file-text" title="Edit"></i> PDF
</asp:linkButton>
What could be causing such a problem? Where should I look at this point?
Let me know if more information is needed.
Thanks,
Shane
EDIT:
I experimented with setting a session variable on btnPDF_Click, and handling the PDF download on postback. Again, a valid byte array was generated, but the HttpResponse did not cause the PDF to download from the browser.
EDIT:
Building on the last edit, this tells me that calling DownloadReportPDF from Page_Load works only when IsPostBack is false. I just tested this thought, and it holds true. In the above code, if I check IsPostBack at the moment I'm trying to download the PDF, it is true. Investigating further.
Alright, I finally found a solution I'm satisfied with (though I still don't understand why I can't download the PDF using Response while IsPostBack is true).
Inspired by this thread, I put the previously posted code in an HttpHandler called PDFDownloadHandler, then used Response.Redirect in the btnPDF_Click event handler to utilize PDFDownloadHandler. This article helped me a lot on that process, as it is something I have not done before.
In case anyone else runs into this problem, here is the new PDFDownloadHandler:
Imports Microsoft.VisualBasic
Imports System.Web
Imports Telerik.Reporting
Imports Telerik.Reporting.Processing
Public Class PDFDownloadHandler
Implements IHttpHandler
Public Sub ProcessRequest(ByVal context As _
System.Web.HttpContext) Implements _
System.Web.IHttpHandler.ProcessRequest
Dim request As HttpRequest = context.Request
Dim response As HttpResponse = context.Response
Dim path As String = request.Path
If path.Contains("pps.pdfdownload") Then
Dim releaseMasterId As String = request.QueryString("ID")
If releaseMasterId IsNot Nothing Then
'Service call to generate report source
Dim service As New TelerikReportLibrary.ReportServices.PPSReportService
Dim source As Telerik.Reporting.TypeReportSource = service.GetReportSource(releaseMasterId)
'Render PDF and save
Dim reportProcessor As New ReportProcessor()
Dim result As RenderingResult = reportProcessor.RenderReport("PDF", source, Nothing)
Dim fileName As String = result.DocumentName + "_" + releaseMasterId + "." + result.Extension
response.Clear()
response.ContentType = result.MimeType
response.Cache.SetCacheability(HttpCacheability.Private)
response.Expires = -1
response.Buffer = True
response.AddHeader("Content-Disposition", String.Format("{0};FileName={1}", "attachment", fileName))
response.BinaryWrite(result.DocumentBytes)
End If
End If
response.End()
End Sub
Public ReadOnly Property IsReusable() As Boolean _
Implements System.Web.IHttpHandler.IsReusable
Get
Return False
End Get
End Property
End Class
Any further insight on why the original technique did not work is greatly appreciated.

simply dump write xml to screen

XML/ASP.net VB newbie here having fun can't find needle in haystack.
I just want to dump some XML to the screen! Loads of sites tell me how to iterate the nodes, xpath my way in directly. I just want the whole lot to screen.
Dim doc As New XmlDocument
doc.Load("remote.xml")
Dim writer as XmlTextWriter = new XmlTextWriter("debug.xml",nothing)
writer.Formatting = Formatting.Indented
doc.Save(writer)
Does a sterling job of getting it to a file, but I want it on the screen. doc.print(writer).....
Please help.
Try it with the innerXml of your doc. Make sure to HtmlEncode it for it to show up. Stick a literalcontrol on your aspx with id='ltXml' and then something like this:
Dim doc As New XmlDocument()
doc.Load(Server.MapPath("~/remote.xml"))
ltXml.Text = Server.HtmlEncode(doc.InnerXml)
Edited per comment by OP.
Have the function in your class return the Xml string.
Private Class [MyClass]
Public Shared Function getXml() As String
Dim doc As New XmlDocument()
doc.Load("somefile.xml")
Return HttpContext.Current.Server.HtmlEncode(doc.InnerXml)
End Function
End Class
Then in your aspx code behind of your webpage call the class function:
ltXml.Text = [MyClass].getXml()
I suggest using the modern XDocument class instead of the old, deprecated XmlDocument.
XDocument.ToString already returns a nicely formatted version of the XML, so all you need to do is:
Dim doc As XDocument = XDocument.Load("remote.xml")
Dim formatted As String = doc.ToString()

Large File Upload Using HttpHandler or HttpModule?

I have a webform application. It required to be able to upload large file (100MB). I intended to use httpHandler and httpModule to split the file to chunk.
I also had a look at http://forums.asp.net/t/55127.aspx
But it is a very old post and I've seen some example on the internet using httpHandler.
e.g. http://silverlightfileupld.codeplex.com/
I'm not sure httpModule is still better then httpHandler.
Since httpModule apples to the request of the whole application, and I just want it apply to specify page.
Can anybody explain the shortcoming of httpHandler for large file upload clearly (if it has)?
If you know a good example without flash/silverlight , could you post the link here? thx
Edit: Would Like to see some Source Code example.
Why not try plupload which has lot of features with many fallbacks and here how it is done.
This is the http handler code:
Imports System
Imports System.IO
Imports System.Web
Public Class upload : Implements IHttpHandler
Public Sub ProcessRequest(ByVal context As HttpContext) Implements IHttpHandler.ProcessRequest
Dim chunk As Integer = If(context.Request("chunk") IsNot Nothing, Integer.Parse(context.Request("chunk")), 0)
Dim fileName As String = If(context.Request("name") IsNot Nothing, context.Request("name"), String.Empty)
Dim fileUpload As HttpPostedFile = context.Request.Files(0)
Dim uploadPath = context.Server.MapPath("~/uploads")
Using fs = New FileStream(Path.Combine(uploadPath, fileName), If(chunk = 0, FileMode.Create, FileMode.Append))
Dim buffer = New Byte(fileUpload.InputStream.Length - 1) {}
fileUpload.InputStream.Read(buffer, 0, buffer.Length)
fs.Write(buffer, 0, buffer.Length)
End Using
context.Response.ContentType = "text/plain"
context.Response.Write("Success")
End Sub
Public ReadOnly Property IsReusable() As Boolean Implements IHttpHandler.IsReusable
Get
Return False
End Get
End Property
End Class

Sending CSV created on the fly back to client for download

I'm converting a bunch of FOXPRO / FOXWEB apps to ASP.NET.
The underlying DB is still foxpro (for the moment).
I am passing a table to some VB.NET code that I want to have converted to a CSV file and sent back to the client for download. And it works! Sort of ... It works sometimes, but at other times, instead of asking me if I want to download the CSV file, it just spews the file to the browser window.
On the asp side, I am passing the response object, the table and the csv file name.
<%
Dim xls_fn As String = "test01.csv"
'OLEDB call to fill up 'tbl' ... this works.
sendTableAsCSVtoClient(response, tbl, xls_fn)
%>
In the file clsCommon.vb, I have the following code:
Option Explicit On
'Option Strict On
Imports System
Imports System.Web
Imports System.Web.UI
Imports System.Web.UI.Page
Imports System.IO
Imports Microsoft.VisualBasic
Imports System.Diagnostics
Imports System.Data
Imports System.Data.OleDb
Public Class clsCommon
Inherits Page
Public Shared Function enq(ByVal str As String) As String
Dim dq As String
dq = """"
Return dq & str & dq
End Function
' some other functions and subs defined in here ... blah blah blah
' ...
Public Shared Function sendTableAsCSVtoClient(ByVal resp As HttpResponse, ByVal sqlTable As DataTable, ByVal xls_fn As String) As Boolean
Dim r As DataRow
Dim c As DataColumn
Dim sep As String = ","
Dim FileExtension As String
Dim lcFileNameONLY As String
Dim i As Integer
Dim dq As String = """"
FileExtension = UCase(Path.GetExtension(xls_fn))
lcFileNameONLY = UCase(Path.GetFileNameWithoutExtension(xls_fn))
resp.Clear()
resp.ClearContent()
resp.ClearHeaders()
resp.ContentType = "application/vnd.ms-excel"
resp.AddHeader("Content-Disposition", "inline; filename=" & lcFileNameONLY & ".csv")
For Each c In sqlTable.Columns
resp.Write(UCase(c.ColumnName) & sep)
Next
resp.Write(vbCrLf)
For Each r In sqlTable.Rows
For i = 0 To sqlTable.Columns.Count - 1
resp.Write(enq(r(i)) & sep)
Next
resp.Write(vbCrLf)
Next
resp.End()
Return True
End Function
End Class
What's causing this?
How do I get around it?
I'm guessing it doesn't really matter that the source of the data is a table.
Note that the file is created on the fly and never exists on the file system of the server.
tx,
tff
Instead of using a Content-Disposition header that is Inline, use Attachment - this will always prompt for a download.
Change the following line from:
resp.AddHeader("Content-Disposition", "inline; filename=" & lcFileNameONLY & ".csv")
To
resp.AddHeader("Content-Disposition", "attachment; filename=" & lcFileNameONLY & ".csv")
See this and this for examples.
The inline type means that the browser is free to render it inline (within the browser), if in knows how to.
And see this SO question, asking why inline sometimes prompts for downloads (the exact opposite of your question...).
The issue is your Content-disposition header. It should be "attachment" instead of "inline".
You may also want to set the content type to be "text/csv"instead of "application/vnd.ms-excel". This way you're more accurate, and if they prefer to use something else for CSV it should work better. However for an in-house app perhaps vnd.ms-excel might work better?
I agree with the Oded and chmullig that you should change the Content-Disposition, but I also recommend using the buffer and finishing the response with a flush:
resp.Clear()
resp.Buffer = true
'build csv
resp.Flush()
resp.Close()
I believe a call to .End() throws a ThreadAbortException to stop execution which can cause issues depending on how you handle your exceptions. See here for more info

Calling a Class in ASP.NET

I know my ASP.NET but i have to admit, i am dumb with classes and not sure how they work exactly. Also have not worked with them yet but i want to. But what I do know is that it's a place where i can keep code for re-use correct? How will my class look with my code?
So this is my code i use on about 3 forms - but i want to save it in 1 spot and just call it from like when i click on btnSubmit.
Dim strConnection As String = ConfigurationManager.ConnectionStrings("ConnectionString").ConnectionString
Dim con As SqlConnection = New SqlConnection(strConnection)
Dim cmd As SqlCommand = New SqlCommand
Dim objDs As DataSet = New DataSet
Dim dAdapter As SqlDataAdapter = New SqlDataAdapter
cmd.Connection = con
cmd.CommandType = CommandType.Text
cmd.CommandText = "SELECT distinct FIELD FROM TABLE order by FIELD"
dAdapter.SelectCommand = cmd
con.Open()
dAdapter.Fill(objDs)
con.Close()
If (objDs.Tables(0).Rows.Count > 0) Then
lstDropdown.DataSource = objDs.Tables(0)
lstDropdown.DataTextField = "FIELD"
lstDropdown.DataValueField = "FIELD"
lstDropdown.DataBind()
lstDropdown.Items.Insert(0, "Please Select")
lstDropdown2.Items.Insert(0, "Please Select")
Else
lblMessage.Text = "* Our Database seem to be down!"
End If
What must i put here to execute my code above?
Protected Sub btnSubmit_Click(ByVal sender As Object, ByVal e As System.EventArgs) Handles btnSubmit.Click
?????????????????????????????????
End Try
End Sub
Etienne
A class is (in VB.Net) is defined as so
Public Class Person
private _firstName as string
private _lastName as string
'''Constructor with no params
public Sub New()
_firstName = ""
_lastName = ""
End Sub
'Contructor with params
Public Sub New(FirstName as String, LastName as String)
_firstName = FirstName
_lastName = LastName
End Sub
Public Property FirstName As String
Get
return _firstName
End Get
Set(value as String)
_firstName = value
End Set
End Property
Public Property LastName As String
Get
return _lastName
End Get
Set(value as String)
_lastName = value
End Set
End Property
Public Function HitHomeRun() As Boolean
....'Do some stuff here
End Function
End Class
You can then instantiate the class and call its members.
Dim p as New Person()
p.FirstName = "Mike"
p.LastName = "Schmidt"
dim IsHomeRunHit As Boolean = p.HitHomeRun()
Learn more about creating and consuming classes in VB.Net.
This is a very big topic and can be defined in many different ways. But typically what you are venturing into is an N-Tier architecture.
Data Access Layer
Business Logic
UI Logic
Now the way a class can be built in your question can be done, but in the long run is prone to maintenance horror and modifiiability is cut short. Not to mention very much prone to bugs. Putting any type of data access code in your UI layer is bad practice.
This is where the power of having separate layers of classes (separation of concerns) in each layer gives you the ability to reuse code and ability to easily modify for future expansions/features etc. This is getting into Software Architecture is a very broad topic to put into one post.
But if you are really interested here are some links to point you into the right directions.
N-Tier Architecture from Wikipedia
Data Access Layer
Business Logic Layer
Martin Fowler is an expert in Architecture
There is software that eases the pain of the DAL.
1. Linq-To-SQL ability to query your data via .Net Objects (compiled queries)
2. Entity Framework Version 2 of Linq-To-SQL
And this effectively could replace all of your SQL code.
If you want to reuse the code, you should put it in a separate project. That way you can add that project to different solutions (or just reference the compiled dll).
In your web project you add a reference to the project (or to the dll if you have compiled it before and don't want to add the project to the solution).
In your new project you add a class file, for example named UIHelper. In the class skeleton that is created for you, you add a method. As the class is in a separate project, it doesn't know about the controls in the page, so you have to send references to those in the method call:
Public Shared Sub PopulateDropdowns(lstDropdown As DropDownList, lstDropdown2 As DropDownList)
... here goes your code
End Sub
In your page you call it with references to the dropdown lists that you have in the page:
UIHelper.PopulateDropdowns(lstDropdown, lstDropdown2)
This will get you started. There is a lot more to learn about using classes...
I sometimes create a "Common" class and put public Shared methods in it that I want to call from different places.
Something along these lines:
Public Class Common
Public Shared Sub MyMethod
'Do things.
End Sub
End Class
I'd then call it using:
Common.MyMethod
Obviously, you can a sub/function definition that takes the parameters you require.
Sorry if my VB.NET code is a bit off. I usually use C#.
I think you should look into using visual studio designer tools to do your data access and data binding. Search for typed datasets

Resources