save images and text from a PDF in database in asp.net - asp.net

Is it possible to save image PDF and its text in database in asp.net?(as binary)
In my application users need to upload a PDF file that contains several question and Application must extract questions within there images in data base.(of course the questions have a special signed for separating from each other) Is it possible or not?
If yes please guide me?
thank you.

I hope this will help you to do what you looking for :
http://www.dotnetcurry.com/ShowArticle.aspx?ID=129

Related

Form Recognizer Tool OCR issues

I have been exploring Azure Form Recognizer for one of my project where we wants to perform OCR on some hand written texts.
The problem is that when we give scanned images to the tool to process, it some time doesn't even recognize the text written on it (even if it is clearly written). I tried multiple type of images by performing enhancement on it and also the B/W or colored copy of it but it doesn't works.
Some times it recognize value of two fields as one and this leads to incorrect data where one field is completely blank and other is having value of other one along with its own.
When there is NO VALUE in the tagged field in the testing data, it try to read the from some other place which is not even closer to that field or sometimes un-tagged
Could you please help with these queries.
Thanks in advance.
Can you please share also sample forms please make sure data is anonymized and without any real data ?
Please contact customer service to debug this issue.
Thanks,
Neta - MSFT

How to enter mathematics equations in asp.net and save equation to SQL Server for my online test

I want to take math equations from user interface like a textbox and save them to SQL Server.
Is there any possible way please suggest me. I found like creating images but it not possible for my project which needs to create lot and user did not have that much of knowledge in creating images and me too.
Please help me like binding virtual keyboard to textbox or other possible ways
NMaheshGoud
I'd take a look at MathML if I were you. It would require a custom control to allow the user to enter the data. Unfortunately the .Net editor appears to no longer be available but you could resort to a Flash-based one such as fmath Editor.
Why can't you save it in a nvarchar field?
Assuming you are allowing text input then save it as this base form?
I've published a JavaScript library that you can use to create a virtual math keyboard. It is intended for use together with any LaTeX typesetting library (for example MathJax or KaTeX).
GitHub repository: https://github.com/MathKeyboardEngine/MathKeyboardEngine
Live examples: https://mathkeyboardengine.github.io
I see that you want to store the user's input in a database. Call getViewModeLatex to get the user's input as a LaTeX string (for example \frac{x}{1-x}) that you can store.

RDLC file generating PDF and send via email

I'm generating PDF files from RDLC report programaticaly without a viewer (ASP.NET 2.0 C#).
I would like to find a way to send it directly via email without downloading the PDF file. Thanks for any help.
I would suggest these two approaches.. See which one fits best for you
http://weblogs.asp.net/rajbk/archive/2006/03/02/How-to-render-client-report-definition-files-_28002E00_rdlc_2900_-directly-to-the-Response-stream-without-preview.aspx
http://www.codeproject.com/KB/reporting-services/PDFUsingSQLRepServices.aspx
I think Jonathon has posted exactly what you need in this answer to my (similar) question a few days ago:
Distributing RDLC output as an email attachment

To read pdf in asp.net

I want to read pdf file and i want to store it's detail in my database.
But i could not read pdf file & store it in sql database in asp.net using c#. So, please give me a solution if anyone knows...
Very very thanx in advance
If you need to effectively parse the contents of the PDF file, you may want to use a PDF management library such as iTextSharp. Otherwise, you could just store the raw file contents using a binary field in your table.
Please see these article, May be these will help you.
http://www.codeproject.com/KB/string/pdf2text.aspx
http://www.codeproject.com/showcase/TallComponents.asp
Else if you can purchase adobe toolkit for this then it will very easy for you to play with PDF.

In ASP.NET what is the best way to convert a PDF file to HTML?

What my users will do is select a PDF document on their machine, upload it to my website, where I will convert into an HTML document for display on the website. The document will be stored in a database after conversion.
What's the best way to convert a PDF to HTML?
I have been handed a requirement where a user would create a "news" story as a pdf and then would upload it to the sever, where it will be converted to HTML and displayed on the website.
Any document creation software that can save documents as PDF can save them as HTML. I'm assuming the issue is that your users will be creating rich documents (lots of embedded images), which results in multiple files, and your requirements stem from a desire to make uploading these documents as simple as possible to the user.
There are numerous conversion packages that can probably do this for you, however when you're talking about rich content, you are talking about text plus images. Those images have to be stored somewhere and served somehow, and whatever conversion method you use will require you to examine all image sources to make sure they point to valid locations on your server.
I would like to suggest an alternate way of doing this that you can take to your team: Implement one of the many blog APIs for publishing content. There are free and commercial software packages that use these APIs to publish content directly to a website, such as Windows Live Writer and Microsoft Word. Your users can simply create their content and upload it directly to your website without having to publish it as PDF first then upload it. So the process becomes much smoother for your users, and you get the posts in a form that doesn't require you spend thousands of dollars on developing or buying conversion code.
The two most common APIs are the MetaWeblog API and the Movable Type API. Both are very simple and easy to implement. I think this way would be a MUCH better alternative than what you're thinking about doing.
I don't think converting a PDF to an HTML string is necessarily the best idea, especially if you want to export it back as PDF. PDF files often contain binary elements such as images, so you may be best to convert it to ASCII via an encoding, such as Base64. That way you will have an ASCII string you can save into a text field in the DB and then convert it back out. Could you expand more on the main requirement?
My recommendation would be to not do it that way IF POSSIBLE (but we all know what managers are like) so...
I would recommend that you stay away from converting the PDF to/from HTML (because unless you can find a commercial solution it will be nigh on impossible) and instead do as has already been mentioned and store it as an encoded Base64 string, or BLOB or some other binary format in the database, and then display it to the user with some sort of PDF view plugin for the browser.
All it took was a simple google search for "PDF to HTML": http://www.gnostice.com/pdf2manyOverview_x.asp. I'm sure there are others.
So while it's 'possible', you may want to explain to your manager that this isn't the best content management solution.
Why not use the iTextSharp to read the PDF content? Then You could save both the binary PDF and the text content to the database. You could then let users search the content and download the PDF.
You should look into DynamicPDF. They have a converter (currently Beta) out for serving exactly this purpose. We have used their products with great success (especially for dumping Reporting Services reports directly to PDF).
Ref: http://www.dynamicpdf.com/

Resources