Problem using Interop Range.Find when range contains table - asp.net

I'm trying to write a word add-in (with C#) that searches a document for all occurrences of certain pieces of text and makes some changes to the sections of text it finds.
I've created a loop that uses Range.Find to get all of the ranges in the document that contain a piece of text and the use the range objects it returns to do the manipulation later. A problem comes up when there is a table in the document, though.
In my first attempt at this, I just kept creating a new range, from the end of my last found occurrence to the end of the document and then searching again in that new range until it returns no found values. When I did this with a document containing a table, it just got stuck inside the table and created an infinite loop.
Then, I found this article: http://www.codeproject.com/KB/office/wordaddinpart1.aspx, and when using the Find function the article describes, it successfully continues on through a table, but unfortunately doesn't successfully grab all of the values within that table, which I need it to do.
Does anyone have any advice about getting around this problem? I've seen a couple people talk about having this problem, but no solutions.

I would suggest using the OpenXml SDK for this. The Office interop is a relic. Here's an article that explains how to use the OpenXml SDK to search a Word document:
http://msdn.microsoft.com/en-us/library/bb508261.aspx
Here's an SO question that discusses how to replace an image in a Word document using the OpenXml SDK:
Replace image in word doc using OpenXML

Related

Add Footnotes with a Word Extension

I want to create an extension that runs on Mac and Windows. The extension should insert footnotes at the position of the cursor. As far as I know, Office.js doesn't have a suitable interface for this. I found this in the .NET API Browser (https://learn.microsoft.com/de-de/dotnet/api/microsoft.office.interop.word.footnote?view=word-pia):
public Microsoft.Office.Interop.Word.Footnote Add (Microsoft.Office.Interop.Word.Range Range, ref object Reference, ref object Text);
But the Net Framework cannot be used on Mac. Do you have another solution to my problem?
Office.js can be used to insert footnotes into a Word document. Currently, there is no Footnotes API that would make this easy, but we can still use the Ooxml APIs to insert footnotes into the document.
Here is an example gist that uses Ooxml to replace the current selection in the document with a sample text and sample footnote. I tried my best to separate out the relevant parts of the Ooxml in case you want to modify the code to incorporate multiple texts and footnotes:
Insert footnotes using Ooxml
You can use ScriptLab to import the above gist and play around with the sample.
An Office Add-in solution using Office.js can be used on Mac, along with on Windows, and on Web.

MS Word template with loops, tables and charts

For our SaaS (LAMP) product reporting we are currently using JasperReports. We find it too cumbersome to develop reports with and the output in Word unworkable. Moreover, a couple of customers request to be able to develop simple reports themselves (to be used as mail merge). We would therefore like to develop templates right in Word. The idea is to have an application/webservice that would receive the Word template and JSON data from the LAMP application and return the filled-in report. The report has to support:
Loops inside content (repeating a document section several times while filling in array data)
Filling in tables (populating rows from array)
Filling in chart data in pre-created charts (from array)
This is the functionality we are using in JasperReports right now. Are there existing solutions to this? I've found quite a lot that can substitute simple variables, but no info about the the above three points. Will it be a lot of effort to write one from scratch? I would prefer a Windows OpenXML-based solution rather than a Linux PHPOffice-based one as I presume the former would handle the text split by spell-checker and language tags (though I'm not sure).
Windward and Docmosis are both commercial products that support the features you've listed and they are intended to be added to your application to provide reporting capabilities. Neither is are not OpenXML based. They can use Word documents as templates and perform the data merge into different output formats. Please note I work for Docmosis.
Aspose Words is another tool and it can populate a template but most of the power is through code rather than controls/directives in the template. Given your OpenXML thoughts, perhaps this is more what you are looking for.
More tools are recommended here in StackExchange.
I hope that helps.
ReportBox is a Web based reporting solution that can be used by any software application to generate documents and reports in Microsoft Word/ Excel/ PowerPoint/ HTML(DocX/Xlsx/PPTx/HTML) using OpenXML.
The process starts by building a Microsoft Word/ Excel/ PowerPoint/ HTML document as a template and uploading to ReportBox portal. Your application either sends data to ReportBox or ReportBox can pull data from your application database, which is then merged with the template to produce the finished report. Please note that I work for GreenThoughts.

After generating Word Document, the word did not die and keep using my memories

I have a little problem on my projects.
My project is a mvc, do the following jobs.
Using chrome to input data on forms, which then post to my controller
to generate word document using dynamic datas input by replacing the word
document templates.
It finished this part perfectly by replacing the bookmarks,however, the
only thing that border me is that after generating the word document,even
i did not open the word that i generated,there is a word running in the background, and it keep starting a new one after generating new document.
then after multiple generation, the computer becomes slow and i need to go to task manager to kill it one by one in order to release my memories again.
What are the problems? and how do i solve it?
Forgive my stupidity
Just simply add :
winApp.Quit()
and everything is done.
You could try this:
Dim pProcess() As Process = System.Diagnostics.Process.GetProcessesByName("Word")
For Each p As Process In pProcess
p.Kill()
Next

count words in word file that was uploaded

is there a way so i can count the words in a word file (all versions) in classic asp or asp.net?
what i need is to know how many words and if possible to make an array of word length and how many from each so words of 1,2,3 letters will get less attention from the code later.
i was thinking of using FSO or something like that but that won't work for docx
i can upload the file with aspupload or any other object if needed. if there is an object that can be bought that will upload and count words i don't have a problem purchasing it
thanks in advance
You have several options -
If you can have office installed on the server and don't require this to be an fast solution, you can try Word Interop. See Word count using Microsoft.Office.Interop.Word. A similar option is to have OpenOffice installed and work with that, never did that myself.
You can use the IFilter interface (http://msdn.microsoft.com/en-us/library/ms691105(v=vs.85).aspx). Microsoft already implemented logic to take Word files and give you access to the inner text, so all you'll have to do is count the words. Look at the first answer here Are IFilters necessary to index full text documents using Lucene.NET and the link it provides or How to extract text from MS office documents in C#. You can also look at http://blogs.msdn.com/b/jasonz/archive/2009/08/31/sample-parsing-content-in-c-using-ifilter.aspx
You can use 3rd party tools, I know there are some out there, but I'm not really familiar with any of them. For example see http://www.aspose.com/.net/word-component.aspx
If you don't really need support for ALL word versions, then there are various ways to work with Word 2007+ files - for example - the official openXML or the open source docx
Option (2) seems like the way to go to me.

How to feed Word 2010 (.docx) documents/templates with data from MySQL database?

What would be the best approach to replace placeholders in a .docx document (Word 2010) with data coming from a MySQL database?
Can I just open the file using a server side language and do a string replace per each placeholder?
Is there any existing tool/library available?
Thanks
Disclosure: I work for Invantive.
Using Invantive Composition (http://www.invantive.com/products/invantive-composition) you can fill Word documents (letters, legal pleadings, insurancy policies) with data from a database (IBM DB2, Oracle, MySQL, Teradata and SQL Server) and then fully change the contents at will manually. It is intended for real Microsoft Word end-users (both the guys that make the template and the ones that use it) that access the databases through a central webservice and models with queries. Invantive Composition allows nested repeating groups of data and lay-out. Integrates into Microsoft Word using click once.
In the past, I personally have also been using JasperReports (http://community.jaspersoft.com/project/jasperreports-library) to generate letters using the RTF output target of JasperReports. It is free and works fine as long as you do not want to edit the output more than a few words and have Java/SQL development skills. Just as Invantive Composition it works fine for large numbers of different reports.
As long as you can control the environment completely, you can also consider using RTF as intermediate language (not for end-users, only real developers). Save document as RTF, replace parts of the text you need to be replacable, write a webservice that accepts the parameter and dumps back the resulting RTF. Takes some time to generate more complex tables (tables are obviously something invented by the human race after the RTF specification was written :-) This approach only works with very limited number of templates and when you have sufficient developer time available to get it up and running and stabilized.
As an independent reviewer, I have also seen cases where XML templates were used, but the results were not as good as with JasperReports.
**Disclosure: I lead the docx4j project **
There are heaps of existing tools/libraries available!
Yes, you can just do a string replace, but that is a brittle approach, since Word may have split the string across runs.
You can use MERGEFIELDs, or content control data binding.
docx4j supports all three approaches, but content control data binding is the most powerful.
ContentControlsMergeXML
MERGEFIELDs
VariableReplace
One thing to consider especially is "repeats". If you want say a row of a table in Word, for each matching row in your MySQL table, then you need a way to make this happen.
docx4j does this with a "repeat" content control around the table row; whichever solution you choose, I'd make sure up front that it can handle repeats.
If you want to use PHP the most complete available solution is PHPDocX.
You may check in the tutorial how to substitute placeholder variables by data coming from any data source (like a MySQL DB).
In particular, you may populate table rows with an indefinite number of entries and you may delete whole blocks of the Word document depending on the data fed to the application or build dynamical Word charts.
You may check the available DEMO for a simple but quite illustrative example (its inner workings are explained in the tutorial section).
You can use open Open XML SDK and replace your placeholders like this.
Disclosure: I lead the docxgenjs project
I think you shouldn't have to code everything by yourself, that's why I created a Mustache-like templating engine for docx
Demo:
http://javascript-ninja.fr/docxgenjs/examples/demo.html
Repo
https://github.com/edi9999/docxgenjs
It is JS-based and works client and server side.
Yes, you can use server side language to do it.
Check on apache POI.
http://poi.apache.org
Hello I read the above esp the comments and Ivantive looks impressive - but the solution I needed was much simpler. Use Selection.Range.InsertDatabase in Word to fetch records from an access database or excel spreadsheet or even just another word document. With the access solution you can choose the layout of the records to fetch and have it fetch just particular recordds based on a field (eg ID). Google the words above and it'll take you to MS guidance and an example VB script. Worked well in just a few mins. Now looking for VB script that asks the person what ID they want from the dbase and we're done.
it uses docx templates that have merge fields with java objects (the objects have the information you load from mysql or any other source). The xdoc report is an project for java language, the home page of the project is https://code.google.com/p/xdocreport/.
*Disclosure: I create the templ4docx project *
Hello
You can use templ4docx java library, which is on maven central repository, so you can just add it to your maven dependencies:
<dependency>
<groupId>pl.jsolve</groupId>
<artifactId>templ4docx</artifactId>
<version>2.0.0</version>
</dependency>
Example usage:
Docx docx = new Docx("E:\\template.docx");
Variables variables = new Variables();
variables.addTextVariable(new TextVariable("${firstName}", "John"));
variables.addTextVariable(new TextVariable("${lastName}", "Sky"));
docx.fillTemplate(variables);
docx.save("E:\\filledTemplate.docx");
More details you can find here: http://jsolve.github.io/java/templ4docx/

Resources