Copy from Word and keep formatting in Redactor II - redactor

How can i copy and paste tables from a MS Word document into the new Redactor II and keep (a bit of) the formatting?
In my case it's about hundreds of technical tables that i need to copy into the cms (craft cms). When i copy&paste it, it loses the layout completely.
Dave

In the end i used a word>html tool like: https://word2cleanhtml.com/

Related

count words in word file that was uploaded

is there a way so i can count the words in a word file (all versions) in classic asp or asp.net?
what i need is to know how many words and if possible to make an array of word length and how many from each so words of 1,2,3 letters will get less attention from the code later.
i was thinking of using FSO or something like that but that won't work for docx
i can upload the file with aspupload or any other object if needed. if there is an object that can be bought that will upload and count words i don't have a problem purchasing it
thanks in advance
You have several options -
If you can have office installed on the server and don't require this to be an fast solution, you can try Word Interop. See Word count using Microsoft.Office.Interop.Word. A similar option is to have OpenOffice installed and work with that, never did that myself.
You can use the IFilter interface (http://msdn.microsoft.com/en-us/library/ms691105(v=vs.85).aspx). Microsoft already implemented logic to take Word files and give you access to the inner text, so all you'll have to do is count the words. Look at the first answer here Are IFilters necessary to index full text documents using Lucene.NET and the link it provides or How to extract text from MS office documents in C#. You can also look at http://blogs.msdn.com/b/jasonz/archive/2009/08/31/sample-parsing-content-in-c-using-ifilter.aspx
You can use 3rd party tools, I know there are some out there, but I'm not really familiar with any of them. For example see http://www.aspose.com/.net/word-component.aspx
If you don't really need support for ALL word versions, then there are various ways to work with Word 2007+ files - for example - the official openXML or the open source docx
Option (2) seems like the way to go to me.

Bulk upload of Microsoft Word files to WordPress pages

I have been asked to upload 200 Microsoft Word documents — many of them containing lengthy, complex math problems or scientific notation — into a WordPress setting. Each Word file would become a separate WordPress post.
I would clearly prefer to not cut-and-paste each file one-by-one into a post and then save it . Does anyone know of a way to automate the process while ensuring the accuracy of the translation, or at least minimizing the number of issues we might find when converting from Word to WordPress? Or am I dreaming the impossible dream?
Thanks for any input you can offer.
Sounds like an interesting problem. I have an idea that might be worth exploring. There are a number of free or shareware tools that can convert Word docs to HTML.
If you can manage to convert them into decently clean markup with one of those tools, I would recommend using the HTML Import 2 WordPress plugin. It can take a batch of HTML files and create Posts / Pages out of them.
It's a two step process, but I bet it'll work. (And certainly be faster than copy/paste 200 times).
Hope that helps, have fun!
Well I got the solution which works for me, but its bit manual but still save a lots of time.
Here are the Steps.
Connect your Blog to Ms word 2007/2013
Make sure Remote writing is Enabled in WP
Copy all post in one Word document or use merge to make one single DOC.
Now Set Default posting category from WP and Save it.
Now from your MSWORD copy the post and start posting one by one.
Tips:
Make Shortcut key for publishing.
Use Ctrl+C for text before publishing.
Make shortcut for publishing to WP

How to feed Word 2010 (.docx) documents/templates with data from MySQL database?

What would be the best approach to replace placeholders in a .docx document (Word 2010) with data coming from a MySQL database?
Can I just open the file using a server side language and do a string replace per each placeholder?
Is there any existing tool/library available?
Thanks
Disclosure: I work for Invantive.
Using Invantive Composition (http://www.invantive.com/products/invantive-composition) you can fill Word documents (letters, legal pleadings, insurancy policies) with data from a database (IBM DB2, Oracle, MySQL, Teradata and SQL Server) and then fully change the contents at will manually. It is intended for real Microsoft Word end-users (both the guys that make the template and the ones that use it) that access the databases through a central webservice and models with queries. Invantive Composition allows nested repeating groups of data and lay-out. Integrates into Microsoft Word using click once.
In the past, I personally have also been using JasperReports (http://community.jaspersoft.com/project/jasperreports-library) to generate letters using the RTF output target of JasperReports. It is free and works fine as long as you do not want to edit the output more than a few words and have Java/SQL development skills. Just as Invantive Composition it works fine for large numbers of different reports.
As long as you can control the environment completely, you can also consider using RTF as intermediate language (not for end-users, only real developers). Save document as RTF, replace parts of the text you need to be replacable, write a webservice that accepts the parameter and dumps back the resulting RTF. Takes some time to generate more complex tables (tables are obviously something invented by the human race after the RTF specification was written :-) This approach only works with very limited number of templates and when you have sufficient developer time available to get it up and running and stabilized.
As an independent reviewer, I have also seen cases where XML templates were used, but the results were not as good as with JasperReports.
**Disclosure: I lead the docx4j project **
There are heaps of existing tools/libraries available!
Yes, you can just do a string replace, but that is a brittle approach, since Word may have split the string across runs.
You can use MERGEFIELDs, or content control data binding.
docx4j supports all three approaches, but content control data binding is the most powerful.
ContentControlsMergeXML
MERGEFIELDs
VariableReplace
One thing to consider especially is "repeats". If you want say a row of a table in Word, for each matching row in your MySQL table, then you need a way to make this happen.
docx4j does this with a "repeat" content control around the table row; whichever solution you choose, I'd make sure up front that it can handle repeats.
If you want to use PHP the most complete available solution is PHPDocX.
You may check in the tutorial how to substitute placeholder variables by data coming from any data source (like a MySQL DB).
In particular, you may populate table rows with an indefinite number of entries and you may delete whole blocks of the Word document depending on the data fed to the application or build dynamical Word charts.
You may check the available DEMO for a simple but quite illustrative example (its inner workings are explained in the tutorial section).
You can use open Open XML SDK and replace your placeholders like this.
Disclosure: I lead the docxgenjs project
I think you shouldn't have to code everything by yourself, that's why I created a Mustache-like templating engine for docx
Demo:
http://javascript-ninja.fr/docxgenjs/examples/demo.html
Repo
https://github.com/edi9999/docxgenjs
It is JS-based and works client and server side.
Yes, you can use server side language to do it.
Check on apache POI.
http://poi.apache.org
Hello I read the above esp the comments and Ivantive looks impressive - but the solution I needed was much simpler. Use Selection.Range.InsertDatabase in Word to fetch records from an access database or excel spreadsheet or even just another word document. With the access solution you can choose the layout of the records to fetch and have it fetch just particular recordds based on a field (eg ID). Google the words above and it'll take you to MS guidance and an example VB script. Worked well in just a few mins. Now looking for VB script that asks the person what ID they want from the dbase and we're done.
it uses docx templates that have merge fields with java objects (the objects have the information you load from mysql or any other source). The xdoc report is an project for java language, the home page of the project is https://code.google.com/p/xdocreport/.
*Disclosure: I create the templ4docx project *
Hello
You can use templ4docx java library, which is on maven central repository, so you can just add it to your maven dependencies:
<dependency>
<groupId>pl.jsolve</groupId>
<artifactId>templ4docx</artifactId>
<version>2.0.0</version>
</dependency>
Example usage:
Docx docx = new Docx("E:\\template.docx");
Variables variables = new Variables();
variables.addTextVariable(new TextVariable("${firstName}", "John"));
variables.addTextVariable(new TextVariable("${lastName}", "Sky"));
docx.fillTemplate(variables);
docx.save("E:\\filledTemplate.docx");
More details you can find here: http://jsolve.github.io/java/templ4docx/

Use an excel template and update it programatically

I found this tool but I wonder if it still the right way nowdays with net 4.0 or is there any straight forward oob alternatives.
I just need to add columns and update excel stuff programatically. There are many ways but I need to keep the original document as a template. The link above explains exactly what the requeriments are and why they created such "ExcelPackage" library.
A quick look at the link you provided seems like it will in fact keep the original template intact and just return a populated version of that template. This is a pretty common way to create and populate Excel documents using Open XML since it helps to minimize the amount of code you have to write. If you did not specify the layout, styles, formats, etc in a template you would be forced to define those when coding and that could lead to some bloated code. Overall, a project like this or using the Open XML SDK 2.0 to create the documents is the way to go.

Is there a way to convert DocX, OpenXml, or RTF to TextFlow in AS3?

Basically we want to be able to open up a docx file in as3 or Flex 4 and convert it to a text flow while preserving formatting, embedded images, tables, columns, etc. I know theorectically it's possible as the new Text Layout Framework is powerful enough to pull it off, but I haven't been able to find any case where someone has achieved anything along these lines except for Adobe's BuzzWord web app which does just this. Ideally the solution would be for RTF documents as conversions to RTF from anything are pretty familiar.
Buzzword was built before the Text Layout Framework existed; so I do not think it uses it. I was also under the impression--with no facts to back it up--that Buzzword did a server side conversion of the document; not a client side conversion.
I don't know of any AS3 projects that do this and would recommend taking a look at server side ways to access the data inside the word document. The Apache POI project is one option: http://poi.apache.org/ .
From there you'd have to create your own conversion from doc to something AS3 can handle.

Resources