(view help text of Word2Txt.cs as plain text)
Word2Txt, Version 1.05 Extract plain text from a Word document and send it to the screen Usage: Word2Txt "wordfile" [ encoding | /D ] or: Word2Txt /E Where: wordfile is the path of the Word document to be read (no wildcards allowed) encoding force use of alternative encoding for plain text, e.g. UTF-8 to preserve accented characters or IBM437 to convert unicode quotes to ASCII /D use the encoding specified in the document file (for .DOCX and .ODT only, if Word isn't available) /E list all available encodings Notes: If a "regular" (MSI based) Microsoft Word (2007 or later) installation is detected, this program will use Word to read the recognized text from the Word file, which may be ANY file format by Word. If Word was already active when this program is started, any other opened document(s) will be left alone, and only the document opened by this program will be closed. If Word is not available, or if it encounters unreadable content (i.e. the file is corrupted), the text can still be extracted, but only from .DOC, .DOCX, .ODT, .RTF and .WPD files. If the specified encoding does not match any available encoding name, the program will try again, ignoring dashes; if that does not provide a match, the program will try matching the specified encoding with the available encodings' codepages. This program requires .NET 4.5. Return code ("errorlevel") 0 means Word encountered no errors and some text was extracted from the file; 1 means Word is not available or the file was corrupted; 2 means either command line errors or the program failed to extract any text. Written by Rob van der Woude https://www.robvanderwoude.com
page last uploaded: 2022-10-05; loaded in 0.0057 seconds