RTF file normalization

Document file normalization can be a means to allow the use of tools that do not know about a document structure. A popular example of such tools are the ones that compare documents line by line (e.g. diff) and the revision control tools that are built on them (e.g. RCS, SCCS).

There are document formats like HTML or RTF that are free in setting line breaks in their internal file representations. Document processors for these formats sometimes do not normalize their output. That means that the same contents may lead to different document source files, which is a tough case for tools like diff. These programs sometimes do not even produce stable file versions (i.e. do not write the same file versions if the same document version is saved multiple times).

rtfn

rtfn is a program that normalizes RTF files (rich text format files). The line breaks are set according to a normalized format.

The line breaks are set in a manner that RTF keywords (\par, \footnote, etc.) are isolated on lines of their own (to isolate formatting changes) and line-wraps are applied at column 72 to 75 (making diff output more readable), without breaking control symbols. Control symbols (\~, \_, \-, etc.) are embedded in the text. For more details, see the source code.

Usage: rtfn [files]
The output files will have the last character of the file name replaced by 'n' (or 'm' if they end on 'n').
The usage message can be displayed by: rtfn -?

Note: The program is free software; you can redistribute it and/or modify it. It is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

source: rtfn.c
binaries: rtfn-i386-win32.exe rtfn-i386-win16.exe

Keywords: RTF normalization, RTF document normalization, rich text format normalization, RTF versioning, RTF document versioning, RTF versions, RTF line breaks, RTF newlines, normalize, source code


lr / Mon Mar 9 1998