How to create uncleaned RTF without Trados
દોર પોસ્ટ કરનાર: John Fossey
John Fossey
John Fossey  Identity Verified
કેનેડા
Local time: 02:22
સભ્ય (2008)
ફ્રેન્ચ થી અંગ્રેજી
+ ...
Jan 1, 2009

I recently completed a project for a Trados agency. I don't have Trados, and don't intend to get it. I was given the RTF source files and a large TMX to maintain uniformity with the client's previous documents. Usually I work in Wordfast (unlicensed) but since this large TMX was too large for the max. 500 TU limitation, I did the translation in OmegaT, which worked without a hitch. Now the agency has asked for the "uncleaned" RTF file for proofreading in Trados, but OmegaT only produces clea... See more
I recently completed a project for a Trados agency. I don't have Trados, and don't intend to get it. I was given the RTF source files and a large TMX to maintain uniformity with the client's previous documents. Usually I work in Wordfast (unlicensed) but since this large TMX was too large for the max. 500 TU limitation, I did the translation in OmegaT, which worked without a hitch. Now the agency has asked for the "uncleaned" RTF file for proofreading in Trados, but OmegaT only produces cleaned target files. So I have the original RTF, cleaned RTF, and a variety of TMXs. Does anyone know how I can produce an uncleaned RTF? There used to be a program RTFStyler by MaxPrograms (Swordfish) that claimed to be able to do this, but it seems to have been discontinued. Many thanks for any help!Collapse


 
Anna Villegas
Anna Villegas
મેક્સિકો
Local time: 01:22
અંગ્રેજી થી સ્પેનીશ
This will help Jan 1, 2009

http://en.wikipedia.org/wiki/OmegaT

Luck,
Tadzio.


 
Rodolfo Raya
Rodolfo Raya  Identity Verified
Local time: 04:22
અંગ્રેજી થી સ્પેનીશ
RTFStyler moved to Swordfish Jan 1, 2009

John Fossey wrote:
There used to be a program RTFStyler by MaxPrograms (Swordfish) that claimed to be able to do this, but it seems to have been discontinued. Many thanks for any help!


The functionality of RTFStyler has been incorporated in Swordfish.

Convert your source RTF to XLIFF using Swordfish selecting "Tagged RTF" as source format. Use your TMX file to recover as much as possible from your translations and then convert the translated XLIFF to RTF again. Swordfish will add the missing Trados markup and you will have an "uncleaned" RTF that your client can review in Trados.

Regards,
Rodolfo


 
Samuel Murray
Samuel Murray  Identity Verified
નેધરલેન્ડ્સ
Local time: 08:22
સભ્ય (2006)
અંગ્રેજી થી આફ્રીકાન્સ
+ ...
I'll try... Jan 2, 2009

John Fossey wrote:
Usually I work in Wordfast (unlicensed) but since this large TMX was too large for the max. 500 TU limitation...


Great, so at least you know what an uncleaned file looks like.

...I did the translation in OmegaT, which worked without a hitch. Now the agency has asked for the "uncleaned" RTF file for proofreading in Trados, but OmegaT only produces cleaned target files.


I think I was able to figure out a way to help. I don't have time to test it, though.

==

Very brief instructions to create uncleaned files after you've translated a file using OmegaT. Because ProZ.com's forum software mangles code, I've added spaces where there should not be spaces. Hopefully you'll see where the spaces should not be spaces.

Open the project_save.tmx file in a text editor and do a regex find/replace to change this:

<tu>
<tuv lang="EN-US">
<seg>Your text here.</seg>
</tuv>
<tuv lang="AF">
<seg>Jou teks hier</seg>
</tuv>
</tu>

into this:

<tu>
<tuv lang="EN-US">
<seg>Your text here.</seg>
</tuv>
<tuv lang="AF">
<seg> { 0 & g t ; Your text here. & l t ; } 0 { & g t ; Jou teks hier. & l t ; 0 } </seg>
</tuv>
</tu>

So basically, the target text should contain:

1. This: { 0 & g t ;
2. The source text
3. This: & l t ; } 0 { & g t ;
4. The target text, and
5. This: & l t ; 0 }

If you're using MS Word as your text editor, the following regex find/replace (with wildcards enabled) should do the trick:

Find what:
(\<tuv lang=\"EN-US\"\>)(*)(\<seg\>)(*)(\<\/seg)(*)(\<seg\>)(*)(\<\/seg)
Replace with:
\1\2\3\4\5\6\7 { 0 & g t ; \4 & l t ; } 0 { & g t ; \8 & l t ; 0 } \9

When using MS Word, remember to enable "Confirm conversion at open" and to use "File -> Open" to open the file, and open it as an Encoded text file, with the encoding Unicode UTF8.

Please let me know the syntaxes for other tools (eg jEdit etc) so that we can post the hack on the OmegaT mailing list.

Then reload the project and create target documents. The resulting ODT file, converted to DOC, would possibly be accepted by Trados and Wordfast as an uncleaned file, even though it does not contain the correct styles.

Ideally, the "uncleaned" tags should be made purple, and not just any purple, but in the style tw4winMark. Doing that is beyond the scope of this post, but basically, you should create a style called tw4winMark and make it purple, and then use the "More" option in find/replace to add a style to the replace box. I'm not sure if a third-party tool can be used to do that. If you have a file translated in WF, you can try to copy one of the purple thingies into your document and then hopefully the tw4winMark style would come with it, so that you can use it in find/replace.


 
Samuel Murray
Samuel Murray  Identity Verified
નેધરલેન્ડ્સ
Local time: 08:22
સભ્ય (2006)
અંગ્રેજી થી આફ્રીકાન્સ
+ ...
Allow me to fix the smilies... Jan 2, 2009

Samuel Murray wrote:
Find what:
(\<tuv lang=\"EN-US\"\>)(*)(\<seg\>)(*)(\<\/seg)(*)(\<seg\>)(*)(\<\/seg)
Replace with:
\1\2\3\4\5\6\7 { 0 & g t ; \4 & l t ; } 0 { & g t ; \8 & l t ; 0 } \9


Perhaps in ten years' time, the ProZ.com forum software will work. Until then, let me add more spaces:

Find what:
( \ & l t ; t u v l a n g = \ " E N - U S \ " \ & g t ; ) ( * ) ( \ & l t ; s e g \ & g t ; ) ( * ) ( \ & l t ; \ / s e g ) ( * ) ( \ & l t ; s e g \ & g t ; ) ( * ) ( \ & l t ; \ / s e g )
Replace with:
\ 1 \ 2 \ 3 \ 4 \ 5 \ 6 \ 7 { 0 & g t ; \ 4 & l t ; } 0 { & g t ; \ 8 & l t ; 0 } \ 9


 
Vito Smolej
Vito Smolej
જર્મની
Local time: 08:22
સભ્ય (2004)
અંગ્રેજી થી સ્લોવેનિયન
+ ...
SITE LOCALIZER
If you send me the material Jan 2, 2009

Many thanks for any help!

... I'll patch it up with Trados. Use it to mop up NOPs on my machine instead of Seti (g).

Regards

Vito


 
John Fossey
John Fossey  Identity Verified
કેનેડા
Local time: 02:22
સભ્ય (2008)
ફ્રેન્ચ થી અંગ્રેજી
+ ...
વિષયની શરૂઆત કરનાર
Thanks for all the suggestions. Jan 2, 2009

What I did, which seems to have worked (crossing fingers), is that I found that I had (fortunately) saved TMs of sections of the job, which were less than 500 TUs each - it was the client's master TM that was way too big. So I was able to import the source file into Wordfast, point it to the TMs, and run TranslateUntilNoMatch, which produced an uncleaned RTF. So far the client has accepted it.

But I would like to try the OmegaT hack suggested, because it seems to me it would benef
... See more
What I did, which seems to have worked (crossing fingers), is that I found that I had (fortunately) saved TMs of sections of the job, which were less than 500 TUs each - it was the client's master TM that was way too big. So I was able to import the source file into Wordfast, point it to the TMs, and run TranslateUntilNoMatch, which produced an uncleaned RTF. So far the client has accepted it.

But I would like to try the OmegaT hack suggested, because it seems to me it would beneficial to be able to produce Trados-acceptable work with open source software. Maybe a simple macro could be recorded to do it.

Many thanks for all the suggestions, which gave me a much better understanding of the whole situation.
Collapse


 
FarkasAndras
FarkasAndras  Identity Verified
Local time: 08:22
અંગ્રેજી થી હંગેરીયન
+ ...
500 TUs Jan 2, 2009

If you run into similar issues in the future, just chop up your TMs.
TMX is just a text file, made up of a header and the TUs in sequence after it.
You can keep the header and delete any surplus TUs to cut it down and make it fit under 500, add the closing tag at the end and away you go.
Make as many small TMs out of a big one as you need.


 
Samuel Murray
Samuel Murray  Identity Verified
નેધરલેન્ડ્સ
Local time: 08:22
સભ્ય (2006)
અંગ્રેજી થી આફ્રીકાન્સ
+ ...
Editing is disabled after 24 hours... Jan 3, 2009

...so I can't fix things unless I repost.

Find what:
(\< t u v l a n g = \ " E N - U S \ " \ > ) ( * ) ( \< s e g \> ) ( * ) ( \< \ / s e g ) ( * ) ( \< s e g \> ) ( * ) ( \< \ / s e g )
Replace with:
\ 1 \ 2 \ 3 \ 4 \ 5 \ 6 \ 7 { 0 & g t ; \ 4 & l t ; } 0 { & g t ; \ 8 & l t ; 0 } \ 9


 
Samuel Murray
Samuel Murray  Identity Verified
નેધરલેન્ડ્સ
Local time: 08:22
સભ્ય (2006)
અંગ્રેજી થી આફ્રીકાન્સ
+ ...
Try this script I wrote Jan 4, 2009

John Fossey wrote:
Now the agency has asked for the "uncleaned" RTF file for proofreading in Trados, but OmegaT only produces cleaned target files.


I've written my solution from a few posts previously into an AutoIt script (with accompanying EXE file if you don't have AutoIt installed). Download "UncleanifyTMX" here: http://leuce.com/tempfile/omtautoit/. Let me know if it works for you.


 
Anthony Baldwin
Anthony Baldwin  Identity Verified
યૂનાઇટેડ સ્ટેટસ્
Local time: 02:22
પોર્ચુગિઝ થી અંગ્રેજી
+ ...
anaphraseus Jul 1, 2009

I primarily use OmegaT for my work, too, but, when clients require uncleaned rtf or doc files, I now use Anaphraseus ( http://anaphraseus.sourceforge.net ).
Anaphraseus works similarly to older versions of Wordfast®, as I understand it (I've never
... See more
I primarily use OmegaT for my work, too, but, when clients require uncleaned rtf or doc files, I now use Anaphraseus ( http://anaphraseus.sourceforge.net ).
Anaphraseus works similarly to older versions of Wordfast®, as I understand it (I've never used WF), but as an extension to OpenOffice ( http://www.openoffice.org ), not MSOffice®.
Sometimes I will still translate files in OmegaT (large project, various references tm files, etc.), and then simply use Anaphraseus to "convert" the target files to unclean by importing the project's tmx file into anaphraseus, other times I simply use Anaphraseus to translate the file.
Here's a manual for use of the latest release of Anaphraseus:
http://www.linguasos.org/bsoft/AnaphraseusManual_1.23b.html
with screenshots, etc.

bonne chance

[Edited at 2009-07-01 13:31 GMT]
Collapse


 
esperantisto
esperantisto  Identity Verified
Local time: 10:22
સભ્ય (2006)
અંગ્રેજી થી રશિયન
+ ...
SITE LOCALIZER
Anaphraseus works exactly as Wordfast Classic does Jul 1, 2009

The reference to older versions relates to its functionality, not to the workflow. Files, produced by Anaphraseus, are cleaned-up with Wordfast without any problem. Beware of possibility to loose complex formatting, however.

Or you can use Wordfast without a license — it should produce an uncleaned document with an existing TM (an unlicensed copy won’t update a TM with new TUs, but all other features will be operable).

[Edited at 2009-07-01 14:59 GMT]


 


To report site rules violations or get help, contact a site moderator:


You can also contact site staff by submitting a support request »

How to create uncleaned RTF without Trados







Pastey
Your smart companion app

Pastey is an innovative desktop application that bridges the gap between human expertise and artificial intelligence. With intuitive keyboard shortcuts, Pastey transforms your source text into AI-powered draft translations.

Find out more »
Wordfast Pro
Translation Memory Software for Any Platform

Exclusive discount for ProZ.com users! Save over 13% when purchasing Wordfast Pro through ProZ.com. Wordfast is the world's #1 provider of platform-independent Translation Memory software. Consistently ranked the most user-friendly and highest value

Buy now! »