How to create uncleaned RTF without Trados દોર પોસ્ટ કરનાર: John Fossey
| John Fossey કેનેડા Local time: 02:22 સભ્ય (2008) ફ્રેન્ચ થી અંગ્રેજી + ...
I recently completed a project for a Trados agency. I don't have Trados, and don't intend to get it. I was given the RTF source files and a large TMX to maintain uniformity with the client's previous documents. Usually I work in Wordfast (unlicensed) but since this large TMX was too large for the max. 500 TU limitation, I did the translation in OmegaT, which worked without a hitch. Now the agency has asked for the "uncleaned" RTF file for proofreading in Trados, but OmegaT only produces clea... See more I recently completed a project for a Trados agency. I don't have Trados, and don't intend to get it. I was given the RTF source files and a large TMX to maintain uniformity with the client's previous documents. Usually I work in Wordfast (unlicensed) but since this large TMX was too large for the max. 500 TU limitation, I did the translation in OmegaT, which worked without a hitch. Now the agency has asked for the "uncleaned" RTF file for proofreading in Trados, but OmegaT only produces cleaned target files. So I have the original RTF, cleaned RTF, and a variety of TMXs. Does anyone know how I can produce an uncleaned RTF? There used to be a program RTFStyler by MaxPrograms (Swordfish) that claimed to be able to do this, but it seems to have been discontinued. Many thanks for any help! ▲ Collapse | | | | RTFStyler moved to Swordfish | Jan 1, 2009 |
John Fossey wrote:
There used to be a program RTFStyler by MaxPrograms (Swordfish) that claimed to be able to do this, but it seems to have been discontinued. Many thanks for any help!
The functionality of RTFStyler has been incorporated in Swordfish.
Convert your source RTF to XLIFF using Swordfish selecting "Tagged RTF" as source format. Use your TMX file to recover as much as possible from your translations and then convert the translated XLIFF to RTF again. Swordfish will add the missing Trados markup and you will have an "uncleaned" RTF that your client can review in Trados.
Regards,
Rodolfo | | | Samuel Murray નેધરલેન્ડ્સ Local time: 08:22 સભ્ય (2006) અંગ્રેજી થી આફ્રીકાન્સ + ...
John Fossey wrote:
Usually I work in Wordfast (unlicensed) but since this large TMX was too large for the max. 500 TU limitation...
Great, so at least you know what an uncleaned file looks like.
...I did the translation in OmegaT, which worked without a hitch. Now the agency has asked for the "uncleaned" RTF file for proofreading in Trados, but OmegaT only produces cleaned target files.
I think I was able to figure out a way to help. I don't have time to test it, though.
==
Very brief instructions to create uncleaned files after you've translated a file using OmegaT. Because ProZ.com's forum software mangles code, I've added spaces where there should not be spaces. Hopefully you'll see where the spaces should not be spaces.
Open the project_save.tmx file in a text editor and do a regex find/replace to change this:
<tu>
<tuv lang="EN-US">
<seg>Your text here.</seg>
</tuv>
<tuv lang="AF">
<seg>Jou teks hier</seg>
</tuv>
</tu>
into this:
<tu>
<tuv lang="EN-US">
<seg>Your text here.</seg>
</tuv>
<tuv lang="AF">
<seg> { 0 & g t ; Your text here. & l t ; } 0 { & g t ; Jou teks hier. & l t ; 0 } </seg>
</tuv>
</tu>
So basically, the target text should contain:
1. This: { 0 & g t ;
2. The source text
3. This: & l t ; } 0 { & g t ;
4. The target text, and
5. This: & l t ; 0 }
If you're using MS Word as your text editor, the following regex find/replace (with wildcards enabled) should do the trick:
Find what:
(\<tuv lang=\"EN-US\"\>)(*)(\<seg\>)(*)(\<\/seg)(*)(\<seg\>)(*)(\<\/seg)
Replace with:
\1\2\3\4\5\6\7 { 0 & g t ; \4 & l t ; } 0 { & g t ; \8 & l t ; 0 } \9
When using MS Word, remember to enable "Confirm conversion at open" and to use "File -> Open" to open the file, and open it as an Encoded text file, with the encoding Unicode UTF8.
Please let me know the syntaxes for other tools (eg jEdit etc) so that we can post the hack on the OmegaT mailing list.
Then reload the project and create target documents. The resulting ODT file, converted to DOC, would possibly be accepted by Trados and Wordfast as an uncleaned file, even though it does not contain the correct styles.
Ideally, the "uncleaned" tags should be made purple, and not just any purple, but in the style tw4winMark. Doing that is beyond the scope of this post, but basically, you should create a style called tw4winMark and make it purple, and then use the "More" option in find/replace to add a style to the replace box. I'm not sure if a third-party tool can be used to do that. If you have a file translated in WF, you can try to copy one of the purple thingies into your document and then hopefully the tw4winMark style would come with it, so that you can use it in find/replace. | |
|
|
Samuel Murray નેધરલેન્ડ્સ Local time: 08:22 સભ્ય (2006) અંગ્રેજી થી આફ્રીકાન્સ + ... Allow me to fix the smilies... | Jan 2, 2009 |
Samuel Murray wrote:
Find what:
(\<tuv lang=\"EN-US\"\>)(*)(\<seg\>)(*)(\<\/seg)(*)(\<seg\>)(*)(\<\/seg)
Replace with:
\1\2\3\4\5\6\7 { 0 & g t ; \4 & l t ; } 0 { & g t ; \8 & l t ; 0 } \9
Perhaps in ten years' time, the ProZ.com forum software will work. Until then, let me add more spaces:
Find what:
( \ & l t ; t u v l a n g = \ " E N - U S \ " \ & g t ; ) ( * ) ( \ & l t ; s e g \ & g t ; ) ( * ) ( \ & l t ; \ / s e g ) ( * ) ( \ & l t ; s e g \ & g t ; ) ( * ) ( \ & l t ; \ / s e g )
Replace with:
\ 1 \ 2 \ 3 \ 4 \ 5 \ 6 \ 7 { 0 & g t ; \ 4 & l t ; } 0 { & g t ; \ 8 & l t ; 0 } \ 9 | | | Vito Smolej જર્મની Local time: 08:22 સભ્ય (2004) અંગ્રેજી થી સ્લોવેનિયન + ... SITE LOCALIZER If you send me the material | Jan 2, 2009 |
Many thanks for any help!
... I'll patch it up with Trados. Use it to mop up NOPs on my machine instead of Seti (g).
Regards
Vito | | | John Fossey કેનેડા Local time: 02:22 સભ્ય (2008) ફ્રેન્ચ થી અંગ્રેજી + ... વિષયની શરૂઆત કરનાર Thanks for all the suggestions. | Jan 2, 2009 |
What I did, which seems to have worked (crossing fingers), is that I found that I had (fortunately) saved TMs of sections of the job, which were less than 500 TUs each - it was the client's master TM that was way too big. So I was able to import the source file into Wordfast, point it to the TMs, and run TranslateUntilNoMatch, which produced an uncleaned RTF. So far the client has accepted it.
But I would like to try the OmegaT hack suggested, because it seems to me it would benef... See more What I did, which seems to have worked (crossing fingers), is that I found that I had (fortunately) saved TMs of sections of the job, which were less than 500 TUs each - it was the client's master TM that was way too big. So I was able to import the source file into Wordfast, point it to the TMs, and run TranslateUntilNoMatch, which produced an uncleaned RTF. So far the client has accepted it.
But I would like to try the OmegaT hack suggested, because it seems to me it would beneficial to be able to produce Trados-acceptable work with open source software. Maybe a simple macro could be recorded to do it.
Many thanks for all the suggestions, which gave me a much better understanding of the whole situation. ▲ Collapse | | |
If you run into similar issues in the future, just chop up your TMs.
TMX is just a text file, made up of a header and the TUs in sequence after it.
You can keep the header and delete any surplus TUs to cut it down and make it fit under 500, add the closing tag at the end and away you go.
Make as many small TMs out of a big one as you need. | |
|
|
Samuel Murray નેધરલેન્ડ્સ Local time: 08:22 સભ્ય (2006) અંગ્રેજી થી આફ્રીકાન્સ + ... Editing is disabled after 24 hours... | Jan 3, 2009 |
...so I can't fix things unless I repost.
Find what:
(\< t u v l a n g = \ " E N - U S \ " \ > ) ( * ) ( \< s e g \> ) ( * ) ( \< \ / s e g ) ( * ) ( \< s e g \> ) ( * ) ( \< \ / s e g )
Replace with:
\ 1 \ 2 \ 3 \ 4 \ 5 \ 6 \ 7 { 0 & g t ; \ 4 & l t ; } 0 { & g t ; \ 8 & l t ; 0 } \ 9 | | | Samuel Murray નેધરલેન્ડ્સ Local time: 08:22 સભ્ય (2006) અંગ્રેજી થી આફ્રીકાન્સ + ... Try this script I wrote | Jan 4, 2009 |
John Fossey wrote:
Now the agency has asked for the "uncleaned" RTF file for proofreading in Trados, but OmegaT only produces cleaned target files.
I've written my solution from a few posts previously into an AutoIt script (with accompanying EXE file if you don't have AutoIt installed). Download "UncleanifyTMX" here: http://leuce.com/tempfile/omtautoit/. Let me know if it works for you. | | | Anthony Baldwin યૂનાઇટેડ સ્ટેટસ્ Local time: 02:22 પોર્ચુગિઝ થી અંગ્રેજી + ...
I primarily use OmegaT for my work, too, but, when clients require uncleaned rtf or doc files, I now use Anaphraseus ( http://anaphraseus.sourceforge.net ).
Anaphraseus works similarly to older versions of Wordfast®, as I understand it (I've never ... See more I primarily use OmegaT for my work, too, but, when clients require uncleaned rtf or doc files, I now use Anaphraseus ( http://anaphraseus.sourceforge.net ).
Anaphraseus works similarly to older versions of Wordfast®, as I understand it (I've never used WF), but as an extension to OpenOffice ( http://www.openoffice.org ), not MSOffice®.
Sometimes I will still translate files in OmegaT (large project, various references tm files, etc.), and then simply use Anaphraseus to "convert" the target files to unclean by importing the project's tmx file into anaphraseus, other times I simply use Anaphraseus to translate the file.
Here's a manual for use of the latest release of Anaphraseus:
http://www.linguasos.org/bsoft/AnaphraseusManual_1.23b.html
with screenshots, etc.
bonne chance
[Edited at 2009-07-01 13:31 GMT] ▲ Collapse | | | esperantisto Local time: 10:22 સભ્ય (2006) અંગ્રેજી થી રશિયન + ... SITE LOCALIZER Anaphraseus works exactly as Wordfast Classic does | Jul 1, 2009 |
The reference to older versions relates to its functionality, not to the workflow. Files, produced by Anaphraseus, are cleaned-up with Wordfast without any problem. Beware of possibility to loose complex formatting, however.
Or you can use Wordfast without a license — it should produce an uncleaned document with an existing TM (an unlicensed copy won’t update a TM with new TUs, but all other features will be operable).
[Edited at 2009-07-01 14:59 GMT] | | | To report site rules violations or get help, contact a site moderator: You can also contact site staff by submitting a support request » How to create uncleaned RTF without Trados Pastey | Your smart companion app
Pastey is an innovative desktop application that bridges the gap between human expertise and artificial intelligence. With intuitive keyboard shortcuts, Pastey transforms your source text into AI-powered draft translations.
Find out more » |
| Wordfast Pro | Translation Memory Software for Any Platform
Exclusive discount for ProZ.com users!
Save over 13% when purchasing Wordfast Pro through ProZ.com. Wordfast is the world's #1 provider of platform-independent Translation Memory software. Consistently ranked the most user-friendly and highest value
Buy now! » |
|
| | | | X Sign in to your ProZ.com account... | | | | | |