Cross-compatible terminology database format દોર પોસ્ટ કરનાર: TranslateMedia Translation Company
|
Hi Everyone,
Our agency wants to store and provide terminology databases in a format that can be imported into any CAT tool - so that people can work in the tool of their choice.
It is proving difficult to work out what the most universal format is - I thought it would be CSV, but one of our translation team suggested that TXT might be better. We are hoping that TXT or CSV will work, but understand that we may be wrong!
Does anyone know if there is a univer... See more Hi Everyone,
Our agency wants to store and provide terminology databases in a format that can be imported into any CAT tool - so that people can work in the tool of their choice.
It is proving difficult to work out what the most universal format is - I thought it would be CSV, but one of our translation team suggested that TXT might be better. We are hoping that TXT or CSV will work, but understand that we may be wrong!
Does anyone know if there is a universal format that can be imported into all terminology tools in all CAT tools?
Thanks!
Matt ▲ Collapse | | | Laurent KRAULAND (X) ફ્રાંસ Local time: 14:44 ફ્રેન્ચ થી જર્મન + ... | Does export to TBX exist in all CAT tools? | Jul 23, 2009 |
Hi Laurent,
Thanks for your message, really helpful.
When i look in our tool here - MemoQ (v.3.5.22) I cannot see export to TBX format - I only have the option to export to CSV or Multiterm XML format. MemoQ also only appears to allow import of TermBases in CSV or TMX format....just wondering what the most common tools will import/export?
Thanks for your help!
Matt | | | Laurent KRAULAND (X) ફ્રાંસ Local time: 14:44 ફ્રેન્ચ થી જર્મન + ... Most commonly imported/exported format | Jul 23, 2009 |
Matt Train wrote:
just wondering what the most common tools will import/export?
Thanks for your help!
Matt
Hi again, Matt, glad I could help in some way. AFAIAK the most commonly imported/exported format is TMX (Translation Memory eXchange), but it is not a terminology database format. Hope that other colleagues can add their input to mine.
Laurent K. | |
|
|
CSV and TXT are basically the same thing | Jul 23, 2009 |
Matt Train wrote:
Hi Everyone,
Our agency wants to store and provide terminology databases in a format that can be imported into any CAT tool - so that people can work in the tool of their choice.
It is proving difficult to work out what the most universal format is - I thought it would be CSV, but one of our translation team suggested that TXT might be better. We are hoping that TXT or CSV will work, but understand that we may be wrong!
Does anyone know if there is a universal format that can be imported into all terminology tools in all CAT tools?
Thanks!
Matt
A CSV is a comma separated txt file. Now, I have no idea how CSV could possibly work as terminology data often contains commas of its own, which would screw it all up horribly. I'm sure there is a solution for that issue, but why bother when you can use tab separated? Comma separated and tab separated TXTs are almost the same thing but tab separated is a bit more user friendly I think. For starters, you can copy-paste between a tab separated txt and a spreadsheet with zero adjustment or trickery, they morph into each other by default.
So, the only really good solution that I can see is tab separated txt and/or Excel tables. (Txt wins in compatibility but xls is more familiar and more easily manageable to most users.) | | | Samuel Murray નેધરલેન્ડ્સ Local time: 14:44 સભ્ય (2006) અંગ્રેજી થી આફ્રીકાન્સ + ... Any flat file is best | Jul 23, 2009 |
Matt Train wrote:
It is proving difficult to work out what the most universal format is - I thought it would be CSV, but one of our translation team suggested that TXT might be better.
Some CAT tools simply don't have the facility to import a simple format. There is no format that every tool can import. But your best bet is probably something tab delimited. If the translation team member meant "Trados TXT", then he's got it wrong -- the only tool that can read Trados TXT is, well, Trados. But if he meant a tab delimited file with a TXT file extension, then it's spot on.
TBX was designed (by some guy and his mates, over a cup of coffee perhaps) as a universal format, but so far very few tools can read and/or write it.
TMX may work but the problem with TMX is that there are only two fields, whereas with a tab delimited TXT file or a CSV file you can have as many fields as you can dream of.
CSV is not a good choice because different tools generate different dialects of CSV that are not all mutually intelligible.
[Edited at 2009-07-23 12:20 GMT] | | | Samuel Murray નેધરલેન્ડ્સ Local time: 14:44 સભ્ય (2006) અંગ્રેજી થી આફ્રીકાન્સ + ... How to handle commas in CSV | Jul 23, 2009 |
FarkasAndras wrote:
A CSV is a comma separated txt file. Now, I have no idea how CSV could possibly work as terminology data often contains commas of its own, which would screw it all up horribly.
Ah, but CSV is not simply a comma separated file -- it is a comma separated file with some extras to make it comma compatible. If a field contains a comma, simply put quotes on either side of the field. If a field contains a quote, simply double it. With CSV, your fields can also contain tabs and even line breaks. Some CSV programs do not accept a CSV file if there are superfluous quotes, however, and some other programs generate quotes whether they are strictly necessary or not, so you have a recipe for disaster.
That said, I don't think the CSV format is sufficiently simple that people should attempt to edit it by hand. Tab delimited is simpler and more human editable.
For starters, you can copy-paste between a tab separated txt and a spreadsheet with zero adjustment or trickery...
Agreed. You can do slightly more with Microsoft Office than with OpenOffice.org, but basically a tab delimited file shows the most promise.
[Edited at 2009-07-23 12:29 GMT] | | |
Thanks Samuel and Andras.
Interesting that a standard simple format does not exist in the practical world (although TBX may solve that in future hopefully!).
Now we know that in this case there is not a one-size-fits-all solution we can act accordingly.
Thanks for your input! | | | To report site rules violations or get help, contact a site moderator: You can also contact site staff by submitting a support request » Cross-compatible terminology database format Pastey | Your smart companion app
Pastey is an innovative desktop application that bridges the gap between human expertise and artificial intelligence. With intuitive keyboard shortcuts, Pastey transforms your source text into AI-powered draft translations.
Find out more » |
| Wordfast Pro | Translation Memory Software for Any Platform
Exclusive discount for ProZ.com users!
Save over 13% when purchasing Wordfast Pro through ProZ.com. Wordfast is the world's #1 provider of platform-independent Translation Memory software. Consistently ranked the most user-friendly and highest value
Buy now! » |
|
| | | | X Sign in to your ProZ.com account... | | | | | |