Scanned PDF Files
દોર પોસ્ટ કરનાર: Trevor Chichester
Trevor Chichester
Trevor Chichester  Identity Verified
યૂનાઇટેડ સ્ટેટસ્
Local time: 21:13
સભ્ય (2012)
જર્મન થી અંગ્રેજી
+ ...
May 17, 2012

Good Afternoon All!

So...I was wondering, what's the percentage of scanned pdf's you guys do a year?

Strangely, more and more of my translations have been from dead pdf's. Right now, I'm working on 13K worth of dead pdfs and to be honest it is QUITE the headache to deal with this file format.

How do you guys combat this? Do you re-write the pdf? Or do you have an OCR converter?

I personally have a great OCR converter but that doesn't mean I
... See more
Good Afternoon All!

So...I was wondering, what's the percentage of scanned pdf's you guys do a year?

Strangely, more and more of my translations have been from dead pdf's. Right now, I'm working on 13K worth of dead pdfs and to be honest it is QUITE the headache to deal with this file format.

How do you guys combat this? Do you re-write the pdf? Or do you have an OCR converter?

I personally have a great OCR converter but that doesn't mean I don't have to wade through the entire file looking for errors before putting it into Trados.

How do you guys deal with these files?



Cheers,

Trev
Collapse


 
Paulo Eduardo -  Pro Knowledge
Paulo Eduardo - Pro Knowledge  Identity Verified
બ્રાઝીલ
Local time: 22:13
પોર્ચુગિઝ થી અંગ્રેજી
+ ...
have fun! May 17, 2012

www.freepdfconvert.com/

www.pdfonline.com/

www.freepdfconvert.com/pdf_converter_desktop.asp


 
Giles Watson
Giles Watson  Identity Verified
ઇટલી
Local time: 03:13
ઇટાલિયન થી અંગ્રેજી
ઇન મેમોરીયમ
Money talks May 17, 2012

Trevor Chichester wrote:

How do you guys deal with these files?



By quoting a hefty (at least 30%) premium for working with them.

In practice, though, I don't do any. The client either comes up with a viable file format or goes elsewhere. I know plenty of translators who are quite happy to deal with scanned images but I'm not one of them.


 
Nikita Kobrin
Nikita Kobrin  Identity Verified
લીથુઆનીયા
Local time: 04:13
સભ્ય (2010)
અંગ્રેજી થી રશિયન
+ ...
* May 17, 2012

Trevor Chichester wrote:
How do you guys deal with these files?


1) I ask the client to convert the PDF file into editable format (MS Word) and send it to me for translation (I accept only those converted files that are 100% identical to the PDF files from which they were converted).

2) If the client is not able to do 100% identical conversion himself I ask my DTP operator to do the conversion. In order to be able to compensate his work I charge the client extra. It's not cheap: in difficult cases the cost of conversion my be equal to the cost of translation.

Nikita Kobrin

[Edited at 2012-05-17 20:26 GMT]


 
Anton Konashenok
Anton Konashenok  Identity Verified
શેક રિપબ્લીક
Local time: 03:13
ફ્રેન્ચ થી અંગ્રેજી
+ ...
Just OCR it, but do it properly May 17, 2012

Nikita, your DTP operator seems to be overcharging you by a huge factor. In my own experience, OCRing a scanned text of decent quality (maybe even a good fax) has never taken me more than 10% of the time needed for translation, and I consider it good customer relations to offer it free of charge if a steady client sends me an occasional scanned document.
There is, however, an important point to remember: never run your OCR in fully automatic mode, nor allow it to format the paragraphs for
... See more
Nikita, your DTP operator seems to be overcharging you by a huge factor. In my own experience, OCRing a scanned text of decent quality (maybe even a good fax) has never taken me more than 10% of the time needed for translation, and I consider it good customer relations to offer it free of charge if a steady client sends me an occasional scanned document.
There is, however, an important point to remember: never run your OCR in fully automatic mode, nor allow it to format the paragraphs for you. I'm using FineReader, defining the recognition areas by hand (selecting text or table as appropriate) and saving the results as plain text. For very clear originals, I may decide to save as formatted text instead, but delete all paragraph styles created by FineReader before doing any further work - this way, I only keep character-level formatting (font size and bold/italic/underline). Recreating the necessary paragraph format by hand takes a small fraction of the time needed to straighten out the automatically generated formatting.
Collapse


 
Nadiia and Vatslav Yehurnovy
Nadiia and Vatslav Yehurnovy
યુક્રેન
Local time: 04:13
સભ્ય (2008)
અંગ્રેજી થી રશિયન
+ ...
Pricing is often NOT meant to do OCRing May 18, 2012

We also have a friend who sometimes helps with OCRing and deep DTP wizardry, but completely agree with Nikita as for pricing extra per hour. And the originals in Word or other editable and not pre-OCRed formats really start to appear like magic

Well, sometimes miracles do not happen, and so the client pays per hour for re-creating the document versions from a scanned all-tables PDF with several consecutive changes of
... See more
We also have a friend who sometimes helps with OCRing and deep DTP wizardry, but completely agree with Nikita as for pricing extra per hour. And the originals in Word or other editable and not pre-OCRed formats really start to appear like magic

Well, sometimes miracles do not happen, and so the client pays per hour for re-creating the document versions from a scanned all-tables PDF with several consecutive changes of numbers in the cells.

Anton, how about a scanned 15-page document with numerous hardly legible handwritten memos with arrows etc., full of tables and block diagrams?

We just gave a quote for OCRing, drawing and typing, and received back the great Word file with everything intact, just in 3 hours.
Collapse


 
Rolf Keller
Rolf Keller
જર્મની
Local time: 03:13
અંગ્રેજી થી જર્મન
Online services vs. confidentiality May 18, 2012



Usage of such online services might compromise the confidentiality.


 


To report site rules violations or get help, contact a site moderator:

આ મંચના મધ્યસ્થીઓ
Maria Castro[Call to this topic]
Nawal Kramer[Call to this topic]

You can also contact site staff by submitting a support request »

Scanned PDF Files







Pastey
Your smart companion app

Pastey is an innovative desktop application that bridges the gap between human expertise and artificial intelligence. With intuitive keyboard shortcuts, Pastey transforms your source text into AI-powered draft translations.

Find out more »
Anycount & Translation Office 3000
Translation Office 3000

Translation Office 3000 is an advanced accounting tool for freelance translators and small agencies. TO3000 easily and seamlessly integrates with the business life of professional freelance translators.

More info »