Newbie to scanning > Am I on the right track?

Advice and Help

Moderator: kcleung

Post Reply
600dpi
Posts: 1
Joined: Fri Feb 24, 2012 9:28 pm
notabot: 422
notabot2: Human

Newbie to scanning > Am I on the right track?

Post by 600dpi »

Hi,

I'm a newbie to scanning sheet music and have over the last few weeks evolved the following workflow. I'd be grateful if someone "in the know" could comment on this approach or offer some pearls of wisdom! I'm using a PC and Epson V330 flat bed. (but I'm only doing limited numbers of pages not large tomes!). I've downloaded Irfanview and the "Homer" package intended for DIY book scanning (which includes Scan tailor, does OCR and makes PDFs).

I scan in Black and White at 600 DPI and save as TIF files with no compression (they're all about 3500 to 4500 Kb).

First the right hand side pages, then the left. These are sequentially numbered.

I review the images quickly at this stage and redo any that are glaringly misaligned.

Since the left hand pages (pass two) are upside down I've written a script that calls Irfanview's command line function, it applies TIF/Fax4 compression to both right and left hand pages, then in addition it vflips and hflips the left hand pages, saving the converted new files into a subfolder and renumbering them 0 to n.tif with the file names padded to the same length. This sub folder of images are now all in the correct order and orientation. The file size at this stage is around 70-150 Kb.

Scan Tailor is then used to sequentially apply orientation, deskew, content identification, margins and finally to output the final version of the images (tifs again) in a further subfolder.

I drag and drop this directory of third generation images onto the Homer desktop Icon and using option 4 perform OCR to the pages and produces the final (searchable) PDF file.

So that's it so far, I arrived at the initial TIF/No compression (and large files) since I wanted to automate the rotation of the second set of images (left hand pages) with Irfanview, so the same compression algorithm is applied to left and right pages.

A couple of Questions:

As TIF is "Lossless" can I delete the initial none compressed files? I intend keeping the secondary, renumbered, reoriented and compressed Tifs.

Is B&W, 600 DPI, TIF/No compression a sensible (practicable) approach?

I'd welcome your thoughts and comments...
cypressdome
active poster
Posts: 568
Joined: Fri Aug 27, 2010 1:10 am
notabot: 42
notabot2: Human
Location: the piney woods of Florida

Re: Newbie to scanning > Am I on the right track?

Post by cypressdome »

Hi 600dpi (nice name!),

It appears to be a well-thought-out process for scanning and I'd imagine the results will be very good. I'll just add a couple of points. I'm assuming you know about setting the black and white threshold as that can really be the difference between a good and bad scan. Possibly your scanner's software does this automatically but if you are scanning in black and white this should be adjustable within your scanner's twain interface. I would think the only reason you would want to have Irfanview compress the images to Fax4 at the stage of correcting the rotation of the images would be if these images are going to be the ones you ultimately archive because in each of the next steps the images will be re-compressed anyway(Scantailor using LZW and Homer using JBIG2 when it packs them into a PDF file (as I understand)). You are correct that the Fax4 compressed black and white tif files will be identical in quality as the non-compressed ones so there's no point wasting disk space. Are you planning on scanning music that contains much text? If not, I don't think that OCR and creating a searchable PDF would be worthwhile.

Looking forward to your submissions!
Cypressdome
thebeachbum
Posts: 1
Joined: Tue Jun 05, 2012 11:29 pm
notabot: 42
notabot2: Human

Re: Newbie to scanning > Am I on the right track?

Post by thebeachbum »

Hey Like the name! was wondering if you resolved your issue here, I use the same scanner and having issues with the copies. :D
Post Reply