Convert a bunch of PDF documents to JPEG

Well, I found a bunch of PDF documents on my disk today, which I wanted converted to JPEG. Now, Debian replaced ImageMagick in the past for GraphicsMagick, which is supposedly a bit faster and leaner than ImageMagick. So first you need to install graphicsmagick – or rewrite the script to use /usr/bin/convert instead.

The script basically takes every .PDF you have in your current working directory, creates a sub-directory, and then extracts each page of the PDF into a single JPEG image in that subdirectory.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
#!/bin/bash

# Needs graphicsmagick
[ ! -x /usr/bin/gm ] && exit 1

for file in $PWD/*.pdf; do
        sudo mkdir $PWD/${file%*.pdf}
        sudo chown -R nobody.users $PWD/${file%*.pdf}
        sudo gm convert $PWD/$file
           JPEG:"$PWD/${file%*.pdf}/${file%*.pdf}%02d.jpg"

        number="$( echo ${file%*.pdf} | cut -d. -f3 )"
        title="$( echo ${file%*.pdf} | cut -d. -f4 | sed "s,., ," )"
        series="$( echo ${file%*.pdf} | cut -d. -f1-2 | sed "s,., ," )"

        # Create the ComicInfo.xml file
        cat << EOF | sudo tee ${file%*.pdf}/ComicInfo.xml &amp;>/dev/null
<?xml version="1.0"?>
<ComicInfo xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
  <Series>$series</Series>
  <Number>$number</Number>
  <Title>$title</Title>
</ComicInfo>
EOF

done

What I had to google for was basically on how to actually pad the output number. According to the man-page of gm, you just put %02d (or %03d, depending on how much pages your PDFs have at the max) in the desired output file name.