Read Scientific Papers on Your Kindle
Reading scientific on your Kindle (or other eBook reader) usually sucks. The text is usually only available as PDF or PS files and formatted in a way that is meant for printing in A4, or US Letter. A two-column layout is also very common, which further complicates things. In this post I show you a simple way to get these papers on your eBook reader for comfortable reading.
Step 1: Preprocessing with
First we will preprocess the file a bit to make the next step easier / more
successful. Using the cool little
tool we will crop out unnecessary parts and only leave the main text area. The
idea is to get rid of line numbers, notes in the margin (e.g. the arXiv line in
our test document), etc.
BRISS is a graphical tool. You can use the menu to load the PDF or
just start it from the terminal:
briss Text\ Understanding\ from\
Scratch.pdf You will be prompted to enter the range of pages that will
be analyzed to find the main text body. Usually it’s fine to just leave it at
BRISS now tries to find the main text area.
Tweak the boxes until they only cover the relevant text and crop the PDF by
Action > Crop PDF. We now have a PDF document with all
possibly misleading fluff cut out and can move on to the next step.
Step 2: Optimizing with
./k2pdfopt -ppgs -dev kpw -mode 2col Text\ Understanding\ from\ Scratch_cropped.pdf
And that’s it, now you have a Kindle optimized PDF!
Warning the default modes include the
-n flag, which will
enable native PDF output. This is the preferable mode since it leads to
smaller, better files because it uses native PDF instructions instead of
rendering the pages to bitmaps. However, (at least the 1st gen Paperwhite) may
crash opening files generated with this option, because it runs out of memory.
This forced me to factory reset my device a couple of times during first
Solution either disable native output by specifying
leading to bigger, uglier files, or install Ghostscript (if you haven’t
already) and include the
option. This will post process the file using Ghostscript and fix the issue.
Then head over to Github and open an Issue please!