Irreversible conversion to/from pdf

Post any defects you find in the released or beta versions of the ImageMagick software here. Include the ImageMagick version, OS, and any command-line required to reproduce the problem. Got a patch for a bug? Post it here.
Post Reply
macias
Posts: 28
Joined: 2008-12-10T13:44:19-07:00

Irreversible conversion to/from pdf

Post by macias »

Opensuse 11.1, IM 6.4.3.6

This problem is really old, and I already discussed this on the ML but without not too much interest -- maybe the reason was I didn't find good use cases, to test the problem. Now I found it, because other people were bitten by this bug.

From my POV (as user) I should be able to convert image to pdf, then from pdf to image, and I should get duplicate. In a bit worse version -- with some additional arguments.

IM now fails to provide such, reversible, conversion for some simple files really. Even if I try add some arguments, as size, or density.
image -> pdf: works fine, I get 1:1 copy
pdf -> image: I get "white paper" with actual image taking 25% of space in the lower, left corner

File to test it:
http://launchpadlibrarian.net/16005528/ ... output.zip

convert testpic.jpg test.pdf
convert test.pdf bad.png

Try it yourself. Maybe I am wrong but if any pdf reader can read pdf file and get the proper image+"paper" size then while converting this information should be used too.

If I can be any helpful, please let me know.

edit: note mine, but other person report of this bug at Ubuntu launchpad:
https://bugs.launchpad.net/ubuntu/+sour ... bug/248329
(about opposite direction of conversion actually)
User avatar
magick
Site Admin
Posts: 11064
Joined: 2003-05-31T11:32:55-07:00

Re: Irreversible conversion to/from pdf

Post by magick »

ImageMagick rasterizes the PDF into image pixels so its not reversible. Instead you want a program that supports PDF in its original vector format.
macias
Posts: 28
Joined: 2008-12-10T13:44:19-07:00

Re: Irreversible conversion to/from pdf

Post by macias »

a) do you mean when I convert image to pdf I don't get exact image, but some average of it? But this is off-topic, I am just curious

b) please run the conversion as I described it -- there is not loss about one pixel, but entire image is scaled down while maintaining the size of the "paper". The difference is hard to miss, believe me

Original
Image

after to pdf, from pdf

Image

edit: Middle step (pdf) is just fine. I would like to get the pdf quality to this final step, as you can see, around 94% of pixels are simply lost
User avatar
anthony
Posts: 8883
Joined: 2004-05-31T19:27:03-07:00
Authentication code: 8675308
Location: Brisbane, Australia

Re: Irreversible conversion to/from pdf

Post by anthony »

See...
A word about Vector Image formats
http://www.imagemagick.org/Usage/formats/#vector

From what I could see NONE of the image was lost. It was just
rasterised into a set of pixels at a smaller resolution.

PDF images generally do not contain pixels at all, it contains lines and shapes (vectors), not pixels. IM deals in pixels, arrays of pixels (rasters), not lines or shapes.
Anthony Thyssen -- Webmaster for ImageMagick Example Pages
https://imagemagick.org/Usage/
macias
Posts: 28
Joined: 2008-12-10T13:44:19-07:00

Re: Irreversible conversion to/from pdf

Post by macias »

> From what I could see NONE of the image was lost. It was just
> rasterised into a set of pixels at a smaller resolution.

Please, zoom it in -- you will see the difference.

It is impossible to shrink 1000x1000 image into 10x10 area and get all the data back. Besides, for some images, IM does correct work -- it converts image with preserving image & "paper" size (for jpg it is the same, but I as human can tell the difference), for some it does not.

My point is it is a bug, to scale down the image because in such case:
a) IM is unpredictable (it is hard to know in advance what IM will do)
b) it is quality loss

IM was not told to scale down the image, and I would like to get "the same" (visually) content.

* the same -- I don't mind few pixels lost, this is not an issue here
macias
Posts: 28
Joined: 2008-12-10T13:44:19-07:00

Re: Irreversible conversion to/from pdf

Post by macias »

Original picture:
Image

Pdf obtained from it looks the same, so pdf contains all the data required for conversion from it:

After conversion from pdf:
Image

And since some don't believe in data loss, I manually zoomed the "after conversion" picture:
Image

Are you sure you can read the small text in the last picture (click on the image to show it in full size)?
User avatar
anthony
Posts: 8883
Joined: 2004-05-31T19:27:03-07:00
Authentication code: 8675308
Location: Brisbane, Australia

Re: Irreversible conversion to/from pdf

Post by anthony »

You will always get some data loss at a different resolution. I never said it was perfect or that the results were tha same, only that the image is correct for the resolution it was rasterized at. Everythign is present, including the small text (even if it is imposible to actually read!)

In fact rasterising is itself a loss of information about the individual objects in the image. You can no longer easily separate then or move them around!

The point is you need to specify the resolution you want. both for input and for output.

printers and monitors know what the final resolution will be, but IM as a batch program without any physical hardware does not. You need to specify.

PDF files do not have an 'ideal' resolution unless they contain raster images internally. Finding the resolution for one-to-one pixel mapping for internal raster images is a major problem with no simple solution because the wrapping format is vector without a 'ideal' resolution.

If you know of a simple solution to extracting raster images from PDF's and Postscript perfectly, I would love to hear about it!

However this is not the subject you specified as having a problem with!
Anthony Thyssen -- Webmaster for ImageMagick Example Pages
https://imagemagick.org/Usage/
macias
Posts: 28
Joined: 2008-12-10T13:44:19-07:00

Re: Irreversible conversion to/from pdf

Post by macias »

anthony wrote:The point is you need to specify the resolution you want. both for input and for output.
I cannot believe there is such miscommunication :-), maybe my English is that bad. So ok, in steps:

Example:
a) I have 200x200 pixels image
b) the image itself is a blue rectangle, 200x200 pixels
c) I create pdf file of it
d) pdf looks just like that -- 200x200 pixels image ("paper"), 200x200 rectangle in it
e) I create image from pdf (convert)
f) as the result I get 200x200 pixels image with 40x40 pixels blue rectangle

ad.b) info about the image says, it is 300 dpi
ad.f) info about the image says, it is 72 dpi


And compare this to those steps:
a) crate IM wizard logo image -- 640x480 image
b) the image contains the wizard ~640x480 pixels
c) create pdf file of it
d) pdf looks like original image
e) create image from pdf
f) you will get 640x480 image with ~640x480 wizard

The problem might be the wizard is 72 dpi, the image in the first example is 300 dpi. But:

My points are:
1) the size of the otput image is get 100% correctly
2) the content of the pdf should be retrieved with preserving ratio in relation to the image size
3) the problem is not size of the image, but the sizes -- relative sizes, in the first case the size of the image is 4-5 times
bigger than its content
4) if you run convert with additional argument of density the density of course will change, but lack of proportions still occurs --
as expected (at least I expected it)

Please run those test by yourself and try to get as the final result image NxN pixels with blue rectangle NxN pixels (*).
This will save us some communication (English) problems.
anthony wrote: Finding the resolution for one-to-one pixel mapping for internal raster images is a major problem with no simple solution because the wrapping format is vector without a 'ideal' resolution.
First -- I don't care about one-to-one pixel mapping. The second -- (for this thread) it is quite simple. The part of reading image
size is already done and it is done correctly. So the size of the all elements is such, that the proportions are maintained.
If dot in vector graphics takes 25% of the image, it should take 25% of the image in bitmap form.

I am talking about proportions (just to emphasize it is not about the size, single entity).

Cheers,

(*) from my experience, no matter what I do, ImageMagick produces NxN image with LxL content where N=~4*L for the first
case. And N and L should be equal, just as with the wizard (72dpi) case.
macias
Posts: 28
Joined: 2008-12-10T13:44:19-07:00

Re: Irreversible conversion to/from pdf

Post by macias »

I think the simplest test case is this:

(1) source pdf file -- entirely black page
(2) aim -- convert it to image (png) to get black image (as in pdf file)

ad.2) black, and only black

Bug: IM produces white image with black rectangle, no matter whether density is specified
macias
Posts: 28
Joined: 2008-12-10T13:44:19-07:00

Re: Irreversible conversion to/from pdf

Post by macias »

So straightforward test, you do it in home too.

Source: 640x480, 300 dpi, png file. All black.

Image

pdf
convert black.png black.pdf
Image

png (back)
convert black.pdf black-default.png
Image

png (back) forced 300 dpi
convert -density 300 black.pdf black-300.png
Image

So I hope I created good test-example, so it is crystal clear where the problem is.
User avatar
magick
Site Admin
Posts: 11064
Joined: 2003-05-31T11:32:55-07:00

Re: Irreversible conversion to/from pdf

Post by magick »

You may need to upgrade your version of ImageMagick. We issued your commands and all images that were generated were black as expected.
macias
Posts: 28
Joined: 2008-12-10T13:44:19-07:00

Re: Irreversible conversion to/from pdf

Post by macias »

Thank you for the reply -- I upgraded to IM 6.4.9-1 (I don't see any newer version on the servers), still the same results (white "border").

Any hints what can be wrong? Thank you in advance.

P.S. Did you downloaded the images, or recreate them on your own -- if latter, the first image has to be 300 dpi.
User avatar
magick
Site Admin
Posts: 11064
Joined: 2003-05-31T11:32:55-07:00

Re: Irreversible conversion to/from pdf

Post by magick »

Ok, we can reproduce the problem now. Will have a fix in the Subversion trunk within a day or two. In the mean-time, add -density 72 to your command line.
macias
Posts: 28
Joined: 2008-12-10T13:44:19-07:00

Re: Irreversible conversion to/from pdf

Post by macias »

I upgraded to 6.4.9-2 release, and the problem is still there. After converting from pdf file, I get white border.
macias
Posts: 28
Joined: 2008-12-10T13:44:19-07:00

Re: Irreversible conversion to/from pdf

Post by macias »

IM 6.4.9-7, exactly the same result as before (btw. -density 72 does not change a thing, it makes only the entire file/image smaller but the white border is still there).

The same report for GS:
http://bugs.ghostscript.com/show_bug.cgi?id=690284
and I confirm GS works (however only in one direction pdf->png of course).
Post Reply