Collectives™ on Stack Overflow

Find centralized, trusted content and collaborate around the technologies you use most.

Learn more about Collectives

Teams

Q&A for work

Connect and share knowledge within a single location that is structured and easy to search.

Learn more about Teams

I'm trying to use appimagetool ( https://appimage.org/ ) to create a single-binary executable of the OCR program tesseract ( https://github.com/tesseract-ocr ). I have built tesseract on Ubuntu 19.10, and I want the executable to run on Ubuntu 14.01.

NOTE: I do not have control over the old version of Ubuntu, and I need features in the late-version tesseract. I have already tried an existing AppImage of tesseract, and it fails in a similar way to what's detailed below.

Somewhat following this tutorial: https://appiomatic.com/blog/creating-appimage-binary-manually-for-linux-from-your-app/ I created a tesseract.AppDir with the requisite layout:

tesseract.AppDir/AppRun
tesseract.AppDir/.DirIcon
tesseract.AppDir/tesseract.desktop
tesseract.AppDir/tesseract.png
tesseract.AppDir/usr
tesseract.AppDir/usr/bin
tesseract.AppDir/usr/bin/tesseract
tesseract.AppDir/usr/lib
tesseract.AppDir/usr/lib/libtesseract.so.5
tesseract.AppDir/usr/lib/libtesseract.so.5.0.0
tesseract.AppDir/usr/share
tesseract.AppDir/usr/share/tessdata
tesseract.AppDir/usr/share/tessdata/eng.traineddata
tesseract.AppDir/usr/share/tessdata/tessconfigs

And created the AppImage:

[Ubuntu 19.10]$ ~/Downloads/appimagetool-x86_64.AppImage tesseract.AppDir
appimagetool, continuous build (commit effcebc), build 2084 built on 2019-05-01 21:02:41 UTC
Using architecture x86_64
/home/kingsley/Software/Tesseract/tesseract/tesseract.AppDir should be packaged as Tesseract-OCR-x86_64.AppImage
Generating squashfs...
Parallel mksquashfs: Using 6 processors
Creating 4.0 filesystem on Tesseract-OCR-x86_64.AppImage, block size 131072.
[=======================================================================================================================|] 1921/1921 100%
Exportable Squashfs 4.0 filesystem, gzip compressed, data block size 131072
    compressed data, compressed metadata, compressed fragments, compressed xattrs
    duplicates are removed
Filesystem size 73511.40 Kbytes (71.79 Mbytes)
    30.95% of uncompressed filesystem size (237490.75 Kbytes)
Inode table size 5971 bytes (5.83 Kbytes)
    57.29% of uncompressed inode table size (10423 bytes)
Directory table size 1019 bytes (1.00 Kbytes)
    56.90% of uncompressed directory table size (1791 bytes)
Number of duplicate files found 0
Number of inodes 92
Number of files 78
Number of fragments 5
Number of symbolic links  3
Number of device nodes 0
Number of fifo nodes 0
Number of socket nodes 0
Number of directories 11
Number of ids (unique uids + gids) 1
Number of uids 1
    root (0)
Number of gids 1
    root (0)
Embedding ELF...
Marking the AppImage as executable...
Embedding MD5 digest
Success

However copying it to the older system, it would not run, saying it was missing libpng16.so.16.

[Ubuntu14]$ ./Tesseract-OCR-x86_64.AppImage 
tesseract: error while loading shared libraries: libpng16.so.16: cannot open shared object file: No such file or directory

Further research led me to believe that I had to manually copy in all the dependencies.

So using ldd on the tesseract executable:

[Ubuntu 19.10]$ ldd LOCAL_INSTALL/bin/tesseract 
    linux-vdso.so.1 (0x00007fffd7937000)
    libtesseract.so.5 => not found
    liblept.so.5 => /usr/lib/x86_64-linux-gnu/liblept.so.5 (0x00007f44c03d3000)
    libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x00007f44c03b0000)
    libstdc++.so.6 => /usr/lib/x86_64-linux-gnu/libstdc++.so.6 (0x00007f44c01c2000)
    libgcc_s.so.1 => /lib/x86_64-linux-gnu/libgcc_s.so.1 (0x00007f44c01a8000)
    libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f44bffb7000)
    libpng16.so.16 => /usr/lib/x86_64-linux-gnu/libpng16.so.16 (0x00007f44bff7d000)
    libjpeg.so.8 => /usr/lib/x86_64-linux-gnu/libjpeg.so.8 (0x00007f44bfef8000)
    libgif.so.7 => /usr/lib/x86_64-linux-gnu/libgif.so.7 (0x00007f44bfeed000)
    libtiff.so.5 => /usr/lib/x86_64-linux-gnu/libtiff.so.5 (0x00007f44bfe6c000)
    libwebp.so.6 => /usr/lib/x86_64-linux-gnu/libwebp.so.6 (0x00007f44bfc03000)
    libopenjp2.so.7 => /usr/lib/x86_64-linux-gnu/libopenjp2.so.7 (0x00007f44bfbad000)
    libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007f44bfa5c000)
    libz.so.1 => /lib/x86_64-linux-gnu/libz.so.1 (0x00007f44bfa40000)
    /lib64/ld-linux-x86-64.so.2 (0x00007f44c0706000)
    libzstd.so.1 => /usr/lib/x86_64-linux-gnu/libzstd.so.1 (0x00007f44bf999000)
    liblzma.so.5 => /lib/x86_64-linux-gnu/liblzma.so.5 (0x00007f44bf972000)
    libjbig.so.0 => /usr/lib/x86_64-linux-gnu/libjbig.so.0 (0x00007f44bf764000)

I then copied all those shared libraries into the tesseract.AppDir/usr/lib/ and rebuilt the AppImage again.

Testing on Ubuntu 14 still failed:

[Ubuntu14]$ ./Tesseract-OCR-x86_64.AppImage 
Segmentation fault (core dumped)
Segmentation fault (core dumped)
Segmentation fault (core dumped)
Segmentation fault (core dumped)

EDIT: I retried making the AppImage, adding the midding .so files one by one. Only when I finally copy in the libc.so.6 did I get the seg. fault. However, if I leave this library out, the executable run fails with:

tesseract: /lib/x86_64-linux-gnu/libc.so.6: version `GLIBC_2.22' not found (required by /tmp/.mount_Tesser6wDkZB/lib/liblept.so.5)

It seems that liblept.so.5 is the problem.

Now I'm pretty much out of ideas.

  • Is this not a use-case for AppImages ?
  • Is there a way to debug what's going wrong ?
  • Is there a tool that automatically finds the dependencies?
  • Is Ubuntu 14.01 just too old a target, and I should give up and go back to using gocr.
  • Is there a way to debug what's going wrong ?

    Yes, you can use strace and the LD_DEBUG=libs environment variable to see what's being loaded. For more information about debugging AppImages check:

  • https://appimage-builder.readthedocs.io/en/latest/advanced/troubleshooting.html
  • Is there a tool that automatically finds the dependencies?

    Yes, please check https://github.com/AppImage/awesome-appimage#build-systems

    Which one you should use depends on whether your app can be built on an oldest stable system. If the answer is YES you can use linuxdeploy otherwise you can use appmage-builder. I would recommend reading this entry to discern which tool use.

    Is Ubuntu 14.01 just too old a target, and I should give up and go back to using gocr.

    Provably, you can use appimage-builder to build your AppImage in ubuntu 20.04.

    Many thanks for your time. This answer pointed me in the right direction to solving it myself. – Kingsley Nov 5, 2020 at 3:25

    In case someone is here looking for how to actually solve this issue, this is how I worked it out.

    By adding libraries one-by-one, I was able to determine that the core of the problem was that liblept.so.5 was compiled with GLIBC 22.2, whereas the Ubuntu 14 target did not have this. I found that this was the only library that had this issue.

    However simply including the libc.so.6 too caused all those Segmentation faults. I don't know why this is - and would still like to know why. So I looked around for alternatives.

    One path I tried was compiling tesseract to link with a static liblept, but this did not immediately work, and I did not have time to investigate it fully. Perhaps this was a good approach. Eventually I reckoned on compiling a local leptonica library, so that there was a native Ubuntu 14 version of the .so, and the App Image would just use that.

    Obviously this is not the best solution, because it's not incorporated into the package, but that was enough to get it working for me.

    You cannot use the libc.so.6 binary without the companion ld-linux, there are other libraries that are closely related and must be bundled too. A close list of those files can be found at: github.com/AppImageCrafters/appimage-builder/blob/master/… – Alexis Nov 5, 2020 at 4:35 If you bundle libc.so.6 but you still load some libraries from the system (which is a totally valid approach) you will have to detect and select at runtime the newer lib version between the one embed and the one in the system. Here is a post about the topic azubieta.net/appimage/appimagebuilder/ld_so/glibc/2020/03/09/… – Alexis Nov 5, 2020 at 4:38 As a final comment, we have already solved this kind of issues on appimage-builder and it's totally transparent for the user. So if you want/need to embed libc.so.6 then you should use appimage-builder for sure. – Alexis Nov 5, 2020 at 4:39

    Thanks for contributing an answer to Stack Overflow!

    • Please be sure to answer the question. Provide details and share your research!

    But avoid

    • Asking for help, clarification, or responding to other answers.
    • Making statements based on opinion; back them up with references or personal experience.

    To learn more, see our tips on writing great answers.