Future work

While fixing long-shebangs: there are myriad bugs affecting configuration with single quotes in prefix. Even the parts which work are unlikely to work then if the path had a double-quote. Double-check whether this was covered by the old rejected pair of PRs on configure (for cycling make reconfigure) and either resurrect or redo this along with a CI test with a tortuous prefix.
-use-runtime should actually conflict with -custom (this extends a fix in long-shebangs a bit further)
caml_parse_ld_conf appears to have a totally broken read loop - there's no check that it actually read the whole file and EINTR is not handled?! The latter may not matter, but shouldn't we ensure that we've read to EOF?!
Relative locations in ld.conf required starting to look at ignoring OCAMLLIB and CAMLLIB when they're empty - this should pervade the whole distribution (i.e. an empty environment variable should be equivalent to no environment variable because of the impossibility on Windows)
Documentation issues in https://ocaml.org/manual/intfc.html#ss:custom-runtime
- threads.cma requires either -thread or -I +threads
- The example should mention -noautolink - you end up in the strange situation that you've built a runtime system with all the required primitives, but the linking with unix.cma still adds dllunix.so to the resulting bytecode!
Supporting runtime-variant _shared properly in native mode (or at least consistently). In bytecode, ocamlc -custom -runtime-variant _shared -o foo creates foo linked against libcamlrun_shared.so. However, this seems like a terrible hack both in bytecode and native. Why can't we have libcamlrund_shared.so, for example? It sounds as though shared and or static runtime should be a separate option (both for bytecode and native code) with dllcamlrun.so and dllasmrun.so. dllasmrun.a can be the current _pic variant which IIUC is used to link the runtime statically but when compiling an OCaml .so.

Tests

Searching for correct ocamlrun
i. Two different compilers installed in PATH (critically, they must have an incompatible runtime)
ii. Both shebang and executable header bytecode executables compiled with each compiler.
iii. These four executables will run with PATH empty or with PATH swapped. Note that on Windows (which doesn't embed the absolute PATH in the default runtime) only the executable compiled against the first compiler in PATH will succeed and neither will succeed if PATH is cleared.
Finding stub libraries
i. Two different and compilers, as before (must have incompatible dllunix)
ii. Build a bytecode executable using Unix
iii. Clear CAML_LD_LIBRARY_PATH - both should work.
iv. Set CAML_LD_LIBRARY_PATH to contain both stublib directories
v. One executable should fail (all platforms) since it will load dllunix.so from the first directory
Moving the runtime
i. Set-up a compiler and put its bin in PATH
ii. Compile a bytecode executable
iii. Move the runtime and update PATH
iv. At present, executable will fail
OCAMLLIB madness
i. Build two compilers of different versions
ii. Put one in PATH and set OCAMLLIB to the lib of the other
iii. Try to build something. This should fail.
iv. New system: setting OCAMLLIB-ID should allow it to work; without setting it, the message should be clear that OCAMLLIB is preventing it from working (i.e. it should detect that a valid library is present at the preconfigured location)
Cloning a compiler
i. Build a compiler
ii. Rename its directory
iii. Try to use it.

22-Nov-2024 Notes
1–3, 5 covered by installation-tests branch. 4 is out of scope (no plans to add OCAMLLIB-ID - the restricted version was to harden the error messages).

Notes

(notes from the original presentation)

Three problems
- Loading stublibs for wrong version (dllunix.so; CAML_LD_LIBRARY_PATH abuse)
- Absolute shebang (move ocamlrune somewhere else, e.g. local switch)
- Standard library not found (moved lib/ocaml somewhere else, OCAMLLIB abuse)
The runtime variant (slide 7) should not be included. The variant should always be drop-in compatible for any given runtime ID. For native code, that might mean that PIC and SHARED are possibly not variants?
stdlib.cma should not be suffixed (Slide 10) - suffixing should be done using the directory it's found in. What is worth considering is whether the .cma and .cmxa format should embed the system ID for improved messages?
Environment variables must always be respected (Slide 10) - but when an error message is about to displayed (e.g. OCAMLLIB points to clearly wrong stdlib.cma) then ocamlc can do the originally proposed ignoring of it and see what happens (i.e. ocamlc should determine if OCAMLLIB appears to be wrong and tell the user, but not just act as though it weren't there)
camlheader should be folded into a linker option, meaning all systems will have the executable versions in stdlib dir (and will generate shebang inside the driver)
--enable-relative-libdir should be the default with the absolute directory only being a fallback for when the full path to $0 cannot be found or the relative path from $0 cannot be found. There should be a new option not to embed the absolute path at all? Remember that ocamlrun uses the stdlib dir to locate ld.conf, so there is a case for keeping the absolute path in the search too. The big change in ocamlrun's behaviour is that ld.conf is loaded from $OCAMLLIB, $CAMLLIB and the preconfigured location (which may be relatively resolved). This is a separate PR.
Hash function - use 32-bits of md5 (original proposal) or use Murmur3 (in Hashtbl already) with an appropriate key for direct 32-bit?

Rebase February 2023

Target commit is 1c5e748. Moving each branch in turn, but with relocatable-base-trunk still on d98fd80.

windows-ln-5.0 and compiled-primitives-5.0 need to be rebased on to 5.0 and stack updated with @5.0
Decision on ocamltest branch - does it fold into just a 5.0 version? No - branch contains tweaks for CI testing of the backport branches, not upstreamed work
There's a Basis commit present in the trunk backport - a consequence of the ocamltest branch, which should disappear on the final rebase
unified-target-bindir needs to be rebased onto 5.0 (it's not needed post utils/Makefile merge on trunk). It should be renamed to target-bindir-5.0 at this point.
unified-enable-relative not done yet (the unified PRs need to be done when the base commit moves)
unified-runtime-suffixing not done yet (the unified PRs need to be done when the base commit moves)
ld-warning-temp can be removed as soon as the base commit is shifted (it's only there to reduce diff noise)
runtime-id-temp can be removed as soon as the base commit is shifted (it'll squash on to runtime-id-5.0)

Branches on relocatable-base-trunk:

ocamltest
DONE misc-win-fixes <- PR 0
DONE windows-ln <- PR 1
DONE one-camlheader <- PR 2
DONE target-bindir <- PR 3
unified-target-bindir dropped post-rebase
DONE ld.conf-CRLF <- PR 4
DONE ld.conf-search <- PR 5
DONE ld.conf-relative <- PR 6
DONE compiled-primitives <- PR 7
DONE enable-relative <- PR 8
unified-enable-relative
DONE ld-warning <- PR 9
DONE runtime-id <- PR 10
DONE runtime-suffixing <- PR 11
unified-runtime-suffixing
DONE camlheader-search <- PR 12
unified-camlheader-search

Remaining rebase tasks:

ocamltest - rebase on new commit
windows-ln-5.0 - rebase onto upstream/5.0 and update stack
unified-target-bindir - rename to target-bindir-5.0, rebase onto upstream/5.0 and update stack
compiled-primitives-5.0 - rebase onto upstream/5.0 and update stack
unified-enable-relative - rebase on new commit
ld-warning-temp - drop from stack
runtime-id-temp - drop from stack
unified-runtime-suffixing - rebase on new commit
unified-camlheader-search - rebase on new commit

Notes for future rebases. Working a single branch at a time but keeping the backport-trunk branch on the old commit creates some unnecessary resolutions, but they are all trivial - at each conflict, origin/backport-trunk contains the correct commit, so additional work is only required for files which have no longer changed. It means throughout the process that all the origin branches remain identical. Completely avoid both rebasing and updating branches! When shifting the combined branches to the final commit, back-up the current combined set (which should match origin/backport-trunk) and use that for a diff-of-diff comparison with the patch on the new commit. The last check is important since some of the unified branches add missing code, so the branches may build even though they're technically incorrect. During this switch, it should only be necessary to re-combine to trunk and the latest release - as long as the release branch is identical, the backports should recompile by induction. Depending on the amount of time this takes to upstream, next time leave the old base as an "abandoned" target (i.e. continue to rebase to it). This would remove the need to merge the resolutions, at the cost of an extra target. At the following actual OCaml release, then consolidate the interim bases.

opam packaging notes

These are out-of-date from the 2021 version, but kept for now until the work is done. Note that CoW will be the way to do this - and we'll be identifying if the sources are the same based on the hash written by the ocaml-src package, not using runtime IDs (in other words, the configuration of the switch matching will be taken as the hash key)

Repackaging notes September 2022

Already previous notes on ideas for ocaml-src. The aim here is to make all the vanilla packages in ocaml-variants and ocaml-base-compiler (part from the actual forks) depend on ocaml-src and it's in ocaml-src that we'll embed patches.

The idea is that this package will use git trickery (if available) and a Windows+Unix script to generate a .install file and a sha1 sum for the sources in their actual state. The build process for the compilers will therefore begin by cloning those sources.

Our default mode could also include tarring these up? It's tempting to have the copy operation be really quite fast. We could also use caching for this??

Then we introduce ocaml-compiler as the core package - this uses ocaml-options, etc. ocaml-base-compiler and the existing ocaml-variants are then rewritten in terms of this package only. The ocaml package therefore begins to depend on on ocaml-system and ocaml-compiler only.

opam packaging notes

It would be very useful if configure could provide a fast way to get the runtime ID for its configuration. In this idea we would hoist the detection of the runtime ID as high as it physically can be and then have an enable switch which bombs the configure script at that point. Alternatively, we put the generation in a separate script which is called by configure (which might be better!) and opam just has to know how to replicate that
Note that RuntimeID might encapsulate everything wanted - for example, you can create a switch manually overriding CC which is not captured in it. This is fine - and would form part of detecting whether a compiler is a suitable caching candidate. It's a packaging concern; OCaml only has to be compatible with it.
Special case: the compiler must ensure that it doesn't consider itself (i.e. the current switch!) as a cache source, or opam upgrade / opam reinstall can fail!!
The scanning must be done either by parsing the root config file or by parsing the output of opam switch since we should aim to include local switches in the cache.
Must be careful to ensure that caching works correctly during a reinstall which changes something. This should be fine when the checksums are installed. At present, it goes totally wrong, but it could also go wrong if it used an old version of the caching script. Perhaps that implies that the script itself should be in ocaml-src (or somehting) and be part of the checksum?
The cache managed to find itself during a reinstall!
Post OUD2022 notes:
- Two bug-fixes on Libera. Fairly sure one needs to go straight in a PR and the other is actually on the more-windows-shell branch itself.
- The extraction time of the tarball is a problem, but I wonder if this is something we can optimise on Windows. At present, the tarball is extracted and copied a lot - it might perhaps be more prudent if Windows, detecting the presence of a tar which supports path stripping might allow that copying to be eliminated - i.e. we definitely extract the files to the correct place. Or possibly we live it for now, expecting to use ocaml-tar soon… this was for the demo, after all. The problem with the approach used is the patch files - we really want opam to prepare the tree.

Hard-linking the tree from the cache (opam notes)

The compiler only uses symlinks in bin/
All files without extensions can be hard-linked
Extensions which can be hard-linked:
- 1 man-pages
- a/lib (C libraries)
- o/obj (C objects)
- byte (bytecode executables)
- opt (native executables)
- so/dll (Shared libraries)
- cma/cmi/cmo/cmti/cmt/cmxa/cmx/cmxs (OCaml artefacts)
- h/tbl (C headers)
- ml/mli (OCaml interfaces/installed code)
- hva (ocamldoc Hevea artefacts)
These files must be copied:
- *.conf
- Makefile.config (this may even be edited)

So the conclusion is that we can hard-link everything except ld.conf, not because we want to edit, but because it could be edited and we similarly Makefile.config not be cause it will be edited but because the cloning process will edit it (to set LIBDIR and so forth to correct values)

With this in place, we can do the indistinguishable compiler test

XXX opam will re-package and libasm*.a and libcaml*.a files and re-link the .so files with the correct stdlib.o (the patches should install a helper Makefile for this)

External TODO

TODO

These branches are stacked at the moment.

use-runtime-evil

Image Not Showing Possible Reasons
The image file may be corrupted
The server hosting the image is unavailable
The image path is incorrect
The image format is not supported
Learn More →

Fixes the bug @damiendoligez identified in https://github.com/ocaml/ocaml/pull/8622#discussion_r328158394

This was incorporated into an additional fix in ocaml/ocaml#11112 for OCaml 5.0. It's included in the back-port.

Tests: none required

misc-win-fixes :

Image Not Showing Possible Reasons
The image file may be corrupted
The server hosting the image is unavailable
The image path is incorrect
The image format is not supported
Learn More →

Image Not Showing Possible Reasons

The image file may be corrupted
The server hosting the image is unavailable
The image path is incorrect
The image format is not supported

Learn More →

All the work required from this branch has either been upstreamed or moved to other branches

Wrong branch: this should be a consequence of camlheader-search Commit making the camlheader programs absolute on Windows, as on Unix. This commit should be deleted
Wrong branch: should go with C header clean-up Incorrect usage of SearchPath. The argument gets clobbered - needs verifying and fixing separately
Disable crash dumping on Windows. Determine if definitely wanted and submit upstream. Needs a configure-based fix for mingw-w64 (which defines the symbol in its headers, but uses a runtime which doesn't have it)
Improving the error message when force-linking in disable-shared mode
Including CAML_LD_LIBRARY_PATH in the output of ocamlrun -config (general bug fix)
Remove stdlib/hashbang (unused file)
Correct use of Camlheader error (parameters the wrong way around)

Miscellaneous fixes which need to be in PRs. These need dispatching to separate PRs.

windows-ln

Image Not Showing Possible Reasons
The image file may be corrupted
The server hosting the image is unavailable
The image path is incorrect
The image format is not supported
Learn More →

Windows has supported symbolic links since Windows Vista, however creating them requires elevation ("sudo"). Since Windows 10 1703, that need for elevation is removed by enabling Developer Mode, which is a common configuration choice for, um, developers.

The distribution uses symlinks on Unix to alias ocamlc to ocamlc.byte/ocamlc.opt, as appropriate. On Windows, this has always used cp, incurring a considerable cost for the duplicated executables in the bin directory.

This PR adds a test to configure to see if the ln command is able to create native Windows symlinks. Cygwin's ln is always able to create symlinks, as Cygwin emulates them if the native support is disabled. Cygwin has a mode nativestrict which causes ln to fail if Windows native symbolic links can't be created. However, that assumes we're controlling Cygwin's ln - the test here is slightly stronger (checking the output of cmd's dir command) in order to be paranoid that the symbolic links are only created if they will be readable outside the shell executing the script.

Set the MSYS variable as well as CYGWIN
Rather than unconditionally overriding, ensure that we simply set CYGWIN in the call (NB - work with Seb using AC_CONFIG_LINKS may supersede this aspect - CYGWIN/MSYS may be correctly set in the build regardless)

test-in-prefix

Relocatable OCaml - test harness

This is the first PR in the "Relocatable OCaml" series of changes. The primary motivation of this project is to allow compiler installations to be duplicated, both in opam and in Dune's package management feature (dune pkg). From the compiler's perspective, this boils down to being able to use both the compiler and the runtime after renaming the prefix in which the distribution has been installed.

The changes to achieve this shine light into various dark corners of our linking and execution strategies (some of which have already been tackled in #12751), especially in bytecode.

The goal of this PR is to add a test harness for Relocatable OCaml, which subsequent PRs then amend (and in general simplify). There are two key differences between this test harness and the main ocamltest-based testsuite:

it is central to this test that it runs on the compiler having been installed to the prefix it was configured with; and
the test focuses more on the operation and execution of the programs than of the actual programs themselves, requiring more control over the exact commands executed to build the tests than ocamltest offers

A consequence of the "in-prefix" part is that this is not a test that should be run by default and the fact it needs to operate outside the build tree has led to an additional harness, rather than additional features in ocamltest.

The harness itself has revealed various bugs not related to Relocatable OCaml at all (cf. ocaml/flexdll#146, #13496, #13520, #13638, #13692 and #13693, in addition to a fault in the partial linker alluded to #13692 which will be fixed in a subsequent PR)

The tests performed are covered in testsuite/in_prefix/README.md in this PR.

In terms of review, the first two commits alter the compiler:

The tests exercise considerably the handling of Sys.argv.(0) w.r.t. Sys.executable_name and also bytecode launching. For this to work, it is necessary for the harness to determine if caml_executable_name returns NULL. I've done this by tweaking the startup code ever-so-slightly in bytecode to ensure that caml_executable_name is always called (it is always called in native startup) and then exposed that fact in a new primitive caml_has_proc_self_exe. There are other ways this could be done - I like the fact that this approach has actually tested that caml_executable_name works (versus adding more configure-logic instead of the #ifdef-soup in runtime/unix.c's implementation)
The compilation tests are vastly simpler if they can use Ccomp.call_linker (I mean really, really, simpler - I tried!). However, in order to that, the test harness needs to be able to control slightly more precisely the value of Config.standard_library as interpreted by Ccomp.call_linker. Having tried various approaches, the least invasive to compiler-libs seems to be to generalise Compmisc.init_path via a new Compmisc.reinit_path. Again, there are other ways of doing this, but this one I think is the simplest that doesn't involve duplicating code from utils/ccomp.ml directly in the test harness.

Those first two commits clearly change the compiler, and I expect to be reviewed as such. The next two alter the testsuite only so, while testsuite/tools/test_in_prefix.ml may be, um, a little long, it is also simply a test, like the other 1600 or so ml files in the testsuite! In an earlier incarnation, it was necessary to compile this test harness using the installed compiler, which is why it started out as a single OCaml script. I'm not averse to having to break it up into smaller files, but it wasn't instantly clear to me that it would bring much clarity, and it's not like there's anything reusable - it is in essence a very long script which happens to be written in OCaml.

As far as possible, the harness is told on its command line what to expect from the installation (shared library support; bytecode-only, etc.). That's permitted to guide the selection of tests. Beyond that, everything is executed - i.e. if a test is known to fail on a given platform or architecture then it is run and that failure is noted. The harness therefore "fails" when these issues are fixed.

The final commit plumbs the tests into CI - it (should!) be passing both on GitHub Actions here and has also passed precheck#1009. It's also passing an even wider text matrix including Cygwin and multiple different shebang/executable/static/minimal tests in dra27/ocaml#158.
…

test-in-prefix needs drawing into the chain (as installation-tests was)
installation-tests2 contains various other fixes which need to be opened subsequently (and checked as to which ones are critical)
tools/test_install.ml needs to be tidied up a bit
Makefile alterations - this should instead go into testsuite/Makefile.installed or some such, and then the guard isn't needed
The "Partial revert #12108" commit is upstream in #13520
The WIP commit should be converted to build threads.cmxs using the pthread-compatible version. The suggestion to change ocamlnat is nonsense - the test should simply use the .cmxs files, and we should install threads.cmxs. Samples in systhreads-in-full branch.

unified-header

Image Not Showing Possible Reasons
The image file may be corrupted
The server hosting the image is unavailable
The image path is incorrect
The image format is not supported
Learn More →

XXX PR not started

There's some interim work on the Ubuntu VM on Libera merging the C files for the two headers. This should be seen as pre-requisite and done separately. We regressed the compiled header a while ago on Windows (in 4.06, with the Unicode change).

-nostdlib seems to be our friend along with -Wl,-eheaderentry
headerentry is possibly unnecessary => what happens if we call it WinMain?!
stripping the mingw-w64 binary helped - does that reduce the MSVC one even further?

The major fault (work on Thor checking this?) is that the header entry was changed before in order to reduce the file size, and it does considerably. The fault on mingw goes back ages - there is a linker option for gcc (-Wl,-entry IIRC) which does the same. Since 4.06 we're also using a stdlib function for no good reason.

Notes October 2024
Branch work on relocatable-base-trunk@033900bf

msvc64 vanilla: runtime-launch-info is 12310 bytes
mingw-w64 vanilla: runtime-launch-info is 17943 bytes

The strategy for this:

Compare the filesize on msvc and mingw-w64 between 4.05 and 4.06
- Need to get /O1 /GS- /link /nodefaultlib
- That gets it to 5k
mingw-w64 has always been wrong, by the look of it - either the linker option wasn't there when the original mingw32 port was done, or not enough investigation was done into why the Windows header is this way
Get the 4.06 header back to a similar size - i.e. eliminate the Standard Library import
Update that to 4.14 (to test all four ports)
Investigate unifying the two C files again

Future todos (Nov 2024):

runtime/startup_byt.c should attempt to open itself first - at present -custom executable analyses argv[0] first, which is a mild security concern. It's possible this can just be done where -custom executable always tries this first (although was this done anyway in the rust interop improvement??)

Notes from 4.05-4.14 fixing

OCaml 4.05 msvc64:

cl -nologo -D_CRT_SECURE_NO_DEPRECATE -O2 -Gy- -MD -DCAML_NAME_SPACE -c -I../byterun \
          -DRUNTIME_NAME='"ocamlrun"' headernt.c
../boot/ocamlrun ../flexdll/flexlink.exe -x64 -merge-manifest -stack 33554432 -exe -o tmpheader.exe headernt.obj

result: 15872 bytes

OCaml 4.05 mingw64:

x86_64-w64-mingw32-gcc -O -mms-bitfields -DCAML_NAME_SPACE -D__USE_MINGW_ANSI_STDIO=0 -Wall -Wno-unused -fno-tree-vrp -c -I../byterun \
          -DRUNTIME_NAME='"ocamlrun"' headernt.c
../boot/ocamlrun ../flexdll/flexlink.exe -chain mingw64 -stack 33554432 -exe -o tmpheader.exe headernt.o

result: 277610 bytes

OCaml 4.06 msvc64:

cl -c -nologo -O2 -Gy- -MD -D_CRT_SECURE_NO_DEPRECATE -DCAML_NAME_SPACE -DUNICODE -D_UNICODE -DWINDOWS_UNICODE=1 -I../byterun \
          -DRUNTIME_NAME='"ocamlrun"' -Foheadernt.obj headernt.c
../boot/ocamlrun ../flexdll/flexlink.exe -x64 -merge-manifest -stack 33554432 -exe -link "/ENTRY:wmainCRTStartup" -o tmpheader.exe headernt.obj

result: 20992 bytes!

OCaml 4.06 mingw64:

x86_64-w64-mingw32-gcc -c -O -mms-bitfields -Wall -Wno-unused -fno-tree-vrp -DCAML_NAME_SPACE -D__USE_MINGW_ANSI_STDIO=0 -DUNICODE -D_UNICODE -DWINDOWS_UNICODE=1 -I../byterun \
          -DRUNTIME_NAME='"ocamlrun"' -o headernt.o headernt.c
../boot/ocamlrun ../flexdll/flexlink.exe -chain mingw64 -stack 33554432 -exe -link "-municode" -o tmpheader.exe headernt.o

result: 278581 bytes

OCaml 4.14 msvc64:

cl -c -nologo -O2 -Gy- -MD  -WX -d2VolatileMetadata-  -D_CRT_SECURE_NO_DEPRECATE -I ../flexdll -DCAML_NAME_SPACE -DUNICODE -D_UNICODE -DWINDOWS_UNICODE=1 -I../runtime -DRUNTIME_NAME='"ocamlrun"'  \
  -Foheadernt.obj headernt.c
cl -nologo -O2 -Gy- -MD  -WX -d2VolatileMetadata-  -Fetmpheader.exe headernt.obj  /link /subsystem:console /ENTRY:wmainCRTStartup  && (test ! -f tmpheader.exe.manifest || mt -nologo -outputresource:tmpheader.exe -manifest tmpheader.exe.manifest && rm -f tmpheader.exe.manifest)

Result: 12800 bytes

OCaml 4.14 mingw64:

x86_64-w64-mingw32-gcc -c -O2 -fno-strict-aliasing -fwrapv -mms-bitfields -Wno-unused -Wall -Wdeclaration-after-statement -Werror -fexcess-precision=standard -fno-tree-vrp  -I ../flexdll -DCAML_NAME_SPACE -D__USE_MINGW_ANSI_STDIO=0 -DUNICODE -D_UNICODE -DWINDOWS_UNICODE=1 -I../runtime -DRUNTIME_NAME='"ocamlrun"' -o headernt.o headernt.c && x86_64-w64-mingw32-gcc -O2 -fno-strict-aliasing -fwrapv -mms-bitfields -Wno-unused -Wall -Wdeclaration-after-statement -Werror -fexcess-precision=standard -fno-tree-vrp  -municode  -o tmpheader.exe headernt.o

Result: 134339 bytes

For mingw-w64, remove -municode and add -nostdlib -Wl,-eheaderentry before the link command and then -lkernel32 -lmsvcrt afterwards

134339 -> 13239!

Looks like we only need wcslen for both

All the mucking around the msvc64 side is not getting below 7680 - indeed, -O2 is already inlining wcslen

Stripping the mingw-w64 header gets it to 6144!

exe-executing

Image Not Showing Possible Reasons
The image file may be corrupted
The server hosting the image is unavailable
The image path is incorrect
The image format is not supported
Learn More →

XXX PR not started!

On Unix, it's reasonable to assume that argv[0] will point to something we can open. On Windows, this isn't true: it's acceptable for the command line not to include the .exe, if it was not used when the process was created (i.e. a program can see the difference between being invoked program vs program.exe).

This has always been the case, but it's been less visible since PATH resolution will give the fully resolved filename. For example, assuming OCaml's bin directory is in PATH, ocamlc.byte will be resolved to a path ending ocamlc.byte.exe. However, if one runs C:\ocamlmgw64\bin\ocamlc.byte then the .exe is not appended, and ocamlrun will claim that no bytecode file was specified, since it can't find C:\ocamlmgw\bin\ocamlc.byte.

Adding .exe is both a nasty smell in ocamlrun and also brittle, given that other extensions are available.

The Windows executable launcher already determines its full location using GetModuleFileNameW in order to read the RNTM section. This PR tweaks the launcher to use this path as argv[0] (following the rules in CommandLineToArgvW to escape it). On Windows, ocamlrun:

already opens argv[0], so we use GetFinalPathNameByHandleW to canonicalise it
also already determines its own executable path via GetModuleFileNameW, so we open it and canonicalise it with GetFinalPathNameByHandleW.
if these two resolved paths are not equal, then we discard argv[0] when launching the bytecode image

The Windows header is totally broken w.r.t. the .exe extension and always has been. We need a way to convey to ocamlrun the executable path we've already determined.

empty-env

Image Not Showing Possible Reasons
The image file may be corrupted
The server hosting the image is unavailable
The image path is incorrect
The image format is not supported
Learn More →

On Windows, environment variables are deleted by setting them to be empty. The main environment block does not differentiate between an empty environment variable and an unset environment variable.

For portability, it is therefore better to ensure that an empty environment variable is treated as un-set on Unix. This is also consistent with most released versions of opam at the moment, as when reverting environment changes, opam leaves empty environment variables, rather than unset ones (i.e. a variable which was unset before calling opam env may be empty after a round-trip through opam env followed by opam env --revert)

This is a minor breaking change in that, for example, an empty OCAMLLIB before resulted in a broken compiler. Similarly, an empty CAML_LD_LIBRARY_PATH always added the current directory to the search path.

Check the claim about empty $OCAMLLIB!
Check the claim about CAML_LD_LIBRARY_PATH

backslashes

Image Not Showing Possible Reasons
The image file may be corrupted
The server hosting the image is unavailable
The image path is incorrect
The image format is not supported
Learn More →

XXX PR not started

This PR is about ensuring that backslashes passed to configure make it through to the system as backslashes. The aim with this is that ./configure --prefix 'C:\OCaml' should result in an installed OCaml with no forward slashes. This is different from the original ocaml/ocaml#658 as this is about preserving backslashes rather than forcing them.

runtime-launch-info

Image Not Showing Possible Reasons
The image file may be corrupted
The server hosting the image is unavailable
The image path is incorrect
The image format is not supported
Learn More →

Use some headings instead :)

This PR is a prerequisite for the "Relocatable Compiler" project, but the changes here are independent of it. It addresses five currently known bugs (see below) in #! ("shebang") handling in the bytecode compiler and its implementation considerably simplifies stdlib/Makefile, in advance of @shindere merging that into the root Makefile.

Background: Bytecode Executables

When not using one of its C-based linking modes (-custom, -output-complete-exe, etc.), ocamlc creates bytecode executables by prepending a launch header to the bytecode image. This header's sole responsibility is to locate the actual OCaml runtime and transfer execution to it. There are three ways in which this can be done:

A #! (or "shebang") header is used with the full path to the runtime, e.g. #!/usr/local/bin/ocamlrun. This is the default on Unix systems (except Cygwin, at least before this PR).
A small shell script is used to exec the runtime. Presently this is only used with -use-runtime when the path given is too long or contains a space, for example:

#!/bin/sh
exec '/home/David Allsopp/OCaml/bin/ocamlrun' "$0" "$@"

A small C program (in stdlib/header{,nt}.c), which is able both to execute a runtime at an absolute location (i.e. /usr/local/bin/ocamlrun) or to search $PATH for a runtime (i.e. to search for ocamlrun in $PATH).

At present, the choice between shebang scripts (mechanisms 1 and 2) and executable (mechanism 3) is made at configure time ($(SHEBANGSCRIPTS) and $(LONG_SHEBANGS)), and the result is written by the build system to the file camlheader which is kept in the Standard Library directory. This file is either the compiled executable header or it is the full path to where ocamlrun will be installed.

The runtime variants (-runtime-variant d and -runtime-variant i) are supported by building multiple versions of this file, so there are in fact 3 of them: camlheader, camlheaderd and camlheaderi.

Finally, in order to support the -use-runtime option, a different file camlheader_ur is created. ocamlc copies this file and immediately starts the bytecode TOC recorder. It then writes the name of the runtime followed by a newline and then marks the RNTM section.

Now, when shebang headers are supported, camlheader_ur is exactly the string #!. This means that ocamlc's procedure writes a shebang header, though pointlessly records it in the RNTM section. When mechanism 3 (the small C program) is in use, camlheader_ur is exactly the same as camlheader. The C program, in addition to knowing how to search $PATH, is also able to read the RNTM section of the bytecode image. It reads this data, cunningly converts the newline character which ocamlc wrote into a nul character, in order to make the RNTM payload a valid C string, and then proceeds to execute that runtime.

For -use-runtime only, ocamlc performs some validation on the runtime path to check if it's valid to use in a shebang line. If it's not, then it elects to write a mechanism 2 header (using /bin/sh).

That gets us to OCaml 4.02.1. In OCaml 4.02.2, in order to assist the iOS and Android cross-compilation projects, an additional set of headers was added: target_camlheader, target_camlheaderd and target_camlheaderi. These are the same as their unprefixed counterparts, except that the directory written to them is $(TARGET_BINDIR) (which can be overridden when calling make) instead of $(BINDIR). There is no target_camlheader_ur because there are no paths embedded in the file (so it never differs).

The

Image Not Showing Possible Reasons
The image file may be corrupted
The server hosting the image is unavailable
The image path is incorrect
The image format is not supported
Learn More →

s

As announced above, the five following bugs are present in this mechanism in OCaml. In decreasing order of severity, they are:

Not all valid paths are valid to use in a shebang line. If OCaml is configure'd with a bindir which is not valid in a shebang line, at present the bytecode tools installed correctly use mechanism 2. However, the target_camlheader* files are incorrectly generated, and bytecode executables produced by the installed ocamlc will have invalid shebang lines. (this is a very real bug, originally identified by the Sandmark project; see also https://github.com/ocaml/ocaml/pull/2309#issuecomment-503582198 and #12709)
On Windows, passing an argument to -use-runtime which is longer than 125 characters causes ocamlc to generate a corrupt executable since it uses a #!/bin/sh header.
The check for whether a path is valid for a shebang line is duplicated between configure.ac and bytecomp/bytelink.ml, these have subsequently diverged and are (still) both incorrect. In particular, configure.ac only checks the length (and, even then, in a conservative check which rejects one otherwise valid possible header) and while bytecomp/bytelink.ml checks for space characters, both places fail to check for tabs and newline characters, which are also not permitted in a shebang line. (this issue has been separately reported in #10724, and is also one reason that Windows CI actually includes a stronger test of strange characters in --prefix than the Linux/macOS one)
Both stdlib/Makefile (generating the headers) and bytecomp/bytelink.ml (processing -use-runtime) assume sh resides in /bin/sh, which is not guaranteed by POSIX (and, indeed, is not the case on some, admittedly obscure, systems)
When camlheader_ur is just the string #!, bytecomp/bytelink.ml still (unnecessarily) records the RNTM section. This in itself isn't a bug, but when writing a #!/bin/sh script version, the RNTM section incorrectly contains the entire /bin/sh script, rather than just the name of the runtime.

Current implementation

The current implementation of all this goes to some lengths to ensure that it is enough for bytecomp/bytelink.ml to copy the header blindly and never actually have to inspect its content. Regardless, the header is a relatively subtle piece of configuration state. The compiler and most of the tools will be compiled with boot/ocamlc which is built with a generic Config module. With the current setup of the build, therefore, it is not possible for boot/ocamlc to use values from Config (although Config.bindir exists, boot/ocamlc will see the value it was built with during the bootstrap, not the value used during configuration and, for this reason, there is no Config.shebangscripts to mirror the $(SHEBANGSCRIPTS) variable in the build system). Although ocamlc doesn't at present actually analyse the content of the header it's copying, the decisions it takes as to which file to read (based on -use-runtime and -runtime-variant) mean that the header is effectively acting as a series of "ghost" command-line arguments! While this is sort of neat, it's causing a few problems:

In -use-runtime when camlheader_ur is #!, ocamlc ends up writing the full path of the runtime twice (even allowing for bug 5)
In -use-runtime mode, ocamlc only needs to mark the RNTM section for the executable header, but it's unnecessarily marking it even when camlheader_ur is just #! (which ocamlc has in fact read!)
Even more nefariously, even when the RNTM section is needed, the string ends with the wrong terminator in order to keep the format valid for the shebang case (i.e. the RNTM section is written unnecessarily in one case and, in order to ensure that the string is correct in that unnecessary case, the string has to be mangled in the necessary case
Image Not Showing Possible Reasons
- The image file may be corrupted
- The server hosting the image is unavailable
- The image path is incorrect
- The image format is not supported
Learn More →
)
For largely historical reasons, the Windows ports prefer to search to %PATH% for the runtime, and never use an absolute path. This is very subtly encoded in stdlib/Makefile.
Storing the "state" of the shebangs in camlheader et al means that the code for validating shebang lines is at present implemented in m4sh (in configure.ac), in OCaml (in bytecomp/bytelink.ml) and should be being implemented in GNU make (in stdlib/Makefile).

Proposed implementation in this PR

Presently, the processing of the header in ocamlc is simple, because it boils down to copying the correct file. I think it's possible to fix these 5 bugs while maintaining that. However, the code (in GNU make and m4sh) won't be terribly tasteful and the various checks will still be duplicated in several places. It is not possible to do Relocatable's switch cloning this way, where camlheader instead of being #!/usr/local/bin/ocamlrun wants to be something akin to #!../../bin/ocamlrun with the ../ interpreted relative to the header itself.

So, at last, to the details.

The principle here is to allow ocamlc to do all of the work, being given only the information which it can't know in advance via the "header". Since the header is now really a data-file, it's called runtime-launch-info. It contains the following three pieces of information:

Whether shebang headers may be used, and if they can be used, how to specify sh in a shebang
The full directory containing ocamlrun and its variants (this is usually $prefix/bin)
The content of the executable header (compiled from stdlib/header.c, or stdlib/headernt.c on Windows)

ocamlc is responsible for:

Constructing the full path to the runtime (based on -runtime-variant and the bindir read from runtime-launch-info)
Finding sh, if runtime-launch-info doesn't contain an absolute path to it
Writing either a valid shebang header or ultimately falling back to an executable header if this is not possible
As a result, the RNTM section is used only with the executable header (and is now null-terminated). Furthermore, when using executable headers, it is required that the RNTM section is present in the image (this stops the executable header from ever containing an absolute path).

It makes sense while overhauling all this to implement (finally) the --with-target-bindir option, which was added in the switch to autoconf in 4.08 but never plumbed in. Previously, cross-compilation systems will have specified TARGET_BINDIR to make instead. Additionally, a new switch --with-target-sh has been added to complete the cross-compilation picture, at least in terms of the shell scripting. This allows every aspect of stdlib/target_runtime-launch-info to be controlled in the build. In particular, it allows an improvement to Cygwin's compilation (see below), also providing an immediate upstream use-case for this change.

While the implementation necessarily adds quite a lot of code to bytecomp/bytelink.ml, it removes a relatively complex bit of m4sh from configure.ac and an exceedingly complex mess from stdlib/Makefile. In passing, there are a couple of related issues which can be trivially fixed:

There is a highly unlikely corner-case where sh may not be found. This is "solved" by always compiling the executable launcher, even on Unix. A minor side-effect of this is to reduce bit-rot in this file, which had started to happen (see first commit).
The implementation of --with-target-sh allows the use of shebangs on Cygwin to be improved.

The approach I've adopted in this PR is to allow ocamlc to look at the data it reads from camlheader and act accordingly. There is more than one way to do this! I think it is possible to achieve this using a mix of -use-runtime in the build system and carefully ensuring that the Config module's values are only ever used by a compiler which has been actually installed. Likewise, we have previously discussed being able to dynamically load the complete Config module into the boot compiler (see #9291). I think the -use-runtime approach is likely to be a bit too brittle and although I have a possible approach for loading Config at runtime, it's not a trivial change, and we are also fixing bugs here and now.

This branch used to be one-camlheader
Plumb the revised branch into the next rebase
- Note the cmpbyt part below: definitely worth checking that bootstrapping is still working correctly (it should be, though - doesn't the process end with a bootstrap?) (the change for 5.0+ was in #11149 where the need for cmpbyt was eliminated; it's kept because it has better error reporting)
Rename camlheader to ocamlheader (or ocamllauncher - launcher is feeling good). Current verdict is runtime-launch-info
This PR should preserve the Windows behaviour of not embedding the path to the runtime. This should be easy to do, as boot/ocamlc does have access to Sys.win32. That changes gets reverted when unified with camlheader-search.
The detection of spaces in the shebang needs to take \t and \n into account.
Separately, bytecomp/symtable.ml's function for getting the primitives from the runtime should echo the command when -verbose is used (generalise the mechanism already present in utils/ccomp.ml) Not sure why this is a TODO against this branch?
At the moment, the branch updates tools/stripdebug.ml but not tools/cmpbyt.ml to ignore the RNTM section - shouldn't it do both? For 5.0+ this is unnecessary, because cmpbyt is optional, but for the back-ports this might matter?
Cross-reference ocaml/ocaml#10724 - maximum length of shebang depends on kernel version. Not sure if this is worth fixing. - enhance the comment in bytelink.ml?
Cygwin is using shebangs, which should be separated (or at least noted in Changes)

ld.conf-CRLF

Image Not Showing Possible Reasons
The image file may be corrupted
The server hosting the image is unavailable
The image path is incorrect
The image format is not supported
Learn More →

At present, Windows can correctly read either a CRLF or an LF-formatted ld.conf, however Unix cannot. This PR adds the appropriate tweaks to Dll and dynlink.o to skip \r characters when parsing ld.conf. Note that the compiler itself already handles this correctly for source files.

One slightly whacky thought: this technically breaks Unix if you actually had directories with a \r character at the end. Given the lack of an escape hatch, not sure whether to go with this, reject it completely, or possible post-process the list and do the ocamltest-style "remove exactly the last \r only". However, I think accidentally loading the wrong file is more likely to be a problem than this!

ld.conf-search

Image Not Showing Possible Reasons
The image file may be corrupted
The server hosting the image is unavailable
The image path is incorrect
The image format is not supported
Learn More →

The "harmonious" feature of the relocatable compiler is that the configuration of one compiler should not interfere with the configuration of another. The bytecode runtime has to read ld.conf from the Standard Library location on startup. This location at present is taken from $OCAMLLIB or $CAMLLIB if either of these is set. If neither is set, the location the compiler was configured with is used. However, if OCAMLLIB has been set for one runtime, then another runtime will not always load the "wrong" ld.conf.

This PR primarily alters the runtime so that ld.conf is loaded from all the possible locations in order.

This is a "breaking" change inasmuch as programs which would have been expected to fail before might instead work. Programs which worked are unaffected, because they must have been loading libraries based on the first ld.conf which was found.

The runtime no longer uses caml_get_stdlib_location. I've consequently removed it (and therefore it's no longer displayed in ocamlrun -config).

This PR includes a simplification the memory management for caml_shared_libs_path - the effect can be seen by looking at the PR commit-by-commit, but it eliminates the need to allocate and return an array of pointers from caml_parse_ld_conf. In this PR, that's a simplication - in the subsequent PR introducing relative syntax to ld.conf, it's mandatory (since the strings passed to caml_shared_libs_path are then computed rather than simply read).

Should we consider automatically adding the directory containing the executable to the search path?

ld.conf-relative

Image Not Showing Possible Reasons
The image file may be corrupted
The server hosting the image is unavailable
The image path is incorrect
The image format is not supported
Learn More →

This is the first of the patches allowing the compiler to be relocated/cloned.

At present, the lines in ld.conf are expected to be absolute paths, but this isn't actually checked. Determining if a path is absolute is mildly complicated (Windows…), however explicit relative paths can be portably identified with ease, since these are paths beginning ./ or ../ (or just . and ..).

These entries in ld.conf are now interpreted relative to the directory containing ld.conf. The default ld.conf file can be written:

./stublibs
.

which can clearly be copied or moved.

Implicit paths (as defined in Filename.is_implicit) retain the old, somewhat bizarre, interpretation. CAML_LD_LIBRARY_PATH retains the same interpretation as before.

CAML_LD_LIBRARY being blank (rather than unset) seems to have an interpretation. Has that changed with this PR? (it shouldn't change)
Possibly for enable-relative, but the generation of stublibs assumed that STUBLIBS in Makefile.config hadn't been overridden. Not sure if this is unnecessary pedantry (so whether the original just writing ./stublibs would do) or whether it should go further and do a relative computation.

Extends the format of ld.conf to recognise implicit paths as being relative to the location of ld.conf (NB in this instance, implicit includes . and ..). This is breaking, since previously such paths would have been relative to the build directory. Reasons for picking this scheme:

It continues to allow directories not rooted under the stdlib to be referenced
We can refer to the standard library as . which is nicer than a blank line. The default file becomes . and ./stublibs
Explicit relative is much easier to determine than absolute
Not using + for two reasons:
- We use + elsewhere to refer to the effective standard library location whereas here it means the location of the ld.conf file being read
- It is confusing if you actually have a directory beginning with + (which is not ridiculous)

compiled-primitives

Image Not Showing Possible Reasons
The image file may be corrupted
The server hosting the image is unavailable
The image path is incorrect
The image format is not supported
Learn More →

When compiling a custom bytecode runtime, a C file is generated containing the combined primitives table of the runtime (camlprim.c). Presently, this is passed directly to the C compiler and is written in a way which avoids using any of the C headers. This has resulted in increasing amounts of code duplication - the typedef for intnat (already in caml/config.h) has to be inferred and the linker command line in Bytelink has to handle -fdebug-prefix-map, duplicating logic already in Ccomp to handle this. It gets worse with the relocatable compiler patches.

This PR alters ocamlc's link process so that primitives file is explicitly compiled using Ccomp.compile_file. That eliminates the duplicated debug prefix map code and also allows caml/mlvalues.h to be used to get all the required definitions.

For the build system, this means when building with -custom that we must ensure that runtime directory has been included with -I so that the headers are available.

Is the testsuite passing in --disable-shared mode?

enable-relative

Image Not Showing Possible Reasons
The image file may be corrupted
The server hosting the image is unavailable
The image path is incorrect
The image format is not supported
Learn More →

This PR adds --enable-relative to configure which, when given, specifies that both the bytecode runtime and the compilers should locate the Standard Library using a path given relative to the directory containing the tools themselves. For a default installation on Unix, this changes the default location of the Standard Library from /usr/local/lib/ocaml to ../lib/ocaml.

At first glance, implementing this seems straightforward - the relative value for LIBDIR gets injected into runtime/dynlink.c via OCAML_STDLIB_DIR and into utils/config.ml via %%LIBDIR%% and a relative calculation can then be added in both places. This seemingly straightforward approach fails in two ways:

Critically, bytecode executables compiled with -custom or -output-obj would search for ld.conf relative to the compiled program, which is clearly wrong.
Breakingly, users of the Config module require updating to recognise a relative value being returned by Config.standard_library

The second problem could clearly be solved by using a new value, and continuing to leave Config.standard_library as it was before (i.e. deprecating Config.standard_library internally and using, say, Config.effective_standard_library). However, this breaks the reproducibility of the build, and doesn't solve the first problem.

What is needed, therefore, is one value for the Standard Library location used by ocamlrun and by the compiler drivers, which can be relative, and an absolute value which is used by programs created by the compilers.

The solution proposed here is to introduce caml_standard_library_default (a relative path in --enable-relative or LIBDIR otherwise). This symbol is not included in either libcamlrun or libasmrun but, like prims.o, is added to ocamlrun. That works correctly for ocamlc when outputting executables using the launcher stub (i.e. which are invoked using ocamlrun). ocamlopt, and ocamlc when linking actual executables or objects, then calculate the effective value of caml_standard_library_default and put this in the startup object. This deals with both problems, except that it means that ocamlc.opt and ocamlopt.opt now always have an absolute path for caml_standard_library_default (since only ocamlrun had the relative path). To deal with this, a new compiler option -set-global-string name=string is added to both drivers. This parameter is only valid when ocamlc is linking C code (i.e. it's not valid for bytecode which is sent to ocamlrun) and causes the global name to be added set to the "string". The compiler's build system then uses this flag when linking to specify the relative path to the Standard Library. If caml_standard_library_default is not set using -set-global-string, then the compiler automatically sets it to the absolute path it's computed. Now, all the compiler distribution tools compute the Standard Library relatively, but everything produced by those tools use an absolute path, computed when those tools start.

Note that while libcamlrun and libasmrun gain an additional undefined symbol which has to be provided when linking an executable, these libraries are already expected to be used with an object emitted using -output-obj, which already defines these symbols.

Additionally:

ld.conf at present is always loaded by the runtime, which is unnecessary for -custom executables. The runtime is tweaked only to read ld.conf if it will need to use the search path to load shared libraries.
The -set-global-string option adds the need for OCaml to be able to manipulate UTF-16 strings (so that ocamlopt can emit the correct assembly listing on Windows) and also to be able to encode an OCaml UTF-8 string as a C string literal, both of which are done using C primitives. The use of C for producing the C string literals allows the use of the Windows API functions for converting between UTF-8 and UTF-16, rather than having to add a decoder to the Standard Library (the code is also already present in runtime/sak.c).
Config.standard_library remains the absolute path to the Standard Library. Config.standard_library_default is the actual value computed by configure (which may therefore be a relative path). Config.standard_library_effective is always an absolute path but, unlike Config.standard_library, it does not read $OCAMLLIB or $CAMLLIB. Finally, Config.standard_library_relative is true when the compiler was built with --enable-relative (it is effectively Filename.is_relative Config.standard_library_default)

The first implementation acquired caml_realpath and caml_dirname which provided the building blocks to implement caml_locate_standard_library in a cross-platform way. The problem is that this is all required in C, and implementing the Windows versions of both of those functions properly is non-trivial. However, for Windows, GetFullPathName combined with GetFinalPathNameByHandle gives the required result (both a good filename to display, and GetFullPathName will set a pointer to the basename of the result), so it's actually better to implement caml_locate_standard_library separately on each platform.

Tests:

TODO

ld-warning

Image Not Showing Possible Reasons
The image file may be corrupted
The server hosting the image is unavailable
The image path is incorrect
The image format is not supported
Learn More →

Three of the backends are missing .type and .size directives for caml_system.frametable, which causes linking warnings when using libasmrun_shared.so.

runtime-id

Image Not Showing Possible Reasons
The image file may be corrupted
The server hosting the image is unavailable
The image path is incorrect
The image format is not supported
Learn More →

This PR introduces the concept of a RuntimeID to describe a given version and configuration of OCaml and forms the basis of filename mangling used to allow both multiple versions and configurations to co-exist harmoniously.

The RuntimeID itself is documented in runtime/RuntimeID.md. Since bytecode and native code have different configuration options, a value is calculated for each in configure and exposed in Config.bytecode_runtime_id and Config.native_runtime_id. The choice of a 5-bit encoding means that only lowercase letters are needed, so no two RuntimeID values end up relying on a case-sensitive file system. It is intentional that while the RuntimeID is always written in lowercase, it may be searched case-insensitively (especially on Windows).

This is still WIP

Remove the int31 and shared-libraries bits; the Zinc ID is just the release number and release bit. July 2023: not put working here! Was this to reduce the complexity, or for a more fundamental reason? Mar 2024: think it's that in both cases the runtime system is able to cope with these being wrong/different??

Tests:

TODO

runtime-suffixing

Image Not Showing Possible Reasons
The image file may be corrupted
The server hosting the image is unavailable
The image path is incorrect
The image format is not supported
Learn More →

This PR allows OCaml runtimes and associated shared libraries to co-exist harmoniously on the same system without having to hide from each other. At present, the following interactions can all fail as a program seeks a runtime:

The wrong ld.conf may be read, if OCAMLLIB or CAMLLIB is set to another Standard Library
The wrong libcamlrun_shared.so or libasmrun_shared.so could be loaded; it's necessary to ensure that only the correct one appears in LD_PATH.
The wrong stub libraries may be loaded (for example, dllunix.so) if CAML_LD_LIBRARY_PATH includes a directory for another runtime

The first issue is partially dealt with by ensuring that ld.conf is loaded from both $OCAMLLIB/ld.conf and the configured default. This change turns the first problem into an instance of the third.

The second two problems are addressed by this PR. The RuntimeID is used to mangle filenames so that shared libraries do not conflict between different configurations and runtimes because they have different names.

The name mangling is also applied ocamlrun. Historically, the Windows launcher searches for ocamlrun in PATH (since Windows OCaml was distributed as a precompiled binary), and therefore suffered this same problem. However, ocamlrun uses a slightly different RuntimeID, based solely on the release number of OCaml. Bytecode is compiled to be portable, so the intent is that the bytecode "declares" (both in its magic number, and in its "Zinc" RuntimeID) that it runs on a specific version. That specific instance of ocamlrun will use its bytecode RuntimeID to load shared libraries, which must of course exactly match.

For ocamlrun, libcamlrun_shared and libasmrun_shared, everything is handled by the build system. For bytecode C stub libraries, a little more work is required. It is intended that RuntimeID values are not "exposed" to the user - i.e. that a programmer should never need to care about their existence. ocamlmklib is therefore augmented with a -suffixed option, which indicates to ocamlmklib that the name given in -oc should be automatically suffixed when creating the shared stubs library. Note that this is only done for the shared library (.so/.dll). The static library (.a/.lib) is left undecorated: the name mangling is used to solve runtime problems, not compile-time problems. ocamlc then gains -dllib-suffixed which receives which similarly indicates to ocamlc to suffix the supplied library name. These two parameters together mean that the only change required in a user's build system to take advantage of the suffixing is to add -suffixed to the ocamlmklib invocation.

The existing bytecode implementation embeds relatively portable names for shared libraries into the bytecode image, in that the shared object extension (.so vs .dll) is stripped. In order to keep the bytecode images as portable as possible, -dllib-suffixed causes the un-suffixed name to be written either to the DLLS section of the bytecode image, or in the .cma header. An indicator byte in DLLS tells ocamlrun whether to apply suffixing to the name when the bytecode executable itself is started. This means, for example, that an application using Str compiled on Unix and an application compiled using Str on Windows produce the same bytecode image.

In passing, ocamlopt now supports compilation against libasmrun_shared. This can never have worked since the library had the wrong extension (I think that either libasmrun_shared.so was only ever compiled as the dual of libcamlrun_shared.so or it was used with -output-obj and so linking the final executable was done separately). ocamlopt -runtime-variant _shared will now produce a native executable requiring libasmrun_shared-<suffix>.so.

Similarly in passing, the DLLS and DLPT section were always written even if they were empty. 16 bytes are now saved by omitting them entirely when there's nothing to put in them.

tools/objinfo needs updating for the new DLLS format
Clarification in runtime/Mangling.md of the Zinc Runtime ID
Is it worth having something which will display the mangled name to allow for installation?
Add -no-suffixed - and set out deprecation plan for switching the meaning. Or, obeying one's own rules, do we change this immediately? Do we already have a good story for -no- options in OCaml or is that just gcc?

camlheader-search

Image Not Showing Possible Reasons
The image file may be corrupted
The server hosting the image is unavailable
The image path is incorrect
The image format is not supported
Learn More →

Bytecode executables are normally a bytecode image with a small launcher prepended. On Unix, this is a shebang, usually directly to the interpreter. On Windows, this is a small executable.

Historically, Windows quietly didn't include the full path to the runtime, allowing the header to search PATH for the runtime.

This PR formalises this behaviour, extending both the shebang launcher and the executable to be capable of performing three searches for the runtime:

At a preconfigured location (i.e. the present behaviour)
In the same directory as the executable
In PATH

At present, Unix does 1 only and Windows does 3 only. A new option --enable-runtime-search is added to configure. This option accepts three values:

no (equivalent to --disable-runtime-search) maintains the current behaviour, with the difference that Windows will also use a preconfigured location only
yes (equivalent to --enable-runtime-search) prefers the current behaviour, but if the runtime is not found at the preconfigured location, then the same directory is executable is tried, followed by a search of PATH
always first looks in the same directory as the executable for the runtime, and then searches PATH if necessary

The behaviour is designed to allow the compiler distribution to be cloned, and --enable-runtime-search=always removes the last traces of a hard-coded path from the compiler distribution when combined with the other PRs in this series.

The runtime search mode is encoded in camlheader. The existing format is extended to convey to ocamlc the required runtime seach mode.

With --disable-runtime-search, the shebang header is as before, except that the directory must end with a directory separator. A default camlheader would be:

#!/usr/local/bin/
/usr/bin/sh

The executable header has a similar first line, but with the #! changed to !! (e.g. !!/usr/local/bin/) followed on the next line by the binary data for the header.

With --enable-runtime-search[=yes], the shebang header is replaced with an entire shell script which begins with shebang for sh (usually #!/usr/bin/sh). ocamlc identifies this case by the lack of a trailing directory separator. In this case, the entire script is copied to the executable, except that line exactly matching r= is replaced with r=''<runtime-name>'. The executable header is encoded as for --disable-runtime-search, but with !# instead of !!

Finally, with --enable-runtime-search=always, the shebang header is processed as for --enable-runtime-search=yes (the file is generated differently). The executable header simply has !! on the first line, followed by the executable itself.

ocamlc is therefore able to determine from camlheader exactly what to write both in terms of header and for the RNTM section. The same executable header is used regardless of the mode - the format of RNTM is tweaked, using null characters (which are illegal in filenames on all systems):

If RNTM ends with a null, then RNTM is the preconfigured location and is the only runtime which should be tried
If RNTM begins with a null, then the rest of the RNTM data is the name of the runtime to search for and is not null-terminated (used for --enable-runtime-search=always)
Otherwise, RNTM will contain one null in the middle of the string

--enable-runtime-search controls stdlib/camlheader, and thus all the bytecode tools which will be built and installed. There is also --enable-runtime-search-target controls stdlib/target_camlheader, and thus everything which will be produced by ocamlc after installation.

These settings all have active use-cases:

OS package managers will wish to keep the default --disable-runtime-search --disable-runtime-search-target behaviour (the runtime being required to be in /usr/bin/ocamlrun).
opam will use --enable-runtime-search=always --enable-runtime-search-target=yes, allowing the compiler to be cloned, but producing executables for the user which assume that switches won't move, but which are resilient to that move.

Implement --enable-runtime-search-target
Implement the ^r= approach to substitution
Remove the additional runtimes interpretation (for RNTM it's much easier - we null terminate RNTM regardless and then search for the first null)
- Oct 2024 Revised thinking (in the current branches) is that this is not a good idea. The benefit when keeping this is that the bytecode images become portable - i.e. the same bytecode sources produce the same header, and can be portably run on another system (combined with camlheader-ape, it's then actually a portable executable between systems).
- This should be extended to include the compressed marshalling flag (i.e. the presence or not of the compressed marshalling primitive)
Ensure tools/ocamlsize is working with the various shebang headers (it needs to be able to parse both the path and the name lines)
The boot compiler should be using both relative standard library (with location .) and relative header search (so no need to specify runtime)
When this is revised, consider the comnbination of enable-relative. enable-relative should allow the header to be specified relatively but written absolutely. This is --enable-runtime-search=always --enable-runtime-search-target=no - i.e. the compiler is relocatable, but it writes shebang headers which are based on the inferred absolute location of the compiler. That absolutely requires the header to be computed in ocamlc. It also suggests that camlheader-search should probably be earlier in the patch-set - it makes sense that enable-relative alters this branch, rather than the other way around.
Notes from 8-Oct-2024 whiteboarding this:

	`_ ↓`	`_ ↓`
`int31-only`	`0 0`	`1 1`
`static`^[1]	`0 1`	`0 1`
	`↑`	`↑`
	shared-only	shared only
	(1 system)	(1 system)

Absolute (only one runtime possible)
…
Search (priority + rank others)
…
Always (stable ordering - shared+int63, static+int63/shared+int31, static+int31)

Boot compilwe: not shared + int31-only which results in a fixed zinc ID.

Tests:

TODO

camlheader-ape

Image Not Showing Possible Reasons
The image file may be corrupted
The server hosting the image is unavailable
The image path is incorrect
The image format is not supported
Learn More →

This is an interest branch for the ape header version. Notes so far:

File is in cosmo/camlheader.ape on ubuntu.thor
Extract the DOS stub from an existing EXE
Change the header from MZ\0\0\0 to MZ='\n
Append '\nread -r r<<"EOF"
This stub should still execute in DOSBox (displaying the traditional "This program cannot be run in DOS mode" message)
Compile tmpheader.exe using cl with /stub after /link to inject this revised stub
Assemble camlheader with !! on the first line followed by this stub
Append \nEOF\n to camlheader followed by the --enable-relative-search=always script
Alter ocamlc to still scan for the ^r=$ line when compiling.
Alter the header minimally to be shebang compatible
That header can now be dropped in both Unix and Windows - demonstrate with the str.cma that the file produced is the same

disable-absolute

Image Not Showing Possible Reasons
The image file may be corrupted
The server hosting the image is unavailable
The image path is incorrect
The image format is not supported
Learn More →

XXX PR not yet started!

The intention here is to allow the compiler to continue towards reproducibility by adding --enable-relative=strict. In this mode, the default injected for caml_standard_library_default is "" - i.e. if a program use Config, then it must link with -set-global-string to set the correct value (which may or may not be relative, depending on the use-case)

Revisiting some of the stuff in enable-relative which embeds caml_standard_library_default. Issues:

Tests:

TODO

runtime-realpath

Image Not Showing Possible Reasons
The image file may be corrupted
The server hosting the image is unavailable
The image path is incorrect
The image format is not supported
Learn More →

This branch is functionally complete, and includes two fixes to the Windows version (``\?\UNCis now correctly translated; there's an off-by-one error in a buffer calculation, and the use ofGetFullPathNameis more reliable than usingFilename.dirnamealthough it's not clear that that's anything further than just fixing the bugs inFilename.dirname` on Windows)

However, the resulting code is very complicated.

Conclusion: enable-relative will use the syscalls directly. The bug fixes in Unix.realpath should be transferred to trunk (in OCaml). The error handling stuff is worth doing a PR for, but not terribly urgently. That's possibly more worth looking at as part of the wider error handling stuff.

This PR moves the C parts of realpath from unix/win32unix to give caml_realpath. The move simplifies the implementation of Unix.realpath (this could have been done anyway, but it's now more obvious).

There's also a change to move win32_maperr from win32unix into the runtime - this makes a small change to the errno used for ERROR_CURRENT_DIRECTORY which needs investigating, but this feels like a better change - there was a smaller stub function already in win32.c. It's not a necessary part of the change, though - it'd be possible to pick errno values in caml_realpath for the error cases instead.

target-bindir

Image Not Showing Possible Reasons
The image file may be corrupted
The server hosting the image is unavailable
The image path is incorrect
The image format is not supported
Learn More →

Image Not Showing Possible Reasons

The image file may be corrupted
The server hosting the image is unavailable
The image path is incorrect
The image format is not supported

Learn More →

This branch is no longer required. The fix here is superseded by the fixes in runtime-launch-info.

This option has been present since 4.08 in configure but it wasn't propagated to the build system (which allows TARGET_BINDIR to be specified manually when building cross compilers).

--with-target-bindir has been broken since 4.08.0 - there's a notional remnant in that TARGET_BINDIR can be passed directly to make. Looking on GitHub, no use of -target-bindir or --with-target-bindir is actually correct (they're all pointless - it's the same as BINDIR).

This test should return just the prefix (/usr/local/bin/ocamlrun):

./configure --with-target-bindir=/somewhere
make -j
find . -executable -type f | xargs -L1 sed -ne '1s/^#! *//p' | grep -v '^/\(usr/\)\?bin\(/\(ba\)\?sh\)\?' | sort | uniq

and stdlib/target_camlheader should be #!/somewhere/ocamlrun

camlheader_ur

Image Not Showing Possible Reasons
The image file may be corrupted
The server hosting the image is unavailable
The image path is incorrect
The image format is not supported
Learn More →

Image Not Showing Possible Reasons

The image file may be corrupted
The server hosting the image is unavailable
The image path is incorrect
The image format is not supported

Learn More →

This branch is no longer required. The fix here is superseded by the fixes in one-camlheader.

use_shebang is presently in utils/config.fixed.ml which is probably correct, but needs confirming. In particular, that means that boot/ocamlc must never rely on it (because it will write the wrong thing on Windows). The key thing is that the bootstrap doesn't use -use-runtime and then attempt to run the resulting binary (which is correct) and that under correct configuration we never emit #! during the build on Windows!

Eliminates camlheader_ur: it's broken in terms of long shebangs. It's either literally #! or camlheader (which will be an executable).

Tests: none required

long-shebangs

Image Not Showing Possible Reasons
The image file may be corrupted
The server hosting the image is unavailable
The image path is incorrect
The image format is not supported
Learn More →

Image Not Showing Possible Reasons

The image file may be corrupted
The server hosting the image is unavailable
The image path is incorrect
The image format is not supported

Learn More →

This branch is no longer required. The fixes here have been folded into one-camlheader.

In -use-runtime mode with a long shebang, there is a comment in bytecomp/bytelink.ml which doesn't entirely make sense. It appears that the #!/bin/sh shebang is written and then the runtime gets written on a stray line following it and also it gets written in RNTM. Confirm that this is three copies and, if so, do we need the extra line? i.e. does tools/ocamlsize still work?
@shindere will request the end the endif in Makefile is tagged with comments (n/a)

Fixes the bug @dra27 identified in https://github.com/ocaml/ocaml/pull/8622#issuecomment-503605224

This PR moves the generation of all shebangs back into ocamlc

It also always build the executable header for Unix.

Tests: none required

opam-bin patches

opam-bin introduces + notation for explicit relative paths in ld.conf, but this seems unnecessary: the present behaviour (in OCaml) borders on a bug and can only have been done by hand (I think?!) so the alternative is to treat any relative path as relative to where ld.conf was loaded from.
opam-bin also searches OPAM_SWITCH_PREFIX. This is not necessary with the relative searching PR.
the ocamlmklib stuff is to do with compiler_path which is already in hand.
opam-bin seems to add support for #load "+file.ml", although it's not at all clear why this is a necessary fix.
opam-bin including extending relocation support to things which use compiler-libs. Don't care about that (at least at this stage): if ppx_tools wants to be relocatable, it should be updated to use the compiler-libs relocatably, not the compiler mashed around.
opam-bin does a lot of work to normalise the path - this is unnecessary, it should be done by chdir followed by getcwd if it even matters. It is very tempting to move Daniel's realpath into the standard library for doing this directly. This should be done - we'll fall back to chdir / getcwd otherwise.
The OPAM_SWITCH_PREFIX trick poses a further problem - anything using compiler-libs, but not installed in the switch, immediately fails with another switch:











RUN opam install opam-bin
RUN opam exec -- opam-bin install
RUN opam switch create prime-cache ocaml-base-compiler.4.14.0
RUN opam switch create foo ocaml-base-compiler.4.14.0
RUN opam switch create bar ocaml-base-compiler.4.14.0
RUN echo 'print_endline Config.standard_library' > stdlib_loc.ml
RUN opam exec --switch=bar -- ocamlopt -o stdlib_loc -I +compiler-libs ocamlcommon.cmxa stdlib_loc.ml
RUN echo 'All of these should be the same:'; \
    opam exec --switch=bar ./stdlib_loc; \
    opam exec --switch=foo ./stdlib_loc; \
    ./stdlib_loc

Original document

Relocatable OCaml

Technical background

Locating the interpreter `ocamlrun`

Bytecode executables presently have three mechanisms for invoking the bytecode interpreter:

abs Absolute location specified with a shebang (#!) header or inserting the location immediately below the camlheader program. This is the preferred mechanism on Unix.
env PATH searching by prefixing the bytecode image with a small C program which searches PATH for ocamlrun. This mechanism is primarily used on Windows.
custom Compiling the executable with -custom which builds an entire runtime (with any other C support required) and embeds the bytecode into this executable.

@@DRA Check that this is true
Note that abs can be implemented both as #! and also with the camlheader program.

Each of these has various strengths and weaknesses:

abs guarantees the correct runtime will be found (although see later notes on CAML_LD_LIBRARY_PATH), but is not relocatable (that is to say, the runtime must be in the same place on any system on which the executable is expected to run)
env allows for relocation, but at the risk of executables being unrunnable if the environment is scrubbed (e.g. in a sudo operation) or if the first ocamlrun found is for the wrong version of OCaml.
custom solves all these problems but at the expense of larger executables and, to a lesser extent, the loss of common updating of components (depending on one’s view of “DLL hell”)

Locating the standard library

Both the runtime and the compilers use the location of the standard library.

The runtime uses it to locate ld.conf which, in conjunction with CAML_LD_LIBRARY_PATH, is used to search for dynamically linked C stub libraries.

The compilers use it as the final search directory for object files.

In each case, the value of the standard library location given to configure is embedded and is the default. Without manual tweaking, this location is presently absolute.

The default may be overridden by the OCAMLLIB environment variable. For the compilers, it can also be completely ignored by using the -nostdlib parameter.

@@DRA check interaction with -I - this will be clearer with a formal list of where the search path can presently come from (same for CAML_LD_LIBRARY_PATH)

Locating dynamically loaded C stubs

The runtime forms a search path from directories given in CAML_LD_LIBRARY_PATH and ld.conf for locating .so (or .dll on Windows) files containing C primitives needed by the bytecode image.

Goals

Changes to the compiler and runtime should seek to unify the following goals:

Execution of a bytecode executable using a shebang header should never select the wrong runtime.
It should be possible to have multiple runtimes in PATH without breaking 1). A corollary of this is that it should be possible to install multiple runtimes.
It should be possible to move an installed compiler to a different location without setting environment variables.
OCAMLLIB should not cause a compiler to cease working just because the library it points to is for a different version of OCaml.
Likewise, CAML_LD_LIBRARY_PATH should not cause one version of ocamlrun to load C primitives intended for a different version of the runtime.

Proposals

Runtime MD5

Every build of OCaml will include a new MD5 magic number formed by the checksum of the concatenation of several configuration parameters:

Version
Bytecode magic number
Other pertinent configuration (TODO Naked pointers, etc.?)

This does not include any paths, so it is intended that the magic numbers would be the same on any system for a given configuration of an identical version of OCaml (indeed, a database of these may be included in OCaml to allow better error reporting, similar to changes proposed for ocamlobjinfo (@@DRA reference?)

These magic numbers would be displayed in the output of ocamlc -config

RuntimeMD5 is the lowercase first 8 characters of this checksum.

New handling of `OCAMLLIB`

If OCAMLLIB-RuntimeMD5 is defined, then OCAMLLIB is ignored. Note that the special case of defined and empty allows for ignoring OCAMLLIB without actually overriding the configured value.

If stdlib.cma exists in OCAMLLIB but it does not match the cma magic number of the compiler then it is ignored (silently by the runtime and with a non-fatal warning by the compilers - ocamlopt obviously checks stdlib.cmxa)

Note that any system reliably wishing to override the standard library has always been able to add a final -I to any compiler (or ocamlrun invocation), so this ocamlrun mechanism is not really intended for general use, but more for ‘completeness’ in the handling of OCAMLLIB.

New handling of `CAML_LD_LIBRARY_PATH`

If CAML_LD_LIBRARY_PATH-RuntimeMD5 is defined, then it used before CAML_LD_LIBRARY_PATH.

Note that as for OCAMLLIB, it has always been possible to pass -I to ocamlrun.

Rename `ocamlrun`

ocamlrun is installed as ocamlrun-RuntimeMD5

For legacy support, ocamlrun would continue to exist as a symlink to the new name.

New `camlheader`

camlheader (and its variants) now take the form:

#!/bin/sh -e
i=_abs-path-to-ocamlrun-RuntimeMD5_
# TODO Perhaps -e is sufficient, and allow exec to display
# any errors?
if ! test -f "$i" || ! test -x "$i" ; then
  i=ocamlrun-_RuntimeMD5_
fi
exec "$i" "$@"
# Possibly could run without set -e and have some message here

header.c should be updated with the same logic (i.e. it should initially attempt the absolute path). Note that the header is already a sh-script for shebangs which would be too long.

Suffixed stublibs

ocamlmklib will have a new parameter --suffixed which will cause .so/.dll files to be suffixed with -RuntimeMD5.

This approach will instantly fix the location problems with CAML_LD_LIBRARY_PATH, but it requires opt-in, since it involves build system support. ocamlbuild and dune would aim to support this mode of operation from release time. TODO Not sure that either tool actually invokes ocamlmklib directly, so this would involve patching them to follow the same naming standard, which for ocamlbuild may be too hard (Dune being both opinionated, and also generating .install and META files automatically, should be more able to adapt)

The intention would be to make --suffixed the default with the following schedule:

At release, ocamlmklib displays a non-fatal warning if invoked without --suffixed and a dynamic stub library is generated
Two releases later the warning changes to warn that the behaviour will alter in the next release
The next release (three after the original feature) makes --suffixed the default behaviour

There should not be a --unsuffixed flag - all build systems should convert to this new standard, given that it’s only a build system alteration, and does not materially affect any code.

`--enable-relative-libdir` configuration option

In this mode, the relative location from BINDIR to LIBDIR is embedded as the default location of the standard library. Note that while ocamlc -config should display this as a new value, the existing display of the standard library location should remain an absolute path (i.e. ocamlc -where will return a computed absolute path based on the location of ocamlc). If the binary is unable to determine where it was invoked from, then an error will be reported, as if the standard library could not found.

Dealt with notes

Everything from here on is archived (i.e. not useful either for documentation or tasks)

Notes while assembling Demo 28-Sep-2021

All dealt with 5-Sep-2022

Stubs bitness potentially exposes a problem: the runtime must attempt to load a .so which matches itself
This requires the ability to transform the DLL name

dllunix-bbbp-machine.so
dllunix-bbbp-machine.so

Runtime ID and Machine ID

Bytecodeder executables do not specify Machine ID: that's the point
So there is ocamlrun at a given version -> don't care when executing that machine, only that everything loaded needs to match
In native mode, this would only matter for the shared runtime

This requires a slight tweak to the .cma format to add a list of suffixed DLLs -> will have to do something in the DLLS section itself to indicate that they're loaded this way.

This is particularly useful on Windows, where it allows mingw and msvc to coexist in completely separate harmony. It's also potentially useful on other systems which are capable of loading in two different modes (macOS?)

Another important aspect, which might affect ocamlmklib -suffixed is whether the runtime ID used for bytecode stub libraries has 64-bit support or not. I think it's probably fine that this doesn't specify, or that it's always set to 31bit. I think it's that - the C library has hard-coded runtime support, so it's more that the machine must match up. Yes, this is definitely it: ocamlmklib should be clearing the 63-bit ID (or using a specific computed stubs runtime ID)

This is worth noting in runtime ID: a runtime ID always has the same set of bits, but depending on the context, some of them are not permitted to be set. Set the 63/31 specifier allows bytecode to locate a runtime which can run it, but it obviously cannot be used to find the correct machine for loading stub libraries - that must clearly match the given configuration and machine type.

All dealt with 5-Sep-2022

Clarification for Runtime ID and Machine ID

Machine ID is just the host triplet - e.g. x86_64-pc-windows … it's not the shortest, but it's obvious that x86_64-pc-windows-ocamlrun-bbbp is the msvc64 version of ocamlrun-bbbp

In other instances, the Runtime ID describes a target configuration, but the context in which its being used will then necessarily zero out some other bits:

For ocamlrun, we're selecting the runtime based on the properties of the bytecode in question:

Version matters (bytecode version!)
63-bit support matters (the code requires 63-bit support in order to be loaded)
reserved headers could optionally matter => actually, I don't think this does matter.
shared library support matters => but is implied by the presence of DLLS. Note for Dynlink that this is a runtime error, so again it could optionally matter
frame-pointers clearly doesn't matter (native)
no-naked-pointers should not matter (unless doing whacky Obj stuff) - optionally matter
spacetime clearly doesn't matter (native)
force-safe-string doesn't matter (at least I don't think it does for bytecode?)
flat-float-array could optionally matter (I think)
Windows unicode only optionally matters

A similar combination applies to native code:

63-bit bit is bytecode only (encoded in the triplet)
shared library support is always set (because it must be!)
frame-pointers obviously matters
no-naked-pointers matters (right? affects code generation?)
spacetime obviously matters
force-safe-string matters (affects code generation)
flat-float-array matters (affects code generation)
Windows unicode likewise optionally matters

There is then an additional context for stub libraries:

Version matters (we should be looking at the same OCaml)
63-bit support does NOT matter - what matters is that the code matches the runtime (i.e. the Machine ID matches)
reserved headers must match the runtime
shared library is implied - as for native!
frame-pointers still doens't matter (native)
no-naked-pointers should match the runtime?!
spacetime clearly doesn't matter (native)
force-safe-string must match the runtime
flat-float-array must match the runtime
Windows unicode must match the runtime

So I think what's missing for the stub libraries is that we should be modifying the .cma format to specify that stub DLLs are being loaded which need transforming - but we'd still write
dllunix. What alters is that we wrote "dllunix" and then rely on the runtime to add the .so and the appropriate host triplet and runtime ID.

Note that when the runtime is working out which DLLS to load, it would be doing this based on configured runtime ID… i.e. it's not about the basename. That involves embedding the runtime ID in the runtime as well, which I don't think has been.

Demo 16-Sep-2021

All dealt with 5-Sep-2022

Looks like the i386-static needs some work in one of the earlier branches.

Target here:

Definitely get Actions pipeline passing
- make depend needs running somewhere
- Possible fix to –disable-shared needed somewhere
Get Windows working too
TODO The weak linking patches are proving too unstable on Windows. Let's go for something simpler: introduce libocamlrun*.byte.a and libocamlrun*.opt.a. This name is preferred over libcamlrun, but it's there as a fallback. These libraries will not include stdlib.* - it's expected that whatever is doing the linking will have specified them. We'll still include libcamlrun.a for compatibility - the symbol can still be declared weakly in it, but we don't require the use of it. The searching will have a small subtlety: we'll first search for libcamlrun and then if libocamlrun.byte is found in that directory then we'll use it (i.e. libocamlrun needs to be in the same directory as a libcamlrun). It might be better to pick a different name just to reduce the chances of its being confused by a user who has not RTFM'd the linking with C chapter.
Rebase on to trunk (to pick up sak todo items, etc.)
Rebase on to 4.13
Rebase on to 4.12

At this point the target is to set up 4.11.2, 4.12.1, 4.12.1-relative and 4.13.0 and 4.13.0-relative
switches to demonstrate the breakages. 4.11.2 will be the actual release - the 4.12 and 4.13 branches should be simulated by
changing VERSION on their branches.

These could do with flipping around - i.e. to be 63bit-required and shared ↩︎

Future work

Tests

Notes

Rebase February 2023

opam packaging notes

Repackaging notes September 2022

opam packaging notes

Hard-linking the tree from the cache (opam notes)

External TODO

TODO

use-runtime-evil Image Not Showing Possible Reasons The image file may be corruptedThe server hosting the image is unavailableThe image path is incorrectThe image format is not supported Learn More →

misc-win-fixes : Image Not Showing Possible Reasons The image file may be corruptedThe server hosting the image is unavailableThe image path is incorrectThe image format is not supported Learn More →

windows-ln Image Not Showing Possible Reasons The image file may be corruptedThe server hosting the image is unavailableThe image path is incorrectThe image format is not supported Learn More →

test-in-prefix

unified-header Image Not Showing Possible Reasons The image file may be corruptedThe server hosting the image is unavailableThe image path is incorrectThe image format is not supported Learn More →

Notes from 4.05-4.14 fixing

exe-executing Image Not Showing Possible Reasons The image file may be corruptedThe server hosting the image is unavailableThe image path is incorrectThe image format is not supported Learn More →

empty-env Image Not Showing Possible Reasons The image file may be corruptedThe server hosting the image is unavailableThe image path is incorrectThe image format is not supported Learn More →

backslashes Image Not Showing Possible Reasons The image file may be corruptedThe server hosting the image is unavailableThe image path is incorrectThe image format is not supported Learn More →

runtime-launch-info Image Not Showing Possible Reasons The image file may be corruptedThe server hosting the image is unavailableThe image path is incorrectThe image format is not supported Learn More →

Background: Bytecode Executables

The Image Not Showing Possible Reasons The image file may be corruptedThe server hosting the image is unavailableThe image path is incorrectThe image format is not supported Learn More → s

Current implementation

Proposed implementation in this PR

ld.conf-CRLF Image Not Showing Possible Reasons The image file may be corruptedThe server hosting the image is unavailableThe image path is incorrectThe image format is not supported Learn More →

ld.conf-search Image Not Showing Possible Reasons The image file may be corruptedThe server hosting the image is unavailableThe image path is incorrectThe image format is not supported Learn More →

ld.conf-relative Image Not Showing Possible Reasons The image file may be corruptedThe server hosting the image is unavailableThe image path is incorrectThe image format is not supported Learn More →

compiled-primitives Image Not Showing Possible Reasons The image file may be corruptedThe server hosting the image is unavailableThe image path is incorrectThe image format is not supported Learn More →

enable-relative Image Not Showing Possible Reasons The image file may be corruptedThe server hosting the image is unavailableThe image path is incorrectThe image format is not supported Learn More →

ld-warning Image Not Showing Possible Reasons The image file may be corruptedThe server hosting the image is unavailableThe image path is incorrectThe image format is not supported Learn More →

runtime-id Image Not Showing Possible Reasons The image file may be corruptedThe server hosting the image is unavailableThe image path is incorrectThe image format is not supported Learn More →

runtime-suffixing Image Not Showing Possible Reasons The image file may be corruptedThe server hosting the image is unavailableThe image path is incorrectThe image format is not supported Learn More →

camlheader-search Image Not Showing Possible Reasons The image file may be corruptedThe server hosting the image is unavailableThe image path is incorrectThe image format is not supported Learn More →

camlheader-ape Image Not Showing Possible Reasons The image file may be corruptedThe server hosting the image is unavailableThe image path is incorrectThe image format is not supported Learn More →

disable-absolute Image Not Showing Possible Reasons The image file may be corruptedThe server hosting the image is unavailableThe image path is incorrectThe image format is not supported Learn More →

runtime-realpath Image Not Showing Possible Reasons The image file may be corruptedThe server hosting the image is unavailableThe image path is incorrectThe image format is not supported Learn More →

target-bindir Image Not Showing Possible Reasons The image file may be corruptedThe server hosting the image is unavailableThe image path is incorrectThe image format is not supported Learn More →

camlheader_ur Image Not Showing Possible Reasons The image file may be corruptedThe server hosting the image is unavailableThe image path is incorrectThe image format is not supported Learn More →

long-shebangs Image Not Showing Possible Reasons The image file may be corruptedThe server hosting the image is unavailableThe image path is incorrectThe image format is not supported Learn More →

opam-bin patches

Relocatable OCaml

Technical background

Locating the interpreter ocamlrun

Locating the standard library

Locating dynamically loaded C stubs

Goals

Proposals

Runtime MD5

New handling of OCAMLLIB

New handling of CAML_LD_LIBRARY_PATH

Rename ocamlrun

New camlheader

Suffixed stublibs

--enable-relative-libdir configuration option

Dealt with notes

Notes while assembling Demo 28-Sep-2021

Demo 16-Sep-2021

use-runtime-evil

Image Not Showing Possible Reasons
The image file may be corrupted
The server hosting the image is unavailable
The image path is incorrect
The image format is not supported
Learn More →

misc-win-fixes :

Image Not Showing Possible Reasons
The image file may be corrupted
The server hosting the image is unavailable
The image path is incorrect
The image format is not supported
Learn More →

windows-ln

Image Not Showing Possible Reasons
The image file may be corrupted
The server hosting the image is unavailable
The image path is incorrect
The image format is not supported
Learn More →

unified-header

Image Not Showing Possible Reasons
The image file may be corrupted
The server hosting the image is unavailable
The image path is incorrect
The image format is not supported
Learn More →

exe-executing

Image Not Showing Possible Reasons
The image file may be corrupted
The server hosting the image is unavailable
The image path is incorrect
The image format is not supported
Learn More →

empty-env

Image Not Showing Possible Reasons
The image file may be corrupted
The server hosting the image is unavailable
The image path is incorrect
The image format is not supported
Learn More →

backslashes

Image Not Showing Possible Reasons
The image file may be corrupted
The server hosting the image is unavailable
The image path is incorrect
The image format is not supported
Learn More →

runtime-launch-info

Image Not Showing Possible Reasons
The image file may be corrupted
The server hosting the image is unavailable
The image path is incorrect
The image format is not supported
Learn More →

The

Image Not Showing Possible Reasons
The image file may be corrupted
The server hosting the image is unavailable
The image path is incorrect
The image format is not supported
Learn More →

s

ld.conf-CRLF

Image Not Showing Possible Reasons
The image file may be corrupted
The server hosting the image is unavailable
The image path is incorrect
The image format is not supported
Learn More →

ld.conf-search

Image Not Showing Possible Reasons
The image file may be corrupted
The server hosting the image is unavailable
The image path is incorrect
The image format is not supported
Learn More →

ld.conf-relative

Image Not Showing Possible Reasons
The image file may be corrupted
The server hosting the image is unavailable
The image path is incorrect
The image format is not supported
Learn More →

compiled-primitives

Image Not Showing Possible Reasons
The image file may be corrupted
The server hosting the image is unavailable
The image path is incorrect
The image format is not supported
Learn More →

enable-relative

Image Not Showing Possible Reasons
The image file may be corrupted
The server hosting the image is unavailable
The image path is incorrect
The image format is not supported
Learn More →

ld-warning

Image Not Showing Possible Reasons
The image file may be corrupted
The server hosting the image is unavailable
The image path is incorrect
The image format is not supported
Learn More →

runtime-id

Image Not Showing Possible Reasons
The image file may be corrupted
The server hosting the image is unavailable
The image path is incorrect
The image format is not supported
Learn More →

runtime-suffixing

Image Not Showing Possible Reasons
The image file may be corrupted
The server hosting the image is unavailable
The image path is incorrect
The image format is not supported
Learn More →

camlheader-search

Image Not Showing Possible Reasons
The image file may be corrupted
The server hosting the image is unavailable
The image path is incorrect
The image format is not supported
Learn More →

camlheader-ape

Image Not Showing Possible Reasons
The image file may be corrupted
The server hosting the image is unavailable
The image path is incorrect
The image format is not supported
Learn More →

disable-absolute

Image Not Showing Possible Reasons
The image file may be corrupted
The server hosting the image is unavailable
The image path is incorrect
The image format is not supported
Learn More →

runtime-realpath

Image Not Showing Possible Reasons
The image file may be corrupted
The server hosting the image is unavailable
The image path is incorrect
The image format is not supported
Learn More →

target-bindir

Image Not Showing Possible Reasons
The image file may be corrupted
The server hosting the image is unavailable
The image path is incorrect
The image format is not supported
Learn More →

camlheader_ur

Image Not Showing Possible Reasons
The image file may be corrupted
The server hosting the image is unavailable
The image path is incorrect
The image format is not supported
Learn More →

long-shebangs

Image Not Showing Possible Reasons
The image file may be corrupted
The server hosting the image is unavailable
The image path is incorrect
The image format is not supported
Learn More →

Locating the interpreter `ocamlrun`

New handling of `OCAMLLIB`

New handling of `CAML_LD_LIBRARY_PATH`

Rename `ocamlrun`

New `camlheader`

`--enable-relative-libdir` configuration option