-
Yifu Wang authored
make allocators and sanitizers work for processes created with multiprocessing's spawn method in dev mode (#2660) Summary: Pull Request resolved: https://github.com/facebook/buck/pull/2660 **The first attempt (D30802446) overlooked that fact that the interpreter wrapper is not an executable (for `execv`) on Mac, and introduced some bugs due to the refactoring. The attempt 2 addressed the issues, and isolated the effect of the change to only processes created by multiprocess's spawn method on Linux.** #### Problem Currently, the entrypoint for in-place Python binaries (i.e. built with dev mode) executes the following steps to load system native dependencies (e.g. sanitizers and allocators): - Backup `LD_PRELOAD` set by the caller - Append system native dependencies to `LD_PRELOAD` - Inject a prologue in user code which restores `LD_PRELOAD` set by the caller - `execv` Python interpreter The steps work as intended for single process Python programs. However, when a Python program spawns child processes, the child processes will not load native dependencies, since they simply `execv`'s the vanilla Python interpreter. A few examples why this is problematic: - The ASAN runtime library is a system native dependency. Without loading it, a child process that loads user native dependencies compiled with ASAN will crash during static initialization because it can't find `_asan_init`. - `jemalloc` is also a system native dependency. Many if not most ML use cases "bans" dev mode because of these problems. It is very unfortunate considering the developer efficiency dev mode provides. In addition, a huge amount of unit tests have to run in a more expensive build mode because of these problems. #### Solution Move the system native dependencies loading logic out of the Python binary entrypoint into an interpreter wrapper, and set the interpreter as `sys.executable` in the injected prologue: - The Python binary entrypoint now uses the interpreter wrapper, which has the same command line interface as the Python interpreter, to run the main module. - `multiprocessing`'s `spawn` method now uses the interpreter wrapper to create child processes, ensuring system native dependencies get loaded correctly. #### Alternative Considered One alternative considered is to simply not removing system native dependencies from `LD_PRELOAD`, so they are present in the spawned processes. However, this can cause some linking issues, which were perhaps the reason `LD_PRELOAD` was restored in the first place. Reviewed By: fried, Reubend fbshipit-source-id: 9528c1856bf389ce033a8630cd718466754f3cef
7faa8a54