We've faced a problem with sanitizing part of the NetBSD userland, as we need to use helper functions to make sanitization possible in some narrow cases that aren't clear for sanitizers. The current problem is the usage of callback functions defined in programs and executed from the internals of libc. This is true for sorting functions where we can specify a comparison function, e.g. in qsort(3): void qsort(void *base, size_t nmemb, size_t size, int (*compar)(const void *, const void *)); The same scenario is in heapsort(3) and mergesort(3), and their users: - fts_open(3) - alphasort(3) - scandir(3) - tdelete(3) twalk(3) tfind(3) tsearch(3) - bsearch(3) Once a callback function is executed from the internals of libc, a sanitized program does not know whether the arguments passed to it are properly initialized. Two examples: # modstat Uninitialized bytes in int __interceptor_strcmp(const char *, const char *) at offset 0 inside [0x731000000000, 1) ==19613==WARNING: MemorySanitizer: use-of-uninitialized-value #0 0x21dc89 in modstatcmp /public-dyn/src/sbin/modstat/main.c:246:9 #1 0x7f7ff692d581 in qsort (/lib/libc.so.12+0x12d581) #2 0x21ca82 in main /public-dyn/src/sbin/modstat/main.c:181:2 #3 0x21b341 in ___start (/sbin//modstat+0x1b341) SUMMARY: MemorySanitizer: use-of-uninitialized-value /public-dyn/src/sbin/modstat/main.c:246:9 in modstatcmp Exiting # ls ==11267==WARNING: MemorySanitizer: use-of-uninitialized-value #0 0x222750 in mastercmp /public-dyn/src/bin/ls/ls.c:691:17 #1 0x7f7ff63bc435 in med3 /public-dyn/src/lib/libc/stdlib/qsort.c:94:9 #2 0x7f7ff63bbe12 in qsort /public-dyn/src/lib/libc/stdlib/qsort.c:128:8 #3 0x7f7ff630ce6b in fts_sort /public-dyn/src/lib/libc/gen/fts.c:1028:2 #4 0x7f7ff630e39b in fts_build /public-dyn/src/lib/libc/gen/fts.c:915:10 #5 0x7f7ff630e8b9 in __fts_children60 /public-dyn/src/lib/libc/gen/fts.c:616:18 #6 0x256e2d in __interceptor___fts_children60 /public-dyn/llvm/projects/compiler-rt/lib/msan/../sanitizer_common/sanitizer_common_interceptors.inc:7368:12 #7 0x2221a7 in traverse /public-dyn/src/bin/ls/ls.c:470:10 #8 0x22149d in ls_main /public-dyn/src/bin/ls/ls.c:405:3 #9 0x226ab1 in main /public-dyn/src/bin/ls/main.c:48:9 #10 0x21b531 in ___start (/bin/ls+0x1b531) SUMMARY: MemorySanitizer: use-of-uninitialized-value /public-dyn/src/bin/ls/ls.c:691:17 in mastercmp Exiting Possible solutions: 1. Reimplement libc functions inside sanitizers 2. Copy part of libc code into sanitizers source code and build it along the sanitizers. 3. Inject __msan_unpoison()-like functions inside libc, optionally under MKSANITIZER switch. 4. Use auxiliary sanitizer functions/macros inside programs that need it and enable it in the mode of being built with a sanitzier. I've wend through points 1-4: 1. Isn't really doable. Functionality duplication, maintenance burden and both implementations will go out of sync. While it might be theoretically possible for sorting functions, reimplementing fts_open(3)-like features is too much work. Upstream would likely reject it. 2. Not applicable for upstream. Someone would need to keep both copies in sync. We will end up with two different implementations of features like fts_open(3) built through -fsanitize and standalone. 3. Adding any symbols to libc is taxed. There is need to use a preprocessed libc in order to sanitize some programs using plain libc. Also these symbols are injected in performance critical paths like in every execution of the callback in a sorting function. 4. This explicitly restricts the usage of helper functions to the programs that need it and they are built with a sanitizer. No libc replacement is needed. The rest of the world does the same as in point 4., this is already the common usage in 3rd party software like: libuv, mozjs, rr, iotjs, libcrypto++, julia, openssl, firefox etc. I've prepared a <sanitizer.h> header that intends to abstract inclusion of sanitizer specific headers in userland programs and export macros for programs. If a program is not built with a sanitizer, the macro is evaluated into a dummy line of code. Proposed patch with a new header and patched ls(1). http://netbsd.org/~kamil/patch-00049-ls-msan.txt With the above diff, ls(1) can execute under Memory Sanitizer correctly. The patch includes support for ASan, MSan and TSan. UBSan does not need a dedicated header. The MSan support is restricted to Clang/LLVM only. Other sanitizers (ESan, DFsan, Scudo, HWASan, LSan) are right now skipped.
Attachment:
signature.asc
Description: OpenPGP digital signature