Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Segmentation faults - Inconsistent errors with lots of async workers and objects #1174

Closed
kelvinhammond opened this issue May 8, 2022 · 44 comments
Assignees
Labels

Comments

@kelvinhammond
Copy link
Contributor

Hello,

I'm working on a pull request for a project using node-addon-api.
When I create a lot of objects via AsyncWorker I get Segmentation faults.

I traced some of them down to create a fresh Napi::Env via info.Env() vs the persistent object's Env() call. Is there a difference here or a reason why I can't do that?

I'm now running into other issues. These problems only occur after over 10,000 objects are created and destroyed.

https://github.com/openzim/node-libzim/pull/72/files#diff-f28f98b9b15ae6a552a5b00f2b80c65a701a42b1e75f81aff149ac16e480b238R15

This occurs when I call makeLargeZim.ts which calls creator.addItem.
addItem creates an AsyncWorker which stores a promise which is resolved once the libzim call completes. This is done because libzim will wait until other threads within its library are finished in the queue which blocks the nodejs / libuv main event thread.

This works fine most of the time with 10, 100, 1000, 10000, but when I try to do ~1,000,000 I get an error. Its not a memory issue, plenty of ram. Any guidance would be helpful. Thank you.

The backtrace from gdb is here

#0  0x0000555555edbe10 in v8::internal::GlobalHandles::Create(v8::internal::Object) ()                                                                                                                                               [0/14875]
#1  0x0000555555d7aee4 in v8::V8::GlobalizeReference(v8::internal::Isolate*, unsigned long*) ()                                                                                                                                               
#2  0x0000555555b2c171 in v8impl::Reference::New(napi_env__*, v8::Local<v8::Value>, unsigned int, bool, void (*)(napi_env__*, void*, void*), void*, void*) ()                                                                                 
#3  0x0000555555b3356c in napi_wrap ()                                                                                                                                                                                                        
#4  0x00007fffe674e91a in Napi::ObjectWrap<Blob>::ObjectWrap (this=0x555557f48400, callbackInfo=...) at ./code/node-libzim/node_modules/node-addon-api/napi-inl.h:4151                                                               
#5  0x00007fffe67341fe in Blob::Blob (this=0x555557f48400, info=...) at ../src/blob.h:17                                                                                                                                                      
#6  0x00007fffe67737ca in Napi::ObjectWrap<Blob>::ConstructorCallbackWrapper(napi_env__*, napi_callback_info__*)::{lambda()#2}::operator()() const (__closure=0x7fffffff8550)                                                                 
    at ./code/node-libzim/node_modules/node-addon-api/napi-inl.h:4519                                                                                                                                                                
#7  0x00007fffe6789e54 in Napi::details::WrapCallback<Napi::ObjectWrap<Blob>::ConstructorCallbackWrapper(napi_env__*, napi_callback_info__*)::{lambda()#2}>(Napi::ObjectWrap<Blob>::ConstructorCallbackWrapper(napi_env__*, napi_callback_info
__*)::{lambda()#2}) (callback=...) at ./code/node-libzim/node_modules/node-addon-api/napi-inl.h:79                                                                                                                                   
#8  0x00007fffe6773917 in Napi::ObjectWrap<Blob>::ConstructorCallbackWrapper (env=0x55555764dfb0, info=0x7fffffff8620) at ./code/node-libzim/node_modules/node-addon-api/napi-inl.h:4517                                             
#9  0x0000555555b2b9df in ?? ()                                                                                                                                                                                                               
#10 0x0000555555dd7738 in v8::internal::FunctionCallbackArguments::Call(v8::internal::CallHandlerInfo) ()                                                                                                                                     
#11 0x0000555555dd7c08 in ?? ()                                                                                                                                                                                                               
#12 0x0000555555dd8611 in v8::internal::Builtins::InvokeApiFunction(v8::internal::Isolate*, bool, v8::internal::Handle<v8::internal::HeapObject>, v8::internal::Handle<v8::internal::Object>, int, v8::internal::Handle<v8::internal::Object>*
, v8::internal::Handle<v8::internal::HeapObject>) ()                                                                                                                                                                                          
#13 0x0000555555e9e618 in ?? ()                                                                                                                                                                                                               
#14 0x0000555555e9f004 in v8::internal::Execution::New(v8::internal::Isolate*, v8::internal::Handle<v8::internal::Object>, v8::internal::Handle<v8::internal::Object>, int, v8::internal::Handle<v8::internal::Object>*) ()
#15 0x0000555555d99f49 in v8::Function::NewInstanceWithSideEffectType(v8::Local<v8::Context>, int, v8::Local<v8::Value>*, v8::SideEffectType) const ()
#16 0x0000555555b34e63 in napi_new_instance ()                                                                                                                                                                                                
#17 0x00007fffe673011a in Napi::Function::New (this=0x7fffffff8e00, argc=1, args=0x7fffffff8e90) at ./code/node-libzim/node_modules/node-addon-api/napi-inl.h:2384
#18 0x00007fffe67300c7 in Napi::Function::New (this=0x7fffffff8e00, args=...) at ./code/node-libzim/node_modules/node-addon-api/napi-inl.h:2373                                                      
#19 0x00007fffe673175e in Napi::FunctionReference::New (this=0x5555579e6de8, args=...) at ./code/node-libzim/node_modules/node-addon-api/napi-inl.h:3408
#20 0x00007fffe67346fd in Blob::New (env=..., blob=...) at ../src/blob.h:47                                                                                                                                                                   
#21 0x00007fffe673d2b6 in StringProvider::feed (this=0x555557fa2ac0, info=...) at ../src/contentProvider.h:152
#22 0x00007fffe6724153 in operator() (__closure=0x7fffffff9010) at ./code/node-libzim/node_modules/node-addon-api/napi-inl.h:170                                                                                                     
#23 0x00007fffe672ae2f in Napi::details::WrapCallback<Napi::details::TemplatedInstanceCallback<StringProvider, &StringProvider::feed>(napi_env, napi_callback_info)::<lambda()> >(struct {...}) (callback=...)
    at ./code/node-libzim/node_modules/node-addon-api/napi-inl.h:79                                                                                                                                                                  
#24 0x00007fffe67241e3 in Napi::details::TemplatedInstanceCallback<StringProvider, &StringProvider::feed> (env=0x55555764dfb0, info=0x7fffffff90a0) at ./code/node-libzim/node_modules/node-addon-api/napi-inl.h:167
#25 0x0000555555b2b9df in ?? ()
#26 0x0000555555dd7738 in v8::internal::FunctionCallbackArguments::Call(v8::internal::CallHandlerInfo) ()
#27 0x0000555555dd7e32 in ?? ()
#28 0x0000555555dd8552 in v8::internal::Builtins::InvokeApiFunction(v8::internal::Isolate*, bool, v8::internal::Handle<v8::internal::HeapObject>, v8::internal::Handle<v8::internal::Object>, int, v8::internal::Handle<v8::internal::Object>*
, v8::internal::Handle<v8::internal::HeapObject>) ()
#29 0x0000555555e9e618 in ?? ()
#30 0x0000555555e9ee12 in v8::internal::Execution::Call(v8::internal::Isolate*, v8::internal::Handle<v8::internal::Object>, v8::internal::Handle<v8::internal::Object>, int, v8::internal::Handle<v8::internal::Object>*) ()
#31 0x0000555555d9a469 in v8::Function::Call(v8::Local<v8::Context>, v8::Local<v8::Value>, int, v8::Local<v8::Value>*) ()
#32 0x0000555555b31209 in napi_call_function ()
#33 0x00007fffe672ffeb in Napi::Function::Call (this=0x7fffffff98d0, recv=0x555557565a38, argc=0, args=0x0) at ./code/node-libzim/node_modules/node-addon-api/napi-inl.h:2339
#34 0x00007fffe672feee in Napi::Function::Call (this=0x7fffffff98d0, recv=0x555557565a38, args=...) at ./code/node-libzim/node_modules/node-addon-api/napi-inl.h:2306
#35 0x00007fffe673c9a8 in ContentProviderWrapper::feed()::{lambda(Napi::Env, Napi::Function)#1}::operator()(Napi::Env, Napi::Function) const (__closure=0x7fffcc4b83a0, env=...) at ../src/contentProvider.h:74
#36 0x00007fffe67a27de in std::__invoke_impl<void, ContentProviderWrapper::feed()::{lambda(Napi::Env, Napi::Function)#1}&, Napi::Env, Napi::Function>(std::__invoke_other, ContentProviderWrapper::feed()::{lambda(Napi::Env, Napi::Function)#
1}&, Napi::Env&&, Napi::Function&&) (__f=...) at /usr/include/c++/11.2.0/bits/invoke.h:61
#37 0x00007fffe678f69c in std::__invoke_r<void, ContentProviderWrapper::feed()::{lambda(Napi::Env, Napi::Function)#1}&, Napi::Env, Napi::Function>(ContentProviderWrapper::feed()::{lambda(Napi::Env, Napi::Function)#1}&, Napi::Env&&, Napi::
Function&&) (__fn=...) at /usr/include/c++/11.2.0/bits/invoke.h:111
#38 0x00007fffe67783e3 in std::_Function_handler<void (Napi::Env, Napi::Function), ContentProviderWrapper::feed()::{lambda(Napi::Env, Napi::Function)#1}>::_M_invoke(std::_Any_data const&, Napi::Env&&, Napi::Function&&) (__functor=..., 
    __args#0=..., __args#1=...) at /usr/include/c++/11.2.0/bits/std_function.h:291
#39 0x00007fffe674cf89 in std::function<void (Napi::Env, Napi::Function)>::operator()(Napi::Env, Napi::Function) const (this=0x7fffcc4b83a0, __args#0=..., __args#1=...) at /usr/include/c++/11.2.0/bits/std_function.h:560
#40 0x00007fffe6732c30 in Napi::ThreadSafeFunction::CallJS (env=0x55555764dfb0, jsCallback=0x555557565a28, data=0x7fffcc4b83a0) at ./code/node-libzim/node_modules/node-addon-api/napi-inl.h:5780
#41 0x0000555555b4adae in ?? ()
#42 0x00007ffff7f3d92d in ?? () from /usr/lib/libuv.so.1
#43 0x00007ffff7f59d0e in ?? () from /usr/lib/libuv.so.1
#44 0x00007ffff7f43438 in uv_run () from /usr/lib/libuv.so.1
--Type <RET> for more, q to quit, c to continue without paging--
@KevinEady
Copy link
Contributor

Hi @kelvinhammond ,

We discussed this in the 13 May Node API meeting. Is it possible to provide a standalone repository that reproduces this issue? This will help with our investigations.

Thanks, Kevin

@kelvinhammond
Copy link
Contributor Author

@KevinEady The branch here openzim/node-libzim#72 demonstrates the problem, you can npm install and run npm run test-mem-leak

For gdb I run node-gyp rebuild --debug -v && npm run bundle && gdb node and the run -r ts-node/register test/makeLargeZim.ts

I can then access the backtrace.

@kelvinhammond
Copy link
Contributor Author

@KevinEady Let me know if there is anymore info I can provide for you.

@KevinEady
Copy link
Contributor

Hi @kelvinhammond ,

I will take a look at your example repository and get back to you as soon as possible.

Thanks for the quick response!

Kevin

@KevinEady KevinEady self-assigned this May 20, 2022
@KevinEady
Copy link
Contributor

Hi @kelvinhammond ,

Can I ask which Node version you are testing with?

Thanks, Kevin

@kelvinhammond
Copy link
Contributor Author

@KevinEady v16.15.0

@KevinEady
Copy link
Contributor

So I tried your makeLargeZim test with a debug version of node 16.15.0 and I can receive one of three different segfaults:

#
# Fatal error in ../deps/v8/src/handles/global-handles.cc, line 120
# Debug check failed: used_nodes_ > 0 (0 vs. 0).
#
#
#
#FailureMessage Object: 0x7ffedfef0e20
 1: 0x10fd8c3cd node::DumpBacktrace(__sFILE*) [/Users/kevineady/Documents/Projects/node/out/Debug/node]
 2: 0x110038806 node::NodePlatform::GetStackTracePrinter()::$_3::operator()() const [/Users/kevineady/Documents/Projects/node/out/Debug/node]
 3: 0x1100387c9 node::NodePlatform::GetStackTracePrinter()::$_3::__invoke() [/Users/kevineady/Documents/Projects/node/out/Debug/node]
 4: 0x111a42646 V8_Fatal(char const*, int, char const*, ...) [/Users/kevineady/Documents/Projects/node/out/Debug/node]
 5: 0x111a41fe5 v8::base::(anonymous namespace)::DefaultDcheckHandler(char const*, int, char const*) [/Users/kevineady/Documents/Projects/node/out/Debug/node]
 6: 0x11065b6b6 v8::internal::GlobalHandles::NodeSpace<v8::internal::GlobalHandles::Node>::Free(v8::internal::GlobalHandles::Node*) [/Users/kevineady/Documents/Projects/node/out/Debug/node]
 7: 0x10fe0cbad v8::PersistentBase<v8::Value>::Reset() [/Users/kevineady/Documents/Projects/node/out/Debug/node]
 8: 0x10fe2a116 v8impl::Reference::FinalizeCallback(v8::WeakCallbackInfo<v8impl::Reference*> const&) [/Users/kevineady/Documents/Projects/node/out/Debug/node]
 9: 0x1106582a1 unsigned long v8::internal::GlobalHandles::InvokeFirstPassWeakCallbacks<v8::internal::GlobalHandles::Node>(std::__1::vector<std::__1::pair<v8::internal::GlobalHandles::Node*, v8::internal::GlobalHandles::PendingPhantomCallback>, std::__1::allocator<std::__1::pair<v8::internal::GlobalHandles::Node*, v8::internal::GlobalHandles::PendingPhantomCallback> > >*) [/Users/kevineady/Documents/Projects/node/out/Debug/node]
10: 0x110658193 v8::internal::GlobalHandles::InvokeFirstPassWeakCallbacks() [/Users/kevineady/Documents/Projects/node/out/Debug/node]
11: 0x1106eaeef v8::internal::Heap::PerformGarbageCollection(v8::internal::GarbageCollector, v8::GCCallbackFlags) [/Users/kevineady/Documents/Projects/node/out/Debug/node]
12: 0x1106e83aa v8::internal::Heap::CollectGarbage(v8::internal::AllocationSpace, v8::internal::GarbageCollectionReason, v8::GCCallbackFlags) [/Users/kevineady/Documents/Projects/node/out/Debug/node]
13: 0x1106f4bdd v8::internal::Heap::FinalizeIncrementalMarkingIfComplete(v8::internal::GarbageCollectionReason) [/Users/kevineady/Documents/Projects/node/out/Debug/node]
14: 0x11070cdfa v8::internal::IncrementalMarkingJob::Task::RunInternal() [/Users/kevineady/Documents/Projects/node/out/Debug/node]
15: 0x11002f270 node::PerIsolatePlatformData::RunForegroundTask(std::__1::unique_ptr<v8::Task, std::__1::default_delete<v8::Task> >) [/Users/kevineady/Documents/Projects/node/out/Debug/node]
16: 0x11002d775 node::PerIsolatePlatformData::FlushForegroundTasksInternal() [/Users/kevineady/Documents/Projects/node/out/Debug/node]
17: 0x11002d440 node::PerIsolatePlatformData::FlushTasks(uv_async_s*) [/Users/kevineady/Documents/Projects/node/out/Debug/node]
18: 0x1111a1ca8 uv__async_io [/Users/kevineady/Documents/Projects/node/out/Debug/node]
19: 0x1111c2ebf uv__io_poll [/Users/kevineady/Documents/Projects/node/out/Debug/node]
20: 0x1111a234d uv_run [/Users/kevineady/Documents/Projects/node/out/Debug/node]
21: 0x10fd0cab6 node::SpinEventLoop(node::Environment*) [/Users/kevineady/Documents/Projects/node/out/Debug/node]
22: 0x10ff9a67f node::NodeMainInstance::Run(int*, node::Environment*) [/Users/kevineady/Documents/Projects/node/out/Debug/node]
23: 0x10ff99f29 node::NodeMainInstance::Run(node::EnvSerializeInfo const*) [/Users/kevineady/Documents/Projects/node/out/Debug/node]
24: 0x10fe5cd4a node::Start(int, char**) [/Users/kevineady/Documents/Projects/node/out/Debug/node]
25: 0x1115fa15e main [/Users/kevineady/Documents/Projects/node/out/Debug/node]
26: 0x7fff204a6621 start [/usr/lib/system/libdyld.dylib]

or

#
# Fatal error in ../deps/v8/src/handles/global-handles.cc, line 103
# Debug check failed: used_nodes_ < kBlockSize (256 vs. 256).
#
#
#
#FailureMessage Object: 0x7ffee190a760
 1: 0x10e3723cd node::DumpBacktrace(__sFILE*) [/Users/kevineady/Documents/Projects/node/out/Debug/node]
 2: 0x10e61e806 node::NodePlatform::GetStackTracePrinter()::$_3::operator()() const [/Users/kevineady/Documents/Projects/node/out/Debug/node]
 3: 0x10e61e7c9 node::NodePlatform::GetStackTracePrinter()::$_3::__invoke() [/Users/kevineady/Documents/Projects/node/out/Debug/node]
 4: 0x110028646 V8_Fatal(char const*, int, char const*, ...) [/Users/kevineady/Documents/Projects/node/out/Debug/node]
 5: 0x110027fe5 v8::base::(anonymous namespace)::DefaultDcheckHandler(char const*, int, char const*) [/Users/kevineady/Documents/Projects/node/out/Debug/node]
 6: 0x10ec3a2fe v8::internal::GlobalHandles::NodeSpace<v8::internal::GlobalHandles::Node>::Acquire(v8::internal::Object) [/Users/kevineady/Documents/Projects/node/out/Debug/node]
 7: 0x10ec39e30 v8::internal::GlobalHandles::Create(v8::internal::Object) [/Users/kevineady/Documents/Projects/node/out/Debug/node]
 8: 0x10e32c180 v8::PersistentBase<v8::Function>::New(v8::Isolate*, v8::Function*) [/Users/kevineady/Documents/Projects/node/out/Debug/node]
 9: 0x10e32c0df void v8::PersistentBase<v8::Function>::Reset<v8::Function>(v8::Isolate*, v8::Local<v8::Function> const&) [/Users/kevineady/Documents/Projects/node/out/Debug/node]
10: 0x10e459524 v8impl::(anonymous namespace)::ThreadSafeFunction::ThreadSafeFunction(v8::Local<v8::Function>, v8::Local<v8::Object>, v8::Local<v8::String>, unsigned long, void*, unsigned long, node_napi_env__*, void*, void (*)(napi_env__*, void*, void*), void (*)(napi_env__*, napi_value__*, void*, void*)) [/Users/kevineady/Documents/Projects/node/out/Debug/node]
11: 0x10e452fc1 v8impl::(anonymous namespace)::ThreadSafeFunction::ThreadSafeFunction(v8::Local<v8::Function>, v8::Local<v8::Object>, v8::Local<v8::String>, unsigned long, void*, unsigned long, node_napi_env__*, void*, void (*)(napi_env__*, void*, void*), void (*)(napi_env__*, napi_value__*, void*, void*)) [/Users/kevineady/Documents/Projects/node/out/Debug/node]
12: 0x10e452e7c napi_create_threadsafe_function [/Users/kevineady/Documents/Projects/node/out/Debug/node]
13: 0x11d66e51c Napi::ThreadSafeFunction Napi::ThreadSafeFunction::New<char const*, void, Napi::ThreadSafeFunction Napi::ThreadSafeFunction::New<char const*, void>(napi_env__*, Napi::Function const&, Napi::Object const&, char const*, unsigned long, unsigned long, void*)::'lambda'(Napi::Env, void*), void>(napi_env__*, Napi::Function const&, Napi::Object const&, char const*, unsigned long, unsigned long, void*, Napi::ThreadSafeFunction Napi::ThreadSafeFunction::New<char const*, void>(napi_env__*, Napi::Function const&, Napi::Object const&, char const*, unsigned long, unsigned long, void*)::'lambda'(Napi::Env, void*), void*, void (*)(napi_env__*, void*, void*)) [/Users/kevineady/Documents/Projects/node-libzim/build/Debug/zim_binding.node]
14: 0x11d66e3bd Napi::ThreadSafeFunction Napi::ThreadSafeFunction::New<char const*, void, Napi::ThreadSafeFunction Napi::ThreadSafeFunction::New<char const*, void>(napi_env__*, Napi::Function const&, Napi::Object const&, char const*, unsigned long, unsigned long, void*)::'lambda'(Napi::Env, void*)>(napi_env__*, Napi::Function const&, Napi::Object const&, char const*, unsigned long, unsigned long, void*, Napi::ThreadSafeFunction Napi::ThreadSafeFunction::New<char const*, void>(napi_env__*, Napi::Function const&, Napi::Object const&, char const*, unsigned long, unsigned long, void*)::'lambda'(Napi::Env, void*)) [/Users/kevineady/Documents/Projects/node-libzim/build/Debug/zim_binding.node]
15: 0x11d66e33d Napi::ThreadSafeFunction Napi::ThreadSafeFunction::New<char const*, void>(napi_env__*, Napi::Function const&, Napi::Object const&, char const*, unsigned long, unsigned long, void*) [/Users/kevineady/Documents/Projects/node-libzim/build/Debug/zim_binding.node]
16: 0x11d66e2de Napi::ThreadSafeFunction Napi::ThreadSafeFunction::New<char const*>(napi_env__*, Napi::Function const&, Napi::Object const&, char const*, unsigned long, unsigned long) [/Users/kevineady/Documents/Projects/node-libzim/build/Debug/zim_binding.node]
17: 0x11d66d2e5 Napi::ThreadSafeFunction Napi::ThreadSafeFunction::New<char const*>(napi_env__*, Napi::Function const&, char const*, unsigned long, unsigned long) [/Users/kevineady/Documents/Projects/node-libzim/build/Debug/zim_binding.node]
18: 0x11d66f23c ContentProviderWrapper::ContentProviderWrapper(Napi::Env, Napi::Object const&) [/Users/kevineady/Documents/Projects/node-libzim/build/Debug/zim_binding.node]
19: 0x11d66f025 ContentProviderWrapper::ContentProviderWrapper(Napi::Env, Napi::Object const&) [/Users/kevineady/Documents/Projects/node-libzim/build/Debug/zim_binding.node]
20: 0x11d66ed57 std::__1::__unique_if<ContentProviderWrapper>::__unique_single std::__1::make_unique<ContentProviderWrapper, Napi::Env&, Napi::Object>(Napi::Env&, Napi::Object&&) [/Users/kevineady/Documents/Projects/node-libzim/build/Debug/zim_binding.node]
21: 0x11d674948 ItemWrapper::getContentProvider() const::'lambda'(Napi::Env, Napi::Function)::operator()(Napi::Env, Napi::Function) const [/Users/kevineady/Documents/Projects/node-libzim/build/Debug/zim_binding.node]
22: 0x11d6748be decltype(std::__1::forward<ItemWrapper::getContentProvider() const::'lambda'(Napi::Env, Napi::Function)&>(fp)(std::__1::forward<Napi::Env>(fp0), std::__1::forward<Napi::Function>(fp0))) std::__1::__invoke<ItemWrapper::getContentProvider() const::'lambda'(Napi::Env, Napi::Function)&, Napi::Env, Napi::Function>(ItemWrapper::getContentProvider() const::'lambda'(Napi::Env, Napi::Function)&, Napi::Env&&, Napi::Function&&) [/Users/kevineady/Documents/Projects/node-libzim/build/Debug/zim_binding.node]
23: 0x11d674837 void std::__1::__invoke_void_return_wrapper<void>::__call<ItemWrapper::getContentProvider() const::'lambda'(Napi::Env, Napi::Function)&, Napi::Env, Napi::Function>(ItemWrapper::getContentProvider() const::'lambda'(Napi::Env, Napi::Function)&, Napi::Env&&, Napi::Function&&) [/Users/kevineady/Documents/Projects/node-libzim/build/Debug/zim_binding.node]
24: 0x11d6747e7 std::__1::__function::__alloc_func<ItemWrapper::getContentProvider() const::'lambda'(Napi::Env, Napi::Function), std::__1::allocator<ItemWrapper::getContentProvider() const::'lambda'(Napi::Env, Napi::Function)>, void (Napi::Env, Napi::Function)>::operator()(Napi::Env&&, Napi::Function&&) [/Users/kevineady/Documents/Projects/node-libzim/build/Debug/zim_binding.node]
25: 0x11d673598 std::__1::__function::__func<ItemWrapper::getContentProvider() const::'lambda'(Napi::Env, Napi::Function), std::__1::allocator<ItemWrapper::getContentProvider() const::'lambda'(Napi::Env, Napi::Function)>, void (Napi::Env, Napi::Function)>::operator()(Napi::Env&&, Napi::Function&&) [/Users/kevineady/Documents/Projects/node-libzim/build/Debug/zim_binding.node]
26: 0x11d66e8ea std::__1::__function::__value_func<void (Napi::Env, Napi::Function)>::operator()(Napi::Env&&, Napi::Function&&) const [/Users/kevineady/Documents/Projects/node-libzim/build/Debug/zim_binding.node]
27: 0x11d66e7e9 std::__1::function<void (Napi::Env, Napi::Function)>::operator()(Napi::Env, Napi::Function) const [/Users/kevineady/Documents/Projects/node-libzim/build/Debug/zim_binding.node]
28: 0x11d66e727 Napi::ThreadSafeFunction::CallJS(napi_env__*, napi_value__*, void*, void*) [/Users/kevineady/Documents/Projects/node-libzim/build/Debug/zim_binding.node]
29: 0x10e45bef7 v8impl::(anonymous namespace)::ThreadSafeFunction::DispatchOne()::'lambda'(napi_env__*)::operator()(napi_env__*) const [/Users/kevineady/Documents/Projects/node/out/Debug/node]
30: 0x10e45bda0 void napi_env__::CallIntoModule<v8impl::(anonymous namespace)::ThreadSafeFunction::DispatchOne()::'lambda'(napi_env__*), void (napi_env__*, v8::Local<v8::Value>)>(v8impl::(anonymous namespace)::ThreadSafeFunction::DispatchOne()::'lambda'(napi_env__*)&&, void (&&)(napi_env__*, v8::Local<v8::Value>)) [/Users/kevineady/Documents/Projects/node/out/Debug/node]
31: 0x10e45bb94 v8impl::(anonymous namespace)::ThreadSafeFunction::DispatchOne() [/Users/kevineady/Documents/Projects/node/out/Debug/node]
32: 0x10e45b883 v8impl::(anonymous namespace)::ThreadSafeFunction::Dispatch() [/Users/kevineady/Documents/Projects/node/out/Debug/node]
33: 0x10e45b6a4 v8impl::(anonymous namespace)::ThreadSafeFunction::AsyncCb(uv_async_s*) [/Users/kevineady/Documents/Projects/node/out/Debug/node]
34: 0x10f787ca8 uv__async_io [/Users/kevineady/Documents/Projects/node/out/Debug/node]
35: 0x10f7a8ebf uv__io_poll [/Users/kevineady/Documents/Projects/node/out/Debug/node]
36: 0x10f78834d uv_run [/Users/kevineady/Documents/Projects/node/out/Debug/node]
37: 0x10e2f2ab6 node::SpinEventLoop(node::Environment*) [/Users/kevineady/Documents/Projects/node/out/Debug/node]
38: 0x10e58067f node::NodeMainInstance::Run(int*, node::Environment*) [/Users/kevineady/Documents/Projects/node/out/Debug/node]
39: 0x10e57ff29 node::NodeMainInstance::Run(node::EnvSerializeInfo const*) [/Users/kevineady/Documents/Projects/node/out/Debug/node]
40: 0x10e442d4a node::Start(int, char**) [/Users/kevineady/Documents/Projects/node/out/Debug/node]
41: 0x10fbe015e main [/Users/kevineady/Documents/Projects/node/out/Debug/node]
42: 0x7fff204a6621 start [/usr/lib/system/libdyld.dylib]

or

#
# Fatal error in ../deps/v8/src/handles/global-handles.cc, line 499
# Debug check failed: FREE == state() (0 vs. 2).
#
#
#
#FailureMessage Object: 0x7ffeefbf56d0
 1: 0x1000873cd node::DumpBacktrace(__sFILE*) [/Users/kevineady/Documents/Projects/node/out/Debug/node]
 2: 0x100333806 node::NodePlatform::GetStackTracePrinter()::$_3::operator()() const [/Users/kevineady/Documents/Projects/node/out/Debug/node]
 3: 0x1003337c9 node::NodePlatform::GetStackTracePrinter()::$_3::__invoke() [/Users/kevineady/Documents/Projects/node/out/Debug/node]
 4: 0x101d3d646 V8_Fatal(char const*, int, char const*, ...) [/Users/kevineady/Documents/Projects/node/out/Debug/node]
 5: 0x101d3cfe5 v8::base::(anonymous namespace)::DefaultDcheckHandler(char const*, int, char const*) [/Users/kevineady/Documents/Projects/node/out/Debug/node]
 6: 0x10094f29a v8::internal::GlobalHandles::NodeSpace<v8::internal::GlobalHandles::Node>::Acquire(v8::internal::Object) [/Users/kevineady/Documents/Projects/node/out/Debug/node]
 7: 0x10094ee30 v8::internal::GlobalHandles::Create(v8::internal::Object) [/Users/kevineady/Documents/Projects/node/out/Debug/node]
 8: 0x100041180 v8::PersistentBase<v8::Function>::New(v8::Isolate*, v8::Function*) [/Users/kevineady/Documents/Projects/node/out/Debug/node]
 9: 0x1000410df void v8::PersistentBase<v8::Function>::Reset<v8::Function>(v8::Isolate*, v8::Local<v8::Function> const&) [/Users/kevineady/Documents/Projects/node/out/Debug/node]
10: 0x10016e524 v8impl::(anonymous namespace)::ThreadSafeFunction::ThreadSafeFunction(v8::Local<v8::Function>, v8::Local<v8::Object>, v8::Local<v8::String>, unsigned long, void*, unsigned long, node_napi_env__*, void*, void (*)(napi_env__*, void*, void*), void (*)(napi_env__*, napi_value__*, void*, void*)) [/Users/kevineady/Documents/Projects/node/out/Debug/node]
11: 0x100167fc1 v8impl::(anonymous namespace)::ThreadSafeFunction::ThreadSafeFunction(v8::Local<v8::Function>, v8::Local<v8::Object>, v8::Local<v8::String>, unsigned long, void*, unsigned long, node_napi_env__*, void*, void (*)(napi_env__*, void*, void*), void (*)(napi_env__*, napi_value__*, void*, void*)) [/Users/kevineady/Documents/Projects/node/out/Debug/node]
12: 0x100167e7c napi_create_threadsafe_function [/Users/kevineady/Documents/Projects/node/out/Debug/node]
13: 0x11319951c Napi::ThreadSafeFunction Napi::ThreadSafeFunction::New<char const*, void, Napi::ThreadSafeFunction Napi::ThreadSafeFunction::New<char const*, void>(napi_env__*, Napi::Function const&, Napi::Object const&, char const*, unsigned long, unsigned long, void*)::'lambda'(Napi::Env, void*), void>(napi_env__*, Napi::Function const&, Napi::Object const&, char const*, unsigned long, unsigned long, void*, Napi::ThreadSafeFunction Napi::ThreadSafeFunction::New<char const*, void>(napi_env__*, Napi::Function const&, Napi::Object const&, char const*, unsigned long, unsigned long, void*)::'lambda'(Napi::Env, void*), void*, void (*)(napi_env__*, void*, void*)) [/Users/kevineady/Documents/Projects/node-libzim/build/Debug/zim_binding.node]
14: 0x1131993bd Napi::ThreadSafeFunction Napi::ThreadSafeFunction::New<char const*, void, Napi::ThreadSafeFunction Napi::ThreadSafeFunction::New<char const*, void>(napi_env__*, Napi::Function const&, Napi::Object const&, char const*, unsigned long, unsigned long, void*)::'lambda'(Napi::Env, void*)>(napi_env__*, Napi::Function const&, Napi::Object const&, char const*, unsigned long, unsigned long, void*, Napi::ThreadSafeFunction Napi::ThreadSafeFunction::New<char const*, void>(napi_env__*, Napi::Function const&, Napi::Object const&, char const*, unsigned long, unsigned long, void*)::'lambda'(Napi::Env, void*)) [/Users/kevineady/Documents/Projects/node-libzim/build/Debug/zim_binding.node]
15: 0x11319933d Napi::ThreadSafeFunction Napi::ThreadSafeFunction::New<char const*, void>(napi_env__*, Napi::Function const&, Napi::Object const&, char const*, unsigned long, unsigned long, void*) [/Users/kevineady/Documents/Projects/node-libzim/build/Debug/zim_binding.node]
16: 0x1131992de Napi::ThreadSafeFunction Napi::ThreadSafeFunction::New<char const*>(napi_env__*, Napi::Function const&, Napi::Object const&, char const*, unsigned long, unsigned long) [/Users/kevineady/Documents/Projects/node-libzim/build/Debug/zim_binding.node]
17: 0x1131982e5 Napi::ThreadSafeFunction Napi::ThreadSafeFunction::New<char const*>(napi_env__*, Napi::Function const&, char const*, unsigned long, unsigned long) [/Users/kevineady/Documents/Projects/node-libzim/build/Debug/zim_binding.node]
18: 0x11319a23c ContentProviderWrapper::ContentProviderWrapper(Napi::Env, Napi::Object const&) [/Users/kevineady/Documents/Projects/node-libzim/build/Debug/zim_binding.node]
19: 0x11319a025 ContentProviderWrapper::ContentProviderWrapper(Napi::Env, Napi::Object const&) [/Users/kevineady/Documents/Projects/node-libzim/build/Debug/zim_binding.node]
20: 0x113199d57 std::__1::__unique_if<ContentProviderWrapper>::__unique_single std::__1::make_unique<ContentProviderWrapper, Napi::Env&, Napi::Object>(Napi::Env&, Napi::Object&&) [/Users/kevineady/Documents/Projects/node-libzim/build/Debug/zim_binding.node]
21: 0x11319f948 ItemWrapper::getContentProvider() const::'lambda'(Napi::Env, Napi::Function)::operator()(Napi::Env, Napi::Function) const [/Users/kevineady/Documents/Projects/node-libzim/build/Debug/zim_binding.node]
22: 0x11319f8be decltype(std::__1::forward<ItemWrapper::getContentProvider() const::'lambda'(Napi::Env, Napi::Function)&>(fp)(std::__1::forward<Napi::Env>(fp0), std::__1::forward<Napi::Function>(fp0))) std::__1::__invoke<ItemWrapper::getContentProvider() const::'lambda'(Napi::Env, Napi::Function)&, Napi::Env, Napi::Function>(ItemWrapper::getContentProvider() const::'lambda'(Napi::Env, Napi::Function)&, Napi::Env&&, Napi::Function&&) [/Users/kevineady/Documents/Projects/node-libzim/build/Debug/zim_binding.node]
23: 0x11319f837 void std::__1::__invoke_void_return_wrapper<void>::__call<ItemWrapper::getContentProvider() const::'lambda'(Napi::Env, Napi::Function)&, Napi::Env, Napi::Function>(ItemWrapper::getContentProvider() const::'lambda'(Napi::Env, Napi::Function)&, Napi::Env&&, Napi::Function&&) [/Users/kevineady/Documents/Projects/node-libzim/build/Debug/zim_binding.node]
24: 0x11319f7e7 std::__1::__function::__alloc_func<ItemWrapper::getContentProvider() const::'lambda'(Napi::Env, Napi::Function), std::__1::allocator<ItemWrapper::getContentProvider() const::'lambda'(Napi::Env, Napi::Function)>, void (Napi::Env, Napi::Function)>::operator()(Napi::Env&&, Napi::Function&&) [/Users/kevineady/Documents/Projects/node-libzim/build/Debug/zim_binding.node]
25: 0x11319e598 std::__1::__function::__func<ItemWrapper::getContentProvider() const::'lambda'(Napi::Env, Napi::Function), std::__1::allocator<ItemWrapper::getContentProvider() const::'lambda'(Napi::Env, Napi::Function)>, void (Napi::Env, Napi::Function)>::operator()(Napi::Env&&, Napi::Function&&) [/Users/kevineady/Documents/Projects/node-libzim/build/Debug/zim_binding.node]
26: 0x1131998ea std::__1::__function::__value_func<void (Napi::Env, Napi::Function)>::operator()(Napi::Env&&, Napi::Function&&) const [/Users/kevineady/Documents/Projects/node-libzim/build/Debug/zim_binding.node]
27: 0x1131997e9 std::__1::function<void (Napi::Env, Napi::Function)>::operator()(Napi::Env, Napi::Function) const [/Users/kevineady/Documents/Projects/node-libzim/build/Debug/zim_binding.node]
28: 0x113199727 Napi::ThreadSafeFunction::CallJS(napi_env__*, napi_value__*, void*, void*) [/Users/kevineady/Documents/Projects/node-libzim/build/Debug/zim_binding.node]
29: 0x100170ef7 v8impl::(anonymous namespace)::ThreadSafeFunction::DispatchOne()::'lambda'(napi_env__*)::operator()(napi_env__*) const [/Users/kevineady/Documents/Projects/node/out/Debug/node]
30: 0x100170da0 void napi_env__::CallIntoModule<v8impl::(anonymous namespace)::ThreadSafeFunction::DispatchOne()::'lambda'(napi_env__*), void (napi_env__*, v8::Local<v8::Value>)>(v8impl::(anonymous namespace)::ThreadSafeFunction::DispatchOne()::'lambda'(napi_env__*)&&, void (&&)(napi_env__*, v8::Local<v8::Value>)) [/Users/kevineady/Documents/Projects/node/out/Debug/node]
31: 0x100170b94 v8impl::(anonymous namespace)::ThreadSafeFunction::DispatchOne() [/Users/kevineady/Documents/Projects/node/out/Debug/node]
32: 0x100170883 v8impl::(anonymous namespace)::ThreadSafeFunction::Dispatch() [/Users/kevineady/Documents/Projects/node/out/Debug/node]
33: 0x1001706a4 v8impl::(anonymous namespace)::ThreadSafeFunction::AsyncCb(uv_async_s*) [/Users/kevineady/Documents/Projects/node/out/Debug/node]
34: 0x10149cca8 uv__async_io [/Users/kevineady/Documents/Projects/node/out/Debug/node]
35: 0x1014bdebf uv__io_poll [/Users/kevineady/Documents/Projects/node/out/Debug/node]
36: 0x10149d34d uv_run [/Users/kevineady/Documents/Projects/node/out/Debug/node]
37: 0x100007ab6 node::SpinEventLoop(node::Environment*) [/Users/kevineady/Documents/Projects/node/out/Debug/node]
38: 0x10029567f node::NodeMainInstance::Run(int*, node::Environment*) [/Users/kevineady/Documents/Projects/node/out/Debug/node]
39: 0x100294f29 node::NodeMainInstance::Run(node::EnvSerializeInfo const*) [/Users/kevineady/Documents/Projects/node/out/Debug/node]
40: 0x100157d4a node::Start(int, char**) [/Users/kevineady/Documents/Projects/node/out/Debug/node]
41: 0x1018f515e main [/Users/kevineady/Documents/Projects/node/out/Debug/node]
42: 0x7fff204a6621 start [/usr/lib/system/libdyld.dylib]

Unfortunately I am not too familiar with any of these V8 internals. @legendecas or @gabrielschulhof are you able to provide any insight? Or some steps I can take to further troubleshoot?

Thanks, Kevin

@kelvinhammond
Copy link
Contributor Author

Is there a more descriptive error available? I'm wondering if I can somehow work around this and its blocking a major release.

@KevinEady Any ideas?

From what I see, this happens sometimes when creating a threadsafe function and other times when doing some random operation :/

@kelvinhammond
Copy link
Contributor Author

Where can I find the debug version of node?

@JckXia
Copy link
Member

JckXia commented Jun 10, 2022

I assume Kevin built it directly from v16.5.0 source. https://github.com/nodejs/node/blob/master/BUILDING.md

@kelvinhammond
Copy link
Contributor Author

I tried working around this by calling setInterval and having an external queue that runs process in the setInterval callback. This did not work because there are errors when calling Napi::Persistent.

#0  0x0000555555edb760 in v8::internal::GlobalHandles::Create(v8::internal::Object) ()
#1  0x0000555555d7daf4 in v8::V8::GlobalizeReference(v8::internal::Isolate*, unsigned long*) ()
#2  0x0000555555b2b7b2 in v8impl::Reference::New(napi_env__*, v8::Local<v8::Value>, unsigned int, bool, void (*)(napi_env__*, void*, void*), void*, void*) ()
#3  0x0000555555b3354c in napi_create_reference ()
#4  0x00007fffe674b645 in Napi::Reference<Napi::Object>::New (value=..., initialRefcount=1) at /code/node-libzim/node_modules/node-addon-api/napi-inl.h:2869
#5  0x00007fffe672f5db in Napi::Persistent (value=...) at /code/node-libzim/node_modules/node-addon-api/napi-inl.h:3029
#6  0x00007fffe673d7ef in ItemWrapper::ItemWrapper (this=0x5555574fc330, env=..., item=..., queue=std::shared_ptr<ThreadQueue> (use count 143, weak count 0) = {...}) at ../src/writerItem.h:104
#7  0x00007fffe67ac9d4 in std::_Construct<ItemWrapper, Napi::Env&, Napi::Object, std::shared_ptr<ThreadQueue>&> (__p=0x5555574fc330) at /usr/include/c++/12.1.0/bits/stl_construct.h:119
#8  0x00007fffe67aa780 in std::allocator_traits<std::allocator<void> >::construct<ItemWrapper, Napi::Env&, Napi::Object, std::shared_ptr<ThreadQueue>&> (__p=0x5555574fc330) at /usr/include/c++/12.1.0/bits/alloc_traits.h:635
#9  0x00007fffe67a648c in std::_Sp_counted_ptr_inplace<ItemWrapper, std::allocator<void>, (__gnu_cxx::_Lock_policy)2>::_Sp_counted_ptr_inplace<Napi::Env&, Napi::Object, std::shared_ptr<ThreadQueue>&> (this=0x5555574fc320, __a=...) at /usr/include/c++/12.1.0/bits/shared_ptr_base.h:604
#10 0x00007fffe6796655 in std::__shared_count<(__gnu_cxx::_Lock_policy)2>::__shared_count<ItemWrapper, std::allocator<void>, Napi::Env&, Napi::Object, std::shared_ptr<ThreadQueue>&> (this=0x7fffffff9938, __p=@0x7fffffff9930: 0x0, __a=...) at /usr/include/c++/12.1.0/bits/shared_ptr_base.h:971
#11 0x00007fffe677d7a0 in std::__shared_ptr<ItemWrapper, (__gnu_cxx::_Lock_policy)2>::__shared_ptr<std::allocator<void>, Napi::Env&, Napi::Object, std::shared_ptr<ThreadQueue>&> (this=0x7fffffff9930, __tag=...) at /usr/include/c++/12.1.0/bits/shared_ptr_base.h:1712
#12 0x00007fffe676a5d7 in std::shared_ptr<ItemWrapper>::shared_ptr<std::allocator<void>, Napi::Env&, Napi::Object, std::shared_ptr<ThreadQueue>&> (this=0x7fffffff9930, __tag=...) at /usr/include/c++/12.1.0/bits/shared_ptr.h:464
#13 0x00007fffe6759139 in std::make_shared<ItemWrapper, Napi::Env&, Napi::Object, std::shared_ptr<ThreadQueue>&> () at /usr/include/c++/12.1.0/bits/shared_ptr.h:1010
#14 0x00007fffe67416a6 in Creator::addItem (this=0x555557971230, info=...) at ../src/creator.h:189
#15 0x00007fffe67238cb in operator() (__closure=0x7fffffff9a90) at /code/node-libzim/node_modules/node-addon-api/napi-inl.h:170
#16 0x00007fffe6729a80 in Napi::details::WrapCallback<Napi::details::TemplatedInstanceCallback<Creator, &Creator::addItem>(napi_env, napi_callback_info)::<lambda()> >(struct {...}) (callback=...) at /code/node-libzim/node_modules/node-addon-api/napi-inl.h:79
#17 0x00007fffe672395b in Napi::details::TemplatedInstanceCallback<Creator, &Creator::addItem> (env=0x55555764cbe0, info=0x7fffffff9b20) at /code/node-libzim/node_modules/node-addon-api/napi-inl.h:167

Digging through some of the nodejs docs and code. It is my understanding that napi_env is just a pointer to a struct representing the environment. Does this value change for a script being run where there are no other workers?

@KevinEady
Copy link
Contributor

We discussed in the 10 June Node API meeting to try to make a standalone reproduction of this problem that would be a little easier to debug / troubleshoot. If we mimic the several-thousand creation of AsyncWorkers in our standalone test and we do not see the crash, we can (hopefully) assume there's something incorrect in your code that's causing this crash.

I will continue working on this standalone reproduction in order to try to get a similar crash or rule out core issues.

@kelvinhammond
Copy link
Contributor Author

For some reason my persistent api seems to not be storing my function correct.

I check and store the function here, it is called in the feed function:

class ContentProviderWrapper : public zim::writer::ContentProvider {
 public:
  explicit ContentProviderWrapper(Napi::Env env, const Napi::Object &provider)
      : MAIN_THREAD_ID{}, provider_{} {
    MAIN_THREAD_ID = std::this_thread::get_id();

    if (!provider.Get("feed").IsFunction()) {
      throw std::runtime_error("ContentProvider.feed must be a function.");
    }

    provider_ = Napi::Persistent(provider);
    feed_ = Napi::Persistent(provider.Get("feed").As<Napi::Function>());
    size_ = parseSize(provider_.Get("size"));

    tsfn_ = Napi::ThreadSafeFunction::New(env, Napi::Function::New<noopCB>(env),
                                          "getContentProvider.feed", 0, 1);
  }

    zim::Blob feed() override {
    if (MAIN_THREAD_ID == std::this_thread::get_id()) {
      // on main thread for some reason, do it here
      auto blobObj = feed_.Call(provider_.Value(), {});
      if (!blobObj.IsObject()) {
        throw std::runtime_error("ContentProvider.feed must return a blob");
      }
      auto blob = Napi::ObjectWrap<Blob>::Unwrap(blobObj.ToObject());
      return *(blob->blob());
    }

    // called from a thread
    std::promise<zim::Blob> promise;
    auto future = promise.get_future();

    auto callback = [&promise, this](Napi::Env env, Napi::Function) {
      auto blobObj = feed_.Call(provider_.Value(), {});
      if (!blobObj.IsObject()) {
        throw std::runtime_error("ContentProvider.feed must return a blob");
      }
      auto blob = Napi::ObjectWrap<Blob>::Unwrap(blobObj.ToObject());
      promise.set_value(*(blob->blob()));
    };

    auto status = tsfn_.BlockingCall(callback);
    if (status != napi_ok) {
      throw std::runtime_error("Error calling ThreadSafeFunction");
    }

    return future.get();
  }

GDB

(gdb) bt
#0  0x00007ffff70d136c in ?? () from /usr/lib/libc.so.6
#1  0x00007ffff7081838 in raise () from /usr/lib/libc.so.6
#2  0x00007ffff706b535 in abort () from /usr/lib/libc.so.6
#3  0x00007ffff73f1833 in __gnu_cxx::__verbose_terminate_handler () at /usr/src/debug/gcc/libstdc++-v3/libsupc++/vterminate.cc:95
#4  0x00007ffff73fdbfc in __cxxabiv1::__terminate (handler=<optimized out>) at /usr/src/debug/gcc/libstdc++-v3/libsupc++/eh_terminate.cc:48
#5  0x00007ffff73fdc69 in std::terminate () at /usr/src/debug/gcc/libstdc++-v3/libsupc++/eh_terminate.cc:58
#6  0x00007ffff73fdecd in __cxxabiv1::__cxa_throw (obj=<optimized out>, tinfo=0x7fffe67f42f8 <typeinfo for Napi::Error>, dest=0x7fffe673a948 <Napi::Error::~Error()>)
    at /usr/src/debug/gcc/libstdc++-v3/libsupc++/eh_throw.cc:98
#7  0x00007fffe67399a3 in Napi::Function::Call (this=0x7fffffffa030, recv=0x55555755d9a0, argc=0, args=0x0) at /home/kurt/code/node-libzim/node_modules/node-addon-api/napi-inl.h:2341
#8  0x00007fffe6739862 in Napi::Function::Call (this=0x7fffffffa030, recv=0x55555755d9a0, args=std::initializer_list of length 0) at /home/kurt/code/node-libzim/node_modules/node-addon-api/napi-inl.h:2306
#9  0x00007fffe673ae46 in Napi::FunctionReference::Call (this=0x555557d29088, recv=0x55555755d9a0, args=std::initializer_list of length 0)
    at /home/kurt/code/node-libzim/node_modules/node-addon-api/napi-inl.h:3297
#10 0x00007fffe67463b7 in ContentProviderWrapper::feed()::{lambda(Napi::Env, Napi::Function)#1}::operator()(Napi::Env, Napi::Function) const (__closure=0x7fffc0053700, env=...) at ../src/contentProvider.h:78
#11 0x00007fffe67ab026 in std::__invoke_impl<void, ContentProviderWrapper::feed()::{lambda(Napi::Env, Napi::Function)#1}&, Napi::Env, Napi::Function>(std::__invoke_other, ContentProviderWrapper::feed()::{lambda(Napi::Env, Napi::Function)#1}&, Napi::Env&&, Napi::Function&&) (__f=...) at /usr/include/c++/12.1.0/bits/invoke.h:61
#12 0x00007fffe6797b2c in std::__invoke_r<void, ContentProviderWrapper::feed()::{lambda(Napi::Env, Napi::Function)#1}&, Napi::Env, Napi::Function>(ContentProviderWrapper::feed()::{lambda(Napi::Env, Napi::Function)#1}&, Napi::Env&&, Napi::Function&&) (__fn=...) at /usr/include/c++/12.1.0/bits/invoke.h:111
#13 0x00007fffe6780645 in std::_Function_handler<void (Napi::Env, Napi::Function), ContentProviderWrapper::feed()::{lambda(Napi::Env, Napi::Function)#1}>::_M_invoke(std::_Any_data const&, Napi::Env&&, Napi::Function&&) (__functor=..., __args#0=..., __args#1=...) at /usr/include/c++/12.1.0/bits/std_function.h:290
#14 0x00007fffe6756143 in std::function<void (Napi::Env, Napi::Function)>::operator()(Napi::Env, Napi::Function) const (this=0x7fffc0053700, __args#0=..., __args#1=...)
    at /usr/include/c++/12.1.0/bits/std_function.h:591
#15 0x00007fffe673c57b in Napi::ThreadSafeFunction::CallJS (env=0x55555761e4b0, jsCallback=0x55555755d998, data=0x7fffc0053700) at /home/kurt/code/node-libzim/node_modules/node-addon-api/napi-inl.h:5780
#16 0x0000555555b49966 in ?? ()
#17 0x00007ffff7f3592d in ?? () from /usr/lib/libuv.so.1
#18 0x00007ffff7f51d0e in ?? () from /usr/lib/libuv.so.1
#19 0x00007ffff7f3b438 in uv_run () from /usr/lib/libuv.so.1
#20 0x0000555555aae516 in node::SpinEventLoop(node::Environment*) ()
#21 0x0000555555bd0155 in node::NodeMainInstance::Run(int*, node::Environment*) ()
#22 0x0000555555bd0720 in node::NodeMainInstance::Run(node::EnvSerializeInfo const*) ()
#23 0x0000555555b460f4 in node::Start(int, char**) ()
#24 0x00007ffff706c290 in ?? () from /usr/lib/libc.so.6
#25 0x00007ffff706c34a in __libc_start_main () from /usr/lib/libc.so.6
#26 0x0000555555aab095 in _start ()
(gdb) frame 10
#10 0x00007fffe67463b7 in ContentProviderWrapper::feed()::{lambda(Napi::Env, Napi::Function)#1}::operator()(Napi::Env, Napi::Function) const (__closure=0x7fffc0053700, env=...) at ../src/contentProvider.h:78
78            auto blobObj = feed_.Call(provider_.Value(), {});
(gdb) feed_.Type()
Undefined command: "feed_.Type".  Try "help".
(gdb) feed_.Value()
Undefined command: "feed_.Value".  Try "help".
(gdb) print feed_
$1 = {<Napi::Reference<Napi::Function>> = {_env = 0x55555761e4b0, _ref = 0x5555582500c0, _suppressDestruct = false}, <No data fields>}
(gdb) print feed_.Value()
$2 = {<Napi::Object> = {<Napi::Value> = {_env = 0x55555761e4b0, _value = 0x55555755d9d0}, <No data fields>}, <No data fields>}
(gdb) print feed_.Value().Type()
$3 = napi_object
(gdb) print feed_.Value().ToString().Utf8Value()
$4 = "[object Object]"
(gdb) print feed_.Value().ToString().Utf8Value()
$5 = "[object Object]"

Maybe this is what causes some of the issues downstream?

@kelvinhammond
Copy link
Contributor Author

@KevinEady Should the above be a new issue? I still can't figure out what I might be doing wrong.

@KevinEady
Copy link
Contributor

Hi @kelvinhammond , I have not been able to replicate the above error in my own libzim environment. Looking at the code, everythings looks correct:

Runtime type-checking

    if (!provider.Get("feed").IsFunction()) {
      throw std::runtime_error("ContentProvider.feed must be a function.");
    }

    provider_ = Napi::Persistent(provider);
    feed_ = Napi::Persistent(provider.Get("feed").As<Napi::Function>());

Function call:

      auto blobObj = feed_.Call(provider_.Value(), {});

This looks correct to me.

I am a little confused: You have some checks in ContentProviderWrapper::feed to check for main thread. Under what circumstances is this method ran in the main thread and non-main thread?

I'm looking at the stack traces I posted, and there seems to be a new TSFN creation called after CallJs, is there some path where a threadsafe's callback will create a new ThreadSafe function...?

@kelvinhammond
Copy link
Contributor Author

kelvinhammond commented Jun 24, 2022

@KevinEady

I have changed some things locally trying to find a work around to this issue, experimenting with things and changing the contentProvider from

class MyContentProvider {
  get contentProvider() { ... }
}

to a function getContentProvider() which returns an object containing size and feed.

For the ThreadSafeFunction, there is a path where a new one will be created during the callback. ItemWrapper::getContentProvider() will be possibly called from a thread which will then create a new ContentProviderWrapper which creates a new ThreadSafeFunction in its constructor.

@KevinEady
Copy link
Contributor

I have been pulling my hair out trying to understand what's going on. In a previous meeting, there was a suggestion that maybe there is some buffer overflow in the stack / heap, but I'm really not sure how to verify this.

I just wanted to see what would happen in a later version of node, and using 18.4.0 I get different stack trace errors, eg:

(lldb) bt
* thread #1, queue = 'com.apple.main-thread', stop reason = EXC_BAD_ACCESS (code=1, address=0x60000028a3)
  * frame #0: 0x00000001123257b3 zim_binding.node`StringProvider::feed(this=0x0000600004eb36c1, info=0x00007ff7bfef6258) at contentProvider.h:151:30
    frame #1: 0x0000000112325c14 zim_binding.node`napi_value__* Napi::details::TemplatedInstanceCallback<StringProvider, &(StringProvider::feed(Napi::CallbackInfo const&))>(this=0x00007ff7bfef6328)::'lambda'()::operator()() const at napi-inl.h:170:12
    frame #2: 0x00000001123259d9 zim_binding.node`napi_value__* Napi::details::WrapCallback<napi_value__* Napi::details::TemplatedInstanceCallback<StringProvider, &(StringProvider::feed(Napi::CallbackInfo const&))>(napi_env__*, napi_callback_info__*)::'lambda'()>(callback=(unnamed class) @ 0x00007ff7bfef6328))>(napi_env__*, napi_callback_info__*)::'lambda'()) at napi-inl.h:79:12
    frame #3: 0x000000011232599d zim_binding.node`napi_value__* Napi::details::TemplatedInstanceCallback<StringProvider, &(StringProvider::feed(Napi::CallbackInfo const&))>(env=0x00006000033141e0, info=0x00007ff7bfef6398) at napi-inl.h:167:10
    frame #4: 0x0000000100090070 node`v8impl::(anonymous namespace)::FunctionCallbackWrapper::Invoke(v8::FunctionCallbackInfo<v8::Value> const&) + 128
    frame #5: 0x00000001002c57e8 node`v8::internal::FunctionCallbackArguments::Call(v8::internal::CallHandlerInfo) + 216
    frame #6: 0x00000001002c52bb node`v8::internal::MaybeHandle<v8::internal::Object> v8::internal::(anonymous namespace)::HandleApiCallHelper<false>(v8::internal::Isolate*, v8::internal::Handle<v8::internal::HeapObject>, v8::internal::Handle<v8::internal::HeapObject>, v8::internal::Handle<v8::internal::FunctionTemplateInfo>, v8::internal::Handle<v8::internal::Object>, v8::internal::BuiltinArguments) + 555
    frame #7: 0x00000001002c4cfe node`v8::internal::Builtins::InvokeApiFunction(v8::internal::Isolate*, bool, v8::internal::Handle<v8::internal::HeapObject>, v8::internal::Handle<v8::internal::Object>, int, v8::internal::Handle<v8::internal::Object>*, v8::internal::Handle<v8::internal::HeapObject>) + 718
    frame #8: 0x00000001003926eb node`v8::internal::(anonymous namespace)::Invoke(v8::internal::Isolate*, v8::internal::(anonymous namespace)::InvokeParams const&) + 2315
    frame #9: 0x0000000100391dc5 node`v8::internal::Execution::Call(v8::internal::Isolate*, v8::internal::Handle<v8::internal::Object>, v8::internal::Handle<v8::internal::Object>, int, v8::internal::Handle<v8::internal::Object>*) + 213
    frame #10: 0x00000001002752cf node`v8::Function::Call(v8::Local<v8::Context>, v8::Local<v8::Value>, int, v8::Local<v8::Value>*) + 511
    frame #11: 0x000000010008afbb node`napi_call_function + 187
    frame #12: 0x00000001122dc4ec zim_binding.node`Napi::Function::Call(this=0x00007ff7bfef6b70, recv=0x0000000106029038, argc=0, args=0x0000000000000000) const at napi-inl.h:2339:24
    frame #13: 0x00000001122dc22e zim_binding.node`Napi::Function::Call(this=0x00007ff7bfef6b70, recv=0x0000000106029038, args=0x00007ff7bfef6b30) const at napi-inl.h:2306:10
    frame #14: 0x0000000112336b5f zim_binding.node`ContentProviderWrapper::feed(this=0x0000600004eb33c8, env=(_env = 0x00006000033141e0), (null)=Function @ 0x00007ff7bfef6b88)::'lambda'(Napi::Env, Napi::Function)::operator()(Napi::Env, Napi::Function) const at contentProvider.h:74:31
    frame #15: 0x0000000112336aa7 zim_binding.node`decltype(__f=0x0000600004eb33c8, __args=0x00007ff7bfef6d18, __args=0x00007ff7bfef6d08)::'lambda'(Napi::Env, Napi::Function)&>(fp)(static_cast<Napi::Env>(fp0), static_cast<Napi::Function>(fp0))) std::__1::__invoke<ContentProviderWrapper::feed()::'lambda'(Napi::Env, Napi::Function)&, Napi::Env, Napi::Function>(ContentProviderWrapper::feed()::'lambda'(Napi::Env, Napi::Function)&, Napi::Env&&, Napi::Function&&) at type_traits:3918:1
    frame #16: 0x0000000112336a37 zim_binding.node`void std::__1::__invoke_void_return_wrapper<void, true>::__call<ContentProviderWrapper::feed(__args=0x0000600004eb33c8, __args=0x00007ff7bfef6d18, __args=0x00007ff7bfef6d08)::'lambda'(Napi::Env, Napi::Function)&, Napi::Env, Napi::Function>(ContentProviderWrapper::feed()::'lambda'(Napi::Env, Napi::Function)&, Napi::Env&&, Napi::Function&&) at invoke.h:61:9
    frame #17: 0x00000001123369e7 zim_binding.node`std::__1::__function::__alloc_func<ContentProviderWrapper::feed()::'lambda'(Napi::Env, Napi::Function), std::__1::allocator<ContentProviderWrapper::feed()::'lambda'(Napi::Env, Napi::Function)>, void (Napi::Env, Napi::Function)>::operator(this=0x0000600004eb33c8, __arg=0x00007ff7bfef6d18, __arg=0x00007ff7bfef6d08)(Napi::Env&&, Napi::Function&&) at function.h:178:16
    frame #18: 0x0000000112335736 zim_binding.node`std::__1::__function::__func<ContentProviderWrapper::feed()::'lambda'(Napi::Env, Napi::Function), std::__1::allocator<ContentProviderWrapper::feed()::'lambda'(Napi::Env, Napi::Function)>, void (Napi::Env, Napi::Function)>::operator(this=0x0000600004eb33c0, __arg=0x00007ff7bfef6d18, __arg=0x00007ff7bfef6d08)(Napi::Env&&, Napi::Function&&) at function.h:352:12
    frame #19: 0x000000011233387f zim_binding.node`std::__1::__function::__value_func<void (Napi::Env, Napi::Function)>::operator(this=0x0000600004eb33c0, __args=0x00007ff7bfef6d18, __args=0x00007ff7bfef6d08)(Napi::Env&&, Napi::Function&&) const at function.h:505:16
    frame #20: 0x0000000112333786 zim_binding.node`std::__1::function<void (Napi::Env, Napi::Function)>::operator(this= Lambda in File contentProvider.h at Line 72, __arg=(_env = 0x00006000033141e0), __arg=Function @ 0x00007ff7bfef6d08)(Napi::Env, Napi::Function) const at function.h:1182:12
    frame #21: 0x00000001123336c4 zim_binding.node`Napi::ThreadSafeFunction::CallJS(env=0x00006000033141e0, jsCallback=0x0000000106029028, (null)=0x0000000000000000, data=0x0000600004eb33c0) at napi-inl.h:5780:5
    frame #22: 0x00000001000a32e8 node`v8impl::(anonymous namespace)::ThreadSafeFunction::AsyncCb(uv_async_s*) + 536
    frame #23: 0x0000000100b8a4cb node`uv__async_io + 379
    frame #24: 0x0000000100b9e2bb node`uv__io_poll + 2075
    frame #25: 0x0000000100b8aa51 node`uv_run + 433
    frame #26: 0x0000000100002f9f node`node::SpinEventLoop(node::Environment*) + 287
    frame #27: 0x00000001001156f1 node`node::NodeMainInstance::Run(int*, node::Environment*) + 97
    frame #28: 0x0000000100115363 node`node::NodeMainInstance::Run() + 115
    frame #29: 0x000000010009e681 node`node::Start(int, char**) + 401
    frame #30: 0x000000010529e51e dyld`start + 462
(lldb) fr s 0
frame #0: 0x00000001123257b3 zim_binding.node`StringProvider::feed(this=0x0000600004eb36c1, info=0x00007ff7bfef6258) at contentProvider.h:151:30
   148 	  Napi::Value feed(const Napi::CallbackInfo &info) {
   149 	    try {
   150 	      // TODO(kelvinhammond): need a way to move this to avoid copying
-> 151 	      auto blob = provider_->feed();
   152 	      return Blob::New(info.Env(), blob);
   153 	    } catch (const std::exception &err) {
   154 	      throw Napi::Error::New(info.Env(), err.what());

Unfortunately I had to wipe my machine and lost my debug build of node 16.15.0 ... I am going to build debug version of 18.14.0 and see if there are any significant changes

@KevinEady
Copy link
Contributor

In the 1 July 2022 Node API meeting, we discussed trying to use the MemorySanitizer and AddressSanitizer feature in Clang to find a possible memory issue. The documentation is here:

Also, node-gyp can also generate Xcode project via node-gyp configure -- -f xcode. XCode has some tools to analyze memory and address checking.

I will try using these above methods to help troubleshoot issues.

@kelvinhammond
Copy link
Contributor Author

@KevinEady Thank you.

@kelvinhammond
Copy link
Contributor Author

@KevinEady Have you had a chance to try them out? If so, what were your results?

@kelvinhammond kelvinhammond changed the title Segmentation faults - Incosistent errors with lots of async workers and objects Segmentation faults - Inconsistent errors with lots of async workers and objects Aug 16, 2022
@mgautierfr
Copy link

I am a little confused: You have some checks in ContentProviderWrapper::feed to check for main thread. Under what circumstances is this method ran in the main thread and non-main thread?

For the ThreadSafeFunction, there is a path where a new one will be created during the callback. ItemWrapper::getContentProvider() will be possibly called from a thread which will then create a new ContentProviderWrapper which creates a new ThreadSafeFunction in its constructor.

This may be fixed on libzim side with openzim/libzim#684
I think you can reduce the reproduction case by removing this check.

@KevinEady
Copy link
Contributor

Hi @kelvinhammond ,

Apologies for the delay. I was on a traveling vacation and should be getting back into the swing of things this weekend.

I see that openzim/node-libzim#72 was merged into master. Have you been able to identify the underlying cause, or just accepting its current implementation?

@zkk123123

This comment was marked as spam.

@kelson42
Copy link

I see that openzim/node-libzim#72 was merged into master. Have you been able to identify the underlying cause, or just accepting its current implementation?

Sorry, forgot to answer this question. We merged but we are exactly at the same stage and have open a dedicated ticket at openzim/node-libzim#80

@kelson42
Copy link

kelson42 commented Nov 6, 2022

Any update here? This problem is a blocker for future versions of Wikipedia scraper. Would a bounty helps to identify the bug?

@kelvinhammond
Copy link
Contributor Author

@KevinEady Sorry I missed your comment. No, I was not able to figure out what was causing this issue. I tried many different methods to work around it too but non of them worked because I need to use Napi::ThreadSafeFunction. The issue may be related to the creation and freeing of many of these objects.

@kelvinhammond
Copy link
Contributor Author

Just tested it against node v18. I'm still getting the below errors.

Thread 1 "node" received signal SIGSEGV, Segmentation fault.                                                           

0x0000555555f3a799 in v8::internal::GlobalHandles::Create(v8::internal::Object) ()

In the main branch for node-libzim (or libzim8-build-fix1 if its not been merged yet). I edit test/makeLargeZim.ts and uncomment the customItem creation code and change the await creator.addItem(stringItem); to be await creator.addItem(customItem);. This reliably reproduces the issue.

Errors in gdb when running it like so:
node-gyp rebuild --debug -v && npm run bundle && gdb -ex run --args node -r ts-node/register test/makeLargeZim.ts

@kelvinhammond
Copy link
Contributor Author

One of the inconsistent errors I keep seeing is that the ThreadSafeNativeFunction's callback function does not appear to store the function for some reason. Not sure if this is a node issue or node-addon-api issue.

Thread 1 "node" hit Breakpoint 1, ContentProviderWrapper::feed()::{lambda(Napi::Env, Napi::Function)#1}::operator()(Napi::Env, Napi::Function) const (__closure=0x7fffc8000bf0, env=..., feedFunc=...) at ../src/contentProvider.h:79
79            auto blobObj = feedFunc.Call(provider_.Value(), {});
(gdb) p feedFunc.Type()
$3 = napi_function
(gdb) p feedFunc.Value().Type()
Attempt to take address of value not located in memory.
(gdb) p feedFunc.Type()
$4 = napi_undefined

I get the following error if that function is called.

terminate called after throwing an instance of 'Napi::Error'
  what():  Illegal invocation

And I can't reliably check the function is valid without getting it's value first and then checking if the function is undefined for some reason. Also the function should not be undefined.

@KevinEady
Copy link
Contributor

Hi @kelvinhammond ,

Is this still an issue? I noticed there has been some work after merging the initial branch. I forked the repository and enabled the test-mem-leak script in GitHub Actions and all six jobs ran successfully: run 1, run 2. When I was looking at this issue before, it would consistently fail on every run, but that does not seem to be the case now.

@kelvinhammond
Copy link
Contributor Author

@KevinEady Yes, the code that triggers it is commented out in the test-mem-leak script. This is still an issue.

@KevinEady
Copy link
Contributor

So the latest update I have... Running with valgrind (action) displays some memory read errors but does complete successfully, whereas the non-valgrind (action) run gives the segfault.

The valgrind warnings are of this fashion:

Invalid read of size 2
   at 0x50A59E0: memmove (in /usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so)
   by 0x151850A3: ZSTD_compressStream2 (in /usr/lib/x86_64-linux-gnu/libzstd.so.1.4.8)
   by 0x1518537E: ZSTD_compressStream (in /usr/lib/x86_64-linux-gnu/libzstd.so.1.4.8)
   by 0x150DF524: ZSTD_INFO::stream_run_encode(ZSTD_INFO::stream_t*, CompStep) (compression.cpp:118)
   by 0x151025CB: zim::Compressor<ZSTD_INFO>::feed(char const*, unsigned long, CompStep) (compression.h:243)
   by 0x15101D4F: zim::writer::Cluster::_compress<ZSTD_INFO>()::{lambda(zim::Blob const&)#1}::operator()(zim::Blob const&) const (cluster.cpp:152)
   by 0x151036AA: void std::__invoke_impl<void, zim::writer::Cluster::_compress<ZSTD_INFO>()::{lambda(zim::Blob const&)#1}&, zim::Blob const&>(std::__invoke_other, zim::writer::Cluster::_compress<ZSTD_INFO>()::{lambda(zim::Blob const&)#1}&, zim::Blob const&) (invoke.h:61)
   by 0x151032F0: std::enable_if<std::__and_<std::is_void<void>, std::__is_invocable<zim::writer::Cluster::_compress<ZSTD_INFO>()::{lambda(zim::Blob const&)#1}&, zim::Blob const&> >::value, void>::type std::__invoke_r<void, zim::writer::Cluster::_compress<ZSTD_INFO>()::{lambda(zim::Blob const&)#1}&, zim::Blob const&>(zim::writer::Cluster::_compress<ZSTD_INFO>()::{lambda(zim::Blob const&)#1}&, zim::Blob const&) (invoke.h:154)
   by 0x15102C58: std::_Function_handler<void (zim::Blob const&), zim::writer::Cluster::_compress<ZSTD_INFO>()::{lambda(zim::Blob const&)#1}>::_M_invoke(std::_Any_data const&, zim::Blob const&) (std_function.h:290)
   by 0x151020F2: std::function<void (zim::Blob const&)>::operator()(zim::Blob const&) const (std_function.h:590)
   by 0x15101069: zim::writer::Cluster::write_data(std::function<void (zim::Blob const&)>) const (cluster.cpp:244)
   by 0x151007BD: zim::writer::Cluster::write_content(std::function<void (zim::Blob const&)>) const (cluster.cpp:124)
 Address 0x13db0760 is 0 bytes inside a block of size 31 free'd
   at 0x509EB6F: operator delete(void*, unsigned long) (in /usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so)
   by 0x14F16A8A: Blob::~Blob() (in /home/runner/work/node-libzim/node-libzim/build/Release/zim_binding.node)
   by 0x14F1785C: Napi::ObjectWrap<Blob>::FinalizeCallback(napi_env__*, void*, void*) (in /home/runner/work/node-libzim/node-libzim/build/Release/zim_binding.node)
   by 0xB321AA: node::CallbackQueue<void, node::Environment*>::CallbackImpl<node_napi_env__::CallFinalizer<true>(void (*)(napi_env__*, void*, void*), void*, void*)::{lambda(node::Environment*)#2}>::Call(node::Environment*) (in /opt/hostedtoolcache/node/18.13.0/x64/bin/node)
   by 0xAF2E3B: node::Environment::RunAndClearNativeImmediates(bool) (in /opt/hostedtoolcache/node/18.13.0/x64/bin/node)
   by 0xAF3445: node::Environment::CheckImmediate(uv_check_s*) (in /opt/hostedtoolcache/node/18.13.0/x64/bin/node)
   by 0x165ABB8: uv__run_check (loop-watcher.c:67)
   by 0x16532EF: uv_run (core.c:420)
   by 0xAAFA2C: node::SpinEventLoop(node::Environment*) (in /opt/hostedtoolcache/node/18.13.0/x64/bin/node)
   by 0xBB11F3: node::NodeMainInstance::Run() (in /opt/hostedtoolcache/node/18.13.0/x64/bin/node)
   by 0xB26C43: node::LoadSnapshotDataAndRun(node::SnapshotData const**, node::InitializationResult const*) (in /opt/hostedtoolcache/node/18.13.0/x64/bin/node)
   by 0xB2A83E: node::Start(int, char**) (in /opt/hostedtoolcache/node/18.13.0/x64/bin/node)
 Block was alloc'd at
   at 0x509C013: operator new(unsigned long) (in /usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so)
   by 0xC25D7F: std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::reserve(unsigned long) (in /opt/hostedtoolcache/node/18.13.0/x64/bin/node)
   by 0x14F1028B: Napi::String::Utf8Value[abi:cxx11]() const (in /home/runner/work/node-libzim/node-libzim/build/Release/zim_binding.node)
   by 0x14F5371E: Blob::Blob(Napi::CallbackInfo const&) (in /home/runner/work/node-libzim/node-libzim/build/Release/zim_binding.node)
   by 0x14F53B48: Napi::ObjectWrap<Blob>::ConstructorCallbackWrapper(napi_env__*, napi_callback_info__*) (in /home/runner/work/node-libzim/node-libzim/build/Release/zim_binding.node)
   by 0xB10D7C: v8impl::(anonymous namespace)::FunctionCallbackWrapper::Invoke(v8::FunctionCallbackInfo<v8::Value> const&) (in /opt/hostedtoolcache/node/18.13.0/x64/bin/node)
   by 0xDB11AF: v8::internal::MaybeHandle<v8::internal::Object> v8::internal::(anonymous namespace)::HandleApiCallHelper<true>(v8::internal::Isolate*, v8::internal::Handle<v8::internal::HeapObject>, v8::internal::Handle<v8::internal::HeapObject>, v8::internal::Handle<v8::internal::FunctionTemplateInfo>, v8::internal::Handle<v8::internal::Object>, v8::internal::BuiltinArguments) (in /opt/hostedtoolcache/node/18.13.0/x64/bin/node)
   by 0xDB17AE: v8::internal::Builtin_HandleApiCall(int, unsigned long*, v8::internal::Isolate*) (in /opt/hostedtoolcache/node/18.13.0/x64/bin/node)
   by 0x16EF578: Builtins_CEntry_Return1_DontSaveFPRegs_ArgvOnStack_BuiltinExit (in /opt/hostedtoolcache/node/18.13.0/x64/bin/node)
   by 0x1670FEB: Builtins_JSBuiltinsConstructStub (in /opt/hostedtoolcache/node/18.13.0/x64/bin/node)
   by 0xBADABE2: ???
   by 0x1671ADB: Builtins_JSEntryTrampoline (in /opt/hostedtoolcache/node/18.13.0/x64/bin/node)

Not sure if these memory warnings could cause the behavior witnessed with the segfaulting

@kelson42
Copy link

kelson42 commented Feb 3, 2023

@KevinEady Thank you. Any hint for you in this Valgrind report?

@KevinEady
Copy link
Contributor

KevinEady commented Feb 8, 2023

Hi @kelvinhammond , @kelson42 ,

I'm not too sure with the internal workings of the addon or libzim, but looking at the stacktrace it seems like a use-after-free error. The addon's Blob is freed through the garbage collector, running its destructor, decreasing the refcount of the std::shared_ptr<zim::Blob> blob_. It also appears that the shared_ptr's zim::Blob contents are being directly returned here. Since this doesn't make a copy of the shared pointer, is it possible that the zim::Blob ends up freeing before libzim has had a chance to act on it? Just a guess here.

I'm running valgrind again with a higher --num-callers=50 argument to try to get more information in the use-after-free stack. I'm getting some different -- yet similar -- valgrind warnings:

==2642== Invalid read of size 1
==2642==    at 0x50A5A10: memmove (in /usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so)
==2642==    by 0x5BD972D: std::basic_streambuf<char, std::char_traits<char> >::xsputn(char const*, long) (in /usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.30)
==2642==    by 0x5BCB345: std::ostream::write(char const*, long) (in /usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.30)
==2642==    by 0x16598754: zim::writer::DefaultIndexData::initialize() const (in /home/runner/work/node-libzim/node-libzim/build/Release/libzim.so.8)
==2642==    by 0x165995B8: zim::writer::DefaultIndexData::hasIndexData() const (in /home/runner/work/node-libzim/node-libzim/build/Release/libzim.so.8)
==2642==    by 0x165BF840: zim::writer::IndexTask::run(zim::writer::CreatorData*) (in /home/runner/work/node-libzim/node-libzim/build/Release/libzim.so.8)
==2642==    by 0x1659C2D8: zim::writer::taskRunner(void*) (in /home/runner/work/node-libzim/node-libzim/build/Release/libzim.so.8)
==2642==    by 0x5B6B2B2: ??? (in /usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.30)
==2642==    by 0x5D4DB42: start_thread (pthread_create.c:442)
==2642==    by 0x5DDEBB3: clone (clone.S:100)
==2642==  Address 0x15a589a0 is 16 bytes inside a block of size 31 free'd
==2642==    at 0x509EB6F: operator delete(void*, unsigned long) (in /usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so)
==2642==    by 0x16316A8A: Blob::~Blob() (in /home/runner/work/node-libzim/node-libzim/build/Release/zim_binding.node)
==2642==    by 0x1631785C: Napi::ObjectWrap<Blob>::FinalizeCallback(napi_env__*, void*, void*) (in /home/runner/work/node-libzim/node-libzim/build/Release/zim_binding.node)
==2642==    by 0xB321AA: node::CallbackQueue<void, node::Environment*>::CallbackImpl<node_napi_env__::CallFinalizer<true>(void (*)(napi_env__*, void*, void*), void*, void*)::{lambda(node::Environment*)#2}>::Call(node::Environment*) (in /opt/hostedtoolcache/node/18.13.0/x64/bin/node)
==2642==    by 0xAF2E3B: node::Environment::RunAndClearNativeImmediates(bool) (in /opt/hostedtoolcache/node/18.13.0/x64/bin/node)
==2642==    by 0xAF3445: node::Environment::CheckImmediate(uv_check_s*) (in /opt/hostedtoolcache/node/18.13.0/x64/bin/node)
==2642==    by 0x165ABB8: uv__run_check (loop-watcher.c:67)
==2642==    by 0x16532EF: uv_run (core.c:420)
==2642==    by 0xAAFA2C: node::SpinEventLoop(node::Environment*) (in /opt/hostedtoolcache/node/18.13.0/x64/bin/node)
==2642==    by 0xBB11F3: node::NodeMainInstance::Run() (in /opt/hostedtoolcache/node/18.13.0/x64/bin/node)
==2642==    by 0xB26C43: node::LoadSnapshotDataAndRun(node::SnapshotData const**, node::InitializationResult const*) (in /opt/hostedtoolcache/node/18.13.0/x64/bin/node)
==2642==    by 0xB2A83E: node::Start(int, char**) (in /opt/hostedtoolcache/node/18.13.0/x64/bin/node)
==2642==    by 0x5CE2D8F: (below main) (libc_start_call_main.h:58)
==2642==  Block was alloc'd at
==2642==    at 0x509C013: operator new(unsigned long) (in /usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so)
==2642==    by 0xC25D7F: std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::reserve(unsigned long) (in /opt/hostedtoolcache/node/18.13.0/x64/bin/node)
==2642==    by 0x1631028B: Napi::String::Utf8Value[abi:cxx11]() const (in /home/runner/work/node-libzim/node-libzim/build/Release/zim_binding.node)
==2642==    by 0x1635371E: Blob::Blob(Napi::CallbackInfo const&) (in /home/runner/work/node-libzim/node-libzim/build/Release/zim_binding.node)
==2642==    by 0x16353B48: Napi::ObjectWrap<Blob>::ConstructorCallbackWrapper(napi_env__*, napi_callback_info__*) (in /home/runner/work/node-libzim/node-libzim/build/Release/zim_binding.node)
....
==2642== Invalid read of size 1
==2642==    at 0x50A5A1D: memmove (in /usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so)
==2642==    by 0x165DD5CE: ZSTD_compressStream2 (in /home/runner/work/node-libzim/node-libzim/build/Release/libzim.so.8)
==2642==    by 0x165DD96A: ZSTD_compressStream (in /home/runner/work/node-libzim/node-libzim/build/Release/libzim.so.8)
==2642==    by 0x16587B45: ZSTD_INFO::stream_run_encode(ZSTD_INFO::stream_t*, CompStep) (in /home/runner/work/node-libzim/node-libzim/build/Release/libzim.so.8)
==2642==    by 0x1659AD63: std::_Function_handler<void (zim::Blob const&), zim::writer::Cluster::_compress<ZSTD_INFO>()::{lambda(zim::Blob const&)#1}>::_M_invoke(std::_Any_data const&, zim::Blob const&) (in /home/runner/work/node-libzim/node-libzim/build/Release/libzim.so.8)
==2642==    by 0x16599C80: zim::writer::Cluster::write_data(std::function<void (zim::Blob const&)>) const (in /home/runner/work/node-libzim/node-libzim/build/Release/libzim.so.8)
==2642==    by 0x1659A05D: zim::writer::Cluster::write_content(std::function<void (zim::Blob const&)>) const (in /home/runner/work/node-libzim/node-libzim/build/Release/libzim.so.8)
==2642==    by 0x1659B06B: void zim::writer::Cluster::_compress<ZSTD_INFO>() (in /home/runner/work/node-libzim/node-libzim/build/Release/libzim.so.8)
==2642==    by 0x1659A39E: zim::writer::Cluster::close() (in /home/runner/work/node-libzim/node-libzim/build/Release/libzim.so.8)
==2642==    by 0x1659C2D8: zim::writer::taskRunner(void*) (in /home/runner/work/node-libzim/node-libzim/build/Release/libzim.so.8)
==2642==    by 0x5B6B2B2: ??? (in /usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.30)
==2642==    by 0x5D4DB42: start_thread (pthread_create.c:442)
==2642==    by 0x5DDEBB3: clone (clone.S:100)
==2642==  Address 0x1f0372b2 is 2 bytes inside a block of size 31 free'd
==2642==    at 0x509EB6F: operator delete(void*, unsigned long) (in /usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so)
==2642==    by 0x16316A8A: Blob::~Blob() (in /home/runner/work/node-libzim/node-libzim/build/Release/zim_binding.node)
==2642==    by 0x1631785C: Napi::ObjectWrap<Blob>::FinalizeCallback(napi_env__*, void*, void*) (in /home/runner/work/node-libzim/node-libzim/build/Release/zim_binding.node)
==2642==    by 0xB321AA: node::CallbackQueue<void, node::Environment*>::CallbackImpl<node_napi_env__::CallFinalizer<true>(void (*)(napi_env__*, void*, void*), void*, void*)::{lambda(node::Environment*)#2}>::Call(node::Environment*) (in /opt/hostedtoolcache/node/18.13.0/x64/bin/node)
==2642==    by 0xAF2E3B: node::Environment::RunAndClearNativeImmediates(bool) (in /opt/hostedtoolcache/node/18.13.0/x64/bin/node)
==2642==    by 0xAF3445: node::Environment::CheckImmediate(uv_check_s*) (in /opt/hostedtoolcache/node/18.13.0/x64/bin/node)
==2642==    by 0x165ABB8: uv__run_check (loop-watcher.c:67)
==2642==    by 0x16532EF: uv_run (core.c:420)
==2642==    by 0xAAFA2C: node::SpinEventLoop(node::Environment*) (in /opt/hostedtoolcache/node/18.13.0/x64/bin/node)
==2642==    by 0xBB11F3: node::NodeMainInstance::Run() (in /opt/hostedtoolcache/node/18.13.0/x64/bin/node)
==2642==    by 0xB26C43: node::LoadSnapshotDataAndRun(node::SnapshotData const**, node::InitializationResult const*) (in /opt/hostedtoolcache/node/18.13.0/x64/bin/node)
==2642==    by 0xB2A83E: node::Start(int, char**) (in /opt/hostedtoolcache/node/18.13.0/x64/bin/node)
==2642==    by 0x5CE2D8F: (below main) (libc_start_call_main.h:58)
==2642==  Block was alloc'd at
==2642==    at 0x509C013: operator new(unsigned long) (in /usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so)
==2642==    by 0xC25D7F: std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::reserve(unsigned long) (in /opt/hostedtoolcache/node/18.13.0/x64/bin/node)
==2642==    by 0x1631028B: Napi::String::Utf8Value[abi:cxx11]() const (in /home/runner/work/node-libzim/node-libzim/build/Release/zim_binding.node)
==2642==    by 0x1635371E: Blob::Blob(Napi::CallbackInfo const&) (in /home/runner/work/node-libzim/node-libzim/build/Release/zim_binding.node)
==2642==    by 0x16353B48: Napi::ObjectWrap<Blob>::ConstructorCallbackWrapper(napi_env__*, napi_callback_info__*) (in /home/runner/work/node-libzim/node-libzim/build/Release/zim_binding.node)
...
==2642== Invalid read of size 1
==2642==    at 0x50A5A1D: memmove (in /usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so)
==2642==    by 0x165DD5CE: ZSTD_compressStream2 (in /home/runner/work/node-libzim/node-libzim/build/Release/libzim.so.8)
==2642==    by 0x165DD96A: ZSTD_compressStream (in /home/runner/work/node-libzim/node-libzim/build/Release/libzim.so.8)
==2642==    by 0x16587B45: ZSTD_INFO::stream_run_encode(ZSTD_INFO::stream_t*, CompStep) (in /home/runner/work/node-libzim/node-libzim/build/Release/libzim.so.8)
==2642==    by 0x1659AD63: std::_Function_handler<void (zim::Blob const&), zim::writer::Cluster::_compress<ZSTD_INFO>()::{lambda(zim::Blob const&)#1}>::_M_invoke(std::_Any_data const&, zim::Blob const&) (in /home/runner/work/node-libzim/node-libzim/build/Release/libzim.so.8)
==2642==    by 0x16599C80: zim::writer::Cluster::write_data(std::function<void (zim::Blob const&)>) const (in /home/runner/work/node-libzim/node-libzim/build/Release/libzim.so.8)
==2642==    by 0x1659A05D: zim::writer::Cluster::write_content(std::function<void (zim::Blob const&)>) const (in /home/runner/work/node-libzim/node-libzim/build/Release/libzim.so.8)
==2642==    by 0x1659B06B: void zim::writer::Cluster::_compress<ZSTD_INFO>() (in /home/runner/work/node-libzim/node-libzim/build/Release/libzim.so.8)
==2642==    by 0x1659A39E: zim::writer::Cluster::close() (in /home/runner/work/node-libzim/node-libzim/build/Release/libzim.so.8)
==2642==    by 0x1659C2D8: zim::writer::taskRunner(void*) (in /home/runner/work/node-libzim/node-libzim/build/Release/libzim.so.8)
==2642==    by 0x5B6B2B2: ??? (in /usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.30)
==2642==    by 0x5D4DB42: start_thread (pthread_create.c:442)
==2642==    by 0x5DDEBB3: clone (clone.S:100)
==2642==  Address 0x1f0372b2 is 2 bytes inside a block of size 31 free'd
==2642==    at 0x509EB6F: operator delete(void*, unsigned long) (in /usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so)
==2642==    by 0x16316A8A: Blob::~Blob() (in /home/runner/work/node-libzim/node-libzim/build/Release/zim_binding.node)
==2642==    by 0x1631785C: Napi::ObjectWrap<Blob>::FinalizeCallback(napi_env__*, void*, void*) (in /home/runner/work/node-libzim/node-libzim/build/Release/zim_binding.node)
==2642==    by 0xB321AA: node::CallbackQueue<void, node::Environment*>::CallbackImpl<node_napi_env__::CallFinalizer<true>(void (*)(napi_env__*, void*, void*), void*, void*)::{lambda(node::Environment*)#2}>::Call(node::Environment*) (in /opt/hostedtoolcache/node/18.13.0/x64/bin/node)
==2642==    by 0xAF2E3B: node::Environment::RunAndClearNativeImmediates(bool) (in /opt/hostedtoolcache/node/18.13.0/x64/bin/node)
==2642==    by 0xAF3445: node::Environment::CheckImmediate(uv_check_s*) (in /opt/hostedtoolcache/node/18.13.0/x64/bin/node)
==2642==    by 0x165ABB8: uv__run_check (loop-watcher.c:67)
==2642==    by 0x16532EF: uv_run (core.c:420)
==2642==    by 0xAAFA2C: node::SpinEventLoop(node::Environment*) (in /opt/hostedtoolcache/node/18.13.0/x64/bin/node)
==2642==    by 0xBB11F3: node::NodeMainInstance::Run() (in /opt/hostedtoolcache/node/18.13.0/x64/bin/node)
==2642==    by 0xB26C43: node::LoadSnapshotDataAndRun(node::SnapshotData const**, node::InitializationResult const*) (in /opt/hostedtoolcache/node/18.13.0/x64/bin/node)
==2642==    by 0xB2A83E: node::Start(int, char**) (in /opt/hostedtoolcache/node/18.13.0/x64/bin/node)
==2642==    by 0x5CE2D8F: (below main) (libc_start_call_main.h:58)
==2642==  Block was alloc'd at
==2642==    at 0x509C013: operator new(unsigned long) (in /usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so)
==2642==    by 0xC25D7F: std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::reserve(unsigned long) (in /opt/hostedtoolcache/node/18.13.0/x64/bin/node)
==2642==    by 0x1631028B: Napi::String::Utf8Value[abi:cxx11]() const (in /home/runner/work/node-libzim/node-libzim/build/Release/zim_binding.node)
==2642==    by 0x1635371E: Blob::Blob(Napi::CallbackInfo const&) (in /home/runner/work/node-libzim/node-libzim/build/Release/zim_binding.node)
==2642==    by 0x16353B48: Napi::ObjectWrap<Blob>::ConstructorCallbackWrapper(napi_env__*, napi_callback_info__*) (in /home/runner/work/node-libzim/node-libzim/build/Release/zim_binding.node)
...

So it seems like there is some invalid thread safety regarding the Blob on the Node thread and the task runner on the thread created by libzim.

Does this provide any information / clues?

EDIT: Here's a link to the latest action run: https://github.com/KevinEady/node-libzim/actions/runs/4128364660/jobs/7132716816

@kelvinhammond
Copy link
Contributor Author

@KevinEady Thanks for taking a look at this. I'll take a look into this too and see what I can change and if it helps.

@mgautierfr
Copy link

@kelvinhammond I can help you on that if you want.
You can contact me on https://kiwixoffline.slack.com/

@kelvinhammond
Copy link
Contributor Author

@kelvinhammond I can help you on that if you want.
You can contact me on https://kiwixoffline.slack.com/

I can't actually connect to this. Can someone create me an account?

@kelson42
Copy link

kelson42 commented Feb 12, 2023

@kelvinhammond You can join through https://join.slack.com/t/kiwixoffline/shared_invite/zt-19s7tsi68-xlgHdmDr5c6MJ7uFmJuBkg

@kelson42
Copy link

@KevinEady Thank you very much for your last comment. This is pretty much the guess of the libzim dev @mgautierfr as well.

@kelvinhammond We have made an improvement with openzim/node-libzim#93. How is the status now of this ticket? Does it has evolved?

@kelvinhammond
Copy link
Contributor Author

No, this bug still exists. It usually happens when I create the thread safe native function.

@mhdawson
Copy link
Member

@KevinEady mentioned that there have been further changes merged since the last comment. Can this now be closed?

@kelson42
Copy link

AFAIK after talking to @kelvinhammond, the bug/problem is still there.

@gabrielschulhof
Copy link
Contributor

Since the PR mentioned above was merged, I performed the following steps on f61346fa4 of the main branch:

nvm use 16 # (v16.20.0)
npm install
npm run test-mem-leak
npm i --debug
npm run bundle # This put libzim.so.8 into Release rather than Debug, so
cp build/Release/libzim.so.8 build/Debug
node -r ts-node/register test/makeLargeZim.ts
nvm use 18 # (v18.15.0)
node -r ts-node/register test/makeLargeZim.ts

There was no segfault.

@kelvinhammond
Copy link
Contributor Author

I'll take a look when I'm available to do so.

@github-actions
Copy link
Contributor

github-actions bot commented Sep 1, 2023

This issue is stale because it has been open many days with no activity. It will be closed soon unless the stale label is removed or a comment is made.

@github-actions github-actions bot added the stale label Sep 1, 2023
@github-actions github-actions bot closed this as not planned Won't fix, can't repro, duplicate, stale Oct 1, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

8 participants