Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Segmentation Fault when using a custom item #80

Open
kelvinhammond opened this issue Aug 16, 2022 · 20 comments
Open

Segmentation Fault when using a custom item #80

kelvinhammond opened this issue Aug 16, 2022 · 20 comments
Assignees
Milestone

Comments

@kelvinhammond
Copy link
Collaborator

In libzim v8 when using a custom item and adding a lot of objects to a zim file it will throw a segmentation fault error after about 10,000 items have been inserted.

The error has been reported to nodejs/node-addon-api below.

It appears to be related to:

  • the persistent object reference being lost or garbage collected before the object and function can be used
  • some other unknown issue involving the global state within nodejs.
  • it may also be related to the creation of ThreadSafeFunction(s) created within a ThreadSafeFunction context but this should be OK and does not crash until after about 10k or more iterations.

To reproduce the error modify tests/makeLargeZim.ts file to use the custom item and run the test, it should fail.

The issue has been reported to nodejs/node-addon-api#1174 .

Related to #72

@kelson42
Copy link
Contributor

kelson42 commented Dec 1, 2022

@kelvinhammond One hour ago, I have updated the build script to use libzim 8.1.0 which has been released yesterday (and fix/update the CI). I have also run (for the first time) $ npm run test-mem-leak twice on my Ubuntu 22.04 + Node.js 18.12.1 and... all works fine. It ran twice up to the end without any error message. Maybe it would be worth it you try again?

@kelvinhammond
Copy link
Collaborator Author

kelvinhammond commented Dec 1, 2022

@kelson42 Nice, will try again. I wonder if they fixed it! Thank you.

I should have a bit more time soon (hopefully) to take a look at this and mwoffliner during the holidays.

@kelson42
Copy link
Contributor

kelson42 commented Dec 5, 2022

@kelvinhammond Thx, would be really helpful to know soon if this critical bug is really fixed or if I just don't run the reproduction case steps properly.

@kelvinhammond
Copy link
Collaborator Author

@kelvinhammond Thx, would be really helpful to know soon if this critical bug is really fixed or if I just don't run the reproduction case steps properly.

@kelson42 Try it with these changes
Edit test/makeLargeZim.ts
Uncomment the const customItem code and have await creator.addItem(stringItem); be replaced by await creator.addItem(customItem); and see if that works for you.

The diff looks like this.

diff --git a/test/makeLargeZim.ts b/test/makeLargeZim.ts
index b5db866..e87979e 100644
--- a/test/makeLargeZim.ts
+++ b/test/makeLargeZim.ts
@@ -43,13 +43,13 @@ console.log(`Making ZIM file with [${numArticles}] articles`);
       data,
     );

-    /*
     const customItem = { // custom item
       path: url,
       mimeType: "text/html",
       title: title,
       hints: {FRONT_ARTICLE: 1},
       getContentProvider() { // custom content provider
+        return stringItem.getContentProvider();
         let sent = false;
         return {
           size: data.length,
@@ -63,9 +63,9 @@ console.log(`Making ZIM file with [${numArticles}] articles`);
         };
       },
     };
-    */

-    await creator.addItem(stringItem);
+    //await creator.addItem(stringItem);
+    await creator.addItem(customItem);
   }

   console.log(`Finalising...`);

@kelvinhammond
Copy link
Collaborator Author

kelvinhammond commented Dec 5, 2022

I tested it with node v19.1.0

I get

terminate called after throwing an instance of 'Napi::Error'

  what():  Illegal invocation

#8  0x00007ffff031a5c3 in Napi::Function::Call (this=0x7fffffffa0c0, recv=0x555555632990, argc=0, args=0x0) at 
/node-libzim/node_modules/node-addon-api/napi-inl.h:2341

#9  0x00007ffff031a482 in Napi::Function::Call (this=0x7fffffffa0c0, recv=0x555555632990, args=std::initializer_list of length 0) at /node-libzim/node_modules/node-addon-api/napi-inl.h:2306

#10 0x00007ffff0326f92 in ContentProviderWrapper::feed()::{lambda(Napi::Env, Napi::Function)#1}::operator()(Napi::Env, Napi::Function) const (__closure=0x7fffc40245f0, env=..., feedFunc=...) at ../src/contentProvider.h:79

Looking into this more.

@kelvinhammond
Copy link
Collaborator Author

@kelson42 It does work, just not with custom items.

@kelson42
Copy link
Contributor

kelson42 commented Jan 1, 2023

@kelvinhammond Sorry for the big delay in answering. I have involved @mgautierfr who is the lead developer of the libzim. Could you please elaborate what are these "custom items". This is not a concept of the libzim and therefore this is a bit puzzling. One important other point is if we really need them (does that impact MWoffliner)?

@kelson42
Copy link
Contributor

kelson42 commented Jan 8, 2023

BTW @uriesk, this is the bug which stops us to use latest libzim in MWoffliner (we are stuck with libzim6 which is18 months old). If you like to be challenged, this might be one for you ;)

@kelvinhammond
Copy link
Collaborator Author

@kelson42 In libzim you can create Item(s) which have each have a ContentProvider, some are provided by default such as StringItem and FileItem but you can define and implement your own. In node-libzim StringItem and FileItem work fine but when you try to define your own it fails because of an issue (node-addon-api#1174) in node or node-addon-api (napi) when creating a large file (sometimes a smaller one too)

@kelson42
Copy link
Contributor

kelson42 commented Jan 8, 2023

@kelvinhammond But we don't need this customItem in MWoffliner? You confirm?

@kelvinhammond
Copy link
Collaborator Author

@kelvinhammond But we don't need this customItem in MWoffliner? You confirm?

I'll test a large file using only StringItem and FileItem to confirm it works but yes, mwoffliner does not use the custom item and content providers

@kelson42
Copy link
Contributor

kelson42 commented Jan 8, 2023

@kelvinhammond This would be a good news. Waiting to your confirmation. If confirmed, we could consider to release with this bug/limitation.

@uriesk
Copy link

uriesk commented Jan 10, 2023

BTW @uriesk, this is the bug which stops us to use latest libzim in MWoffliner (we are stuck with libzim6 which is18 months old). If you like to be challenged, this might be one for you ;)

sorry, it's out of my league.
I am not yet familiar enough with node-addon-api.

@kelson42
Copy link
Contributor

@kelson42 In libzim you can create Item(s) which have each have a ContentProvider, some are provided by default such as StringItem and FileItem but you can define and implement your own. In node-libzim StringItem and FileItem work fine but when you try to define your own it fails because of an issue (node-addon-api#1174) in node or node-addon-api (napi) when creating a large file (sometimes a smaller one too)

@mgautierfr Do you confirm this (not sure this is exactly how things work at the libzim)? Maybe there is a few things to know and check if using a custom provider?

@mgautierfr
Copy link

One important thing to remember is that the ContentProvider returned by getContentProvider must stay valid after libzim has release the shared_ptr<Item> passed in creator::addItem.

libzim will use the shared_ptr<Item> to get a ContentProvider but will release the shared ptr before using the ContentProvider (potentially in another thread). So if ContentProvider is referencing the data stored in the item. We may have a situation where we free the item (and so, its data) and we have a ContentProvider referencing a freed data.
I don't know how item_.Value() behaves. But if it return a reference/pointer to the value we are in this situation.
ContentProvider must prevent the value it will return from being freed. Either by owning the data, copying it, keeping a reference to the item containing the data, ...

I don't know how napi wrap things. But here : https://github.com/openzim/node-libzim/blob/main/test/makeLargeZim.ts#L54-L62, your custom item return a contentProvider which contain a closure as feed method.
So you must be sure that the referenced data (literally data) is not freed when libzim is calling feed method.
As libzim keep a reference to the contentProvider without keeping a reference data maybe NAPI think that nothing reference the data variable and free it.
And depending of how closures work in js, it can be worse as all contentprovider may share the same data.
At least in python, this is the case (see my own old answer in stackoverflow at this subject : https://stackoverflow.com/questions/21449176/issue-with-generating-ttk-checkboxes-in-loops-and-passing-arguments/21459922#21459922)

Maybe you can try this custom item which not create closure (but copy (reference to?) data):

const customItem = { // custom item
  path: url,
  mimeType: "text/html",
  title: title,
  hints: {FRONT_ARTICLE: 1},
  item_data: data, // reference directly the data without closure
  getContentProvider() { // custom content provider
    return {
      size: this.item_data.length,
      data_to_send: this.item_data, // same here, we don't want the feed() method being a closure accessing the `item_data` in customItem which will be deleted.
      sent = false,
      feed() {
        if(!this.sent) {
          this.sent = true;
          return new Blob(this.data_to_send);
        }
        return new Blob();
      }
    };
  },
};

@kelson42
Copy link
Contributor

@kelvinhammond Does that help? Do we have anything preventing the data to be freed?

@kelvinhammond
Copy link
Collaborator Author

@kelson42 Sadly, it did not. I store a reference to the objects and functions in node so it "should" keep a copy of it in memory without garbage collecting it. The error also seems to happen (most of the time) when the object is creating a thread safe function callback.

As for the StringItem, that works, I haven't verified FileItem yet (I need to write a bit of code to test this too in a large tmp folder for each article we store). StringItem and FileItem work because they don't encounter the bug calling to getContentProvider and feed because I use an internally stored StringItem and FileItem smart ptr. I can test FileItem to be sure but I think it's ready to go. Is there a way to test mwoffliner with a large enough (doesn't need to be huge) archive to verify it works?

@kelson42
Copy link
Contributor

kelson42 commented Jan 11, 2023

Is there a way to test mwoffliner with a large enough (doesn't need to be huge) archive to verify it works?

You mean with a big wikipedia and your PR?

@kelson42
Copy link
Contributor

kelson42 commented Mar 17, 2023

Considering that this bug does not impact our current scrapers AFAIK, I propose to release 3.0.0 with this bug, so we can move forward and benefit from all the work already done.

@kelvinhammond @mgautierfr Any thpughts?

@kelson42 kelson42 removed this from the 3.0.0 milestone May 4, 2023
@kelson42 kelson42 added this to the 3.1.0 milestone May 4, 2023
@kelvinhammond
Copy link
Collaborator Author

Considering that this bug does not impact our current scrapers AFAIK, I propose to release 3.0.0 with this bug, so we can move forward and benefit from all the work already done.

@kelvinhammond @mgautierfr Any thpughts?

Sure, release it but include a note in the change notes please.

@kelson42 kelson42 modified the milestones: 3.1.0, 3.2.0 Dec 3, 2023
@kelson42 kelson42 modified the milestones: 3.2.0, 3.3.0 Dec 9, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants