Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

codeql database create ignore minified files #75

Open
bananabr opened this issue Aug 12, 2021 · 5 comments
Open

codeql database create ignore minified files #75

bananabr opened this issue Aug 12, 2021 · 5 comments
Labels

Comments

@bananabr
Copy link

Description

When codeql database create is used on a code-base containing files whose names contain multiple dots (.) the files are ignored by the extractors.

Steps to reproduce

  1. Create a directory called test
  2. Create a file named foo.js inside the newly created directory
  3. Navigate to the directory and run the following command: codeql database create ../foo-db --language=javascript
  4. Verify the foo.js is extracted
  5. Rename the foo.js to foo.min.js
  6. Run the following command: codeql database create ../foo-min-db --language=javascript
  7. Verify the foo.js is NOT extracted

Expected behavior

When using codeql to assess production JavaScript code, minified files should be included in the database.

@github-actions github-actions bot added the CLI label Aug 12, 2021
@edoardopirovano
Copy link
Contributor

Greetings, thank you for reaching out to us with this issue. Having codeql database create ignore minified files is a deliberate design decision that we made. Unfortunately, if we did create a database with the minified files, the results we could obtain on them would be very difficult to interpret since there wouldn't be meaningful line numbers and variables names. Thus, we'd end up outputting a lot of Code Scanning alerts that would be very difficult to fix and really not provide very much value to users, which is something we want to avoid.

Ideally, you would run our tool on the non-minified version of your JavaScript code where the alerts that get produced will be much easier to understand and fix. If for some reason you do not have access to the non-minified code, I would suggest you run your JavaScript files through a pretty-printer to get back some meaningful line numbers and then use CodeQL on the pretty-printed files (making sure they are named .js rather than .min.js so that they get picked up).

@bananabr
Copy link
Author

That is basically what I've been doing.

Here is a one-liner to rename the files in case anyone needs the same thing.

find `pwd` -iname "*.js" -exec sh -c 'x="{}"; dname=`dirname $x`; bname=`basename $x`; mv $x "${dname}/`echo -n ${bname}|sed s,\\\.,_,g|sed s,_/,\\\./,g`.js"' \;

Thanks,

@edoardopirovano
Copy link
Contributor

Thanks for sharing that one-liner! In terms of supporting this use case more directly, while we definitely don't want to always index .min.js files, perhaps the JavaScript extractor could be configured to do this via some suitable configuration option (maybe an environment variable). I'll leave it to the @github/codeql-javascript team to decide whether this is something they would like to implement or if this issue should be closed as something we won't do.

@adityasharad
Copy link
Contributor

adityasharad commented Aug 16, 2021

We are working on ways to pass configuration options to extractors in a more organised fashion. As a current workaround, can you try setting the environment variable LGTM_INDEX_FILTERS="include:*.min.js" in the shell before running codeql database create? This will tell the CodeQL JavaScript autobuilder not to exclude those files by default. (source)

(I second @edoardopirovano's concerns about the quality of alerts you may find in minified code, but this should get the behaviour you're asking for.)

@bananabr
Copy link
Author

I have successfully found real-world vulnerabilities in public VDPs by inspecting minified files using codeQL running them though a prettyfier first and renaming .min files. If codeQL could somehow perform those steps that would be great for black/gray-box assessments.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants