Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

handle overly long distribution names before the database rejects them #482

Open
rjbs opened this issue Apr 28, 2024 · 5 comments
Open

handle overly long distribution names before the database rejects them #482

rjbs opened this issue Apr 28, 2024 · 5 comments
Labels
bug indexer How we index uploads

Comments

@rjbs
Copy link
Collaborator

rjbs commented Apr 28, 2024

Someone uploaded this file:

CGI-Enurl-1.08-withoutworldwriteables-generated-by-the-silly-pseudosecurity-measure-devised-to-piss-off-nonunixers.tar.gz

I guess they wanted to annoy the PAUSE maintainers, and only 15 years later, it paid off because the database is now complaining that the full path is too long to fit in the 128 character database table.

We should detect this in mldistwatch, in PAUSE::dist->mtime_ok, before sending it to the database.

@rjbs rjbs added bug indexer How we index uploads labels Apr 28, 2024
@rspier
Copy link
Collaborator

rspier commented May 3, 2024

What do we do with that one bad file?

Ideas:
a) Delete it.
b) Rename it to something shorter. (Say, remove "pseudosecurity".)
c) Make the database row 4 bytes longer to accommodate it.

a) is tempting.
b) is easy.
c) is easy, but probably the wrong solution.

This is important to fix because it's causing mldistwatch to trigger an email on every run.

@andk Any thoughts?

@rjbs
Copy link
Collaborator Author

rjbs commented May 3, 2024

You didn't ask me, but I think we should delete it. It's quite old and has never been successfully indexed.

@neilb
Copy link
Collaborator

neilb commented May 3, 2024

Also wasn't asked, but given (a) the author (JENDA) released this in 2009, and (b) hasn't released anything since 2014, I think it's fine to be pragmatic and just delete it.

Furthermore, he released CGI-Enurl-1.08.tar.gz and CGI-Enurl-1.08-withoutworldwriteables-generated-by-the-silly-pseudosecurity-measure-devised-to-piss-off-nonunixers.tar.gz on the same day, and they have exactly the same content. But even the regular 1.08 release wasn't indexed – it's still 1.07 in the index.

@rspier
Copy link
Collaborator

rspier commented May 3, 2024

As an experiment, I have moved the file from authors/id/J/JE/JENDA to /data/pause/attic. (We should delete it or move it back before closing this issue.)

This appears to have eliminated the

DBD::mysql::db do failed: Data too long for column 'dist' at row 1

log as expected.

@rjbs
Copy link
Collaborator Author

rjbs commented May 3, 2024

If you end up moving it back, you'll want to restore the mtime, which mldistwatch cares about.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug indexer How we index uploads
Projects
None yet
Development

No branches or pull requests

3 participants