-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Can the station registry accept UTF-8 in a station description? #7
Comments
i just tried an alter on the so far i do not see more messages like this in error.log, but we'll see... |
here is the table before and after, and the alter command.
|
actually, that sorry, but character encodings are one of my many weaknesses |
the latin1 is there apparently because i did not specify a characterset when i originally created the tables eons ago, so it defaulted to latin1. apparently mysql 8 (running on weewx.com as of this writing nov 2022) defaults to utf8 (not sure whether that is 3-byte or 4-byte). i'm not crazy about altering everything in the database. description should obviously accept non-latin characters, but should any other fields? |
Interesting. Internally, I'm sure MySQL stores it as a byte string. However, it appears that the database engine is also enforcing character type safety. That is, if you declare the field as Latin-1, then try to store something that uses non-Latin-1 characters, it will declare an error. When it comes time to print the field out, the MySQL client will use the encoding to figure out what code point to use. I believe that UTF-8 domain names are valid these days, although most DNS servers don't support them. To future proof things, I would use utf-8 encoding for The other fields should be fine. It's an ancient article, but I recommend Joel Spolsky's The Absolute Minimum Every Software Developer Absolutely, Positively Must Know About Unicode and Character Sets. It set me straight now 20 years ago! |
In
/var/log/nginx/error.log
, I am seeing a lot of errors of the sortWhat they all seem to have in common is the use of UTF-8 in the field
description
.Perl is a mystery to me, so I can't tell if it's rejecting UTF-8 characters.
The text was updated successfully, but these errors were encountered: