Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Re-add 'Run' Queue #6

Open
roddtalebi opened this issue Aug 9, 2018 · 2 comments
Open

Re-add 'Run' Queue #6

roddtalebi opened this issue Aug 9, 2018 · 2 comments
Labels
enhancement New feature or request

Comments

@roddtalebi
Copy link
Member

See original issue #3

copy pasted from there...

Future work:

Better understand Cloud SQL to see how much of a 'delay' we can expect before a second sql-client instance can read the point added by the first sql-client instance. From that, decide whether we can re-add the 'run' queue functionality or if it has to be tossed completely.

Problem:
The 'main' sql-client would add points to the table and to the 'main' queue; if a worker immediately picks it off the 'main' queue and adds it to the 'run' queue so it can change states on the table from state='queue' to ='running', often the 'run' sql-client won't see any point in the table with that hash...likely because it isn't seeing a 'new-enough' vision of the table.
We even added 'sleep' statements so that it would check the table again for the point after a quick rest, but that didn't improve. Something else is going on with how these instances communicate with the server and what view they see that we don't know about.
I suppose the easy solution would be to have the same 'main' client also be the 'run' client but with thousands of nodes, that might make a bottleneck.

@roddtalebi
Copy link
Member Author

https://stackoverflow.com/questions/36258154/sqlalchemy-returns-different-results-of-the-select-command-query-all

You have one SQLAlchemy session per Worker and probably use 2 Workers with uwsgi. SQLAlchemy caches results per session, so session of worker 1 returns the new results, because you have added the records with this worker, but the session of worker 2 is not updated and returns only the old records.

Solution: don't create global sessions, but a new session for each request.

WHYYYYY would they do that??? but okay fine

@roddtalebi
Copy link
Member Author

More knowledge

https://stackoverflow.com/questions/12223335/sqlalchemy-creating-vs-reusing-a-session/12223711

Next part. I think the question is, what's the difference between making a new Session() at various points versus just using one all the way through. The answer, not very much. Session is a container for all the objects you put into it, and then it also keeps track of an open transaction. At the moment you call rollback() or commit(), the transaction is over, and the Session has no connection to the database until it is called upon to emit SQL again. The links it holds to your mapped objects are weak referencing, provided the objects are clean of pending changes, so even in that regard the Session will empty itself out back to a brand new state when your application loses all references to mapped objects. If you leave it with its default "expire_on_commit" setting, then all the objects are expired after a commit. If that Session hangs around for five or twenty minutes, and all kinds of things have changed in the database the next time you use it, it will load all brand new state the next time you access those objects even though they've been sitting in memory for twenty minutes.

In web applications, we usually say, hey why don't you make a brand new Session on each request, rather than using the same one over and over again. This practice ensures that the new request begins "clean". If some objects from the previous request haven't been garbage collected yet, and if maybe you've turned off "expire_on_commit", maybe some state from the previous request is still hanging around, and that state might even be pretty old. If you're careful to leave expire_on_commit turned on and to definitely call commit() or rollback() at request end, then it's fine, but if you start with a brand new Session, then there's not even any question that you're starting clean. So the idea to start each request with a new Session is really just the simplest way to make sure you're starting fresh, and to make the usage of expire_on_commit pretty much optional, as this flag can incur a lot of extra SQL for an operation that calls commit() in the middle of a series of operations. Not sure if this answers your question.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant