-
Notifications
You must be signed in to change notification settings - Fork 57
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Examples #281
Comments
Happy for this to be assigned to me 💯 |
Great! Assigning this to you |
Hey all! I've been looking into this and fixed most of the problems I could find but I am stuck on two. The first error is in Stack Trace
The second problem is when running
function in Stack Trace
Network variable provided:
|
As a side note - having worked with quite a few RL libraries over the last few years, comparatively, I found it quite challenging on getting started with the current example set up. I think if you want beginners to easily and quickly get started you might want to consider having a) a guide on how the classes fit together b) examples which are broken down step by step, rather than a file where you can run examples with args. |
Cc: @threewisemonkeys-as for 1 deep cb and @mehulrastogi for 2 genetic rl |
Can you try running |
Hey @DarylRodrigo we actually have docs for GenRL. You can find the link on the README. It isn't very complete yet and there's some stuff like explanation of how the classes fit together that's missing. We'll be tackling that soon. |
Here's a rudimentary explanation: Every RL problem has an environment, an agent and some sort of training process. We tried to abstract the process in order to make code shorter. The following is an explanation for the deep agents (i haven't worked on bandits and evolutionary is still a little new) So now you have a Then you have an Agent. Every Agent inherits from either OnPolicyAgent or OffPolicyAgent both of which inherit from a BaseAgent class. Now the algorithms all have a certain set of methods that perform some function. Like for example OffPolicyAgents all have The trainers are again similar to the agent classes. They mainly differ in their training loops. |
I'm unable to reproduce this error, but it seems like its running into the final condition (i.e., it is not an object of BasePolicy, BaseValue, BaseActorCritic, "mlp" or "cnn").
Based on this network object it seems like its an object of
Although I reckon the functions being different might not be the issue.
You're definitely right. I think the issue is that we don't have a clear line between "examples" and "tutorials" as of now. The current examples are clearly more like executables. There's a bunch of "examples" that are missing and could be possible be done:
|
I think it would be good to post our plans on
GIven your experience, it would be really helpful if you could help us in planning and making changes in our library. As of now, we're only at v0.0.2, so we're open to big and bold ideas as well. Needless to say, at the same time we're studying other projects too, so that we know whats the best way to create such a library. |
Also, you might need a statement This is a known issue at the moment #268 |
Thanks for the explanation and pointing me to the documentation - I had read them but I still found that they aren't particularly beginner-friendly, especially if you're coming from an "I'm interested in DRL but I've never really done anything other than writing a neural net in PyTorch" perspective. For example, the intro into policy and value function is explained in one sentence and there is no reference to on or off policy or how they interact with the larger system (just my opinion though). Then again, I am new to this open-source project and am unsure of the exact priorities/objectives and as you said, it's still a work in progress :) (By the way, please don't take this in any way as negative feedback or criticism - I really like what you're working on and it's just my an opinion 😁)
This worked in downloading the required data set 👍. Regarding the
This is a great idea :)
I suggest moving this conversation to slack, but as I said before, keen to help out. Especially interested in seeing if this library is the right fit for tutorials starting from the Bellman equation, to MDPs to Q-learning. |
One of our prime goals is to improve accessibility so feedback regarding that is always welcome. Thanks! Raising an issue here. #296
Great. But since you pointed out Bellman equations is missing tacking it here #297 |
I think we might need a better way to track things on the accessibility end, as there are some gaps that need to be filled. I've created its own label for issues, but may create a project for it too, as it needs one. We could follow and dug up things from Sutton+Barto, this, etc |
@Sharad24 Please advise Context: I'm trying to debug what's happening in the example Currently: In the Below is a simple solution (reset the env of the agent), but I don't know if this interferes with other bits of code.
LMKWYT |
Its great that you found this :)
This fix could potentially interfere with rest of the code. At some point, we'd want the agent to be independent of any env dependency. I'd say the easiest fix you could do for this now is just |
Stale issue message |
Examples are broken and imports need to be fixed after #256
Thanks, @DarylRodrigo and @ajaysub110 for pointing this out.
The text was updated successfully, but these errors were encountered: