Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Examples #281

Closed
Sharad24 opened this issue Aug 28, 2020 · 17 comments
Closed

Examples #281

Sharad24 opened this issue Aug 28, 2020 · 17 comments
Assignees

Comments

@Sharad24
Copy link
Member

Examples are broken and imports need to be fixed after #256

Thanks, @DarylRodrigo and @ajaysub110 for pointing this out.

@Sharad24 Sharad24 added good first issue Good for newcomers Examples labels Aug 28, 2020
@Het-Shah Het-Shah assigned Het-Shah and unassigned Het-Shah Aug 28, 2020
@DarylRodrigo
Copy link
Contributor

Happy for this to be assigned to me 💯

@ajaysub110
Copy link
Member

Great! Assigning this to you

@DarylRodrigo
Copy link
Contributor

Hey all!

I've been looking into this and fixed most of the problems I could find but I am stuck on two.

The first error is in deep_cb.py: when running this file there is an error on importing data from 'data/Covertype'. I'm not actually sure what this is used for or how this works, so a pointer here would be useful.

Stack Trace

Traceback (most recent call last):
  File "deep_cb.py", line 106, in <module>
    main(args)
  File "deep_cb.py", line 19, in main
    run_single_algos_on_bandit(args)
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/genrl/bandit/run.py", line 116, in run_single_algos_on_bandit
    bandit = bandit_class(download=args.download)
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/genrl/bandit/bandits/data_bandits/covertype_bandit.py", line 56, in __init__
    self._df = fetch_covertype_data(path)
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/genrl/bandit/bandits/data_bandits/covertype_bandit.py", line 105, in fetch_covertype_data
    return fetch_zipped_data_without_header(path)
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/genrl/bandit/bandits/data_bandits/utils.py", line 81, in fetch_zipped_data_without_header
    with gzip.open(gz_fpath, "rb") as f_in:
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/gzip.py", line 53, in open
    binary_file = GzipFile(filename, gz_mode, compresslevel)
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/gzip.py", line 163, in __init__
    fileobj = self.myfileobj = builtins.open(filename, mode or 'rb')
FileNotFoundError: [Errno 2] No such file or directory: 'data/Covertype'

The second problem is when running genetic_rl.py: it looks like the

get_env_properties(env: Union[gym.Env, VecEnv], network: Union[str, Any] = "mlp") -> (Tuple[int]):

function in utils.py is receiving the wrong network type. It's expecting a string and is passed the actual pyTorch network model. I think this might have been an error introduced post refactor (not sure). I could rewrite the whole example but I'm not sure what the plans of further development are like and I'd rather not do redundant work.

Stack Trace

Traceback (most recent call last):
  File "genetic_rl.py", line 217, in <module>
    main(args)
  File "genetic_rl.py", line 139, in main
    generic_agent = A2C(network, env, rollout_size=args.rollout_size)
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/genrl/agents/deep/a2c/a2c.py", line 59, in __init__
    self._create_model()
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/genrl/agents/deep/a2c/a2c.py", line 67, in _create_model
    self.env, self.network
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/genrl/utils/utils.py", line 155, in get_env_properties
    raise TypeError
TypeError
(base) 

Network variable provided:

MlpActorCritic(
  (actor): MlpPolicy(
    (model): Sequential(
      (0): Linear(in_features=4, out_features=1, bias=True)
      (1): ReLU()
      (2): Linear(in_features=1, out_features=1, bias=True)
      (3): ReLU()
      (4): Linear(in_features=1, out_features=2, bias=True)
      (5): Identity()
    )
  )
  (critic): MlpValue(
    (model): Sequential(
      (0): Linear(in_features=4, out_features=1, bias=True)
      (1): ReLU()
      (2): Linear(in_features=1, out_features=1, bias=True)
      (3): ReLU()
      (4): Linear(in_features=1, out_features=1, bias=True)
      (5): Identity()
    )
  )
)

@DarylRodrigo
Copy link
Contributor

DarylRodrigo commented Aug 30, 2020

As a side note - having worked with quite a few RL libraries over the last few years, comparatively, I found it quite challenging on getting started with the current example set up. I think if you want beginners to easily and quickly get started you might want to consider having a) a guide on how the classes fit together b) examples which are broken down step by step, rather than a file where you can run examples with args.

@sampreet-arthi
Copy link
Member

Cc: @threewisemonkeys-as for 1 deep cb and @mehulrastogi for 2 genetic rl

@Sharad24
Copy link
Member Author

Hey all!

I've been looking into this and fixed most of the problems I could find but I am stuck on two.

The first error is in deep_cb.py: when running this file there is an error on importing data from 'data/Covertype'. I'm not actually sure what this is used for or how this works, so a pointer here would be useful.

Stack Trace

Traceback (most recent call last):
  File "deep_cb.py", line 106, in <module>
    main(args)
  File "deep_cb.py", line 19, in main
    run_single_algos_on_bandit(args)
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/genrl/bandit/run.py", line 116, in run_single_algos_on_bandit
    bandit = bandit_class(download=args.download)
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/genrl/bandit/bandits/data_bandits/covertype_bandit.py", line 56, in __init__
    self._df = fetch_covertype_data(path)
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/genrl/bandit/bandits/data_bandits/covertype_bandit.py", line 105, in fetch_covertype_data
    return fetch_zipped_data_without_header(path)
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/genrl/bandit/bandits/data_bandits/utils.py", line 81, in fetch_zipped_data_without_header
    with gzip.open(gz_fpath, "rb") as f_in:
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/gzip.py", line 53, in open
    binary_file = GzipFile(filename, gz_mode, compresslevel)
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/gzip.py", line 163, in __init__
    fileobj = self.myfileobj = builtins.open(filename, mode or 'rb')
FileNotFoundError: [Errno 2] No such file or directory: 'data/Covertype'

Can you try running python deep_cb.py --download? That should download the dataset. I think that should resolve it.

@sampreet-arthi
Copy link
Member

As a side note - having worked with quite a few RL libraries over the last few years, comparatively, I found it quite challenging on getting started with the current example set up. I think if you want beginners to easily and quickly get started you might want to consider having a) a guide on how the classes fit together b) examples which are broken down step by step, rather than a file where you can run examples with args.

Hey @DarylRodrigo we actually have docs for GenRL. You can find the link on the README. It isn't very complete yet and there's some stuff like explanation of how the classes fit together that's missing. We'll be tackling that soon.

@sampreet-arthi
Copy link
Member

Here's a rudimentary explanation:

Every RL problem has an environment, an agent and some sort of training process. We tried to abstract the process in order to make code shorter.

The following is an explanation for the deep agents (i haven't worked on bandits and evolutionary is still a little new)

So now you have a VectorEnv which is basically a wrapper for vectorized gym envs.

Then you have an Agent. Every Agent inherits from either OnPolicyAgent or OffPolicyAgent both of which inherit from a BaseAgent class. Now the algorithms all have a certain set of methods that perform some function. Like for example OffPolicyAgents all have get_q_values , get_target_q_values etc. Every deep off policy algo has the same subparts just like every agent (not just deep) must have a method to select their action.

The trainers are again similar to the agent classes. They mainly differ in their training loops.

@Sharad24
Copy link
Member Author

The second problem is when running genetic_rl.py: it looks like the

get_env_properties(env: Union[gym.Env, VecEnv], network: Union[str, Any] = "mlp") -> (Tuple[int]):

function in utils.py is receiving the wrong network type. It's expecting a string and is passed the actual pyTorch network model. I think this might have been an error introduced post refactor (not sure). I could rewrite the whole example but I'm not sure what the plans of further development are like and I'd rather not do redundant work.

Stack Trace

Traceback (most recent call last):
  File "genetic_rl.py", line 217, in <module>
    main(args)
  File "genetic_rl.py", line 139, in main
    generic_agent = A2C(network, env, rollout_size=args.rollout_size)
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/genrl/agents/deep/a2c/a2c.py", line 59, in __init__
    self._create_model()
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/genrl/agents/deep/a2c/a2c.py", line 67, in _create_model
    self.env, self.network
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/genrl/utils/utils.py", line 155, in get_env_properties
    raise TypeError
TypeError
(base) 

I'm unable to reproduce this error, but it seems like its running into the final condition (i.e., it is not an object of BasePolicy, BaseValue, BaseActorCritic, "mlp" or "cnn").

Network variable provided:

MlpActorCritic(
  (actor): MlpPolicy(
    (model): Sequential(
      (0): Linear(in_features=4, out_features=1, bias=True)
      (1): ReLU()
      (2): Linear(in_features=1, out_features=1, bias=True)
      (3): ReLU()
      (4): Linear(in_features=1, out_features=2, bias=True)
      (5): Identity()
    )
  )
  (critic): MlpValue(
    (model): Sequential(
      (0): Linear(in_features=4, out_features=1, bias=True)
      (1): ReLU()
      (2): Linear(in_features=1, out_features=1, bias=True)
      (3): ReLU()
      (4): Linear(in_features=1, out_features=1, bias=True)
      (5): Identity()
    )
  )
)

Based on this network object it seems like its an object of BaseActorCritic. So I'm unsure why this is running into a TypeError. Could you post the definition of utils in your installed version of the library? Maybe its missing a few statements. For reference this is what I have:

def get_env_properties(
    env: Union[gym.Env, VecEnv], network: Union[str, Any] = "mlp"
) -> (Tuple[int]):
    """
        Finds important properties of environment

        :param env: Environment that the agent is interacting with
        :type env: Gym Environment
        :param network: Type of network architecture, eg. "mlp", "cnn"
        :type network: str
        :returns: (State space dimensions, Action space dimensions,
    discreteness of action space and action limit (highest action value)
        :rtype: int, float, ...; int, float, ...; bool; int, float, ...
    """
    if network == "cnn":
        state_dim = env.framestack
    elif network == "mlp":
        state_dim = env.observation_space.shape[0]
    elif isinstance(network, (BasePolicy, BaseValue)):
        state_dim = network.state_dim
    elif isinstance(network, BaseActorCritic):
        state_dim = network.actor.state_dim
    else:
        raise TypeError

Although I reckon the functions being different might not be the issue.

As a side note - having worked with quite a few RL libraries over the last few years, comparatively, I found it quite challenging on getting started with the current example set up. I think if you want beginners to easily and quickly get started you might want to consider having a) a guide on how the classes fit together b) examples which are broken down step by step, rather than a file where you can run examples with args.

You're definitely right. I think the issue is that we don't have a clear line between "examples" and "tutorials" as of now. The current examples are clearly more like executables. There's a bunch of "examples" that are missing and could be possible be done:

  • Agents (Base classes, OnPolicy, OffPolicy, OffPolicyAC)
  • Trainers
  • Environments
    Etc.

@Sharad24
Copy link
Member Author

As a side note - having worked with quite a few RL libraries over the last few years, comparatively, I found it quite challenging on getting started with the current example set up. I think if you want beginners to easily and quickly get started you might want to consider having a) a guide on how the classes fit together b) examples which are broken down step by step, rather than a file where you can run examples with args.

Hey @DarylRodrigo we actually have docs for GenRL. You can find the link on the README. It isn't very complete yet and there's some stuff like explanation of how the classes fit together that's missing. We'll be tackling that soon.

I think it would be good to post our plans on Projects so everyone can take a look and see what the plans are. Probably even take up issues and projects.

As a side note - having worked with quite a few RL libraries over the last few years, comparatively, I found it quite challenging on getting started with the current example set up.

GIven your experience, it would be really helpful if you could help us in planning and making changes in our library. As of now, we're only at v0.0.2, so we're open to big and bold ideas as well. Needless to say, at the same time we're studying other projects too, so that we know whats the best way to create such a library.

@Sharad24
Copy link
Member Author

Also, you might need a statement agent.env = environment if you get an issue of AssertionError: Cannot call env.step() before calling reset().

This is a known issue at the moment #268

@DarylRodrigo
Copy link
Contributor

@sampreet-arthi

Thanks for the explanation and pointing me to the documentation - I had read them but I still found that they aren't particularly beginner-friendly, especially if you're coming from an "I'm interested in DRL but I've never really done anything other than writing a neural net in PyTorch" perspective. For example, the intro into policy and value function is explained in one sentence and there is no reference to on or off policy or how they interact with the larger system (just my opinion though). Then again, I am new to this open-source project and am unsure of the exact priorities/objectives and as you said, it's still a work in progress :)

(By the way, please don't take this in any way as negative feedback or criticism - I really like what you're working on and it's just my an opinion 😁)

@Sharad24

Can you try running python deep_cb.py --download? That should download the dataset. I think that should resolve it.

This worked in downloading the required data set 👍.

Regarding the get_env_properties issue - after having pulled from the main repo again it seemed to have resolved that particular issue. Thanks for the checking though!


I think it would be good to post our plans on Projects so everyone can take a look and see what the plans are. Probably even take up issues and projects.

This is a great idea :)

Given your experience, it would be really helpful if you could help us in planning and making changes in our library. As of now, we're only at v0.0.2, so we're open to big and bold ideas as well.

I suggest moving this conversation to slack, but as I said before, keen to help out. Especially interested in seeing if this library is the right fit for tutorials starting from the Bellman equation, to MDPs to Q-learning.

@Sharad24
Copy link
Member Author

@sampreet-arthi

Thanks for the explanation and pointing me to the documentation - I had read them but I still found that they aren't particularly beginner-friendly, especially if you're coming from an "I'm interested in DRL but I've never really done anything other than writing a neural net in PyTorch" perspective. For example, the intro into policy and value function is explained in one sentence and there is no reference to on or off policy or how they interact with the larger system (just my opinion though). Then again, I am new to this open-source project and am unsure of the exact priorities/objectives and as you said, it's still a work in progress :)

(By the way, please don't take this in any way as negative feedback or criticism - I really like what you're working on and it's just my an opinion 😁)

One of our prime goals is to improve accessibility so feedback regarding that is always welcome. Thanks! Raising an issue here. #296

I suggest moving this conversation to slack, but as I said before, keen to help out. Especially interested in seeing if this library is the right fit for tutorials starting from the Bellman equation, to MDPs to Q-learning.

Great. But since you pointed out Bellman equations is missing tacking it here #297

@Sharad24
Copy link
Member Author

Sharad24 commented Aug 31, 2020

I think we might need a better way to track things on the accessibility end, as there are some gaps that need to be filled. I've created its own label for issues, but may create a project for it too, as it needs one.

We could follow and dug up things from Sutton+Barto, this, etc

@DarylRodrigo
Copy link
Contributor

@Sharad24 Please advise

Context: I'm trying to debug what's happening in the example genetic_rl.py
Problem: Program crashes saying that the env must be reset before calling step

Currently: genetic_rl starts generate -> train_population which attempts to train the trainer trainer.train()

In the train function in the onPolicy.py class there is a resetting of the environment through state = self.env.reset() (commented out in the code below). To collect the rollouts from the agent it requires the environment to be triggered which directly starts to step over the env. However the env that the agent references isn't the same env that is being reset in the onPolicy object.

Below is a simple solution (reset the env of the agent), but I don't know if this interferes with other bits of code.

def train(self) -> None:
        """Main training method"""
        if self.load_model is not None:
            self.load()

        for epoch in range(self.epochs):
            self.agent.epoch_reward = np.zeros(self.env.n_envs)

            self.agent.rollout.reset()

            # state = self.env.reset()
            state = self.agent.env.reset()

            values, done = self.agent.collect_rollouts(state)

            self.agent.get_traj_loss(values, done)

            self.agent.update_params()

LMKWYT

@Sharad24
Copy link
Member Author

Sharad24 commented Sep 3, 2020

Its great that you found this :)
We have an issue tracking this #268

       # state = self.env.reset()
      state = self.agent.env.reset()

This fix could potentially interfere with rest of the code. At some point, we'd want the agent to be independent of any env dependency.

I'd say the easiest fix you could do for this now is just agent.env = environment object in train_population function

@github-actions
Copy link

github-actions bot commented Nov 3, 2020

Stale issue message

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants