akolobov commited on
Commit
073a9d3
β€’
1 Parent(s): ebe0928

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +156 -3
README.md CHANGED
@@ -1,3 +1,156 @@
1
- ---
2
- license: cdla-permissive-2.0
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: cdla-permissive-2.0
3
+ datasets:
4
+ - microsoft/mocapact-data
5
+ ---
6
+ # MoCapAct Model Zoo
7
+ Control of simulated humanoid characters is a challenging benchmark for sequential decision-making methods, as it assesses a policy’s ability to drive an inherently unstable, discontinuous, and high-dimensional physical system. Motion capture (MoCap) data can be very helpful in learning sophisticated locomotion policies by teaching a humanoid agent low-level skills (e.g., standing, walking, and running) that can then be used to generate high-level behaviors. However, even with MoCap data, controlling simulated humanoids remains very hard, because this data offers only kinematic information. Finding physical control inputs to realize the MoCap-demonstrated motions has required methods like reinforcement learning that need large amounts of compute, which has effectively served as a barrier to entry for this exciting research direction.
8
+
9
+ In an effort to broaden participation and facilitate evaluation of ideas in humanoid locomotion research, we are releasing MoCapAct (Motion Capture with Actions), a library of high-quality pre-trained agents that can track over three hours of MoCap data for a simulated humanoid in the `dm_control` physics-based environment and rollouts from these experts containing proprioceptive observations and actions. MoCapAct allows researchers to sidestep the computationally intensive task of training low-level control policies from MoCap data and instead use MoCapAct's expert agents and demonstrations for learning advanced locomotion behaviors. It also allows improving on our low-level policies by using them and their demonstration data as a starting point.
10
+
11
+ In our work, we use MoCapAct to train a single hierarchical policy capable of tracking the entire MoCap dataset within `dm_control`.
12
+ We then re-use the learned low-level component to efficiently learn other high-level tasks.
13
+ Finally, we use MoCapAct to train an autoregressive GPT model and show that it can perform natural motion completion given a motion prompt.
14
+ We encourage the reader to visit our [project website](https://microsoft.github.io/MoCapAct/) to see videos of our results as well as get links to our paper and code.
15
+
16
+ ## Model Zoo Structure
17
+
18
+ The file structure of the model zoo is:
19
+ ```
20
+ β”œβ”€β”€ all
21
+ β”‚ └── experts
22
+ β”‚ β”œβ”€β”€ experts_1.tar.gz
23
+ β”‚ β”œβ”€β”€ experts_2.tar.gz
24
+ β”‚ ...
25
+ β”‚ └── experts_8.tar.gz
26
+ β”‚
27
+ β”œβ”€β”€ sample
28
+ β”‚ └── experts.tar.gz
29
+ β”‚
30
+ β”œβ”€β”€ multiclip_policy.tar.gz
31
+ β”‚ β”œβ”€β”€ full_dataset
32
+ β”‚ └── locomotion_dataset
33
+ β”‚
34
+ β”œβ”€β”€ transfer.tar.gz
35
+ β”‚ β”œβ”€β”€ go_to_target
36
+ β”‚ β”‚ β”œβ”€β”€ general_low_level
37
+ β”‚ β”‚ β”œβ”€β”€ locomotion_low_level
38
+ β”‚ β”‚ └── no_low_level
39
+ β”‚ β”‚
40
+ β”‚ └── velocity_control
41
+ β”‚ β”œβ”€β”€ general_low_level
42
+ β”‚ β”œβ”€β”€ locomotion_low_level
43
+ β”‚ └── no_low_level
44
+ β”‚
45
+ β”œβ”€β”€ gpt.ckpt
46
+ β”‚
47
+ └── videos
48
+ β”œβ”€β”€ full_clip_videos.tar.gz
49
+ └── snippet_videos.tar.gz
50
+ ```
51
+
52
+ ## Experts Tarball Files
53
+ The expert tarball files have the following structure:
54
+ - `all/experts/experts_*.tar.gz`: Contains all of the clip snippet experts. Due to file size limitations, we split the experts among multiple tarball files.
55
+ - `sample/experts.tar.gz`: Contains the clip snippet experts used to run the examples on the [dataset website](https://microsoft.github.io/MoCapAct/).
56
+
57
+ The expert structure is detailed in Appendix A.1 of the paper as well as https://github.com/microsoft/MoCapAct#description.
58
+
59
+ An expert can be loaded and rolled out in Python as in the following example:
60
+ ```python
61
+ from mocapact import observables
62
+ from mocapact.sb3 import utils
63
+ expert_path = "/path/to/experts/CMU_083_33/CMU_083_33-0-194/eval_rsi/model"
64
+ expert = utils.load_policy(expert_path, observables.TIME_INDEX_OBSERVABLES)
65
+
66
+ from mocapact.envs import tracking
67
+ from dm_control.locomotion.tasks.reference_pose import types
68
+ dataset = types.ClipCollection(ids=['CMU_083_33'], start_steps=[0], end_steps=[194])
69
+ env = tracking.MocapTrackingGymEnv(dataset)
70
+ obs, done = env.reset(), False
71
+ while not done:
72
+ action, _ = expert.predict(obs, deterministic=True)
73
+ obs, rew, done, _ = env.step(action)
74
+ print(rew)
75
+ ```
76
+
77
+ Alternatively, an expert can be rolled out from the command line:
78
+ ```bash
79
+ python -m mocapact.clip_expert.evaluate \
80
+ --policy_root /path/to/experts/CMU_016_22/CMU_016_22-0-82/eval_rsi/model \
81
+ --act_noise 0 \
82
+ --ghost_offset 1 \
83
+ --always_init_at_clip_start
84
+ ```
85
+
86
+ ## GPT
87
+ The GPT policy is contained in `gpt.ckpt` and can be loaded using PyTorch Lightning:
88
+ ```python
89
+ from mocapact.distillation import model
90
+ policy = model.GPTPolicy.load_from_checkpoint('/path/to/gpt.ckpt', map_location='cpu')
91
+ ```
92
+ This policy can be used with `mocapact/distillation/motion_completion.py`, as in the following example:
93
+ ```bash
94
+ python -m mocapact.distillation.motion_completion.py \
95
+ --policy_path /path/to/gpt.ckpt \
96
+ --nodeterministic \
97
+ --ghost_offset 1 \
98
+ --expert_root /path/to/experts/CMU_016_25 \
99
+ --max_steps 500 \
100
+ --always_init_at_clip_start \
101
+ --prompt_length 32 \
102
+ --min_steps 32 \
103
+ --device cuda \
104
+ --clip_snippet CMU_016_25
105
+ ```
106
+
107
+ ## Multi-Clip Policy
108
+ The `multiclip_policy.tar.gz` file contains two policies:
109
+ - `full_dataset`: Trained on the entire MoCapAct dataset
110
+ - `locomotion_dataset`: Trained on the `locomotion_small` portion of the MoCapAct dataset
111
+
112
+ Taking `full_dataset` as an example, a multi-clip policy can be loaded using PyTorch Lightning:
113
+ ```python
114
+ from mocapact.distillation import model
115
+ policy = model.NpmpPolicy.load_from_checkpoint('/path/to/multiclip_policy/full_dataset/model/model.ckpt', map_location='cpu')
116
+ ```
117
+ The policy can be used with `mocapact/distillation/evaluate.py`, as in the following example:
118
+ ```bash
119
+ python -m mocapact.distillation.evaluate \
120
+ --policy_path /path/to/multiclip_policy/full_dataset/model/model.ckpt \
121
+ --act_noise 0 \
122
+ --ghost_offset 1 \
123
+ --always_init_at_clip_start \
124
+ --termination_error_threshold 10 \
125
+ --clip_snippets CMU_016_22
126
+ ```
127
+
128
+ ## Transfer
129
+ The `transfer.tar.gz` file contains policies for downstream tasks. The main difference between the contained folders is what low-level policy is used:
130
+ - `general_low_level`: Low-level policy comes from `multiclip_policy/full_dataset`
131
+ - `locomotion_low_level`: Low-level policy comes from `multiclip_policy/locomotion_dataset`
132
+ - `no_low_level`: No low-level policy used
133
+
134
+ The policy structure is as follows:
135
+ ```
136
+ β”œβ”€β”€ best_model.zip
137
+ β”œβ”€β”€ low_level_policy.ckpt
138
+ └── vecnormalize.pkl
139
+ ```
140
+ The `low_level_policy.ckpt` (only present in `general_low_level` and `locomotion_low_level`) contains the low-level policy and is loaded with PyTorch Lightning.
141
+ The `best_model.zip` file contains the task policy parameters.
142
+ The `vecnormalize.pkl` file contains the observation normalizer.
143
+ The latter two files are loaded with Stable-Baselines3.
144
+
145
+ The policy can be used with `mocapact/transfer/evaluate.py`, as in the following example:
146
+ ```bash
147
+ python -m mocapact.transfer.evaluate \
148
+ --model_root /path/to/transfer/go_to_target/general_low_level \
149
+ --task /path/to/mocapact/transfer/config.py:go_to_target
150
+ ```
151
+
152
+ ## MoCap Videos
153
+ There are two tarball files containing videos of the MoCap clips in the dataset:
154
+ - `full_clip_videos.tar.gz` contains videos of the full MoCap clips.
155
+ - `snippet_videos.tar.gz` contains videos of the snippets that were used to train the experts.
156
+ Note that they are playbacks of the clips themselves, not rollouts of the corresponding experts.