File size: 14,071 Bytes
448ebbd
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
# ClimateGAN
- [ClimateGAN](#climategan)
  - [Setup](#setup)
  - [Coding conventions](#coding-conventions)
  - [updates](#updates)
  - [interfaces](#interfaces)
  - [Logging on comet](#logging-on-comet)
  - [Resources](#resources)
  - [Example](#example)
  - [Release process](#release-process)

## Setup

**`PyTorch >= 1.1.0`** otherwise optimizer.step() and scheduler.step() are in the wrong order ([docs](https://pytorch.org/docs/stable/optim.html#how-to-adjust-learning-rate))

**pytorch==1.6** to use pytorch-xla or automatic mixed precision (`amp` branch).

Configuration files use the **YAML** syntax. If you don't know what `&` and `<<` mean, you'll have a hard time reading the files. Have a look at:

  * https://dev.to/paulasantamaria/introduction-to-yaml-125f
  * https://stackoverflow.com/questions/41063361/what-is-the-double-left-arrow-syntax-in-yaml-called-and-wheres-it-specced/41065222

**pip**

```
$ pip install comet_ml scipy opencv-python torch torchvision omegaconf==1.4.1 hydra-core==0.11.3 scikit-image imageio addict tqdm torch_optimizer
```

## Coding conventions

* Tasks
  * `x` is an input image, in [-1, 1]
  * `s` is a segmentation target with `long` classes
  * `d` is a depth map target in R, may be actually `log(depth)` or `1/depth`
  * `m` is a binary mask with 1s where water is/should be
* Domains
  * `r` is the *real* domain for the masker. Input images are real pictures of urban/suburban/rural areas
  * `s` is the *simulated* domain for the masker. Input images are taken from our Unity world
  * `rf` is the *real flooded* domain for the painter. Training images are pairs `(x, m)` of flooded scenes for which the water should be reconstructed, in the validation data input images are not flooded and we provide a manually labeled mask `m`
  * `kitti` is a special `s` domain to pre-train the masker on [Virtual Kitti 2](https://europe.naverlabs.com/research/computer-vision/proxy-virtual-worlds-vkitti-2/)
    * it alters the `trainer.loaders` dict to select relevant data sources from `trainer.all_loaders` in `trainer.switch_data()`. The rest of the code is identical.
* Flow
  * This describes the call stack for the trainers standard training procedure
  * `train()`
    * `run_epoch()`
      * `update_G()`
        * `zero_grad(G)`
        * `get_G_loss()`
          * `get_masker_loss()`
            * `masker_m_loss()`  -> masking loss
            * `masker_s_loss()`  -> segmentation loss
            * `masker_d_loss()`  -> depth estimation loss
          * `get_painter_loss()` -> painter's loss
        * `g_loss.backward()`
        * `g_opt_step()`
      * `update_D()`
        * `zero_grad(D)`
        * `get_D_loss()`
          * painter's disc losses
          * `masker_m_loss()` -> masking AdvEnt disc loss
          * `masker_s_loss()` -> segmentation AdvEnt disc loss
        * `d_loss.backward()`
        * `d_opt_step()`
      * `update_learning_rates()` -> update learning rates according to schedules defined in `opts.gen.opt` and `opts.dis.opt`
    * `run_validation()`
      * compute val losses
      * `eval_images()` -> compute metrics
      * `log_comet_images()` -> compute and upload inferences
    * `save()`

### Resuming

Set  `train.resume` to `True` in `opts.yaml` and specify where to load the weights:

Use a config's `load_path` namespace. It should have sub-keys `m`, `p` and `pm`:

```yaml
load_paths:
  p: none # Painter weights
  m: none # Masker weights
  pm: none # Painter + Masker weights (single ckpt for both)
```

1. any path which leads to a dir will be loaded as `path / checkpoints / latest_ckpt.pth`
2. if you want to specify a specific checkpoint (not the latest), it MUST be a `.pth` file
3. resuming a `P` **OR** an `M` model, you may only specify 1 of `load_path.p` **OR** `load_path.m`.
   You may also leave **BOTH** at `none`, in which case `output_path / checkpoints / latest_ckpt.pth`
   will be used
4. resuming a P+M model, you may specify (`p` AND `m`) **OR** `pm` **OR** leave all at `none`,
   in which case `output_path / checkpoints / latest_ckpt.pth` will be used to load from
   a single checkpoint

### Generator

* **Encoder**:

  `trainer.G.encoder` Deeplabv2 or v3-based encoder
  * Code borrowed from
    * https://github.com/valeoai/ADVENT/blob/master/advent/model/deeplabv2.py
    * https://github.com/CoinCheung/DeepLab-v3-plus-cityscapes

* **Decoders**:
  * `trainer.G.decoders["s"]` -> *Segmentation* -> DLV3+ architecture (ASPP + Decoder)
  * `trainer.G.decoders["d"]` -> *Depth* -> ResBlocks + (Upsample + Conv)
  * `trainer.G.decoders["m"]` -> *Mask* -> ResBlocks + (Upsample + Conv) -> Binary mask: 1 = water should be there
    * `trainer.G.mask()` predicts a mask and optionally applies `sigmoid` from an `x` input or a `z` input

* **Painter**: `trainer.G.painter` -> [GauGAN SPADE-based](https://github.com/NVlabs/SPADE)
  * input = masked image
* `trainer.G.paint(m, x)` higher level function which takes care of masking
* If `opts.gen.p.paste_original_content` the painter should only create water and not reconstruct outside the mask: the output of `paint()` is `painted * m + x * (1 - m)`

High level methods of interest:

* `trainer.infer_all()` creates a dictionary of events with keys `flood` `wildfire` and `smog`. Can take in a single image or a batch, of numpy arrays or torch tensors, on CPU/GPU/TPU. This method calls, amongst others:
  * `trainer.G.encode()` to compute the shared latent vector `z`
  * `trainer.G.mask(z=z)` to infer the mask
  * `trainer.compute_fire(x, segmentation)` to create a wildfire image from `x` and inferred segmentation
  * `trainer.compute_smog(x, depth)` to create a smog image from `x` and inferred depth
  * `trainer.compute_flood(x, mask)` to create a flood image from `x` and inferred mask using the painter (`trainer.G.paint(m, x)`)
* `Trainer.resume_from_path()` static method to resume a trainer from a path

### Discriminator

## updates

multi-batch:

```
multi_domain_batch = {"rf: batch0, "r": batch1, "s": batch2}
```

## interfaces

### batches
```python
batch = Dict({
    "data": {
        "d": depthmap,,
        "s": segmentation_map,
        "m": binary_mask
        "x": real_flooded_image,
    },
    "paths":{
        same_keys: path_to_file
    }
    "domain": list(rf | r | s),
    "mode": list(train | val)
})
```

### data

#### json files

| name                                           | domain | description                                                                |  author   |
| :--------------------------------------------- | :----: | :------------------------------------------------------------------------- | :-------: |
| **train_r_full.json, val_r_full.json**         |   r    | MiDaS+ Segmentation pseudo-labels .pt (HRNet + Cityscapes)                 | Mélisande |
| **train_s_full.json, val_s_full.json**         |   s    | Simulated data from Unity11k urban + Unity suburban dataset                |    ***    |
| train_s_nofences.json, val_s_nofences.json     |   s    | Simulated data from Unity11k urban + Unity suburban dataset without fences |  Alexia   |
| train_r_full_pl.json, val_r_full_pl.json       |   r    | MegaDepth + Segmentation pseudo-labels .pt (HRNet + Cityscapes)            |  Alexia   |
| train_r_full_midas.json, val_r_full_midas.json |   r    | MiDaS+ Segmentation (HRNet + Cityscapes)                                   | Mélisande |
| train_r_full_old.json, val_r_full_old.json     |   r    | MegaDepth+ Segmentation (HRNet + Cityscapes)                               |    ***    |
| train_r_nopeople.json, val_r_nopeople.json     |   r    | Same training data as above with people removed                            |   Sasha   |
| train_rf_with_sim.json                         |   rf   | Doubled train_rf's size with sim data  (randomly chosen)                   |  Victor   |
| train_rf.json                                  |   rf   | UPDATE (12/12/20): added 50 ims & masks from ADE20K Outdoors               |  Victor   |
| train_allres.json, val_allres.json             |   rf   | includes both lowres and highres from ORCA_water_seg                       |  Tianyu   |
| train_highres_only.json, val_highres_only.json |   rf   | includes only highres from ORCA_water_seg                                  |  Tianyu   |


```yaml
# data file ; one for each r|s
- x: /path/to/image
  m: /path/to/mask
  s: /path/to/segmentation map
- x: /path/to/another image
  d: /path/to/depth map
  m: /path/to/mask
  s: /path/to/segmentation map
- x: ...
```

or

```json
[
    {
        "x": "/Users/victor/Documents/ccai/github/climategan/example_data/gsv_000005.jpg",
        "s": "/Users/victor/Documents/ccai/github/climategan/example_data/gsv_000005.npy",
        "d": "/Users/victor/Documents/ccai/github/climategan/example_data/gsv_000005_depth.jpg"
    },
    {
        "x": "/Users/victor/Documents/ccai/github/climategan/example_data/gsv_000006.jpg",
        "s": "/Users/victor/Documents/ccai/github/climategan/example_data/gsv_000006.npy",
        "d": "/Users/victor/Documents/ccai/github/climategan/example_data/gsv_000006_depth.jpg"
    }
]
```

The json files used are located at `/network/tmp1/ccai/data/climategan/`. In the basenames,  `_s` denotes simulated domain data and `_r` real domain data.
The `base` folder contains json files with paths to images (`"x"`key) and masks (taken as ground truth for the area that should be flooded, `"m"` key).
The `seg` folder contains json files and keys `"x"`, `"m"` and `"s"` (segmentation) for each image.


loaders

```
loaders = Dict({
    train: { r: loader, s: loader},
    val: { r: loader, s: loader}
})
```

### losses

`trainer.losses` is a dictionary mapping to loss functions to optimize for the 3 main parts of the architecture: generator `G`, discriminators `D`:

```python
trainer.losses = {
    "G":{ # generator
        "gan": { # gan loss from the discriminators
            "a": GANLoss, # adaptation decoder
            "t": GANLoss # translation decoder
        },
        "cycle": { # cycle-consistency loss
            "a": l1 | l2,,
            "t": l1 | l2,
        },
        "auto": { # auto-encoding loss a.k.a. reconstruction loss
            "a": l1 | l2,
            "t": l1 | l2
        },
        "tasks": {  # specific losses for each auxillary task
            "d": func, # depth estimation
            "h": func, # height estimation
            "s": cross_entropy_2d, # segmentation
            "w": func, # water generation
        },
        "classifier": l1 | l2 | CE # loss from fooling the classifier
    },
    "D": GANLoss, # discriminator losses from the generator and true data
    "C": l1 | l2 | CE # classifier should predict the right 1-h vector [rf, rn, sf, sn]
}
```

## Logging on comet

Comet.ml will look for api keys in the following order: argument to the `Experiment(api_key=...)` call, `COMET_API_KEY` environment variable, `.comet.config` file in the current working directory, `.comet.config` in the current user's home directory.

If your not managing several comet accounts at the same time, I recommend putting `.comet.config` in your home as such:

```
[comet]
api_key=<api_key>
workspace=vict0rsch
rest_api_key=<rest_api_key>
```

### Tests

Run tests by executing `python test_trainer.py`. You can add `--no_delete` not to delete the comet experiment at exit and inspect uploads.

Write tests as scenarios by adding to the list `test_scenarios` in the file. A scenario is a dict of overrides over the base opts in `shared/trainer/defaults.yaml`. You can create special flags for the scenario by adding keys which start with `__`. For instance, `__doc` is a mandatory key in any scenario describing it succinctly.

## Resources

[Tricks and Tips for Training a GAN](https://chloes-dl.com/2019/11/19/tricks-and-tips-for-training-a-gan/)
[GAN Hacks](https://github.com/soumith/ganhacks)
[Keep Calm and train a GAN. Pitfalls and Tips on training Generative Adversarial Networks](https://medium.com/@utk.is.here/keep-calm-and-train-a-gan-pitfalls-and-tips-on-training-generative-adversarial-networks-edd529764aa9)

## Example

**Inference: computing floods**

```python
from pathlib import Path
from skimage.io import imsave
from tqdm import tqdm

from climategan.trainer import Trainer
from climategan.utils import find_images
from climategan.tutils import tensor_ims_to_np_uint8s
from climategan.transforms import PrepareInference


model_path = "some/path/to/output/folder" # not .ckpt
input_folder = "path/to/a/folder/with/images"
output_path = "path/where/images/will/be/written"

# resume trainer
trainer = Trainer.resume_from_path(model_path, new_exp=None, inference=True)

# find paths for all images in the input folder. There is a recursive option. 
im_paths = sorted(find_images(input_folder), key=lambda x: x.name)

# Load images into tensors 
#   * smaller side resized to 640 - keeping aspect ratio
#   * then longer side is cropped in the center
#   * result is a 1x3x640x640 float tensor in [-1; 1]
xs = PrepareInference()(im_paths)

# send to device
xs = [x.to(trainer.device) for x in xs]

# compute flood
#   * compute mask
#   * binarize mask if bin_value > 0
#   * paint x using this mask
ys = [trainer.compute_flood(x, bin_value=0.5) for x in tqdm(xs)]

# convert 1x3x640x640 float tensors in [-1; 1] into 640x640x3 numpy arrays in [0, 255]
np_ys = [tensor_ims_to_np_uint8s(y) for y in tqdm(ys)]

# write images
for i, n in tqdm(zip(im_paths, np_ys), total=len(im_paths)):
    imsave(Path(output_path) / i.name, n)
```

## Release process

In the `release/` folder
* create a `model/` folder
* create folders `model/masker/` and `model/painter/` 
* add the climategan code in `release/`: `git clone [email protected]:cc-ai/climategan.git`
* move the code to `release/`: `cp climategan/* . && rm -rf climategan`
* update `model/masker/opts/events` with `events:` from `shared/trainer/opts.yaml`
* update `model/masker/opts/val.val_painter` to `"model/painter/checkpoints/latest_ckpt.pth"`
* update `model/masker/opts/load_paths.m` to `"model/masker/checkpoints/latest_ckpt.pth"`