-
Notifications
You must be signed in to change notification settings - Fork 0
Modules
The pbdl
package consists of three modules, each designed for specific use cases in physics-based deep learning:
-
pbdl.loader
: Provides basic dataset access using NumPy arrays. -
pbdl.torch.loader
: Supports dataset loading for training models in PyTorch. -
pbdl.torch.phi.loader
Supports dataset loading for training models in PyTorch with integrated solver.
This module suitable for loading datasets without training and NumPy arrays are sufficient.
A Dataloader
instance requires at least two arguments:
- dataset name (positional): The name of the dataset to be loaded.
-
time_steps
: The interval between input and target frame. If set toNone
, this interval is maximal (number of frames in the simulation minus one).
Additionally, it accepts the following keyword arguments:
-
sel_sims
: Select specific simulations. By default, all simulations are included. -
trim_start
/trim_end
: Discard the initial or final sequence of frames, which may be uninteresting. -
step_size
: Use every k-th frame (thinning out datasets with many frames). By default the step size is 1. -
normalize_data
/normalize_const
: Choose from the available normalization strategies. By default normalization is disabled. -
batch_size
: Define the number of samples in each batch. -
shuffle
: Determine whether the samples should be provided in a random order. -
intermediate_time_steps
: If enabled, not only the initial and target frames are supplied but also all intermediate frames. Useful for computing accumulated errors over multiple time steps.
For a convenient way to use all simulation frames, set the
all_time_steps
flag. Note that this flag also controls related settings like time steps, step size, and intermediate time steps.
The following code provides a minimal example:
from pbdl.loader import Dataloader
import matplotlib.pyplot as plt
loader = Dataloader(
"incompressible-wake-flow-tiny",
time_steps=10, # interval between input and target frame
sel_sims=[0], # select first simulation
batch_size=3,
shuffle=True,
)
inputs, targets = next(iter(loader))
for i in range(len(inputs)):
plt.subplot(2, len(inputs), i + 1)
plt.imshow(inputs[i][0]) # display field at index 0
plt.axis("off")
plt.title("input {}".format(i + 1))
for i in range(len(targets)):
plt.subplot(2, len(targets), len(targets) + i + 1)
plt.imshow(targets[i][0]) # display field at index 0
plt.axis("off")
plt.title("target {}".format(i + 1))
plt.show()
This module is suitable for loading datasets for training with PyTorch. Unlike the dataloader in the previous module, the dataloader from pbdl.torch.loader
returns a pair (input tensor, target tensor)
, where both elements are PyTorch tensors. Each layer of the tensors represents a physical field or constant.
The following code provides a minimal example:
import torch
import numpy as np
from pbdl.torch.loader import Dataloader
import examples.tcf.net_small as net_small
loader = Dataloader(
"transonic-cylinder-flow-tiny",
time_steps=10,
sel_sims=[0, 1],
step_size=3,
normalize_data="std",
batch_size=3,
shuffle=True,
)
net = net_small.NetworkSmall()
criterionL2 = torch.nn.MSELoss()
optimizer = torch.optim.Adam(net.parameters(), lr=0.0001, weight_decay=0.0)
for epoch in range(5):
for i, (input, target) in enumerate(loader):
net.zero_grad()
output = net(input)
loss = criterionL2(output, target)
loss.backward()
optimizer.step()
print(f"epoch { epoch }, loss { loss.item() }")
This module is suitable if you want to integrate a (PhiFlow) solver into the training loop of your PyTorch program. It introduces new features that must be enabled using the following parameters:
-
batch_by_const
: A list of indices representing constants. It ensures that all samples in a batch share the same constant values. This is useful when using a solver function that requires a batch of samples but only one scalar value for each constant. -
ret_batch_const
: When enabled, the loader also returns the non-normalized constants for the batch. This option is only available if batching by constants is enabled.
Additionally, the module provides auxiliary functions for converting tensors between PyTorch and PhiFlow:
-
to_phiflow(t)
: Converts network input to solver input by removing constant layers. -
from_phiflow(t)
: Converts solver output to match network output format. -
cat_constants(t,l)
: Concatenates constant layers from tensorl
onto tensort
. This is useful because the network output does not include the constant layers required for the network input in the next iteration.
The following code provides a minimal example:
import torch
from pbdl.torch.phi.loader import Dataloader
from examples.ks.ks_networks import ConvResNet1D
from examples.ks.ks_solver import DifferentiableKS
# solver parameters
DOMAIN_SIZE_BASE = 8
PREDHORZ = 5
device = "cuda:0" if torch.cuda.is_available() else "cpu"
diff_ks = DifferentiableKS(resolution=48, dt=0.5)
loader = Dataloader(
"ks-dataset",
PREDHORZ,
step_size=20,
intermediate_time_steps=True,
batch_size=16,
batch_by_const=[0],
ret_batch_const=True,
)
net = ConvResNet1D(16, 3, device=device)
optimizer = torch.optim.Adam(net.parameters(), lr=1e-4)
loss = torch.nn.MSELoss()
for epoch in range(4):
for i, (input, targets, const) in enumerate(loader):
input = input.to(device)
targets = targets.to(device)
optimizer.zero_grad()
domain_size = const[0]
inputs = [input]
outputs = []
for _ in range(PREDHORZ):
output_solver = diff_ks.etd1(
loader.to_phiflow(inputs[-1]), DOMAIN_SIZE_BASE * domain_size
)
correction = diff_ks.dt * net(inputs[-1])
output_combined = loader.from_phiflow(output_solver) + correction
outputs.append(output_combined)
inputs.append(loader.cat_constants(outputs[-1], inputs[0]))
outputs = torch.stack(outputs, axis=1)
loss_value = loss(outputs, targets)
loss_value.backward()
optimizer.step()
print(f"epoch { epoch }, loss {(loss_value.item()*10000.) :.3f}")