README: fix typos, update code paths

This commit is contained in:
2025-01-15 01:17:05 +00:00
parent b81ea7fa5a
commit 641dd9a5a1

View File

@@ -80,7 +80,7 @@ presently, this requires *specific* versions of rust-nightly to work.
the feature is toggled at runtime, but compiled unconditionally. set up the toolchain according to [rust-toolchain.toml](rust-toolchain.toml):
```
$ rustup default nightly-2022-08-29
$ rustup default nightly-2023-01-21
$ rustup component add rust-src rustc-dev llvm-tools-preview
```
@@ -93,7 +93,7 @@ now you can swap out the `CpuDriver` with a `SpirvDriver` and you're set:
re-run it as before and you should see the same results:
```
$ cargo run --release --example wavefront
$ cargo run --release --bin wavefront
```
see the "Processing Loop" section below to understand what GPU acceleration entails.
@@ -102,10 +102,10 @@ see the "Processing Loop" section below to understand what GPU acceleration enta
the [sr\_latch](crates/applications/sr_latch/src/main.rs) example explores a more interesting feature set.
first, it "measures" a bunch of parameters over different regions of the simulation
(peak inside [`crates/coremem/src/meas.rs`](crates/coremem/src/meas.rs) to see how these each work):
(peek inside [`meas.rs`](crates/coremem/src/meas.rs) to see how these each work):
```rust
// measure a bunch of items of interest throughout the whole simulation duration:
// measure some items of interest throughout the whole simulation duration:
driver.add_measurement(meas::CurrentLoop::new("coupling", coupling_region.clone()));
driver.add_measurement(meas::Current::new("coupling", coupling_region.clone()));
driver.add_measurement(meas::CurrentLoop::new("sense", sense_region.clone()));
@@ -140,7 +140,7 @@ driver.add_serializer_renderer(&*format!("{}frame-", prefix), 36000, None);
run this, after having setup the GPU pre-requisites:
```
$ cargo run --release --example sr_latch
$ cargo run --release --bin sr_latch
```
and then investigate the results with
@@ -161,7 +161,7 @@ what we see here is that both ferrites (the two large circles in the above image
we can see the "reset" pulse has polarized both ferrites in the counter-clockwise orientation this time. the E field is less pronounced because we gave the system 22ns instead of 3ns to settle this time.
the graphical viewer is helpful for debugging geometries, but the CSV measurements are useful for viewing numeric system performance. peak inside "out/applications/sr-latch/meas.csv" to see a bunch of measurements over time. you can use a tool like Excel or [visidata](https://www.visidata.org/) to plot the interesting ones.
the graphical viewer is helpful for debugging geometries, but the CSV measurements are useful for viewing numeric system performance. peek inside "out/applications/sr-latch/meas.csv" to see a bunch of measurements over time. you can use a tool like Excel or [visidata](https://www.visidata.org/) to plot the interesting ones.
here's a plot of `M(mem2)` over time from the SR latch simulation. we're measuring, over the torus volume corresponding to the ferrite on the right in the images above, the (average) M component normal to each given cross section of the torus. the notable bumps correspond to these pulses: "set", "reset", "set", "reset", "set+reset applied simultaneously", "set", "set".
@@ -171,14 +171,14 @@ here's a plot of `M(mem2)` over time from the SR latch simulation. we're measuri
## Processing Loop (and how GPU acceleration works)
the processing loop for a simulation is roughly as follows ([`crates/coremem/src/driver.rs:step_until`](crates/coremem/src/driver.rs) drives this loop):
1. evaluate all stimuli at the present moment in time; these produce an "externally applied" E and H field
across the entire volume.
the processing loop for a simulation is roughly as follows ([`driver.rs:step_until`](crates/coremem/src/driver.rs) drives this loop):
1. evaluate all stimuli at the present moment in time;
these produce an "externally applied" E and H field across the entire volume.
2. apply the FDTD update equations to "step" the E field, and then "step" the H field. these equations take the external stimulus from step 1 into account.
3. evaluate all the measurement functions over the current state; write these to disk.
4. serialize the current state to disk so that we can resume from this point later if we choose.
within each step above, the logic is multi-threaded and the rendeveous points lie at the step boundaries.
within each step above, the logic is multi-threaded and the rendezvous points lie at the step boundaries.
it turns out that the Courant rules force us to evaluate FDTD updates (step 2) on a _far_ smaller time scale than the other steps are sensitive to. so to tune for performance, we apply some optimizations:
- stimuli (step 1) are evaluated only once every N frames. we still *apply* them on each frame individually. the waveform resembles that of a Sample & Hold circuit.
@@ -202,12 +202,12 @@ this library takes effort to separate the following from the core/math-heavy "si
the simulation only interacts with these things through a trait interface, such that they're each swappable.
common stimuli type live in [crates/coremem/src/stim/](crates/coremem/src/stim/).
common measurements live in [crates/coremem/src/meas.rs](crates/coremem/src/meas.rs).
common render targets live in [crates/coremem/src/render.rs](crates/coremem/src/render.rs). these change infrequently enough that [crates/coremem/src/driver.rs](crates/coremem/src/driver.rs) has some specialized helpers for each render backend.
common materials are spread throughout [crates/cross/src/mat/](crates/cross/src/mat/).
different float implementations live in [crates/cross/src/real.rs](crates/cross/src/real.rs).
if you're getting NaNs, you can run the entire simulation on a checked `R64` (CPU-only) or `R32` (any backend) type in order to pinpoint the moment those are introduced.
common stimuli type live in [stim/mod.rs](crates/coremem/src/stim/mod.rs).
common measurements live in [meas.rs](crates/coremem/src/meas.rs).
common render targets live in [render.rs](crates/coremem/src/render.rs). these change infrequently enough that [driver.rs](crates/coremem/src/driver.rs) has some specialized helpers for each render backend.
common materials are spread throughout [mat/mod.rs](crates/cross/src/mat/mod.rs).
different float implementations live in [real.rs](crates/cross/src/real.rs).
if you're getting NaNs, you can run the entire simulation on a checked `R64` type in order to pinpoint the moment those are introduced.
## Materials
@@ -237,17 +237,17 @@ this library ships with the following materials:
- `MHPgram` specifies the `M(H)` function as a parallelogram.
- `MBPgram` specifies the `M(B)` function as a parallelogram.
measurements include ([crates/coremem/src/meas.rs](crates/coremem/src/meas.rs)):
measurements include ([meas.rs](crates/coremem/src/meas.rs)):
- E, B or H field (mean vector over some region)
- energy, power (net over some region)
- current (mean vector over some region)
- mean current magnitude along a closed loop (toroidal loops only)
- mean magnetic polarization magnitude along a closed loop (toroidal loops only)
output targets include ([crates/coremem/src/render.rs](crates/coremem/src/render.rs)):
output targets include ([render.rs](crates/coremem/src/render.rs)):
- `ColorTermRenderer`: renders 2d-slices in real-time to the terminal.
- `Y4MRenderer`: outputs 2d-slices to an uncompressed `y4m` video file.
- `SerializerRenderer`: dumps the full 3d simulation state to disk. parseable after the fact with [crates/post/src/bin/viewer.rs](crates/post/src/bin/viewer.rs).
- `SerializerRenderer`: dumps the full 3d simulation state to disk. parseable after the fact with [viewer.rs](crates/post/src/bin/viewer.rs).
- `CsvRenderer`: dumps the output of all measurements into a `csv` file.
historically there was also a plotly renderer, but that effort was redirected into developing the viewer tool better.
@@ -266,8 +266,8 @@ contrast that to the CPU-only implementation which achieves 24.6M grid cell step
# Support
the author can be reached on Matrix <@colin:uninsane.org>, email <colin@uninsane.org> or Activity Pub <@colin@fed.uninsane.org>. i poured a lot of time into making
this: i'm happy to spend the marginal extra time to help curious people make use of what i've made, so don't hesitate to reach out.
the author can be reached on Matrix <@colin:uninsane.org>, email <mailto:colin@uninsane.org> or Activity Pub <@colin@fed.uninsane.org>.
i'd love for this project to be useful to people besides just myself, so don't hesitate to reach out.
## Additional Resources
@@ -288,4 +288,5 @@ David Bennion and Hewitt Crane documented their approach for transforming Diode-
although i decided not to use PML, i found Steven Johnson's (of FFTW fame) notes to be the best explainer of PML:
- [Steven Johnson: Notes on Perfectly Matched Layers (PMLs)](https://math.mit.edu/~stevenj/18.369/spring07/pml.pdf)
a huge thanks to everyone above for sharing the fruits of their studies. though my work here is of a lesser caliber, i hope that someone, likewise, may someday find it of use.
a huge thanks to everyone above for sharing the fruits of their studies.
this project would not have happened if not for literature like the above from which to draw.