197 Commits

Author SHA1 Message Date
f8ae96ef13 README: clarify how to implement _compound_ materials 2025-01-15 02:04:14 +00:00
641dd9a5a1 README: fix typos, update code paths 2025-01-15 01:51:46 +00:00
b81ea7fa5a flake: rust-overlay: 2024-08-01 -> 2025-01-11 2025-01-15 01:47:30 +00:00
518e0821df update deps: rust-gpu: 86d60422 -> d78c3017 and associated rust-toolchain: nightly-2022-12-18 -> nightly-2023-01-21 2025-01-15 01:47:30 +00:00
96932ddb64 update deps: rust-gpu: dcf37b75 -> 86d60422 and associated rust-toolchain: nightly-2022-09-25 -> nightly-2022-12-18 2025-01-15 01:47:30 +00:00
293ba76b1f docs: disable doctesting for docs which arent rust code (or for which the code cant be invoked publicly) 2025-01-15 01:47:30 +00:00
dab5e42d1a update deps: rust-gpu: 985007fc -> dcf37b75, associated rust-toolchain: nightly-2022-08-29 -> nightly-2022-09-25
verified working: cargo test; cargo run --release --bin sr_latch; cargo run --release --bin wavefront
2025-01-15 01:47:30 +00:00
dc201549c1 Rust-GPU: pin to 985007fc, until i can figure how to bump it 2025-01-15 01:47:30 +00:00
74f1f3b232 spirv_backend_builder: un-pin compiler internals (so far, they seem to be not required?)
`cargo test` passes, as does `cargo run --release --bin sr_latch`
2025-01-15 01:47:30 +00:00
781aff3935 migrate rust-gpu from EmbarkStudios -> Rust-GPU project/namespace 2025-01-15 01:47:30 +00:00
6aec4aa4fe flake: update: nixpkgs: 23.05 -> 23.11
`cargo test` passes (excluding doc tests, but including wgpu tests!)
2025-01-15 01:47:30 +00:00
f688d49511 flake: rust-overlay: 2023-12-30 -> 2024-06-27
`cargo test` passes (except wgpu tests, doc tests)
2025-01-15 01:47:30 +00:00
970201bfa1 flake: update: nixpkgs: 22.11 -> 23.05
`cargo test` passes (except wgpu tests, doc tests)
2025-01-15 01:47:30 +00:00
daf8e9dba8 flake: update: rust-overlay: 2023-06-30 -> 2023-12-30
`cargo test` passes (except wgpu tests, doc tests)
2025-01-15 01:47:30 +00:00
e9ce42a2d2 flake: update: rust-overlay 2022-12-31 -> 2023-06-30
`cargo test` passes (except wgpu tests, doc tests)
2025-01-15 01:47:30 +00:00
47604bbf36 flake: update rust-overlay 2022-09-25 -> 2022-12-31
`cargo test` passes (except wgpu tests, doc tests)
2025-01-15 01:47:30 +00:00
3fede45bdb nixpkgs: 22.05 -> 22.11
`cargo test` passes (except wgpu tests, doc tests)
2025-01-15 01:47:30 +00:00
61b55b4ad5 flake: update
`cargo test` passes (except wgpu tests, doc tests)
2025-01-15 01:47:30 +00:00
cc42faeef4 flake: lock rust-overlay to 45140fa526b1cb85498f717e355c79a54367cb1d (until i figure how to update it)
`cargo test` passes (except wgpu tests, doc tests)
2025-01-15 01:47:30 +00:00
4cc46ae71a README: make this file-path a link 2022-12-07 10:00:15 +00:00
beb43843ff app: sr_latch: fix doc-comment to have file scope 2022-12-07 09:58:03 +00:00
c06aa7ff36 README: update toolchain docs to match what we actually use 2022-12-07 09:56:36 +00:00
bcefd46105 nit: use consistent syntax in material docs 2022-12-07 09:49:06 +00:00
859a7f8b18 rename FullyGenericMaterial -> GenericMaterial
this naming was an artifact from the separate CPU/GPU material implementations.
2022-12-07 09:46:33 +00:00
6d73150fb6 README: fix up stale paths, material references 2022-12-07 09:41:38 +00:00
ed55cdfe10 app: stacked_cores: 61-xx: complete more experiments; start ones with only symmetric coupling 2022-11-26 03:42:10 +00:00
1c9527bb63 app: stacked cores: 61-xx: complete more runs 2022-11-24 11:13:53 +00:00
06aaf55e30 app: stacked_cores: 61-xx: more experiments 2022-11-22 12:56:01 +00:00
0d62b60423 app: stacked_cores: 61-xx: complete more runs 2022-11-22 00:25:52 +00:00
7bb8199b02 app: stacked_cores: 61-xx: complete more runs
still nothing with >> 1.0x amplification.
though we do see configurations which might *locally* amplify:
- 2:1 inp coupling and 4:1 output coupling
        - output is amplified relative to the middle cores
        - but the middle cores transition less than fully
2022-11-19 11:18:24 +00:00
cca6a7c8cd app: stacked_cores: complete more 61-xx runs 2022-11-18 23:54:34 +00:00
9cb9c4dd66 app: stacked_cores: 61-xx: complete a few first-pass runs over an alternatively-parameterized complementary buffer 2022-11-18 10:20:49 +00:00
da199568ff app: stacked_cores: 53-xx: do another run with greater asymmetry
5:1 does worse than 3:1 here
2022-11-18 02:47:09 +00:00
eccd865cf7 app: stacked_cores: 60-xx: new experiment that tries moving a value along a 4-core loop
it does this in a non-complementary way; and it doesn't get more than
about 0.60 amplification
2022-11-17 23:28:49 +00:00
e13ddbdc1f app: stacked_cores: complete more 59-xx runs 2022-11-17 09:45:19 +00:00
38aa677aad app: stacked_cores: complete some 59-xx runs
they don't look super insightful/promising.
2022-11-17 01:17:42 +00:00
a2a851b26f app: stacked_cores: 58-xx: try merging cores via a complementary buffer
results aren't any better than the earlier complementary buffers
2022-11-16 12:29:27 +00:00
9d4e245388 app: stacked_cores: new 58-xx sim which tries a complementary buffer into a pre-charged output 2022-11-11 22:59:57 +00:00
6e198caaa3 fix "reset" -> "set" typo in SR latch example 2022-11-11 05:28:54 +00:00
b7112fab86 app: stacked_cores: 57xx: do some runs where only one pos core is wired
into the output
2022-11-11 03:35:24 +00:00
ea6799b764 app: stacked_cores: new 57-xx experiment: complementary buffer with doubled inputs 2022-11-10 01:23:02 +00:00
4539cb18fe app: stacked_cores: 56-xx: complete a few more runs 2022-11-09 03:36:04 +00:00
7443599054 app: stacked_cores: new 56-xx sim for complementary logic using multiple input cores
like 53-xx, but with double the input cores, and a fixed 1:1 coupling.
it achieves 0.9x amplification at best.
- which is *better* than the 0.8 amplification we see with 53-xx when
  using 1:1 coupling, but not enough
- what if we try this idea with 3:1 winding? we can do that if we
  sandwhich each output _between_ its associated input.
2022-11-09 01:20:15 +00:00
df68100f82 app: stacked_cores: define a fork -> join sim
this is like 18xx, but better measured & with better control/coupling
wirings.

so far we don't have anything > 1.0x amp, but closer to 0.75x
2022-11-08 09:23:28 +00:00
be172d4371 remove the 'license' section
no need to state my ideals so in-your-face here. better to just omit any talk
of licensing, if i truly believe it to be irrelevant.
2022-11-07 03:19:53 -08:00
4407a8d3f7 app: stacked_cores: 53-xx: complete some more runs, including one where inputs are uncoupled 2022-11-07 02:46:24 -08:00
16525127a1 app: stacked_cores: 53-xx: complete a run which uses pos-windings != neg-windings 2022-11-05 18:54:10 -07:00
af4b5ffa32 app: stacked_cores: 53-xx: complete a 1:1 coupled buffer
slope is poor, hovering around a constant 0.75 transmission ratio.
2022-11-05 02:59:10 -07:00
1742172e6c app: stacked_cores: 53-xx: add a 5:1 buffer
it seems to under-transfer compared to the 3:1 buffers.
this *might* be an issue of drive current -- unclear.
2022-11-04 06:11:08 -07:00
3ebcc550a0 app: stacked_cores: 53-xx: better constrain the interpolation, and plot slope 2022-11-04 06:10:04 -07:00
373c80793f app: stacked_cores: improve the 52-xx plotting/interpolation
it's a little slow :-(
i'd guess the `score` function is the slowest part. can maybe get scipy
to do a dot-product for us?
2022-11-04 03:22:42 -07:00
df828b6299 app: stacked_cores: create a plot_53xx and refactor the surroundings
note that the interpolation is very BAD. i need to figure out better
sampling.
2022-11-03 21:21:07 -07:00
4023a67912 app: stacked_cores: ingest 53-xx buffer results 2022-11-03 20:26:17 -07:00
aee3796c29 app: stacked_cores: note about preservation 2022-11-03 05:46:58 -07:00
a45c0c4324 app: stacked_cores: new 53-xx run, where we buffer differential signals 2022-11-03 05:42:38 -07:00
47189dcc7e app: stacked_cores: 52-xx: complete more or gate parameterizations 2022-11-03 04:39:06 -07:00
70e14b4578 app: stacked_cores: 52-xx: sort the runs naturally (natsort) 2022-11-03 01:09:12 -07:00
a754b6e01d app: stacked_cores: 52-xx: capture more runs of existing or gates 2022-11-03 01:02:33 -07:00
286a267f75 app: stacked_cores: 52-xx: add some facilities for plotting 52-xx or gate runs
it's primitive; not the *best* results
2022-11-02 17:21:29 -07:00
ff68e57fa5 app: stacked_cores: 52-xx: collect the measurements into a db 2022-11-02 15:42:49 -07:00
87366cf473 app: stacked_cores: define an "or gate" sim + script for post-processing
extract transfer characteristics with e.g.
```
extract_meas.py ../../../out/applications/stacked_cores/52-or--0.0004rad-5000ctl_cond-20000coupling_cond-2000ps-100ps-3ctl-3coupling-3_1_winding-49999998976e0-drive- 2e-9 4e-9 8e-9
```
2022-11-01 00:11:50 -07:00
9d2fbf8b07 app: stacked_cores: expand the 48-xx run set 2022-10-31 20:49:11 -07:00
267a204e7e app: stacked_cores: complete some more 51-xx runs with variable winding ratios 2022-10-29 01:05:51 -07:00
fc0ce9f083 app: stacked_cores: 51-xx: try some higher-current variants; schedule some 5:1 and 7:1 inverter runs 2022-10-28 05:18:18 -07:00
0c7df48234 app: stacked_cores: 51-xx: complete some experiments using single-clock cascaded cores
i'm able to get 0.8x amplification between the first and the third core.
this is *less* than the amplification i got when cascading only one core
of the first, so not likely a good direction to pursue, though i haven't
yet explored that much of the parameter space.
2022-10-28 02:43:12 -07:00
12f286c3c7 app: stacked_cores: prototype a 3-core/1-cycle inverter (51-xx)
we vary the conductivities, as with 50-xx. the hope is that with a
multi-core approach like this we might get >> 1.0x amplification in the
unloaded setup, which we can place into a loaded circuit and deal with
the ~70% loading penalty.
2022-10-27 18:31:39 -07:00
9c17d3b45d app: stacked_cores: conclude 50-xx runs 2022-10-27 18:17:11 -07:00
57e12cbe32 app: stacked_cores: 50-xx: explore some more runs
got the amplification up to a bit over 0.3...
2022-10-26 08:08:15 -07:00
6af2d1d9e3 app: stacked_cores: complete some more 50-xx runs 2022-10-25 15:29:33 -07:00
cba2db6b10 app: stacked_cores: 50-xx: complete a run with high *control* conductivity, and schedule a few more 2022-10-25 05:37:54 -07:00
f4f672aab6 app: stacked_cores: fix 49-xx (now 50-xx) and run a few paramaterizations 2022-10-25 03:58:19 -07:00
3f54b25cf1 app: stacked_cores: define 49-xx: a *typo'd* multi-stage inverter with parameterized conductivities
in fact, M2 is initialized improperly: this actually acts as an
(overpowered) single-clock-cycle inverter.
2022-10-24 21:49:10 -07:00
3e331db374 app: stacked_cores: 48-xx: grab more detailed measurements for recent inverters 2022-10-24 06:53:03 -07:00
87e94d2182 app: stacked_cores: enable 0.001-level precision for current setting 2022-10-24 02:40:58 -07:00
21d41ff3d5 app: stacked_cores: 48-xx: run another 2e10 I parameterization 2022-10-24 02:37:05 -07:00
e526289fe9 app: stacked_cores: 48-xx: test the high-side of current for an already successful run 2022-10-24 00:26:37 -07:00
0e3212e624 app: stacked_cores: 48-xx: try a few more 10ns, 5e4 coupling cond runs 2022-10-23 21:04:43 -07:00
2b8c5d45c2 app: stacked_cores: finish a lower-current variant of the 48-xx 5e2/4e4 conductivity run 2022-10-22 08:05:13 -07:00
e1867ee541 app: stacked_cores: 48-xx: complete a very low control-conductivity run (2e2) 2022-10-22 05:37:34 -07:00
816d7edc38 app: stacked_cores: 48-xx: complete a few more runs with varied conductivity ratios 2022-10-22 01:24:18 -07:00
32c643ef13 app: stacked_cores: 48-xx: complete runs for 5e2/4e4 ctrl/coupling run
high slope (1.70) over a narrow domain
2022-10-21 20:33:51 -07:00
8a8823ffd8 app: stacked_cores: more 48-xx runs where we vary the coupling conductivity separate from the control conductivity 2022-10-21 19:19:55 -07:00
75a88887f0 app: stacked_cores: 48-xx: simulate a few more variants
got one with a 1.4x slope at the start.
that's novel across all inverters i've simulated to-date.
2022-10-21 09:54:20 -07:00
3dbdead1cb app: stacked_cores: 48-xx: complete a few more runs 2022-10-21 05:13:28 -07:00
daf50324d7 app: stacked_cores: complete more 48-xx runs 2022-10-21 01:00:46 -07:00
6f57e17bef app: stacked_cores: 48-xx: add some runs 2022-10-17 06:51:48 -07:00
7c0151220c app: stacked_cores: new 48-xx sim which varies conductivities on a 2-core buffer 2022-10-17 04:32:30 -07:00
ee74163131 app: stacked_cores: complete a few runs of 46-xx where the output is floating
this sohws us that most of the load preventing M1 from switching is due
to us holding its *downstream* core steady.

if we could somehow make it so that the downstream core presented a
lower load to M1, then we could hold it steady while writing M0 -> M1.

this is similar to saying "make M0 -> M1 a circuit that amplifies A >> 1
and make M1 -> M2 a 1:1 circuit". then we can hold M2 low and still get
amplification A - 1.

then the question is how do we get A >> 1?
2022-10-17 03:40:03 -07:00
760dd0070f app: stacked_cores: complete a few more 46-xx runs 2022-10-16 23:18:33 -07:00
ff2c79162c app: stacked_cores: 47-xx: cascade two buffers and vary their parameterization 2022-10-16 17:21:10 -07:00
c458b3135b app: stacked_cores: fix flipped 41-xx measurements 2022-10-16 06:02:13 -07:00
e8adf6eaa7 app: stacked_cores: include intermediate core values in the db for multi-core inverters 2022-10-16 05:20:55 -07:00
3498649312 42-xx: try some > 400um inverters 2022-10-16 04:58:00 -07:00
7ecd8fa881 app: stacked_cores: backfill some 40-xx parameterizations 2022-10-16 04:30:49 -07:00
226e4949d0 app: stacked_cores: minimize what we extrapolate from beyond the measured transfer domain 2022-10-16 04:28:44 -07:00
74858ee247 app: stacked_cores: add aliases for poorly formatted f32 strings 2022-10-16 02:29:31 -07:00
3614d00871 app: stacked_cores: sort all the inverters in the db 2022-10-16 02:12:47 -07:00
bc61fd0d0a app: stacked_cores: 46-xx: complete some runs of an inverter cascaded into a buffer
the results aren't great :'(
2022-10-16 02:00:55 -07:00
33b0b76278 app: stacked_cores: plot what happens when one cascades an inverter into a buffer 2022-10-15 23:26:10 -07:00
d03818b58e app: stacked_cores: try varying the number of control loops separately from the coupling loops
doesn't make a huge difference, apparently.
2022-10-15 21:45:37 -07:00
3a21cf7655 app: stacked_cores: try a 3-core inverter where the 3rd core is initialized LOW
theory being that this would placeless load on the intermediary core,
allowing it to transition more. but that wasn't actually the case.
2022-10-15 07:45:14 -07:00
8a3914d56d app: stacked_cores: factor out the inverter wiring setup 2022-10-14 20:11:58 -07:00
5a61613381 app: stacked_cores: 43-xx: complete more current variations 2022-10-14 19:25:05 -07:00
997ac5f299 app: stacked_cores: 43-xx: complete some 600um runs 2022-10-14 08:18:38 -07:00
8407c2c8e8 app: stacked_cores: 43-xx: run more current variations 2022-10-13 21:53:30 -07:00
196e6c8790 app: stacked_cores: 43-xx: run a few 5x 3:1 current variations 2022-10-13 19:22:58 -07:00
b07da366f1 app: stacked_cores: 43-xx: ingest results 2022-10-13 17:27:16 -07:00
f4d637fc98 app: stacked_cores: new 43-xx experiment where we cascade two asymmetrically-wound inverters 2022-10-12 07:39:02 -07:00
1cfebb73e0 app: stacked_cores: complete a few more 42-xx runs 2022-10-12 03:42:25 -07:00
0bf7b379d6 app: stacked_cores: explore more 4x 7:1 parameterizations 2022-10-11 23:27:40 -07:00
2f097ab1a8 app: stacked_cores: 42-xx: explore more 9x 3:1 parameterizations 2022-10-11 21:26:17 -07:00
f4b21afe58 app: stacked_cores: 42-xx: explore 6x 5:1 parameterizations 2022-10-11 20:25:25 -07:00
0c079585b0 app: stacked_cores: 42-xx: explore some more > 3:1 runs 2022-10-11 18:46:40 -07:00
c6814796e1 app: stacked_cores: 42-xx: conclude a 3e10 drive variant of the 2x 13:1 inverter 2022-10-11 07:25:46 -07:00
09ea393417 app: stacked_cores: 42-xx: run a 2x 13:1 experiment at 2e10 current 2022-10-11 04:32:37 -07:00
e76fd7f045 app: stacked_cores: 42-xx: re-measure 400um 4x 7:1 at 1e10 coupling 2022-10-11 03:29:31 -07:00
ff203011df app: stacked_cores: 42-xx: explore more runs of the low-current 400um 9x 3:1 parameterization 2022-10-11 02:11:30 -07:00
348042ca00 app: stacked_cores: 42-xx: complete more runs 2022-10-10 22:49:46 -07:00
bab747b97b app: stacked_cores: 42-xx: complete some runs
not all the "inverters" from 41-xx lend themselves to actual, native,
inverters when natively inverted.
2022-10-10 16:45:44 -07:00
197c1ca30d app: stacked_cores: complete the first 42-xx inverter run 2022-10-10 06:48:27 -07:00
d8eeecfa4e app: stacked_cores: new grouping: 42-xx: test a native inverter 2022-10-10 05:15:48 -07:00
ff88b18473 Intersection: add a new3 constructor 2022-10-10 04:25:23 -07:00
3e32526099 app: stacked_cores: complete a 400um 9x 3:1 run at 12e9 drive strength 2022-10-10 02:50:50 -07:00
1069f63255 app: stacked_cores: try another 400um 9x 3:1 run with higher current
also completed a bunch more detail for adjacent inverters.
2022-10-09 16:59:24 -07:00
c0e2b1ba6c app: stacked_cores: try a 8e9 drive strength variant of the 400um 3:1 inverter 2022-10-09 06:21:44 -07:00
7150d4c8b3 app: stacked_cores: test some variants of the 400um 6x 5:1 core 2022-10-09 04:56:11 -07:00
8b3b638de1 app: stacked_cores: take more readings for the 400um 5:1 41-xx run 2022-10-09 04:25:05 -07:00
19bf9e2d31 app: stacked_cores: try a 41-xx 400um 4x 7:1 run at 4e10 drive strength 2022-10-09 03:34:31 -07:00
d5f2c75ec7 app: stacked_cores: complete more 41-xx runs of the validated inverters 2022-10-07 14:48:30 -07:00
12d0737c6b app: stacked_cores: 41-xx: finish more runs of the 1200um 3:1 inverter 2022-10-07 03:20:20 -07:00
7b2bb56e7a app: stacked_cores: 41-xx: finish the 1200 um 5:1 inverter 2022-10-06 21:47:28 -07:00
972db0d45f app: stacked_cores: mark (36, 1, um(1200), 4e9) as not a viable inverter 2022-10-06 16:00:22 -07:00
2f9110d858 app: stacked_cores: confirm another inverter: 41-0.0011999999rad-24coupling-5_1_winding-1e10-drive 2022-10-06 03:45:08 -07:00
269a5f979b app: stacked_cores: 41-xx: more db entries for 1:1 coupling
- `41-0.0004rad-18coupling-1_1_winding-1e10-drive`
- `41-0.0004rad-18coupling-1_1_winding-5e9-drive`
2022-10-05 19:31:49 -07:00
2ab3bf39ed app: stacked cores: 41-xx: try non-asymmetric wrapping between cores 2022-10-05 15:37:28 -07:00
159652e1d6 app: stacked_cores: 41-xx: launch a 1200um 5:1 run with higher drive current 2022-10-05 15:30:00 -07:00
aeaed7aba3 app: stacked_cores: 41-xx: record working 1200um inverter; try again with 4e9 I 2022-10-05 15:26:09 -07:00
1d9d3659b8 app: stacked_cores: 41-xx: record new runs
- `41-0.0008rad-24coupling-3_1_winding-3e9-drive`
- `41-0.0008rad-16coupling-5_1_winding-5e9-drive`
- `41-0.0011999999rad-24coupling-5_1_winding-5e9-drive`
2022-10-05 15:16:32 -07:00
0739749982 app: stacked_cores: 41-xx: record some 1200um runs 2022-10-05 00:07:05 -07:00
0de33a33ce app: stacked_cores: 41-xx: try a 1200um 5:1 run 2022-10-04 15:56:37 -07:00
adfa4b1e78 app: stacked_cores: 41-xx: try a 1200um run 2022-10-04 15:54:48 -07:00
6d8e9d050f app: stacked_cores: 41-xx: complete start of 800um 16x 5:1 5e9 I sim, and remove it
too low transfer at logic low.

also, add a tool to analyze inverters without plotting them
2022-10-04 15:52:13 -07:00
c8bf2053ef app: stacked_cores: 41-xx establish a working 800um 3:1 inverter; start on a 800um 5:1 inverter 2022-10-04 15:19:07 -07:00
726e60061f app: stacked cores: tackle more interesting parameterizations sooner 2022-10-04 04:58:30 -07:00
82d264045c app: stacked_cores: 41-xx: try to find more 800um inverters 2022-10-04 04:39:50 -07:00
807ae68523 app: stacked_cores: wrap up 800um 10x 7:1 1e10 I run 2022-10-04 03:20:37 -07:00
ea69807a90 app: stacked_cores: 41-xx: verdict on 600um 18x 3:1 2e10 I 2022-10-04 02:43:42 -07:00
427bb1ec22 app: stacked_cores: 41-xx: mark some completed runs; prototype a 800um 3:1 run 2022-10-04 01:44:53 -07:00
98a7815cd7 enumerated: improve docs 2022-10-04 01:29:04 -07:00
ab01d8eff0 app: stacked_cores: 41-xx: mark which inverters from the last batch were good/bad 2022-10-04 00:22:58 -07:00
b869de6d91 app: stacked-cores: save the 600 um 18x 3:1 runs 2022-10-04 00:09:04 -07:00
0786481b79 app: stacked_cores: 41-xx: define a few more inverters to try 2022-10-03 04:32:59 -07:00
e62d4d066c app: stacked_cores: 40-xx: document more runs 2022-10-03 04:28:30 -07:00
8974282a9d sim: backfill a test to show that conductors properly reflect EM waves 2022-10-03 04:21:43 -07:00
ecfdf5e322 sim: tests: split the ray_propagation test into smaller helpers
these might be useful for other future tests as well.
2022-10-03 03:06:40 -07:00
72d6d017a6 app: stacked cores: 40-xx: complete a few more runs 2022-10-03 02:34:26 -07:00
c9c2f11ec8 app: stacked_cores: 40xx: vary the current on 18x 3:1 600um 2022-10-03 01:45:54 -07:00
b8a7cc54e2 app: stacked_cores: 41-xx: define 2 more sims (600um) 2022-10-02 23:06:38 -07:00
1a156203b7 app: stacked_cores: 40-xx: update db for the 400um 9x 3:1 5e9 I run
it looks like it's not a viable inverter
2022-10-02 22:47:29 -07:00
0d82cf414e app: stacked_cores: 41-xx: gather more samples to demonstrate ineffectiveness of (9, 1, um(400), 2e10) 2022-10-02 17:16:51 -07:00
4800a1b625 app: stacked cores: 40-xx: record more completed runs 2022-10-02 16:57:51 -07:00
c9dd27f741 mb_pgram: better docs 2022-10-02 04:39:22 -07:00
ae3ac2717b app: stacked_cores: 40-xx: define another 400um inverter to test 2022-10-02 04:11:46 -07:00
6c43506a3e app: stacked cores: 44-xx: document newly completed runs; define next parameters 2022-10-02 03:44:57 -07:00
600314d5af delete unnecessary regression tests.
`mb_ferromagnet_50_steps_larger` provides coverage for valid magnetic
devices.
2022-10-02 03:29:17 -07:00
4ffbc0b8af fix broken tests:
- mb_ferromagnet_diff_repro
- mb_ferromagnet_diff_minimal_repro

as the comment hinted:
> these tests probably failed earlier because they were allowing negative mu_r values.
> no (ordinary?) material has a negative permeability.
> most materials (except superconductors) have >= 1.0 relative permeability
> permeability = mu = B/H (or, \Delta B/\Delta H)

in fact, the relative permeability was -0.56.
it's now 1.39
2022-10-02 03:27:02 -07:00
d4a59b8944 app: stacked-cores: 40xx: sort db 2022-10-01 23:50:18 -07:00
9a7591c18e app: stacked-cores: 40xx: ingest new sim runs; start next batch 2022-10-01 23:49:51 -07:00
2ac34f0753 app: stacked_cores: 41-xx: start some new sims based on findings
higher current seems to _decrease_ tx at x=0, generally a good thing.
2022-10-01 16:48:21 -07:00
8484ab7de5 app: stacked cores: complete some 41-xx runs 2022-10-01 16:42:43 -07:00
8c9e02a77f app: stacked_cores: try adding multiple control loops 2022-10-01 04:35:45 -07:00
2353eb531c app: stacked cores: record 1200um results 2022-09-30 23:00:09 -07:00
ef40a8f0f3 app: stacked_cores: ingest a few 1200um results 2022-09-30 06:01:38 -07:00
ea3bc50af2 app: stacked_cores: ingest new results; define next sims 2022-09-30 03:17:14 -07:00
a60bc69403 app: stacked-cores: rearrange/order sims and define some new ones 2022-09-29 18:15:12 -07:00
3bed385cae app: stacked-cores: plot specific cases, like only the viable inverters 2022-09-29 17:15:43 -07:00
5286339413 app: stacked-cores: 40xx db: preserve parameterization in more context
i want to add some filtering functions to the db lookups, and this will
faciliate that
2022-09-29 16:33:44 -07:00
765022639e app: stacked-cores: auto-generate all the names in the 40-xx database 2022-09-29 16:24:47 -07:00
162e9630ad try dumb vertical scaling of inverters
it seems that if we take a non-inverter than has y(0) *close* to 0, and
scale it enough, then we get stable transfer.

this suggests we really just want something with a massive number of
couplings (to keep the coupling ideal) and enough asymmetric windings to
get us > 1.0 tx ratio over some range.
2022-09-29 16:17:48 -07:00
83bd15673d app: stacked cores: record more 40-xx runs
notably, these are entirely new:
`40-0.0008rad-18coupling-5_1_winding-5e10-drive-xx`
2022-09-29 15:09:52 -07:00
2eb714ff74 app: stacked_cores: ingest more runs
the larger cores may indeed be doing better (a little early to tell).
the tendency right now is that too much transfer occurs too early,
such that the region of > 1.0 slope maps _outside_ that region, not
allowing for an inverter to work well.
2022-09-28 15:51:05 -07:00
710e113108 app: stacked_cores: extract values from completed 800um sims 2022-09-28 01:10:04 -07:00
e0f9893b0e app: stacked_cores: explore the 800um cores 2022-09-27 18:21:55 -07:00
54f5a162a4 app: stacked_cores: plot more inverters 2022-09-27 18:21:33 -07:00
091d6f76c8 app: stacked_cores: update measurements for in-progress sims 2022-09-27 18:21:11 -07:00
7aa10f78a3 app: stacked_cores: 40xx_db: define more sims 2022-09-27 17:36:17 -07:00
975fbd7832 app: stacked_cores: record more 40-xx runs 2022-09-27 17:30:49 -07:00
9b93a762f1 app: stacked_cores: allow 40xx_db.py to update itself 2022-09-27 17:15:54 -07:00
9ffd94b23e app: stacked_cores: make 40xx_db.py a quine
in future, invoking this will update the measurements.
2022-09-27 17:08:52 -07:00
3d066d64c6 app: stacked_cores: re-express the database logic
this will make it easier to auto-generate entries
2022-09-27 16:57:34 -07:00
4ee3430db4 app: stacked_cores: simplify the 40xx database 2022-09-27 16:51:45 -07:00
d34e9cc6b2 app: stacked-cores: restructure the 40xx db 2022-09-27 16:41:30 -07:00
049f2d1e4f app: stacked_cores: split the inverter plotting into submodules 2022-09-27 15:59:58 -07:00
8fb4a3be1b app: stacked_cores: define a few more runs 2022-09-27 15:54:11 -07:00
47765f08be app: stacked-cores: document latest run 2022-09-27 02:14:57 -07:00
45f2ecd107 app: stacked_cores: populate results from last run 2022-09-27 01:42:54 -07:00
a3d9b28ab6 capture high-level GPU timing diagnostics
this shows that we spend about 2/3 of the GPU roundtrip time on
stepping. the other 1/3 is i guess scheduling/latency/memory transfers.
2022-09-27 00:20:41 -07:00
50 changed files with 16424 additions and 1418 deletions

211
Cargo.lock generated
View File

@@ -79,7 +79,7 @@ version = "0.2.14"
source = "registry+https://github.com/rust-lang/crates.io-index" source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "d9b39be18770d11421cdb1b9947a45dd3f37e93092cbf377614828a319d5fee8" checksum = "d9b39be18770d11421cdb1b9947a45dd3f37e93092cbf377614828a319d5fee8"
dependencies = [ dependencies = [
"hermit-abi 0.1.19", "hermit-abi",
"libc", "libc",
"winapi", "winapi",
] ]
@@ -90,12 +90,6 @@ version = "1.1.0"
source = "registry+https://github.com/rust-lang/crates.io-index" source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "d468802bab17cbc0cc575e9b053f41e72aa36bfa6b7f55e3529ffa43161b97fa" checksum = "d468802bab17cbc0cc575e9b053f41e72aa36bfa6b7f55e3529ffa43161b97fa"
[[package]]
name = "bimap"
version = "0.6.2"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "bc0455254eb5c6964c4545d8bac815e1a1be4f3afe0ae695ea539c12d728d44b"
[[package]] [[package]]
name = "bincode" name = "bincode"
version = "1.3.3" version = "1.3.3"
@@ -169,9 +163,9 @@ checksum = "c1ad822118d20d2c234f427000d5acc36eabe1e29a348c89b63dd60b13f28e5d"
[[package]] [[package]]
name = "bytemuck" name = "bytemuck"
version = "1.12.1" version = "1.21.0"
source = "registry+https://github.com/rust-lang/crates.io-index" source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "2f5715e491b5a1598fc2bef5a606847b5dc1d48ea625bd3c02c00de8285591da" checksum = "ef657dfab802224e671f5818e9a4935f9b1957ed18e58292690cc39e7a4092a3"
[[package]] [[package]]
name = "byteorder" name = "byteorder"
@@ -194,12 +188,6 @@ dependencies = [
"jobserver", "jobserver",
] ]
[[package]]
name = "cfg-if"
version = "0.1.10"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "4785bdd1c96b2a846b2bd7cc02e86b6b3dbf14e7e53446c4f54c92a361040822"
[[package]] [[package]]
name = "cfg-if" name = "cfg-if"
version = "1.0.0" version = "1.0.0"
@@ -249,12 +237,6 @@ version = "0.1.1"
source = "registry+https://github.com/rust-lang/crates.io-index" source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "f3f6d59c71e7dc3af60f0af9db32364d96a16e9310f3f5db2b55ed642162dd35" checksum = "f3f6d59c71e7dc3af60f0af9db32364d96a16e9310f3f5db2b55ed642162dd35"
[[package]]
name = "compiler_builtins"
version = "0.1.79"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "4f873ce2bd3550b0b565f878b3d04ea8253f4259dc3d20223af2e1ba86f5ecca"
[[package]] [[package]]
name = "conv" name = "conv"
version = "0.3.3" version = "0.3.3"
@@ -316,7 +298,7 @@ dependencies = [
"futures", "futures",
"image", "image",
"imageproc", "imageproc",
"indexmap", "indexmap 1.9.1",
"log", "log",
"more-asserts", "more-asserts",
"ndarray", "ndarray",
@@ -362,7 +344,7 @@ version = "1.3.2"
source = "registry+https://github.com/rust-lang/crates.io-index" source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "b540bd8bc810d3885c6ea91e2018302f68baba2129ab3e88f32389ee9370880d" checksum = "b540bd8bc810d3885c6ea91e2018302f68baba2129ab3e88f32389ee9370880d"
dependencies = [ dependencies = [
"cfg-if 1.0.0", "cfg-if",
] ]
[[package]] [[package]]
@@ -407,7 +389,7 @@ version = "0.8.2"
source = "registry+https://github.com/rust-lang/crates.io-index" source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "2801af0d36612ae591caa9568261fddce32ce6e08a7275ea334a06a4ad021a2c" checksum = "2801af0d36612ae591caa9568261fddce32ce6e08a7275ea334a06a4ad021a2c"
dependencies = [ dependencies = [
"cfg-if 1.0.0", "cfg-if",
"crossbeam-channel", "crossbeam-channel",
"crossbeam-deque", "crossbeam-deque",
"crossbeam-epoch", "crossbeam-epoch",
@@ -421,7 +403,7 @@ version = "0.5.6"
source = "registry+https://github.com/rust-lang/crates.io-index" source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "c2dd04ddaf88237dc3b8d8f9a3c1004b506b54b3313403944054d23c0870c521" checksum = "c2dd04ddaf88237dc3b8d8f9a3c1004b506b54b3313403944054d23c0870c521"
dependencies = [ dependencies = [
"cfg-if 1.0.0", "cfg-if",
"crossbeam-utils", "crossbeam-utils",
] ]
@@ -431,7 +413,7 @@ version = "0.8.2"
source = "registry+https://github.com/rust-lang/crates.io-index" source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "715e8152b692bba2d374b53d4875445368fdf21a94751410af607a5ac677d1fc" checksum = "715e8152b692bba2d374b53d4875445368fdf21a94751410af607a5ac677d1fc"
dependencies = [ dependencies = [
"cfg-if 1.0.0", "cfg-if",
"crossbeam-epoch", "crossbeam-epoch",
"crossbeam-utils", "crossbeam-utils",
] ]
@@ -443,7 +425,7 @@ source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "045ebe27666471bb549370b4b0b3e51b07f56325befa4284db65fc89c02511b1" checksum = "045ebe27666471bb549370b4b0b3e51b07f56325befa4284db65fc89c02511b1"
dependencies = [ dependencies = [
"autocfg", "autocfg",
"cfg-if 1.0.0", "cfg-if",
"crossbeam-utils", "crossbeam-utils",
"memoffset", "memoffset",
"once_cell", "once_cell",
@@ -456,7 +438,7 @@ version = "0.3.6"
source = "registry+https://github.com/rust-lang/crates.io-index" source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "1cd42583b04998a5363558e5f9291ee5a5ff6b49944332103f251e7479a82aa7" checksum = "1cd42583b04998a5363558e5f9291ee5a5ff6b49944332103f251e7479a82aa7"
dependencies = [ dependencies = [
"cfg-if 1.0.0", "cfg-if",
"crossbeam-utils", "crossbeam-utils",
] ]
@@ -466,7 +448,7 @@ version = "0.8.11"
source = "registry+https://github.com/rust-lang/crates.io-index" source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "51887d4adc7b564537b15adcfb307936f8075dfcd5f00dde9a9f1d29383682bc" checksum = "51887d4adc7b564537b15adcfb307936f8075dfcd5f00dde9a9f1d29383682bc"
dependencies = [ dependencies = [
"cfg-if 1.0.0", "cfg-if",
"once_cell", "once_cell",
] ]
@@ -546,28 +528,29 @@ version = "5.4.0"
source = "registry+https://github.com/rust-lang/crates.io-index" source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "907076dfda823b0b36d2a1bb5f90c96660a5bbcd7729e10727f07858f22c4edc" checksum = "907076dfda823b0b36d2a1bb5f90c96660a5bbcd7729e10727f07858f22c4edc"
dependencies = [ dependencies = [
"cfg-if 1.0.0", "cfg-if",
"hashbrown 0.12.3", "hashbrown 0.12.3",
"lock_api", "lock_api",
"once_cell", "once_cell",
"parking_lot_core 0.9.3", "parking_lot_core 0.9.3",
] ]
[[package]]
name = "dlmalloc"
version = "0.2.3"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "a6fe28e0bf9357092740362502f5cc7955d8dc125ebda71dec72336c2e15c62e"
dependencies = [
"libc",
]
[[package]] [[package]]
name = "either" name = "either"
version = "1.8.0" version = "1.8.0"
source = "registry+https://github.com/rust-lang/crates.io-index" source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "90e5c1c8368803113bf0c9584fc495a58b86dc8a29edbf8fe877d21d9507e797" checksum = "90e5c1c8368803113bf0c9584fc495a58b86dc8a29edbf8fe877d21d9507e797"
[[package]]
name = "elsa"
version = "1.10.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "d98e71ae4df57d214182a2e5cb90230c0192c6ddfcaa05c36453d46a54713e10"
dependencies = [
"indexmap 2.7.0",
"stable_deref_trait",
]
[[package]] [[package]]
name = "env_logger" name = "env_logger"
version = "0.9.1" version = "0.9.1"
@@ -581,6 +564,12 @@ dependencies = [
"termcolor", "termcolor",
] ]
[[package]]
name = "equivalent"
version = "1.0.1"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "5443807d6dff69373d433ab9ef5378ad8df50ca6298caf15de6e52e24aaf54d5"
[[package]] [[package]]
name = "exr" name = "exr"
version = "1.5.1" version = "1.5.1"
@@ -646,12 +635,6 @@ version = "0.1.1"
source = "registry+https://github.com/rust-lang/crates.io-index" source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "00b0228411908ca8685dba7fc2cdd70ec9990a6e753e89b6ac91a84c40fbaf4b" checksum = "00b0228411908ca8685dba7fc2cdd70ec9990a6e753e89b6ac91a84c40fbaf4b"
[[package]]
name = "fortanix-sgx-abi"
version = "0.5.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "57cafc2274c10fab234f176b25903ce17e690fca7597090d50880e047a0389c5"
[[package]] [[package]]
name = "futures" name = "futures"
version = "0.3.24" version = "0.3.24"
@@ -750,22 +733,13 @@ dependencies = [
"byteorder", "byteorder",
] ]
[[package]]
name = "getopts"
version = "0.2.21"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "14dbbfd5c71d70241ecf9e6f13737f7b5ce823821063188d7e46c41d371eebd5"
dependencies = [
"unicode-width",
]
[[package]] [[package]]
name = "getrandom" name = "getrandom"
version = "0.1.16" version = "0.1.16"
source = "registry+https://github.com/rust-lang/crates.io-index" source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "8fc3cb4d91f53b50155bdcfd23f6a4c39ae1969c2ae85982b135750cccaf5fce" checksum = "8fc3cb4d91f53b50155bdcfd23f6a4c39ae1969c2ae85982b135750cccaf5fce"
dependencies = [ dependencies = [
"cfg-if 1.0.0", "cfg-if",
"libc", "libc",
"wasi 0.9.0+wasi-snapshot-preview1", "wasi 0.9.0+wasi-snapshot-preview1",
] ]
@@ -776,7 +750,7 @@ version = "0.2.7"
source = "registry+https://github.com/rust-lang/crates.io-index" source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "4eb1a864a501629691edf6c15a593b7a51eebaa1e8468e9ddc623de7c9b58ec6" checksum = "4eb1a864a501629691edf6c15a593b7a51eebaa1e8468e9ddc623de7c9b58ec6"
dependencies = [ dependencies = [
"cfg-if 1.0.0", "cfg-if",
"js-sys", "js-sys",
"libc", "libc",
"wasi 0.11.0+wasi-snapshot-preview1", "wasi 0.11.0+wasi-snapshot-preview1",
@@ -795,9 +769,9 @@ dependencies = [
[[package]] [[package]]
name = "glam" name = "glam"
version = "0.21.3" version = "0.22.0"
source = "registry+https://github.com/rust-lang/crates.io-index" source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "518faa5064866338b013ff9b2350dc318e14cc4fcd6cb8206d7e7c9886c98815" checksum = "12f597d56c1bd55a811a1be189459e8fad2bbc272616375602443bdfb37fa774"
dependencies = [ dependencies = [
"num-traits", "num-traits",
] ]
@@ -877,6 +851,12 @@ dependencies = [
"ahash", "ahash",
] ]
[[package]]
name = "hashbrown"
version = "0.15.2"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "bf151400ff0baff5465007dd2f3e717f3fe502074ca563069ce3a6629d07b289"
[[package]] [[package]]
name = "heck" name = "heck"
version = "0.3.3" version = "0.3.3"
@@ -895,15 +875,6 @@ dependencies = [
"libc", "libc",
] ]
[[package]]
name = "hermit-abi"
version = "0.2.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "1ab7905ea95c6d9af62940f9d7dd9596d54c334ae2c15300c482051292d5637f"
dependencies = [
"libc",
]
[[package]] [[package]]
name = "hexf-parse" name = "hexf-parse"
version = "0.2.1" version = "0.2.1"
@@ -963,6 +934,16 @@ dependencies = [
"hashbrown 0.12.3", "hashbrown 0.12.3",
] ]
[[package]]
name = "indexmap"
version = "2.7.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "62f822373a4fe84d4bb149bf54e584a7f4abec90e072ed49cda0edea5b95471f"
dependencies = [
"equivalent",
"hashbrown 0.15.2",
]
[[package]] [[package]]
name = "inplace_it" name = "inplace_it"
version = "0.3.5" version = "0.3.5"
@@ -975,7 +956,7 @@ version = "0.1.12"
source = "registry+https://github.com/rust-lang/crates.io-index" source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "7a5bbe824c507c5da5956355e86a746d82e0e1464f65d862cc5e71da70e94b2c" checksum = "7a5bbe824c507c5da5956355e86a746d82e0e1464f65d862cc5e71da70e94b2c"
dependencies = [ dependencies = [
"cfg-if 1.0.0", "cfg-if",
"js-sys", "js-sys",
"wasm-bindgen", "wasm-bindgen",
"web-sys", "web-sys",
@@ -1063,7 +1044,7 @@ version = "0.7.3"
source = "registry+https://github.com/rust-lang/crates.io-index" source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "efbc0f03f9a775e9f6aed295c6a1ba2253c5757a9e03d55c6caa46a681abcddd" checksum = "efbc0f03f9a775e9f6aed295c6a1ba2253c5757a9e03d55c6caa46a681abcddd"
dependencies = [ dependencies = [
"cfg-if 1.0.0", "cfg-if",
"winapi", "winapi",
] ]
@@ -1089,7 +1070,7 @@ version = "0.4.17"
source = "registry+https://github.com/rust-lang/crates.io-index" source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "abb12e687cfb44aa40f41fc3978ef76448f9b6038cad6aef4259d3c095a2382e" checksum = "abb12e687cfb44aa40f41fc3978ef76448f9b6038cad6aef4259d3c095a2382e"
dependencies = [ dependencies = [
"cfg-if 1.0.0", "cfg-if",
] ]
[[package]] [[package]]
@@ -1192,7 +1173,7 @@ dependencies = [
"bitflags", "bitflags",
"codespan-reporting", "codespan-reporting",
"hexf-parse", "hexf-parse",
"indexmap", "indexmap 1.9.1",
"log", "log",
"num-traits", "num-traits",
"rustc-hash", "rustc-hash",
@@ -1328,7 +1309,7 @@ version = "1.13.1"
source = "registry+https://github.com/rust-lang/crates.io-index" source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "19e64526ebdee182341572e50e9ad03965aa510cd94427a4549448f285e957a1" checksum = "19e64526ebdee182341572e50e9ad03965aa510cd94427a4549448f285e957a1"
dependencies = [ dependencies = [
"hermit-abi 0.1.19", "hermit-abi",
"libc", "libc",
] ]
@@ -1399,7 +1380,7 @@ version = "0.8.5"
source = "registry+https://github.com/rust-lang/crates.io-index" source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "d76e8e1493bcac0d2766c42737f34458f1c8c50c0d23bcb24ea953affb273216" checksum = "d76e8e1493bcac0d2766c42737f34458f1c8c50c0d23bcb24ea953affb273216"
dependencies = [ dependencies = [
"cfg-if 1.0.0", "cfg-if",
"instant", "instant",
"libc", "libc",
"redox_syscall", "redox_syscall",
@@ -1413,7 +1394,7 @@ version = "0.9.3"
source = "registry+https://github.com/rust-lang/crates.io-index" source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "09a279cbf25cb0757810394fbc1e359949b59e348145c643a939a525692e6929" checksum = "09a279cbf25cb0757810394fbc1e359949b59e348145c643a939a525692e6929"
dependencies = [ dependencies = [
"cfg-if 1.0.0", "cfg-if",
"libc", "libc",
"redox_syscall", "redox_syscall",
"smallvec", "smallvec",
@@ -1746,15 +1727,18 @@ checksum = "08d43f7aa6b08d49f382cde6a7982047c3426db949b1424bc4b7ec9ae12c6ce2"
[[package]] [[package]]
name = "rustc_codegen_spirv" name = "rustc_codegen_spirv"
version = "0.4.0-alpha.15" version = "0.5.0"
source = "git+https://github.com/EmbarkStudios/rust-gpu#985007fc087bb9952106eb3015357bb019e7916a" source = "git+https://github.com/Rust-GPU/rust-gpu?rev=d78c301799e9d254aab3156a230c9a59efd94122#d78c301799e9d254aab3156a230c9a59efd94122"
dependencies = [ dependencies = [
"ar", "ar",
"bimap", "either",
"hashbrown 0.11.2", "hashbrown 0.11.2",
"indexmap", "indexmap 1.9.1",
"lazy_static",
"libc", "libc",
"num-traits", "num-traits",
"once_cell",
"regex",
"rspirv", "rspirv",
"rustc-demangle", "rustc-demangle",
"rustc_codegen_spirv-types", "rustc_codegen_spirv-types",
@@ -1762,14 +1746,15 @@ dependencies = [
"serde", "serde",
"serde_json", "serde_json",
"smallvec", "smallvec",
"spirt",
"spirv-tools", "spirv-tools",
"syn", "syn",
] ]
[[package]] [[package]]
name = "rustc_codegen_spirv-types" name = "rustc_codegen_spirv-types"
version = "0.4.0-alpha.15" version = "0.5.0"
source = "git+https://github.com/EmbarkStudios/rust-gpu#985007fc087bb9952106eb3015357bb019e7916a" source = "git+https://github.com/Rust-GPU/rust-gpu?rev=d78c301799e9d254aab3156a230c9a59efd94122#d78c301799e9d254aab3156a230c9a59efd94122"
dependencies = [ dependencies = [
"rspirv", "rspirv",
"serde", "serde",
@@ -1938,6 +1923,9 @@ name = "smallvec"
version = "1.9.0" version = "1.9.0"
source = "registry+https://github.com/rust-lang/crates.io-index" source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "2fd0db749597d91ff862fd1d55ea87f7855a744a8425a64695b6fca237d1dad1" checksum = "2fd0db749597d91ff862fd1d55ea87f7855a744a8425a64695b6fca237d1dad1"
dependencies = [
"serde",
]
[[package]] [[package]]
name = "spin" name = "spin"
@@ -1948,6 +1936,24 @@ dependencies = [
"lock_api", "lock_api",
] ]
[[package]]
name = "spirt"
version = "0.1.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "06834ebbbbc6f86448fd5dc7ccbac80e36f52f8d66838683752e19d3cae9a459"
dependencies = [
"arrayvec",
"bytemuck",
"elsa",
"indexmap 1.9.1",
"itertools",
"lazy_static",
"rustc-hash",
"serde",
"serde_json",
"smallvec",
]
[[package]] [[package]]
name = "spirv" name = "spirv"
version = "0.2.0+1.5.4" version = "0.2.0+1.5.4"
@@ -1960,8 +1966,8 @@ dependencies = [
[[package]] [[package]]
name = "spirv-builder" name = "spirv-builder"
version = "0.4.0-alpha.15" version = "0.5.0"
source = "git+https://github.com/EmbarkStudios/rust-gpu#985007fc087bb9952106eb3015357bb019e7916a" source = "git+https://github.com/Rust-GPU/rust-gpu?rev=d78c301799e9d254aab3156a230c9a59efd94122#d78c301799e9d254aab3156a230c9a59efd94122"
dependencies = [ dependencies = [
"memchr", "memchr",
"raw-string", "raw-string",
@@ -1973,8 +1979,8 @@ dependencies = [
[[package]] [[package]]
name = "spirv-std" name = "spirv-std"
version = "0.4.0-alpha.15" version = "0.5.0"
source = "git+https://github.com/EmbarkStudios/rust-gpu#985007fc087bb9952106eb3015357bb019e7916a" source = "git+https://github.com/Rust-GPU/rust-gpu?rev=d78c301799e9d254aab3156a230c9a59efd94122#d78c301799e9d254aab3156a230c9a59efd94122"
dependencies = [ dependencies = [
"bitflags", "bitflags",
"glam", "glam",
@@ -1985,8 +1991,8 @@ dependencies = [
[[package]] [[package]]
name = "spirv-std-macros" name = "spirv-std-macros"
version = "0.4.0-alpha.15" version = "0.5.0"
source = "git+https://github.com/EmbarkStudios/rust-gpu#985007fc087bb9952106eb3015357bb019e7916a" source = "git+https://github.com/Rust-GPU/rust-gpu?rev=d78c301799e9d254aab3156a230c9a59efd94122#d78c301799e9d254aab3156a230c9a59efd94122"
dependencies = [ dependencies = [
"proc-macro2", "proc-macro2",
"quote", "quote",
@@ -1996,23 +2002,23 @@ dependencies = [
[[package]] [[package]]
name = "spirv-std-types" name = "spirv-std-types"
version = "0.4.0-alpha.15" version = "0.5.0"
source = "git+https://github.com/EmbarkStudios/rust-gpu#985007fc087bb9952106eb3015357bb019e7916a" source = "git+https://github.com/Rust-GPU/rust-gpu?rev=d78c301799e9d254aab3156a230c9a59efd94122#d78c301799e9d254aab3156a230c9a59efd94122"
[[package]] [[package]]
name = "spirv-tools" name = "spirv-tools"
version = "0.8.0" version = "0.9.0"
source = "registry+https://github.com/rust-lang/crates.io-index" source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "ca7f0f689581589b0a31000317fa31257cb24d040227708718ebd9fedf5cdd2b" checksum = "dc7c8ca2077515286505bd3ccd396e55ac5706e80322e1d6d22a82e1cad4f7c3"
dependencies = [ dependencies = [
"spirv-tools-sys", "spirv-tools-sys",
] ]
[[package]] [[package]]
name = "spirv-tools-sys" name = "spirv-tools-sys"
version = "0.6.0" version = "0.7.0"
source = "registry+https://github.com/rust-lang/crates.io-index" source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "2980b0b4b2a9b5edfeb1dc8a35e84aac07b9c6dcd2339cce004d9355fb62a59d" checksum = "b4b32d9d6469cd6b50dcd6bd841204b5946b4fb7b70a97872717cdc417659c9a"
dependencies = [ dependencies = [
"cc", "cc",
] ]
@@ -2029,17 +2035,7 @@ dependencies = [
name = "spirv_backend_builder" name = "spirv_backend_builder"
version = "0.1.0" version = "0.1.0"
dependencies = [ dependencies = [
"cc",
"cfg-if 0.1.10",
"compiler_builtins",
"dlmalloc",
"fortanix-sgx-abi",
"getopts",
"hashbrown 0.12.3",
"hermit-abi 0.2.0",
"libc",
"spirv-builder", "spirv-builder",
"unicode-width",
] ]
[[package]] [[package]]
@@ -2053,11 +2049,18 @@ dependencies = [
"coremem", "coremem",
] ]
[[package]]
name = "stable_deref_trait"
version = "1.2.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "a8f112729512f8e442d81f95a8a7ddf2b7c6b8a1a6f509a95864142b30cab2d3"
[[package]] [[package]]
name = "stacked_cores" name = "stacked_cores"
version = "0.1.0" version = "0.1.0"
dependencies = [ dependencies = [
"coremem", "coremem",
"log",
] ]
[[package]] [[package]]
@@ -2240,7 +2243,7 @@ version = "0.2.83"
source = "registry+https://github.com/rust-lang/crates.io-index" source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "eaf9f5aceeec8be17c128b2e93e031fb8a4d469bb9c4ae2d7dc1888b26887268" checksum = "eaf9f5aceeec8be17c128b2e93e031fb8a4d469bb9c4ae2d7dc1888b26887268"
dependencies = [ dependencies = [
"cfg-if 1.0.0", "cfg-if",
"wasm-bindgen-macro", "wasm-bindgen-macro",
] ]
@@ -2265,7 +2268,7 @@ version = "0.4.33"
source = "registry+https://github.com/rust-lang/crates.io-index" source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "23639446165ca5a5de86ae1d8896b737ae80319560fbaa4c2887b7da6e7ebd7d" checksum = "23639446165ca5a5de86ae1d8896b737ae80319560fbaa4c2887b7da6e7ebd7d"
dependencies = [ dependencies = [
"cfg-if 1.0.0", "cfg-if",
"js-sys", "js-sys",
"wasm-bindgen", "wasm-bindgen",
"web-sys", "web-sys",

180
README.md
View File

@@ -5,7 +5,7 @@ to model the evolution of some 3d (or 2d) grid-volume of space over time. simula
- some material at each position in the grid - some material at each position in the grid
- a set of stimuli to apply at specific regions in the volume over time - a set of stimuli to apply at specific regions in the volume over time
- a set of "measurements" to evaluate and record as the simulation evolves. - a set of "measurements" to evaluate and record as the simulation evolves
- an optional state file to allow pausing/resumption of long-run simulations - an optional state file to allow pausing/resumption of long-run simulations
after this the simulation is advanced in steps up to some user-specified moment in time. after this the simulation is advanced in steps up to some user-specified moment in time.
@@ -19,8 +19,20 @@ examples are in the [crates/applications/](crates/applications/) directory.
here's an excerpt from the [wavefront](crates/applications/wavefront/src/main.rs) example: here's an excerpt from the [wavefront](crates/applications/wavefront/src/main.rs) example:
```rust ```rust
// Create the simulation "driver" which uses the CPU as backend. // use a general-purpose material, capable of representing vacuum, conductors, and magnetic materials.
let mut driver: driver::CpuDriver = driver::Driver::new(size, feature_size); type Mat = mat::GenericMaterial<f32>;
// simulate a volume of 401x401x1 discrete grid cells.
let (width, height, depth) = (401, 401, 1);
let size = Index::new(width, height, depth);
// each cell represents 1um x 1um x 1um volume.
let feature_size = 1e-6;
// create the simulation "driver".
// the first parameter is the float type to use: f32 for unchecked math, coremem::real::R32
// to guard against NaN/Inf (useful for debugging).
// to run this on the gpu instead of the gpu, replace `CpuBackend` with `WgpuBackend`.
let mut driver = Driver::new(SpirvSim::<f32, Mat, spirv::CpuBackend>::new(size, feature_size));
// create a conductor on the left side. // create a conductor on the left side.
let conductor = Cube::new( let conductor = Cube::new(
@@ -34,17 +46,18 @@ let center_region = Cube::new(
Index::new(200, height/4, 0).to_meters(feature_size), Index::new(200, height/4, 0).to_meters(feature_size),
Index::new(201, height*3/4, 1).to_meters(feature_size), Index::new(201, height*3/4, 1).to_meters(feature_size),
); );
// emit a constant E/H delta over this region for 100 femtoseconds // emit a constant E/H delta over this region for 100 femtoseconds
let stim = Stimulus::new( let stim = ModulatedVectorField::new(
center_region, RegionGated::new(center_region, Fields::new_eh(
UniformStimulus::new( Vec3::new(2e19, 0.0, 0.0),
Vec3::new(2e19, 0.0, 0.0), // E field (per second) Vec3::new(0.0, 0.0, 2e19/376.730),
Vec3::new(0.0, 0.0, 2e19/376.730) // H field (per second) )),
).gated(0.0, 100e-15), Pulse::new(0.0, 100e-15),
); );
driver.add_stimulus(stim); driver.add_stimulus(stim);
// finally, run the simulation: // finally, run the simulation through t=100ps
driver.step_until(Seconds(100e-12)); driver.step_until(Seconds(100e-12));
``` ```
@@ -62,26 +75,25 @@ which can easily be 30-100x faster:
## GPU Acceleration ## GPU Acceleration
we use rust-gpu for gpu acceleration. presently, this requires *specific* versions of rust-nightly to work. we use [rust-gpu]((https://github.com/EmbarkStudios/rust-gpu/)) for gpu acceleration.
presently, this requires *specific* versions of rust-nightly to work.
the feature is toggled at runtime, but compiled unconditionally. set up the toolchain according to [rust-toolchain.toml](rust-toolchain.toml): the feature is toggled at runtime, but compiled unconditionally. set up the toolchain according to [rust-toolchain.toml](rust-toolchain.toml):
``` ```
$ rustup default nightly-2022-04-11 $ rustup default nightly-2023-01-21
$ rustup component add rust-src rustc-dev llvm-tools-preview $ rustup component add rust-src rustc-dev llvm-tools-preview
``` ```
(it's possible to work with older nightlies like `nightly-2022-01-13` or `nightly-2021-06-08` if you enable the 2020 feature and downgrade whichever packages rustc complains about.)
now you can swap out the `CpuDriver` with a `SpirvDriver` and you're set: now you can swap out the `CpuDriver` with a `SpirvDriver` and you're set:
```diff ```diff
- let mut driver: driver::CpuDriver = driver::Driver::new(size, feature_size); - let mut driver = Driver::new(SpirvSim::<f32, Mat, spirv::CpuBackend>::new(size, feature_size));
+ let mut driver: driver::SpirvDriver = driver::Driver::new_spirv(size, feature_size); + let mut driver = Driver::new(SpirvSim::<f32, Mat, spirv::WgpuBackend>::new(size, feature_size));
``` ```
re-run it as before and you should see the same results: re-run it as before and you should see the same results:
``` ```
$ cargo run --release --example wavefront $ cargo run --release --bin wavefront
``` ```
see the "Processing Loop" section below to understand what GPU acceleration entails. see the "Processing Loop" section below to understand what GPU acceleration entails.
@@ -90,10 +102,10 @@ see the "Processing Loop" section below to understand what GPU acceleration enta
the [sr\_latch](crates/applications/sr_latch/src/main.rs) example explores a more interesting feature set. the [sr\_latch](crates/applications/sr_latch/src/main.rs) example explores a more interesting feature set.
first, it "measures" a bunch of parameters over different regions of the simulation first, it "measures" a bunch of parameters over different regions of the simulation
(peak inside [`src/meas.rs`](crates/coremem/src/meas.rs) to see how these each work): (peek inside [`meas.rs`](crates/coremem/src/meas.rs) to see how these each work):
```rust ```rust
// measure a bunch of items of interest throughout the whole simulation duration: // measure some items of interest throughout the whole simulation duration:
driver.add_measurement(meas::CurrentLoop::new("coupling", coupling_region.clone())); driver.add_measurement(meas::CurrentLoop::new("coupling", coupling_region.clone()));
driver.add_measurement(meas::Current::new("coupling", coupling_region.clone())); driver.add_measurement(meas::Current::new("coupling", coupling_region.clone()));
driver.add_measurement(meas::CurrentLoop::new("sense", sense_region.clone())); driver.add_measurement(meas::CurrentLoop::new("sense", sense_region.clone()));
@@ -121,20 +133,20 @@ allowing you to dig further into the simulation in an _interactive_ way (versus
renderer used in the `wavefront` example): renderer used in the `wavefront` example):
```rust ```rust
// serialize frames for later viewing with `cargo run -p coremem_post --release --bin viewer` // serialize frames for later viewing with `cargo run --release --bin viewer`
driver.add_serializer_renderer(&*format!("{}frame-", prefix), 36000, None); driver.add_serializer_renderer(&*format!("{}frame-", prefix), 36000, None);
``` ```
run this, after having setup the GPU pre-requisites: run this, after having setup the GPU pre-requisites:
``` ```
$ cargo run --release --example sr_latch $ cargo run --release --bin sr_latch
``` ```
and then investigate the results with and then investigate the results with
``` ```
$ cargo run -p coremem_post --bin viewer ./out/applications/sr_latch $ cargo run --release --bin viewer ./out/applications/sr_latch
``` ```
![screencapture of Viewer for SR latch at t=2.8ns. it shows two rings spaced horizontally, with arrows circulating them](readme_images/sr_latch_EzBxy_2800ps.png "SR latch at t=2.8ns") ![screencapture of Viewer for SR latch at t=2.8ns. it shows two rings spaced horizontally, with arrows circulating them](readme_images/sr_latch_EzBxy_2800ps.png "SR latch at t=2.8ns")
@@ -145,35 +157,38 @@ the light blue splotches depict the conductors (in the center, the wire coupling
what we see here is that both ferrites (the two large circles in the above image) have a clockwise polarized B field. this is in the middle of a transition, so the E fields look a bit chaotic. advance to t=46 ns: the "reset" pulse was applied at t=24ns and had 22ns to settle: what we see here is that both ferrites (the two large circles in the above image) have a clockwise polarized B field. this is in the middle of a transition, so the E fields look a bit chaotic. advance to t=46 ns: the "reset" pulse was applied at t=24ns and had 22ns to settle:
![screencapture of Viewer for SR latch at t=45.7ns. similar to above but with the B field polarized CCW](readme_images/sr_latch_EzBxy_45700ps.png "SR latch at t=45.7ns") ![screencapture of Viewer for SR latch at t=45.7ns. similar to above but with the B field polarized counter-clockwise](readme_images/sr_latch_EzBxy_45700ps.png "SR latch at t=45.7ns")
we can see the "reset" pulse has polarized both ferrites in the CCW orientation this time. the E field is less pronounced because we gave the system 22ns instead of 3ns to settle this time. we can see the "reset" pulse has polarized both ferrites in the counter-clockwise orientation this time. the E field is less pronounced because we gave the system 22ns instead of 3ns to settle this time.
the graphical viewer is helpful for debugging geometries, but the CSV measurements are useful for viewing numeric system performance. peak inside "out/applications/sr-latch/meas.csv" to see a bunch of measurements over time. you can use a tool like Excel or [visidata](https://www.visidata.org/) to plot the interesting ones. the graphical viewer is helpful for debugging geometries, but the CSV measurements are useful for viewing numeric system performance. peek inside "out/applications/sr-latch/meas.csv" to see a bunch of measurements over time. you can use a tool like Excel or [visidata](https://www.visidata.org/) to plot the interesting ones.
here's a plot of `M(mem2)` over time from the SR latch simulation. we're measuring, over the torus volume corresponding to the ferrite on the right in the images above, the (average) M component normal to each given cross section of the torus. the notable bumps correspond to these pulses: "set", "reset", "set", "reset", "set+reset applied simultaneously", "set", "set".
here's a plot of `M(mem2)` over time from the SR latch simulation. we're measuring the (average) M value along the major tangent to the torus corresponding to the ferrite on the right in the images above. the notable bumps correspond to these pulses: "set", "reset", "set", "reset", "set+reset applied simultaneously", "reset", "reset".
![plot of M(mem2) over time](readme_images/sr_latch_vd_M2.png "plot of M(mem2) over time") ![plot of M(mem2) over time](readme_images/sr_latch_vd_M2.png "plot of M(mem2) over time")
## Processing Loop (and how GPU acceleration works) ## Processing Loop (and how GPU acceleration works)
the processing loop for a simulation is roughly as follows ([`src/driver.rs:step_until`](crates/coremem/src/driver.rs) drives this loop): the processing loop for a simulation is roughly as follows ([`driver.rs:step_until`](crates/coremem/src/driver.rs) drives this loop):
1. evaluate all stimuli at the present moment in time; these produce an "externally applied" E and H field 1. evaluate all stimuli at the present moment in time;
across the entire volume. these produce an "externally applied" E and H field across the entire volume.
2. apply the FDTD update equations to "step" the E field, and then "step" the H field. these equations take the external stimulus from step 1 into account. 2. apply the FDTD update equations to "step" the E field, and then "step" the H field. these equations take the external stimulus from step 1 into account.
3. evaluate all the measurement functions over the current state; write these to disk. 3. evaluate all the measurement functions over the current state; write these to disk.
4. serialize the current state to disk so that we can resume from this point later if we choose. 4. serialize the current state to disk so that we can resume from this point later if we choose.
within each step above, the logic is multi-threaded and the rendeveous points lie at the step boundaries. within each step above, the logic is multi-threaded and the rendezvous points lie at the step boundaries.
it turns out that the Courant rules force us to evaluate FDTD updates (step 2) on a _far_ smaller time scale than the other steps are sensitive to. so to tune for performance, we apply some optimizations here universally: it turns out that the Courant rules force us to evaluate FDTD updates (step 2) on a _far_ smaller time scale than the other steps are sensitive to. so to tune for performance, we apply some optimizations:
- stimuli (step 1) are evaluated only once every N frames (tunable). we still *apply* them on each frame individually. the waveform resembles that of a Sample & Hold circuit. - stimuli (step 1) are evaluated only once every N frames. we still *apply* them on each frame individually. the waveform resembles that of a Sample & Hold circuit.
- measurement functions (step 3) are triggered only once every M frames. - measurement functions (step 3) are triggered only once every M frames.
- the state is serialized (step 4) only once every Z frames. - the state is serialized (step 4) only once every Z frames.
`N`, `M`, and `Z` are all tunable by the application.
as a result, step 2 is actually able to apply the FDTD update functions not just once but up to `min(N, M, Z)` times. as a result, step 2 is actually able to apply the FDTD update functions not just once but up to `min(N, M, Z)` times.
although steps 1 and 3 vary heavily based on the user configuration of the simulation, step 2 can be defined pretty narrowly in code (no user-callbacks/dynamic function calls/etc). this lets us offload the processing of step 2 to a dedicated GPU. by tuning N/M/Z, step 2 becomes the dominant cost in our simulations an GPU offloading can trivially boost performance by more than an order of magnitude on even a mid-range consumer GPU. although steps 1 and 3 vary heavily based on the user configuration of the simulation, step 2 can be defined pretty narrowly in code (no user-callbacks/dynamic function calls/etc). this lets us offload the processing of step 2 to a dedicated GPU. by tuning N/M/Z, step 2 becomes the dominant cost in our simulations and GPU offloading can easily boost performance by more than an order of magnitude on even a mid-range consumer GPU.
# Features # Features
@@ -183,76 +198,66 @@ this library takes effort to separate the following from the core/math-heavy "si
- Measurements - Measurements
- Render targets (video, CSV, etc) - Render targets (video, CSV, etc)
- Materials (conductors, non-linear ferromagnets) - Materials (conductors, non-linear ferromagnets)
- Float implementation (for CPU simulations only) - Float implementation
the simulation only interacts with these things through a trait interface, such that they're each swappable. the simulation only interacts with these things through a trait interface, such that they're each swappable.
common stimuli type live in [src/stim.rs](crates/coremem/src/stim.rs). common stimuli type live in [stim/mod.rs](crates/coremem/src/stim/mod.rs).
common measurements live in [src/meas.rs](crates/coremem/src/meas.rs). common measurements live in [meas.rs](crates/coremem/src/meas.rs).
common render targets live in [src/render.rs](crates/coremem/src/render.rs). these change infrequently enough that [src/driver.rs](crates/coremem/src/driver.rs) has some specialized helpers for each render backend. common render targets live in [render.rs](crates/coremem/src/render.rs). these change infrequently enough that [driver.rs](crates/coremem/src/driver.rs) has some specialized helpers for each render backend.
common materials are spread throughout [src/mat](crates/coremem/src/mat/mod.rs). common materials are spread throughout [mat/mod.rs](crates/cross/src/mat/mod.rs).
different float implementations live in [src/real.rs](crates/coremem/src/real.rs). different float implementations live in [real.rs](crates/cross/src/real.rs).
if you're getting NaNs, you can run the entire simulation on a checked `R64` type in order to pinpoint the moment those are introduced. if you're getting NaNs, you can run the entire simulation on a checked `R64` type in order to pinpoint the moment those are introduced.
## Materials ## Materials
of these, the materials have the most "gotchas". each cell is modeled as having a vector E, H and M field, as well as a Material type defined by the application.
each cell owns an associated material instance.
in the original CPU implementation of this library, each cell had a `E` and `H` component,
and any additional state was required to be held in the material. so a conductor material
might hold only some immutable `conductivity` parameter, while a ferromagnetic material
might hold similar immutable material parameters _and also a mutable `M` field_.
spirv/rust-gpu requires stronger separation of state, and so this `M` field had to be lifted the `Material` trait has the following methods (both are optional):
completely out of the material. as a result, the material API differs slightly between the CPU ```
and spirv backends. as you saw in the examples, that difference doesn't have to appear at the user pub trait Material<R: Real>: Sized {
level, but you will see it if you're adding new materials. fn conductivity(&self) -> Vec3<R>;
/// returns the new M vector for this material. called during each `step_h`.
fn move_b_vec(&self, m: Vec3<R>, target_b: Vec3<R>) -> Vec3<R>;
}
```
### Spirv Materials to add a new material:
- for `CpuBackend` simulations: just implement this trait on your own type and instantiate a `SpirvSim` specialized over that material instead of `GenericMaterial`.
- for `WgpuBackend` simulations: do the above and add a spirv entry-point specialized to your material. scroll to the bottom of
[crates/spirv_backend/src/lib.rs](crates/spirv_backend/src/lib.rs) and follow the examples.
all the materials usable in the spirv backend live in [`crates/spirv_backend/src/mat.rs`](crates/spirv_backend/src/mat.rs). to use your new material alongside other materials like `IsomorphicConductor`, leverage the compound type wrappers
to add a new one, implement the `Material` trait in that file on some new type, which must also in [`compound.rs`](./crates/cross/src/mat/compound.rs):
be in that file. `let my_sim = SpirvSim::<f32, DiscrMat2<IsomorphicConductor<f32>, MyMat>>::new(...)`
next, add an analog type somewhere in the main library, like [`src/mat/mh_ferromagnet.rs`](crates/coremem/src/mat/mh_ferromagnet.rs). this will if the compound wrappers seem complicated, it's because they have to be in order to be compatible with SPIR-V, which
be the user-facing material. does not generally allow addresses to refer to more than one type.
now implement the `IntoFfi` and `IntoLib` traits for this new material inside [`src/sim/spirv/bindings.rs`](crates/coremem/src/sim/spirv/bindings.rs) hence something like `enum { A(IsomorphicConductor<f32>), B(MyMat) }`
so that the spirv backend can translate between its GPU-side material and your CPU-side/user-facing material. has to instead be represented in memory like `(u8 /*discriminant*/, IsomorphicConductor<f32>, MyMat)`.
in practice, we do some extra tricks to fold the discriminant into the other fields and reduce memory usage.
finally, because cpu-side `SpirvSim<M>` is parameterized over a material, but the underlying spirv library
is compiled separately, the spirv library needs specialized dispatch logic for each value of `M` you might want
to use. add this to [`crates/spirv_backend/src/lib.rs`](crates/spirv_backend/src/lib.rs) (it's about five lines: follow the example of `Iso3R1`).
### CPU Materials
adding a CPU material is "simpler". just implement the `Material` trait in [`src/mat/mod.rs`](crates/coremem/src/mat/mod.rs).
either link that material into the `GenericMaterial` type in the same file (if you want to easily
mix materials within the same simulation), or if that material can handle every cell in your
simulation then instantiance a `SimState<M>` object which is directly parameterized over your material.
as can be seen, the Material trait is fairly restrictive. its methods are immutable, and it doesn't even have access to the entire cell state (only the cell's M value, during `move_b_vec`). i'd be receptive to a PR or request that exposes more cell state or mutability: this is just an artifact of me tailoring this specifically to the class of materials i intended to use it for.
## What's in the Box ## What's in the Box
this library ships with the following materials: this library ships with the following materials:
- conductors (Isomorphic or Anisomorphic). supports CPU or GPU. - conductors (Isomorphic or Anisomorphic).
- linear magnets (defined by their relative permeability, mu\_r). supports CPU only.
- a handful of ferromagnet implementations: - a handful of ferromagnet implementations:
- `MHPgram` specifies the `M(H)` function as a parallelogram. supports CPU or GPU. - `MHPgram` specifies the `M(H)` function as a parallelogram.
- `MBPgram` specifies the `M(B)` function as a parallelogram. supports CPU or GPU. - `MBPgram` specifies the `M(B)` function as a parallelogram.
- `MHCurve` specifies the `M(H)` function as an arbitrary polygon. requires a new type for each curve for memory reasons (see `Ferroxcube3R1`). supports CPU only.
measurements include ([src/meas.rs](crates/coremem/src/meas.rs)): measurements include ([meas.rs](crates/coremem/src/meas.rs)):
- E, B or H field (mean vector over some region) - E, B or H field (mean vector over some region)
- energy, power (net over some region) - energy, power (net over some region)
- current (mean vector over some region) - current (mean vector over some region)
- mean current magnitude along a closed loop (toroidal loops only) - mean current magnitude along a closed loop (toroidal loops only)
- mean magnetic polarization magnitude along a closed loop (toroidal loops only) - mean magnetic polarization magnitude along a closed loop (toroidal loops only)
output targets include ([src/render.rs](crates/coremem/src/render.rs)): output targets include ([render.rs](crates/coremem/src/render.rs)):
- `ColorTermRenderer`: renders 2d-slices in real-time to the terminal. - `ColorTermRenderer`: renders 2d-slices in real-time to the terminal.
- `Y4MRenderer`: outputs 2d-slices to an uncompressed `y4m` video file. - `Y4MRenderer`: outputs 2d-slices to an uncompressed `y4m` video file.
- `SerializerRenderer`: dumps the full 3d simulation state to disk. parseable after the fact with [src/bin/viewer.rs](crates/post/src/bin/viewer.rs). - `SerializerRenderer`: dumps the full 3d simulation state to disk. parseable after the fact with [viewer.rs](crates/post/src/bin/viewer.rs).
- `CsvRenderer`: dumps the output of all measurements into a `csv` file. - `CsvRenderer`: dumps the output of all measurements into a `csv` file.
historically there was also a plotly renderer, but that effort was redirected into developing the viewer tool better. historically there was also a plotly renderer, but that effort was redirected into developing the viewer tool better.
@@ -266,13 +271,13 @@ in a FDTD simulation, as we shrink the cell size the time step has to shrink too
this is the "default" optimized version. you could introduce a new material to the simulation, and performance would remain constant. as you finalize your simulation, you can specialize it a bit and compile the GPU code to optimize for your specific material. this can squeeze another factor-of-2 gain: view [buffer\_proto5](crates/applications/buffer_proto5/src/main.rs) to see how that's done. this is the "default" optimized version. you could introduce a new material to the simulation, and performance would remain constant. as you finalize your simulation, you can specialize it a bit and compile the GPU code to optimize for your specific material. this can squeeze another factor-of-2 gain: view [buffer\_proto5](crates/applications/buffer_proto5/src/main.rs) to see how that's done.
contrast that to the CPU-only implementation which achieves 24.6M grid cell steps per second: that's about a 34x gain. contrast that to the CPU-only implementation which achieves 24.6M grid cell steps per second on my 12-core Ryzen 3900X: that's about a 34x gain.
# Support # Support
the author can be reached on Matrix <@colin:uninsane.org> or Activity Pub <@colin@fed.uninsane.org>. i poured a lot of time into making the author can be reached on Matrix <@colin:uninsane.org>, email <mailto:colin@uninsane.org> or Activity Pub <@colin@fed.uninsane.org>.
this: i'm happy to spend the marginal extra time to help curious people make use of what i've made, so don't hesitate to reach out. i'd love for this project to be useful to people besides just myself, so don't hesitate to reach out.
## Additional Resources ## Additional Resources
@@ -293,16 +298,5 @@ David Bennion and Hewitt Crane documented their approach for transforming Diode-
although i decided not to use PML, i found Steven Johnson's (of FFTW fame) notes to be the best explainer of PML: although i decided not to use PML, i found Steven Johnson's (of FFTW fame) notes to be the best explainer of PML:
- [Steven Johnson: Notes on Perfectly Matched Layers (PMLs)](https://math.mit.edu/~stevenj/18.369/spring07/pml.pdf) - [Steven Johnson: Notes on Perfectly Matched Layers (PMLs)](https://math.mit.edu/~stevenj/18.369/spring07/pml.pdf)
a huge thanks to everyone above for sharing the fruits of their studies. though my work here is of a lesser caliber, i hope that someone, likewise, may someday find it of use. a huge thanks to everyone above for sharing the fruits of their studies.
this project would not have happened if not for literature like the above from which to draw.
## License
i'm not a lawyer, and i don't want to be.
by nature of your reading this, my computer has freely shared these bits with yours.
at this point, it's foolish to think i could do anything to restrict your actions with them, and even more foolish to believe that i have any sort of "right" to do so.
however, if you somehow believe IP laws are legitimate, then:
- i claim whatever minimal copyright is necessary for my own use of this code (and future modifications made by me/shared to this repository) to continue unencumbered.
- i license these works to you according to that same condition and the additional condition that your use of these works does not force me into any additional interactions with legal systems which i would not have made were these works not made available to you (e.g. your license to these works is conditional upon your not filing any lawsuits/patent claims/etc against me).
do note that the individual dependencies of this software project include licenses of their own. for your convenience, i've annotated each dependency inside [Cargo.toml](Cargo.toml) with its respective license.

View File

@@ -1,7 +1,7 @@
/// this example creates a "set/reset" latch from a non-linear ferromagnetic device. //! this example creates a "set/reset" latch from a non-linear ferromagnetic device.
/// this is quite a bit like a "core memory" device. //! this is quite a bit like a "core memory" device.
/// the SR latch in this example is wired to a downstream latch, mostly to show that it's //! the SR latch in this example is wired to a downstream latch, mostly to show that it's
/// possible to transfer the state (with some limitation) from one latch to another. //! possible to transfer the state (with some limitation) from one latch to another.
use coremem::{Driver, mat, meas}; use coremem::{Driver, mat, meas};
use coremem::geom::{Coord as _, Meters, Torus}; use coremem::geom::{Coord as _, Meters, Torus};
@@ -59,7 +59,7 @@ fn main() {
let coupling_region = Torus::new_xz(Meters::new(0.5*(ferro1_center + ferro2_center), ferro_center_y, half_depth), wire_coupling_major, wire_minor); let coupling_region = Torus::new_xz(Meters::new(0.5*(ferro1_center + ferro2_center), ferro_center_y, half_depth), wire_coupling_major, wire_minor);
let sense_region = Torus::new_xz(Meters::new(ferro2_center + ferro_major, ferro_center_y, half_depth), wire_major, wire_minor); let sense_region = Torus::new_xz(Meters::new(ferro2_center + ferro_major, ferro_center_y, half_depth), wire_major, wire_minor);
let mut driver = Driver::new(SpirvSim::<f32, mat::FullyGenericMaterial<f32>, WgpuBackend>::new( let mut driver = Driver::new(SpirvSim::<f32, mat::GenericMaterial<f32>, WgpuBackend>::new(
Meters::new(width, height, depth).to_index(feat_size), feat_size Meters::new(width, height, depth).to_index(feat_size), feat_size
)); ));

View File

@@ -6,3 +6,4 @@ edition = "2021"
[dependencies] [dependencies]
coremem = { path = "../../coremem" } coremem = { path = "../../coremem" }
log = "0.4"

View File

@@ -0,0 +1,10 @@
#!/usr/bin/env python3
from stacked_cores_40xx_db import *
sims = [(p, c.logically_inverted()) for (p, c) in filter_meas(run="41")]
sims.sort(key=lambda c: c[1].get(1.0))
for (params, curve) in sims[:20]:
viable = curve.is_viable_inverter()
print(f"{params}: {curve.get(1.0):.3}, {curve.get(0.0):.3}, inv?: {viable}")

View File

@@ -0,0 +1,117 @@
#!/usr/bin/env python3
"""
invoke with the path to a meas.csv file for the stacked_core 51-xx or later demos
to extract higher-level info from them.
"""
import os
import sys
import re
from natsort import natsorted
from stacked_cores import load_csv, labeled_rows, last_row_before_t, extract_m
from stacked_cores_39xx import extract_polarity
class MeasRow:
def __init__(self, t_sec: float, m: list):
self.t_sec = t_sec
self.m = m
def __repr__(self) -> str:
m = ", ".join(f"{v:6}" for v in self.m)
return f"MeasRow({self.t_sec}, [{m}])"
@staticmethod
def from_dict(row_data: dict) -> 'MeasRow':
t_sec = row_data["time"]
m = [int(m + 0.5) for m in extract_m(row_data)]
return MeasRow(t_sec, m)
def format_float_tuple(t: tuple) -> str:
formatted_elems = [f"{e:= 05.3f}," for e in t]
return f"({' '.join(formatted_elems)})"
def format_list(l: list) -> str:
if len(l) == 0: return "[]"
if len(l) == 1: return f"{l}"
formatted_elems = [f" {e}," for e in l]
return "\n".join(["["] + formatted_elems + ["]"])
def indented(s: str) -> str:
return s.replace('\n', '\n ')
class ParameterizedMeas:
def __init__(self, meas = None):
self.meas = meas or {}
def add_meas(self, params: tuple, meas_rows: list):
self.meas[tuple(params)] = meas_rows
def all_rows(self) -> list:
# this is just `sum(self.meas.values())` but python is a idiot
rows = []
for mrows in self.meas.values():
rows.extend(mrows)
return rows
def runs(self) -> list:
return self.meas.values()
def num_runs(self) -> int:
return len(self.meas)
def __repr__(self) -> str:
meas_entries = "\n".join(
f" {format_float_tuple(k)}: {indented(format_list(v))}," for (k, v) in natsorted(self.meas.items())
)
return f"ParameterizedMeas({{\n{meas_entries}\n}})"
def extract_rows(path: str, times: list) -> list:
header, raw_rows = load_csv(path)
rows = labeled_rows(header, raw_rows)
meas_rows = []
for t in times:
row = last_row_before_t(rows, t)
if not row: return None
meas_rows.append(MeasRow.from_dict(row))
# validate the sim has run to completion
if meas_rows[-1].t_sec < 0.95 * t: return None
meas_rows[-1].t_sec = t # make pretty
return meas_rows
def parse_param(s: str) -> float:
""" parse a parameter in the form of 'p050' or 'n0015' or '000' """
if s == "000":
return 0.0
sign = {'n': -1, 'p': 1}[s[0]]
mag = int(s[1:])
max_mag = 10**(len(s[1:]) - 1)
return sign * mag / max_mag
def extract_params(pstr: str) -> list:
""" extract parameters from a string like -n100-000 """
pieces = [p for p in pstr.split("-") if p]
return [parse_param(p) for p in pieces]
def extract_parameterized_meas(stem: str, times: list) -> ParameterizedMeas:
""" given some stem, parse all parameterized measurements associated with that stem """
base_dir, prefix = os.path.split(stem)
built = ParameterizedMeas()
for entry in os.listdir(base_dir):
if entry.startswith(prefix):
meas_rows = extract_rows(os.path.join(base_dir, entry, "meas.csv"), times)
if not meas_rows: continue
params = extract_params(entry[len(prefix):])
built.add_meas(params, meas_rows)
return built
if __name__ == "__main__":
print(extract_parameterized_meas(sys.argv[1], [float(f) for f in sys.argv[2:]]))

View File

@@ -0,0 +1,89 @@
from inverter_characteristics import Piecewise
# stable inverter (ideal)
fwd_fake_step = Piecewise(
[
[ 0.0, 0.0 ],
[ 0.4, 0.0 ],
[ 0.6, 1.0 ],
[ 1.0, 1.0 ],
]
)
# stable inverter (amplifying)
fwd_fake_1_5x = Piecewise(
[
[ 0.0, 0.0 ],
[ 0.65, 1.0 ],
[ 1.0, 1.0 ],
]
)
# stable inverter (amplifying only from 0.3 -> 0.5)
fwd_fake_slope_change_before_0_5 = Piecewise(
[
[ 0.0, 0.2 ],
[ 0.3, 0.3 ],
[ 0.5, 0.6 ],
[ 1.0, 1.0 ],
]
)
# failed inverter (>1.0 slope happens too late)
# flipping x doesn't fix.
# however, shifting x by -0.1 and y by -0.2 and *then* inverting x does.
# - this gives us a concave-up 1-x like curve
fwd_fake_slope_change_after_0_5 = Piecewise(
[
[ 0.0, 0.2 ],
[ 0.3, 0.3 ],
[ 0.6, 0.5 ],
[ 1.0, 1.0 ],
]
)
slope_fake_hill = [
0.8, 0.9, 1.0, 1.1, 1.2, 1.2, 1.1, 1.0, 0.9, 0.8
]
fwd_fake_hill = Piecewise(
[ (0.1*i, 0.1 * sum(slope_fake_hill[0:i])) for i in range(11) ]
)
fwd_fake_asymmetric_hill = Piecewise(
[
(0.0, 0.20),
(0.2, 0.30),
(0.4, 0.45),
(0.6, 0.75),
(0.8, 0.80),
(1.0, 0.85),
]
)
# valid inverter; the [0.6, 1.0] -> 0.8 mapping *fixes* the logic low value to
# 1.0 - 0.8 = 0.2
# and allows anything 0.6 to 1.0 to be recognized as logic high immediately.
# i.e. "bottoming out" is a *good* thing
fwd_fake_asymmetric_flats = Piecewise(
[
(0.0, 0.20),
(0.2, 0.30),
(0.6, 0.80),
(1.0, 0.80),
]
)
fwd_fake_asymmetric_overdrive = Piecewise(
[
(0.0, 0.40),
(0.3, 0.50),
(0.6, 0.85),
(1.0, 0.90),
]
)
fwd_fake_asymmetric_bottom_out = Piecewise(
[
(0.0, 0.00),
(0.8, 0.99),
(1.0, 1.00),
]
)

File diff suppressed because it is too large Load Diff

View File

@@ -0,0 +1,18 @@
#!/usr/bin/env python3
from natsort import natsorted
from stacked_cores_52xx import *
from stacked_cores_52xx_plotters import *
or_gates = read_db(lambda name: name.startswith("52-"))
sweep_a0 = lambda a1, points=101: [(unit_to_m(x/(points-1)), a1, None, None) for x in range(points)]
sweep_a1 = lambda a0, points=101: [(a0, unit_to_m(x/(points-1)), None, None) for x in range(points)]
for name, meas in natsorted(or_gates.items()):
trace = eval_series(meas, sweep_a1(-17000), extract_52xx_tx)
plot(f"{name}", "a1", trace)
plot_slope(f"slope {name}", "a1", trace)

View File

@@ -0,0 +1,32 @@
#!/usr/bin/env python3
from stacked_cores_52xx import *
from stacked_cores_52xx_plotters import *
def extract_53xx_tx(meas_rows: list) -> tuple:
"""
extracts a flat tuple of input/output M mappings from a 53xx run
"""
return (meas_rows[1].m[0], meas_rows[0].m[1], meas_rows[0].m[2], meas_rows[1].m[3])
sweep_buf_inputs = lambda offset=0, points=101: [(None, m, -m + offset, None) for m in sweep_1d(points)]
sweep_m1 = lambda m2, points=101: [(None, m, m2, None) for m in sweep_1d(points)]
buf_gates = read_db(lambda name: name.startswith("53-buf-no_inp_couple-"))
for name, meas in natsorted(buf_gates.items()):
print(name)
# normal M2 = -M1 sweep
trace = eval_series(meas, sweep_buf_inputs(points=41), extract_53xx_tx, y_idx=0)
plot(name, "a0", trace)
plot_slope(f"slope {name}", "a0", trace)
# M2 = 0.25 - M1 shifted sweep
# trace = eval_series(meas, sweep_buf_inputs(8500), extract_53xx_tx, y_idx=0)
# plot(f"In=0.25-Ip {name}", "a0", trace)
# plot_slope(f"slope In=0.25-Ip {name}", "a0", trace)
# M2 fixed at 0.0 while M1 sweeps
# trace = eval_series(meas, sweep_m1(0.0), extract_53xx_tx, y_idx=0)
# plot(f"In=0 {name}", "a0", trace)
# plot_slope(f"slope In=0 {name}", "a0", trace)

View File

@@ -0,0 +1,24 @@
#!/usr/bin/env python3
from natsort import natsorted
from stacked_cores_52xx import *
from stacked_cores_52xx_plotters import *
def extract_54xx_tx(meas_rows: list) -> tuple:
"""
extracts a flat tuple of input/output M mappings from a 54xx run
"""
return (meas_rows[0].m[0], meas_rows[2].m[0])
split_gates = read_db(lambda name: name.startswith("54-"))
sweep_input = lambda points=101: [( unit_to_m(x/(points-1)), None ) for x in range(points)]
for name, meas in natsorted(or_gates.items()):
trace = eval_series(meas, sweep_input(), extract_54xx_tx)
plot(f"{name}", "a1", trace)
plot_slope(f"slope {name}", "a1", trace)

View File

@@ -0,0 +1,24 @@
#!/usr/bin/env python3
from natsort import natsorted
from stacked_cores_52xx import *
from stacked_cores_52xx_plotters import *
def extract_55xx_tx(meas_rows: list) -> tuple:
"""
extracts a flat tuple of input/output M mappings from a 55xx run
"""
return (meas_rows[0].m[1], meas_rows[2].m[1])
split_gates = read_db(lambda name: name.startswith("55-"))
sweep_input = lambda points=101: [( unit_to_m(x/(points-1)), None ) for x in range(points)]
for name, meas in natsorted(split_gates.items()):
trace = eval_series(meas, sweep_input(), extract_55xx_tx)
plot(f"{name}", "a1", trace)
plot_slope(f"slope {name}", "a1", trace)

View File

@@ -0,0 +1,32 @@
#!/usr/bin/env python3
from natsort import natsorted
from stacked_cores_52xx import *
from stacked_cores_52xx_plotters import *
def extract_56xx_tx(meas_rows: list) -> tuple:
"""
extracts a flat tuple of input/output M mappings from a 56xx run
"""
return (
meas_rows[0].m[0], # input
meas_rows[0].m[1], # input
meas_rows[1].m[2], # output
meas_rows[1].m[3], # output
meas_rows[0].m[4], # input
meas_rows[0].m[5], # input
)
buf_gates = read_db(lambda name: name.startswith("56-"))
sweep_buf_inputs = lambda points=101: [(m, m, None, None, -m, -m) for m in sweep_1d(points)]
for name, meas in natsorted(buf_gates.items()):
trace = eval_series(meas, sweep_buf_inputs(), extract_56xx_tx, y_idx=2)
plot(f"{name}", "a1", trace)
plot_slope(f"slope {name}", "a1", trace)

View File

@@ -0,0 +1,37 @@
#!/usr/bin/env python3
from natsort import natsorted
from stacked_cores_52xx import *
from stacked_cores_52xx_plotters import *
def extract_57xx_tx(meas_rows: list) -> tuple:
"""
extracts a flat tuple of input/output M mappings from a 57xx run
"""
return (
meas_rows[0].m[0], # input
meas_rows[1].m[1], # output
meas_rows[0].m[2], # input
meas_rows[0].m[3], # input
meas_rows[1].m[4], # output
meas_rows[0].m[5], # input
)
# buf_gates = read_db(lambda name: name.startswith("57-"))
buf_gates = read_db(lambda name: name.startswith("57-buf-1p-2n-"))
sweep_buf_inputs = lambda points=101: [(m, None, m, -m, None, -m) for m in sweep_1d(points)]
sweep_pos_input = lambda mneg, points=101: [(m, None, m, mneg, None, mneg) for m in sweep_1d(points)]
sweep_2n1p_input = lambda points=101: [(m, None, None, -m, None, -m) for m in sweep_1d(points)]
for name, meas in natsorted(buf_gates.items()):
# trace = eval_series(meas, sweep_buf_inputs(41), extract_57xx_tx, y_idx=1)
# trace = eval_series(meas, sweep_pos_input(0, 41), extract_57xx_tx, y_idx=1)
trace = eval_series(meas, sweep_2n1p_input(41), extract_57xx_tx, y_idx=1)
plot(f"{name}", "a1", trace)
plot_slope(f"slope {name}", "a1", trace)

View File

@@ -0,0 +1,43 @@
#!/usr/bin/env python3
from natsort import natsorted
from stacked_cores_52xx import *
from stacked_cores_52xx_plotters import *
def extract_58xx_tx(meas_rows: list) -> tuple:
"""
extracts a flat tuple of input/output M mappings from a 58xx run
"""
return (
meas_rows[0].m[0], # input (neg)
meas_rows[0].m[2], # I/O (pos)
meas_rows[0].m[3], # I/O (neg)
meas_rows[0].m[5], # input (pos)
meas_rows[1].m[2], # output (pos)
meas_rows[1].m[3], # output (neg)
0.5 * (meas_rows[1].m[2] - meas_rows[1].m[3]) # output (diff)
)
buf_gates = read_db(lambda name: name.startswith("58-"))
sweep_buf_inputs = lambda points=101: [(-m, m, -m, m, None, None, None) for m in sweep_1d(points)]
sweep_mpos = lambda mneg, points=101: [(mneg, m, mneg, m, None, None, None) for m in sweep_1d(points)]
for name, meas in natsorted(buf_gates.items()):
# sweep Mneg = -Mpos
# trace = eval_series(meas, sweep_buf_inputs(41), extract_58xx_tx, y_idx=6)
# plot(f"{name}", "Mpos", trace)
# plot_slope(f"slope {name}", "Mpos", trace)
# sweep M0 with M1 fixed constant (to check for some `max(M+, M-)`-like effect
trace = eval_series(meas, sweep_mpos(-5000, 41), extract_58xx_tx, y_idx=6)
plot(f"{name}", "Mneg=-5000", trace)
plot_slope(f"slope {name}", "Mneg=-5000", trace)
trace = eval_series(meas, sweep_mpos(5000, 41), extract_58xx_tx, y_idx=6)
plot(f"{name}", "Mneg=5000", trace)
plot_slope(f"slope {name}", "Mneg=5000", trace)

View File

@@ -0,0 +1,30 @@
#!/usr/bin/env python3
from natsort import natsorted
from stacked_cores_52xx import *
from stacked_cores_52xx_plotters import *
def extract_60xx_tx(meas_rows: list) -> tuple:
"""
extracts a flat tuple of input/output M mappings from a 60xx run
"""
return (
meas_rows[0].m[0], # input
meas_rows[1].m[2], # output
# meas_rows[1].m[1], # intermediate
# meas_rows[1].m[3], # intermediate
)
buf_gates = read_db(lambda name: name.startswith("60-"))
sweep_inv_input = lambda points=101: [(m, None) for m in sweep_1d(points)]
for name, meas in natsorted(buf_gates.items()):
trace = eval_series(meas, sweep_inv_input(41), extract_60xx_tx, y_idx=1)
plot(f"{name}", "M", trace)
plot_slope(f"slope {name}", "M", trace)

View File

@@ -0,0 +1,307 @@
#!/usr/bin/env python3
from fake_cores_db import *
from stacked_cores_40xx_db import *
_3e10 = "30000001024e0"
_5e10 = "49999998976e0"
class SimParamsCascaded(SimParams):
def __init__(self, p1: SimParams, p2: SimParams):
super().__init__(p1.couplings, p1.wrappings_spec, p1.um, p1.drive_str)
self.p1 = p1
self.p2 = p2
@property
def is_inverter(self) -> bool:
return self.p1.is_inverter ^ self.p2.is_inverter
@property
def human_name(self) -> str:
return f"Cascade: {self.p1.human_name} -> {self.p2.human_name}"
from_params = lambda l: [
(p, get_meas(p)) for p in l if get_meas(p)
]
# plot pre-40xx sims
for (name, curve) in [
# ("fake step", fwd_fake_step.logically_inverted()),
# ("fake 1.5x", fwd_fake_1_5x.logically_inverted()),
# ("fake slope-change", fwd_fake_slope_change_before_0_5.logically_inverted()),
# ("fake slope-change (delayed)", fwd_fake_slope_change_after_0_5.logically_inverted()),
# ("fake slope-change (delayed, shifted)", fwd_fake_slope_change_after_0_5.shifted_x(-0.1).logically_inverted()),
# ("fake slope-change (delayed, shifted, inv-xy)", fwd_fake_slope_change_after_0_5.shifted_x(-0.1).shifted_y(-0.2).logically_inverted_x()),
# ("fake slope-change (delayed, flipped)", fwd_fake_slope_change_after_0_5.logically_inverted_x().logically_inverted()),
# ("fake hill", fwd_fake_hill.logically_inverted()),
# ("fake asymmetric hill", fwd_fake_asymmetric_hill.logically_inverted()),
# ("fake asymmetric flats", fwd_fake_asymmetric_flats.logically_inverted()),
# ("fake asymmetric overdrive", fwd_fake_asymmetric_overdrive.logically_inverted()),
# ("fake asymmetric bottom out", fwd_fake_asymmetric_bottom_out.logically_inverted()),
# ("18", fwd_18.logically_inverted()),
# ("24 5:1 (2e10 I)", fwd_24_5_1_2e10.logically_inverted()),
# ("24 5:1 (5e10 I)", fwd_24_5_1_5e10.logically_inverted()),
# ("24 5:1 (8e10 I)", fwd_24_5_1_8e10.logically_inverted()),
# ("26", fwd_26.logically_inverted()),
# ("38 1:0 (2e10 I)", fwd_38_1_0.logically_inverted()),
# ("38 1:0 (5e10 I)", fwd_38_1_0_5e10.logically_inverted()),
# ("38 2:0 (2e10 I)", fwd_38_2_0.logically_inverted()),
# ("38 2:0 (5e10 I)", fwd_38_2_0_5e10.logically_inverted()),
# ("38 3:0 (2e10 I)", fwd_38_3_0.logically_inverted()),
# ("38 3:0 (5e10 I)", fwd_38_3_0_5e10.logically_inverted()),
# ("38 4:0 (2e10 I)", fwd_38_4_0.logically_inverted()),
# ("38 4:0 (5e10 I)", fwd_38_4_0_5e10.logically_inverted()),
# ("39 2:0 (2e10 I)", inv_39_2_0_2e10),
# ("39 2:0 (5e10 I)", inv_39_2_0_5e10),
# ("39 2:0 (8e10 I)", inv_39_2_0_8e10),
# ("39 2:0 (1e11 I)", inv_39_2_0_1e11),
# ("39 2:0 (15e10 I)", inv_39_2_0_15e10),
]:
curve.plot(title = f"{name} mapping")
curve.logically_inverted().plot_slope(title = f"{name} slope")
curve.plot_equilibrium(title = f"{name} equilibrium")
# curve.plot_integral(title = f"{name} integrated")
of_interest = []
# plot select stims:
# of_interest += filter_meas(rad_um=800, drive=5e10, couplings=12, wrappings=5)
# of_interest += filter_meas(rad_um=800, drive=5e10, couplings=8, wrappings=11)
# of_interest += filter_meas(rad_um=800, drive=5e10, couplings=12, wrappings=7)
# of_interest += filter_meas(rad_um=800, drive=5e10, couplings=12, wrappings=5)
# of_interest += filter_meas(rad_um=800, drive=5e10, couplings=10, wrappings=9)
# of_interest += filter_meas(rad_um=800, drive=1e11, couplings=18, wrappings=5)
# of_interest += filter_meas(rad_um=800, drive=1e11, couplings=12, wrappings=7)
# of_interest += filter_meas(rad_um=800, drive=1e11, couplings=24, wrappings=3)
# of_interest += [(p, c.shifted_y(-0.13)) for (p, c) in filter_meas(rad_um=800, drive=1e11, couplings=12, wrappings=7)]
# of_interest += filter_meas(run="40", rad_um=800, drive=1e11, couplings=8, wrappings=11)
# of_interest += filter_meas(run="40", rad_um=800, drive=1e11, couplings=10, wrappings=9)
# of_interest += filter_meas(run="40", rad_um=1200, drive=1e11, couplings=12, wrappings=9)
# of_interest += filter_meas(run="40", rad_um=1200, drive=1e11, couplings=10, wrappings=11)
# of_interest += filter_meas(run="40", rad_um=1200, drive=2e11, couplings=12, wrappings=9)
# of_interest += filter_meas(run="41", viable_inverter=True)
# of_interest = [
# (p, c) for (p, c) in of_interest if p not in [
# SimParams41(9, 3, 600, "3e9"),
# SimParams41(9, 1, 400, "3e9"),
# SimParams41(10, 3, 800, "25e8"),
# SimParams41(9, 1, 400, "2e9"),
# SimParams41(10, 3, 800, "2e9"),
# SimParams41(10, 3, 800, "3e9"),
# SimParams41(10, 3, 800, "1e9"),
# SimParams41(10, 3, 800, "5e9"),
# SimParams41(10, 3, 800, "5e10"),
# SimParams41(9, 3, 600, "1e10"),
# SimParams41(9, 1, 400, "2e10"),
# ]
# ]
of_interest += from_params(
[
# SimParams41(4, 3, 400, "2e10"),
# SimParams41(4, 3, 400, "4e10"),
# SimParams41(16, 2, 800, "2e10"),
# SimParams41(18, 1, 600, "3e9"),
# SimParams41(18, 1, 600, "5e9"),
# SimParams41(18, 1, 600, "1e10"),
# SimParams41(18, 1, 600, "2e10"),
# SimParams41(4, 3, 400, "1e10"),
# SimParams41(9, 1, 400, "1e10"),
# SimParams41(12, 2, 600, "1e10"),
# SimParams41(12, 2, 600, "5e9"),
# SimParams41(10, 3, 800, "2e10"),
# SimParams41(6, 2, 400, "1e10"),
# SimParams41(9, 3, 600, "2e10"),
# SimParams41(10, 3, 800, "1e10"),
# SimParams41(24, 1, 800, "3e9"),
# SimParams41(24, 1, 800, "5e9"),
# SimParams41(16, 2, 800, "1e10"),
# SimParams41(24, 2, 1200, "5e9"),
# SimParams41(24, 2, 1200, "1e10"),
# SimParams41(36, 1, 1200, "5e9"),
# SimParams41(36, 1, 1200, "4e9"),
# SimParams41(36, 1, 1200, "3e9"),
# # SimParams41(9, 1, 400, "5e9"),
# SimParams41(18, 0, 400, "1e10"),
# SimParams41(18, 0, 400, "5e9"),
# SimParams41(9, 1, 400, "5e9"),
# SimParams41(9, 1, 400, "1e10"),
# SimParams41(9, 1, 400, "2e10"),
]
)
all_viable_inverters = filter_meas(viable_inverter=True);
# inverters with steepest starting slope
inverters_with_steepest_slope0 = from_params(
[
SimParams48(5e2, 4e4, 4000, 200, 9, 1, 400, "1e10"),
# SimParams48(5e2, 2e4, 2000, 100, 9, 1, 400, "1e10"),
SimParams48(5e2, 1e4, 1000, 50, 9, 1, 400, "1e10"),
SimParams48(1e3, 1e4, 2000, 100, 9, 1, 400, "1e10"),
SimParams41(36, 1, 1200, "5e9"),
SimParams41(24, 2, 1200, "1e10"),
SimParams41(16, 2, 800, "1e10"),
SimParams41(12, 2, 600, "1e10"),
]
)
_47xx_all = filter_meas(run="47")
_48xx_all = filter_meas(run="48")
_48xx_study = from_params(
[
# T0: y0=0.62, slope0=1.4 until x=0.20
SimParams48(5e2, 1e4, 1000, 50, 9, 1, 400, "1e10"),
# y0=0.67, slope0=1.0x until x=0.25
# SimParams48(5e2, 1e4, 1000, 50, 9, 1, 400, "2e10"),
# y0=0.55, slope0=1.2 until x=0.30
SimParams48(5e2, 5e3, 1000, 50, 9, 1, 400, "1e10"),
# # y0=0.60, slope0=0.96
# SimParams48(5e2, 5e3, 1000, 50, 9, 1, 400, "2e10"),
# y0=0.63, slope0=1.0 until x=0.15
SimParams48(1e3, 1e4, 2000, 100, 9, 1, 400, "2e10"),
# y0=0.57, slope0=1.25 until x=0.30
SimParams48(1e3, 1e4, 2000, 100, 9, 1, 400, "1e10"),
# y0=0.47, slope0=1.1 until x=0.30
SimParams48(1e3, 1e4, 2000, 100, 9, 1, 400, "5e9"),
# y0=0.52, slope0=1.3 until x=0.20
SimParams48(5e2, 1e4, 1000, 50, 9, 1, 400, "5e9"),
# y0=0.67, slope0=1.6 until x=0.20
SimParams48(5e2, 2e4, 2000, 100, 9, 1, 400, "1e10"),
# y0=0.57, slope0=1.3 until x=0.20
SimParams48(5e2, 2e4, 2000, 100, 9, 1, 400, "5e9"),
# y0=0.70, slope0=1.7 until x=0.15
SimParams48(5e2, 4e4, 4000, 200, 9, 1, 400, "1e10"),
# y0=0.59, slope0=1.4 until x=0.20
SimParams48(5e2, 4e4, 2000, 100, 9, 1, 400, "5e9"),
# y0=0.64, slope0=1.4 until x=0.20
SimParams48(2e2, 1e4, 1000, 50, 9, 1, 400, "1e10"),
# y0=0.71, slope0=1.6 until x=0.15
SimParams48(5e2, 1e5, 10000, 500, 9, 1, 400, "1e10"),
# y0=0.62, slope0=1.3 until x=0.20
SimParams48(5e2, 1e5, 10000, 500, 9, 1, 400, "5e9"),
# y0=0.90, slope0=1.3 to x=0.04
SimParams48(5e2, 2e4, 2000, 100, 6, 2, 400, "2e10"),
]
)
_49xx_study = from_params(
[
# slope ranges are measured from x=0 to x=0.2
# y(0)=0.90, y(1)=0.99, slope0>0.28
SimParams50(5e2, 4e4, 4000, 200, 5, 1, 400, "2e10"),
# y(0)=0.79, y(1)=0.98, slope0=0.36 to 0.27
SimParams50(5e3, 4e4, 4000, 200, 5, 1, 400, "2e10"),
# y(0)=0.65, y(1)=0.95, slope0=0.44 to 0.31
SimParams50(5e3, 4e4, 4000, 200, 5, 1, 400, "1e10"),
# y(0)=0.09, y(1)=0.17, slope0=0.07. "best" comparable 47-xx sim
# SimParams47(5, 1, 400, "1e10"),
# y(0)=0.81, y(1)=0.99, slope0=0.50 to 0.30
SimParams50(1e3, 2e4, 2000, 100, 5, 1, 400, "2e10"),
# y(0)=0.60, y(1)=0.89, slope0=0.40 to 0.29
SimParams50(2e3, 2e4, 2000, 100, 5, 1, 400, "1e10"),
# y(0)=0.73, y(1)=0.97, slope0=0.44 to 0.32
SimParams50(2e3, 2e4, 2000, 100, 5, 1, 400, "2e10"),
# y(0)=0.62, y(1)=0.80, slope0=0.23 to 0.22
SimParams50(5e2, 1e4, 1000, 50, 5, 1, 400, "1e10"),
# y(0)=0.65, y(1)=0.80, slope0>0.18
SimParams50(5e2, 2e4, 2000, 100, 5, 1, 400, "1e10"),
# y(0)=0.63, y(1)=0.89, slope0>0.27
SimParams50(1e3, 2e4, 2000, 100, 5, 1, 400, "1e10"),
# y(0)=0.46, y(1)=0.66, slope0>0.23
SimParams50(2e3, 2e4, 2000, 100, 5, 1, 400, "5e9"),
# y(0)=0.50, y(1)=0.74, slope0>0.25
SimParams50(5e3, 2e4, 2000, 100, 5, 1, 400, "1e10"),
# y(0)=0.61, y(1)=0.86, slope0>0.25
SimParams50(5e3, 2e4, 2000, 100, 5, 1, 400, "2e10"),
# y(0)=0.39, y(1)=0.48, slope0>0.11
# SimParams50(1e3, 2e4, 2000, 100, 5, 1, 400, "5e9"),
# y(0)=0.40, y(1)=0.56, slope0>0.15
# SimParams50(1e4, 2e4, 2000, 100, 5, 1, 400, "1e10"),
# y(0)=0.51, y(1)=0.68, slope0>0.16
# SimParams50(1e4, 2e4, 2000, 100, 5, 1, 400, "2e10"),
# y(0)=0.30, y(1)=0.40, slope0>0.09
# SimParams50(2e4, 2e4, 2000, 100, 5, 1, 400, "1e10"),
]
)
_51xx_study = from_params(
[
SimParams51(5e2, 2e4, 2000, 100, 5, 1, 400, _5e10),
SimParams51(1e3, 2e4, 2000, 100, 5, 1, 400, "2e10"),
SimParams51(2e3, 2e4, 2000, 100, 5, 1, 400, "1e10"),
SimParams51(2e3, 2e4, 2000, 100, 5, 1, 400, "2e10"),
SimParams51(2e3, 2e4, 2000, 100, 5, 1, 400, _3e10),
SimParams51(5e3, 2e4, 2000, 100, 5, 1, 400, _3e10),
SimParams51(5e3, 2e4, 2000, 100, 5, 1, 400, _5e10),
SimParams51(2e3, 2e4, 2000, 100, 3, 2, 400, "2e10"),
SimParams51(2e3, 2e4, 2000, 100, 3, 2, 400, _5e10),
SimParams51(2e3, 2e4, 2000, 100, 2, 3, 400, _5e10),
SimParams51(2e3, 2e4, 2000, 100, 2, 3, 400, 1e11),
]
)
# of_interest += filter_meas(run="40")
# of_interest += filter_meas(run="42", wrappings=7)
# of_interest += filter_meas(rad_um=400, run="41")
# of_interest += filter_meas(run="40", rad_um=800, couplings=18, wrappings=5)
# of_interest += filter_meas(run="41", viable_inverter=True)
# of_interest += filter_meas(run="42", rad_um=400, couplings=4)
# of_interest += filter_meas(run="42", rad_um=400, couplings=9)
# of_interest += filter_meas(run="42", rad_um=400, couplings=2)
# of_interest += filter_meas(run="42", rad_um=400, couplings=6)
# of_interest += filter_meas(run="41")
# of_interest += filter_meas(run="48")
# of_interest += filter_meas(run="48", coupling_cond=1e4, drive=1e10)
# of_interest += filter_meas(run="48", coupling_cond=1e4, drive=5e9)
# of_interest += inverters_with_steepest_slope0
# of_interest += _47xx_all
# of_interest += _48xx_study
# of_interest += _49xx_study
of_interest += _51xx_study
# plot cascaded inverter -> buffer
# for (inv_p, inv_curve) in filter_meas(is_inverter=True):
# for (fwd_p, fwd_curve) in filter_meas(rad_um=400, is_inverter=False):
# of_interest += [ (SimParamsCascaded(inv_p, fwd_p), inv_curve.cascaded(fwd_curve)) ]
# plot cascaded buffer -> inverter
# for (fwd_p, fwd_curve) in filter_meas(run="41", is_inverter=False):
# for (inv_p, inv_curve) in filter_meas(is_inverter=True):
# of_interest += [ (SimParamsCascaded(fwd_p, inv_p), fwd_curve.cascaded(inv_curve)) ]
# of_interest += filter_meas(is_inverter=False)
# of_interest += filter_meas(is_inverter=True)
# of_interest.sort(key = lambda i: -i[1].max_abs_slope())
# of_interest.sort(key = lambda i: -i[1].get_range()) # output range
# of_interest.sort(key = lambda i: i[1].get(0.5) - i[1].get(1.0)) # delayed output swing
# of_interest.sort(key = lambda i: i[1].get(0.5) - i[1].get(0.0)) # early output swing
# of_interest.sort(key = lambda i: i[1].get_repeated(1.0) - i[1].get_repeated(0.0)) # inverter strength
for (params, curve) in of_interest:
curve = curve.flat_extrapolation()
fwd = curve.logically_inverted() if params.is_inverter else curve
fwd.plot(title = f"{params.human_name} mapping")
fwd.plot_slope(title = f"{params.human_name} slope")
inv = fwd.logically_inverted()
# if params.is_inverter or True:
# inv.plot_equilibrium(title = f"{params.human_name} equilibrium")

View File

@@ -9,48 +9,67 @@ import re
from stacked_cores import load_csv, labeled_rows, last_row_before_t, extract_m from stacked_cores import load_csv, labeled_rows, last_row_before_t, extract_m
def extract_one(path: str, t_first: float, t_last: float): def extract_one(path: str, t_first: float, t_last: float, t_mid: float = None):
header, raw_rows = load_csv(path) header, raw_rows = load_csv(path)
rows = labeled_rows(header, raw_rows) rows = labeled_rows(header, raw_rows)
tx_init = last_row_before_t(rows, t_first) tx_init = last_row_before_t(rows, t_first)
tx_fini = last_row_before_t(rows, t_last) tx_fini = last_row_before_t(rows, t_last)
m_init = extract_m(tx_init) tx_mid = last_row_before_t(rows, t_mid) if t_mid is not None else None
m_fini = extract_m(tx_fini)
return m_init[0], m_fini[-1] if tx_fini and float(tx_fini["time"]) < 0.95 * t_last:
tx_fini = None
m_init = extract_m(tx_init) if tx_init is not None else [None]
m_fini = extract_m(tx_fini) if tx_fini is not None else [None]
m_mid = extract_m(tx_mid)[1:-1] if tx_mid is not None else []
return m_init[0], m_fini[-1], m_mid
def extract_polarity(stem: str) -> float: def extract_polarity(stem: str) -> float:
s = None s = None
if re.search("-p\d\d\d", stem): if re.search("-p\d\d\d\d?", stem):
s = re.search("-p\d\d\d", stem).group(0) s = re.search("-p\d\d\d\d?", stem).group(0)
if re.search("-n\d\d\d", stem): if re.search("-n\d\d\d\d?", stem):
s = re.search("-n\d\d\d", stem).group(0) s = re.search("-n\d\d\d\d?", stem).group(0)
if s: if s:
sign = {'n': -1, 'p': 1}[s[1]] sign = {'n': -1, 'p': 1}[s[1]]
mag = int(s[2:]) mag = int(s[2:])
return sign * mag * 0.01 max_mag = 10**(len(s[2:]) - 1)
return sign * mag / max_mag
if "-000" in stem: if "-000" in stem:
return 0.00 return 0.00
def extract_39xx(base_path: str, t_first: str = "2e-9", t_last: str = "3e-9"): def extract_39xx(base_path: str, t_first: str = "2e-9", t_last: str = "3e-9", t_mid: str = None):
t_first = float(t_first)
t_last = float(t_last)
t_mid = float(t_mid) if t_mid is not None else None
base_dir, prefix = os.path.split(base_path) base_dir, prefix = os.path.split(base_path)
mappings = {} mappings = {}
for entry in os.listdir(base_dir): for entry in os.listdir(base_dir):
if entry.startswith(prefix): if entry.startswith(prefix):
(input_, output) = extract_one(os.path.join(base_dir, entry, "meas.csv"), float(t_first), float(t_last)) (input_, output, mid) = extract_one(os.path.join(base_dir, entry, "meas.csv"), t_first, t_last, t_mid)
polarity = extract_polarity(entry) polarity = extract_polarity(entry)
mappings[int(round(input_))] = (int(round(output)), polarity) if input_ is not None and output is not None:
mappings[int(round(input_))] = (int(round(output)), polarity, [round(m) for m in mid])
print("Piecewise(") if mappings:
print(" [") print("Piecewise(")
for i, (o, polarity) in sorted(mappings.items()): print(" [")
comment = f" # {polarity:.2}" if polarity is not None else "" for i, (o, polarity, mid) in sorted(mappings.items()):
print(f" [ {i:6}, {o:6} ],{comment}") comments = []
print(" ]") if polarity is not None:
print(")") comments += [f"{polarity:= 05.3f}"]
for core, val in enumerate(mid):
comments += [f"M{core+1}={val:5}"]
comment = " # " + ", ".join(comments) if comments else ""
print(f" [ {i:6}, {o:6} ],{comment}")
print(" ]")
print(")")
if __name__ == '__main__': if __name__ == '__main__':

View File

@@ -0,0 +1,377 @@
from inverter_characteristics import Piecewise
fwd_17_4_0_8e10 = Piecewise(
[
[ -16381, 6688 ],
[ -15885, 6778 ],
[ -14831, 6878 ],
[ -13622, 7004 ],
[ -883, 8528 ],
[ 6252, 9496 ],
[ 7846, 9703 ],
[ 8148, 9766 ],
[ 8425, 9831 ],
[ 8705, 9892 ],
[ 8988, 9916 ],
[ 9866, 10114 ],
[ 11179, 10234 ],
[ 12033, 10382 ],
[ 12491, 10422 ],
[ 13135, 10494 ],
[ 14363, 10649 ],
]
).normalized(17000)
fwd_18 = Piecewise(
[
[ -16206, -1131 ],
[ -15192, -746 ],
[ -12827, 33 ],
[ -642, 4990 ],
[ 13082, 9652 ],
[ 16696, 10600 ],
]
).normalized(17000)
fwd_24_5_1_2e10 = Piecewise(
[
[ -12912, -8487 ],
[ -4754, -6045 ],
[ 2687, -2560 ],
[ 3936, -1774 ],
[ 4267, -1517 ],
[ 4504, -1314 ],
[ 4710, -1132 ],
[ 4820, -1075 ],
[ 4884, -1042 ],
[ 4948, -1012 ],
[ 5046, -968 ],
[ 5205, -897 ],
[ 5364, -829 ],
[ 5525, -760 ],
[ 5843, -622 ],
[ 6764, -197 ],
[ 9467, 788 ],
]
).normalized(15000)
fwd_24_5_1_5e10 = Piecewise(
[
[ -15208, -6303 ],
[ -13396, -5388 ],
[ -11992, -4516 ],
[ -11991, -4499 ],
[ -9379, -2953 ],
[ -4757, 531 ],
[ -2, 4734 ],
[ 3074, 7760 ],
[ 4854, 9784 ],
[ 5611, 10736 ],
[ 5994, 11126 ],
[ 6298, 11404 ],
[ 6678, 11757 ],
[ 7196, 12200 ],
[ 7667, 12589 ],
[ 8238, 13048 ],
[ 8239, 13046 ],
[ 9613, 14027 ],
[ 10585, 14622 ],
[ 12048, 15346 ],
]
).normalized(17000)
fwd_24_5_1_8e10 = Piecewise(
[
[ -16412, -3392 ],
[ -15266, -2681 ],
[ -14036, -1897 ],
[ -12789, -1110 ],
[ -8766, 1588 ],
[ -2052, 6544 ],
[ 2389, 9989 ],
[ 4225, 11437 ],
[ 5194, 12182 ],
[ 5971, 12438 ],
[ 6901, 12937 ],
[ 8308, 13632 ],
[ 9910, 14365 ],
[ 10583, 14662 ],
[ 11240, 14850 ],
[ 12114, 15171 ],
[ 13862, 15600 ],
]
).normalized(17000)
fwd_26 = Piecewise(
[
[ -14687, -7326 ],
[ -13049, -6503 ],
[ -11785, -5833 ],
[ -4649, -1447 ],
[ 4961, 7059 ],
[ 11283, 11147 ],
]
).normalized(17000)
fwd_38_1_0 = Piecewise(
[
[ -12817, -8131 ], # -1.00
[ -12239, -7798 ], # -0.80
[ -4859, -2587 ], # -0.50
[ 1490, 3012 ], # -0.30
[ 3866, 5327 ], # -0.20
[ 6030, 7237 ], # -0.10
[ 7747, 8357 ], # 0.00
[ 9494, 9202 ], # +0.10
[ 11261, 10011 ], # +0.20
[ 12941, 10808 ], # +0.30
[ 15415, 11986 ], # +0.50
[ 16196, 12375 ], # +0.80
# [ 16182, 12352 ], # +1.00
]
).normalized(17000)
fwd_38_1_0_5e10 = Piecewise(
[
[ -16180, -7079 ], # -1.00
[ -14443, -5965 ], # -0.50
[ -5579, -13 ], # -0.20
[ 10033, 7676 ], # 0.00
[ 14986, 9375 ], # +0.20
[ 15149, 9606 ], # +0.50
[ 15801, 9924 ], # +1.00
]
).normalized(17000)
fwd_38_2_0 = Piecewise(
[
[ (-13745 + -13012)/2, -6222 ], # -1.00
[ (-13097 + -12338)/2, -5662 ], # -0.80
[ (-4969 + -4744)/2, 2373 ], # -0.50
[ (535 + 611)/2, 8793 ], # -0.30
[ (1772 + 2070)/2, 10467 ], # -0.20
[ (3143 + 3200)/2, 11906 ], # -0.10
[ (4472 + 4114)/2, 12921 ], # 0.00
[ (5838 + 5144)/2, 13788 ], # +0.10
[ (7221 + 6291)/2, 14530 ], # +0.20
[ (8558 + 7644)/2, 15127 ], # +0.30
[ (11159 + 10397)/2, 15865 ], # +0.50
[ (12778 + 14243)/2, 16162 ], # +0.80
[ (12430 + 15653)/2, 16202 ], # +1.00
]
).normalized(17000)
fwd_38_2_0_5e10 = Piecewise(
[
[ (-16386 + -16170)/2, -3490 ], # -1.00
[ (-16107 + -15529)/2, -3035 ], # -0.80
[ (-15075 + -14122)/2, -1827 ], # -0.50
[ (-13387 + -12396)/2, -63 ], # -0.30
[ (-5358 + -5201)/2, 7423 ], # -0.20
[ (2355 + 1719)/2, 12039 ], # -0.10
[ (7563 + 5962)/2, 13479 ], # 0.00
[ (10617 + 9318)/2, 14282 ], # +0.10
[ (12779 + 12447)/2, 14796 ], # +0.20
[ (12649 + 15269)/2, 15034 ], # +0.30
[ (13077 + 16320)/2, 15140 ], # +0.50
[ (14410 + 16557)/2, 15260 ], # +0.80
[ (15281 + 16623)/2, 15331 ], # +1.00
]
).normalized(17000)
fwd_38_3_0 = Piecewise(
[
[ (-13956 + -13890 + -13077)/3, -5203 ], # -1.00
[ (-13292 + -13161 + -12374)/3, -4518 ], # -0.80
[ (-4979 + -4885 + -4717)/3, 5051 ], # -0.50
[ (381 + -153 + 31)/3, 11264 ], # -0.30
[ (1531 + 503 + 1006)/3, 12509 ], # -0.20
[ (2862 + 1120 + 1743)/3, 13549 ], # -0.10
[ (4180 + 1821 + 2239)/3, 14386 ], # 0.00
[ (5560 + 2564 + 2899)/3, 15033 ], # +0.10
[ (6986 + 3436 + 3701)/3, 15451 ], # +0.20
[ (8358 + 4396 + 4732)/3, 15738 ], # +0.30
[ (10482 + 6644 + 7735)/3, 16081 ], # +0.50
[ (11246 + 12478 + 12663)/3, 16343 ], # +0.80
[ (11436 + 13343 + 14411)/3, 16380 ], # +1.00
]
).normalized(17000)
fwd_38_3_0_5e10 = Piecewise(
[
[ (-16403 + -16389 + -16152)/3, -1175 ], # -1.00
[ (-16134 + -16084 + -15471)/3, -701 ], # -0.80
[ (-15192 + -14891 + -14016)/3, 777 ], # -0.50
[ (-13512 + -13089 + -12278)/3, 2939 ], # -0.30
[ (-5248 + -5187 + -5032)/3, 11125 ], # -0.20
[ (2099 + 645 + 708)/3, 14046 ], # -0.10
[ (7045 + 3536 + 3757)/3, 14557 ], # 0.00
[ (9729 + 6054 + 6543)/3, 14967 ], # +0.10
[ (11453 + 9238 + 10081)/3, 15393 ], # +0.20
[ (11572 + 13274 + 13839)/3, 15759 ], # +0.30
[ (12534 + 15192 + 16090)/3, 15925 ], # +0.50
[ (14013 + 16353 + 16508)/3, 16007 ], # +0.80
[ (14944 + 16565 + 16606)/3, 16033 ], # +1.00
]
).normalized(17000)
fwd_38_4_0 = Piecewise(
[
[ (-14020 + -14112 + -13935 + -13091)/4, -4701 ], # -1.00
[ (-13353 + -13363 + -13185 + -12381)/4, -3947 ], # -0.80
[ (-4982 + -4912 + -4870 + -4696)/4, 6398 ], # -0.50
[ (338 + -243 + -352 + -254)/4, 12205 ], # -0.30
[ (1469 + 303 + 107 + 510)/4, 13165 ], # -0.20
[ (2789 + 839 + 443 + 1089)/4, 13989 ], # -0.10
[ (4150 + 1560 + 653 + 1416)/4, 14727 ], # 0.00
[ (5562 + 2421 + 979 + 1899)/4, 15224 ], # +0.10
[ (7027 + 3336 + 1460 + 2518)/4, 15551 ], # +0.20
[ (8402 + 4357 + 2093 + 3293)/4, 15802 ], # +0.30
[ (10385 + 7210 + 5554 + 5998)/4, 16123 ], # +0.50
[ (11301 + 12558 + 13004 + 12508)/4, 16388 ], # +0.80
[ (11462 + 13429 + 13829 + 13370)/4, 16408 ], # +1.00
]
).normalized(17000)
fwd_38_4_0_5e10 = Piecewise(
[
[ (-16395 + -16418 + -16377 + -16134)/4, 346 ], # -1.00
[ (-16128 + -16140 + -16066 + -15431)/4, 843 ], # -0.80
[ (-15217 + -15075 + -14834 + -13958)/4, 2407 ], # -0.50
[ (-13556 + -13263 + -13009 + -12226)/4, 4772 ], # -0.30
[ (-5186 + -5134 + -5083 + -4922)/4, 12813 ], # -0.20
[ (1894 + 537 + -149 + 331)/4, 14556 ], # -0.10
[ (6845 + 3237 + 1699 + 2725)/4, 14936 ], # 0.00
[ (9591 + 5649 + 3599 + 5023)/4, 15250 ], # +0.10
[ (11255 + 8605 + 7601 + 8571)/4, 15647 ], # +0.20
[ (11268 + 12837 + 13307 + 13021)/4, 15944 ], # +0.30
[ (12388 + 14744 + 15209 + 15530)/4, 16104 ], # +0.50
[ (13871 + 16190 + 16345 + 16412)/4, 16178 ], # +0.80
[ (14774 + 16497 + 16563 + 16559)/4, 16197 ], # +1.00
]
).normalized(17000)
inv_39_2_0_2e10 = Piecewise(
[
[ -12902, 10759 ],
[ -12339, 11336 ],
[ -8581, 11274 ],
[ -4821, 10571 ],
[ -822, 9463 ],
[ 3117, 8265 ],
[ 4938, 7704 ],
[ 6441, 7221 ],
[ 7234, 6912 ],
[ 7844, 6662 ],
[ 8282, 6551 ],
[ 8674, 6443 ],
[ 9071, 6325 ],
[ 9479, 6191 ],
[ 10311, 5885 ],
[ 11153, 5541 ],
[ 12833, 4788 ],
[ 14097, 4071 ],
[ 14561, 3816 ],
]
).normalized(15000)
inv_39_2_0_5e10 = Piecewise(
[
[ -15691, 9609 ],
[ -15154, 9450 ],
[ -14498, 9327 ],
[ -14086, 9217 ],
[ -13501, 9113 ],
[ -12664, 8923 ],
[ -9677, 7937 ],
[ -4868, 5948 ],
[ 222, 3390 ],
[ 5223, 610 ],
[ 9175, -1732 ],
[ 11286, -2820 ],
[ 12505, -3439 ],
[ 13504, -3957 ],
[ 14679, -4588 ],
[ 15127, -4830 ],
[ 15667, -5110 ],
[ 16156, -5353 ],
[ 16350, -5450 ],
]
).normalized(17000)
inv_39_2_0_8e10 = Piecewise(
[
[ -16465, 6854 ],
[ -16318, 6905 ],
[ -16079, 6824 ],
[ -15789, 6623 ],
[ -15296, 6435 ],
[ -14593, 6212 ],
[ -14052, 5989 ],
[ -13259, 5563 ],
[ -8825, 3572 ],
[ -1149, -218 ],
[ 6004, -3851 ],
[ 10704, -6032 ],
[ 13131, -6986 ],
[ 14268, -7421 ],
[ 14894, -7683 ],
[ 15301, -7829 ],
[ 15839, -8028 ],
[ 16356, -8243 ],
[ 16507, -8292 ],
]
).normalized(17000)
inv_39_2_0_1e11 = Piecewise(
[
[ -16651, 5123 ],
[ -16567, 5111 ],
[ -16429, 5092 ],
[ -16312, 5120 ],
[ -16102, 5078 ],
[ -15572, 4837 ],
[ -15109, 4545 ],
[ -14393, 4298 ],
[ -13129, 3647 ],
[ -5324, -167 ],
[ 3762, -4392 ],
[ 10248, -7171 ],
[ 13522, -8341 ],
[ 14221, -8595 ],
[ 14851, -8807 ],
[ 15280, -8964 ],
[ 15864, -9130 ],
[ 16440, -9363 ],
[ 16585, -9409 ],
]
).normalized(17000)
inv_39_2_0_15e10 = Piecewise(
[
[ -16854, 1899 ],
[ -16811, 1926 ],
[ -16759, 1908 ],
[ -16723, 1910 ],
[ -16670, 1908 ],
[ -16569, 1907 ],
[ -16466, 1877 ],
[ -16269, 1775 ],
[ -15731, 1520 ],
[ -13797, 601 ],
[ -1756, -4314 ],
[ 9395, -7804 ],
[ 13461, -8670 ],
[ 14026, -8763 ],
[ 14766, -8878 ],
[ 15279, -9140 ],
[ 16084, -9300 ],
[ 16568, -9413 ],
[ 16672, -9386 ],
]
).normalized(17000)

File diff suppressed because it is too large Load Diff

View File

@@ -0,0 +1,89 @@
#!/usr/bin/env python3
import os
from natsort import natsorted
from extract_meas import extract_parameterized_meas, indented
from stacked_cores_52xx_db import DB
## CONSTANTS/CONFIGURATION
# list of sims to extract details for
PREFIXES = { "52", "53", "54", "55", "56", "57", "58", "59", "60", "61" }
def times_of_interest(sim_name: str) -> list:
# could be more intelligent, extracting e.g. the clock duration from the name
if sim_name.startswith("52-"):
return [2e-9, 4e-9, 8e-9]
if sim_name.startswith("53-"):
return [2e-9, 4e-9]
if sim_name.startswith("54-"):
return [2e-9, 4e-9, 8e-9]
if sim_name.startswith("55-"):
return [4e-9, 6e-9, 10e-9]
if sim_name.startswith("56-"):
return [4e-9, 6e-9]
if sim_name.startswith("57-"):
return [4e-9, 6e-9]
if sim_name.startswith("58-"):
return [4e-9, 6e-9]
if sim_name.startswith("59-buf-inner_input-"):
return [2e-9, 4e-9]
if sim_name.startswith("59-buf-edge_input-"):
return [4e-9, 6e-9]
if sim_name.startswith("60-"):
return [4e-9, 6e-9]
if sim_name.startswith("61-"):
return [4e-9, 6e-9]
## USER-FACING FUNCTIONS
def read_db(name_filter=lambda name: True, min_meas: int=0) -> dict:
return {
name: meas for (name, meas) in DB.items()
if name_filter(name) \
and meas.num_runs() >= min_meas
}
def update_db():
db = compute_db()
dump("stacked_cores_52xx_db.py", db)
## IMPLEMENTATION DETAILS
def compute_db():
here, _ = os.path.split(__file__)
toplevel_out = f"{here}/../../../../out/applications/stacked_cores"
stems = extract_stems(os.listdir(toplevel_out))
return {
s: extract_parameterized_meas(os.path.join(toplevel_out, s), times_of_interest(s))
for s in stems
}
def extract_stems(dirlist: list) -> list:
stems = set()
TERM = "-drive-"
for d in dirlist:
print(d)
header = d.split('-')[0]
if header not in PREFIXES: continue
if TERM not in d: continue
stem = d[:d.find(TERM) + len(TERM)]
stems.add(stem)
return stems
def dump(path: str, db: dict):
with open(path, "w") as f:
f.write("from extract_meas import MeasRow, ParameterizedMeas\n\n")
f.write("DB = {")
for k, v in natsorted(db.items()):
f.write(indented(f"\n{k!r}: {v},"))
f.write("\n}")
if __name__ == '__main__': update_db()

File diff suppressed because it is too large Load Diff

View File

@@ -0,0 +1,176 @@
from math import sqrt
import plotly.express as px
from pandas import DataFrame
import scipy.optimize as opt
unit_to_m = lambda u: -18000 + 36000 * u
sweep_1d = lambda points=101: [unit_to_m(x/(points-1)) for x in range(points)]
def plot(name: str, x_name: str, y_series: list):
""" plot y(x), where y values are specified by `y_series` and x is inferred """
df = DataFrame(data={ x_name: sweep_1d(len(y_series)), "y": y_series })
fig = px.line(df, x=x_name, y="y", title=name)
fig.show()
def plot_slope(name: str, x_name: str, y_series: list):
slope = extract_slope(y_series)
plot(name, x_name, slope)
def extract_slope(y_series: list):
dist = 2 * 36000 / (len(y_series) - 1)
known = [ (next - prev)/dist for (prev, next) in zip(y_series[:-2], y_series[2:]) ]
return [known[0]] + known + [known[-1]]
def eval_series(meas: 'ParameterizedMeas', points: list, extract_tx, y_idx: int = -1) -> list:
"""
extract a list of y-value floats from `meas`.
each x value is a tuple of desired M values at which to sample the curve.
e.g. points = [ (None, 1000.0, 2000.0, None) ] samples at M1=1000.0, M2=2000.0,
treating M0 and M3 as dependent values.
`y_idx` specifies which M value should be treated as the dependent value to be computed.
e.g. `y_idx=0` to compute M0.
`extract_tx` is a function mapping one run (list[list[float]] of M values)
to a measured point of the transfer function. e.g. [15000, -15000, 14000] for a 3-core OR gate.
"""
return [sample_all(meas, p, extract_tx)[y_idx] for p in points]
def sample_all(meas: 'ParameterizedMeas', at: tuple, extract_tx) -> tuple:
"""
computes the interpolated M values at the provided `at` coordinate;
effectively fillin in whichever items in `at` are left at `None`
"""
runs = [extract_tx(r) for r in meas.runs()]
distances = [(distance_to_sq(m, at), m) for m in runs]
# interpolated = weighted_sum_of_neighbors_by_inv_distance(distances)
interpolated = interpolate_minl1(at, runs, distances)
print(at, interpolated)
return interpolated
def extract_52xx_tx(meas_rows: list) -> tuple:
"""
extracts a flat list of input/output M mappings from a 52xx run
"""
return (meas_rows[0].m[0], meas_rows[0].m[1], meas_rows[1].m[2], meas_rows[2].m[3])
def interpolate_minl1(at: tuple, runs: list, distances: list) -> tuple:
# let R = `runs`, A = `at`, D = `distances`, x be the weight of each run
# such that the result is R0 x0 + R1 x1 + ...
#
# solve for x
# subject to R0 x0 + R1 x1 + ... = A for the elements of A != None
# minimize D0 x0 + D1 x1 + ...
#
# relevant scipy docs:
# - <https://docs.scipy.org/doc/scipy/tutorial/optimize.html#trust-region-constrained-algorithm-method-trust-constr>
# - <https://docs.scipy.org/doc/scipy-0.18.1/reference/generated/scipy.optimize.minimize.html>
fixed_coords = [(i, a) for (i, a) in enumerate(at) if a is not None]
num_fixed_coords = len(fixed_coords)
num_runs = len(runs)
# create a matrix E, where E*x = A for Ai != None
eq_constraints = [[0]*num_runs for _ in range(num_fixed_coords)]
for run_idx, run in enumerate(runs):
for constraint_idx, (coord_idx, _a) in enumerate(fixed_coords):
eq_constraints[constraint_idx][run_idx] = run[coord_idx]
eq_rhs = [a for (i, a) in fixed_coords]
eq_constraint = opt.LinearConstraint(eq_constraints, eq_rhs, eq_rhs)
# constrain the sum of weights to be 1.0
weights_sum_to_1_constraint = opt.LinearConstraint([[1] * num_runs], [1], [1])
# constrain the weights to be positive
bounds = opt.Bounds([0]*num_runs, [float("Inf")]*num_runs)
def score(weights: list) -> float:
# function to minimize: D0 x0 + D1 x1 + ...
return sum(w*d[0] for w, d in zip(weights, distances))
# compute the weight of each run
init = [0]*num_runs
constraints = [eq_constraint, weights_sum_to_1_constraint]
try:
res = opt.minimize(score, init, method='trust-constr', constraints=constraints, bounds=bounds)
except ValueError as e:
print(f"failed to interpolate point {e}")
return [0] * len(at)
run_weights = res.x
# sum the weighted runs
return element_sum([weighted(run, weight) for run, weight in zip(runs, run_weights)])
def interpolate(meas: 'ParameterizedMeas', a0: float, a1: float) -> tuple:
"""
this interpolates a point among four neighboring points in 2d.
the implementation only supports 2d, but the technique is extendable to N dim.
"""
rows = [r.m for r in meas.all_rows()]
distances = [(distance_to(m, (a0, a1)), m) for m in rows]
# a0_below_dist, a0_below_val = min(d for d in distances if d[1][0] <= a0)
# a0_above_dist, a0_above_val = min(d for d in distances if d[1][0] >= a0)
# a1_below_dist, a1_below_val = min(d for d in distances if d[1][1] <= a1)
# a1_above_dist, a1_above_val = min(d for d in distances if d[1][1] >= a1)
a0_below = min((d for d in distances if d[1][0] <= a0), default=None)
a0_above = min((d for d in distances if d[1][0] >= a0), default=None)
a1_below = min((d for d in distances if d[1][1] <= a1), default=None)
a1_above = min((d for d in distances if d[1][1] >= a1), default=None)
neighbors = [a for a in [a0_below, a0_above, a1_below, a1_above] if a is not None]
return weighted_sum_of_neighbors_by_inv_distance(neighbors)
def weighted_sum_of_neighbors_by_inv_distance(neighbors: list) -> tuple:
"""
each neighbor is (distance, value).
return a weighted sum of these neighbors, where lower-distance neighbors are more strongly weighted.
"""
D = sum(a[0] for a in neighbors)
weight_n = lambda n: 1/max(n[0], 1e-3) # non-normalized weight for neighbor
W = sum(weight_n(n) for n in neighbors)
weighted_n = lambda n: weighted(n[1], weight_n(n)/W) # normalized weighted contribution for neighbor
return element_sum([weighted_n(n) for n in neighbors])
def weighted_sum_of_neighbors(neighbors: list) -> tuple:
"""
each neighbor is (distance, value).
return a weighted sum of these neighbors, where lower-distance neighbors are more strongly weighted.
"""
D = sum(a[0] for a in neighbors)
weight_n = lambda n: D - n[0] # non-normalized weight for neighbor
W = sum(weight_n(n) for n in neighbors)
weighted_n = lambda n: weighted(n[1], weight_n(n)/W) # normalized weighted contribution for neighbor
return element_sum([weighted_n(n) for n in neighbors])
def distance_to(p0: tuple, p1: tuple) -> float:
"""
return the L2-norm distance from p0 to p1.
any coordinates set to `None` are ignored.
e.g. `distance_to((1, 2, 3), (None, 4, 5))` is the same as `distance_to((2, 3), (4, 5))`
"""
return sqrt(distance_to_sq(p0, p1))
def distance_to_sq(p0: tuple, p1: tuple) -> float:
return sum((x0-x1)*(x0-x1) for (x0, x1) in zip(p0, p1) if x0 is not None and x1 is not None)
def element_sum(lists: list) -> list:
"""
given a list[list[float]] where each inner length is of identical length,
returns a list[float] by summing along each axis.
e.g. element_sum([[1, 2], [3, 4], [5, 6]]) gives `[1+2+5, 2+4+6]`
"""
elems = lists[0]
for l in lists[1:]:
for i, e in enumerate(l):
elems[i] += e
return elems
def weighted(l: list, scale: float) -> list:
"""
given list[float], returns a new list[float] where each element is multipled by `scale`
"""
return [e*scale for e in l]

File diff suppressed because it is too large Load Diff

View File

@@ -14,7 +14,7 @@ use coremem::sim::spirv::{self, SpirvSim};
use coremem::stim::{Fields, ModulatedVectorField, Pulse, RegionGated}; use coremem::stim::{Fields, ModulatedVectorField, Pulse, RegionGated};
use coremem::cross::vec::Vec3; use coremem::cross::vec::Vec3;
type Mat = mat::FullyGenericMaterial<f32>; type Mat = mat::GenericMaterial<f32>;
fn main() { fn main() {
coremem::init_logging(); coremem::init_logging();

View File

@@ -37,9 +37,9 @@ wgpu = "0.12"
# TODO: update to 0.13 # TODO: update to 0.13
# wgpu = { version = "0.13", features = ["spirv", "vulkan-portability"] } # MIT or Apache 2.0 # wgpu = { version = "0.13", features = ["spirv", "vulkan-portability"] } # MIT or Apache 2.0
# spirv-* is MIT or Apache 2.0 # spirv-* is MIT or Apache 2.0
spirv-builder = { git = "https://github.com/EmbarkStudios/rust-gpu", features = ["use-compiled-tools"] } spirv-builder = { git = "https://github.com/Rust-GPU/rust-gpu", rev = "d78c301799e9d254aab3156a230c9a59efd94122", features = ["use-compiled-tools"] }
spirv-std = { git = "https://github.com/EmbarkStudios/rust-gpu" } spirv-std = { git = "https://github.com/Rust-GPU/rust-gpu", rev = "d78c301799e9d254aab3156a230c9a59efd94122" }
spirv-std-macros = { git = "https://github.com/EmbarkStudios/rust-gpu" } spirv-std-macros = { git = "https://github.com/Rust-GPU/rust-gpu", rev = "d78c301799e9d254aab3156a230c9a59efd94122" }
spirv_backend = { path = "../spirv_backend" } spirv_backend = { path = "../spirv_backend" }
spirv_backend_runner = { path = "../spirv_backend_runner" } spirv_backend_runner = { path = "../spirv_backend_runner" }
coremem_cross = { path = "../cross", features = ["iter", "fmt", "serde", "std"] } coremem_cross = { path = "../cross", features = ["iter", "fmt", "serde", "std"] }

View File

@@ -1,12 +1,12 @@
use coremem::Driver; use coremem::Driver;
use coremem::geom::Index; use coremem::geom::Index;
use coremem::mat::{Ferroxcube3R1MH, IsoConductorOr, FullyGenericMaterial}; use coremem::mat::{Ferroxcube3R1MH, IsoConductorOr, GenericMaterial};
use coremem::sim::spirv::{SpirvSim, WgpuBackend}; use coremem::sim::spirv::{SpirvSim, WgpuBackend};
use criterion::{BenchmarkId, criterion_group, criterion_main, Criterion}; use criterion::{BenchmarkId, criterion_group, criterion_main, Criterion};
pub fn bench_step_spirv(c: &mut Criterion) { pub fn bench_step_spirv(c: &mut Criterion) {
type Mat = FullyGenericMaterial<f32>; type Mat = GenericMaterial<f32>;
for size in &[10, 20, 40, 80, 160] { for size in &[10, 20, 40, 80, 160] {
let sim = SpirvSim::<f32, Mat, WgpuBackend>::new(Index::new(*size, *size, *size), 1e-5); let sim = SpirvSim::<f32, Mat, WgpuBackend>::new(Index::new(*size, *size, *size), 1e-5);
c.bench_with_input(BenchmarkId::new("Driver::step_spirv", size), &sim, |b, sim| { c.bench_with_input(BenchmarkId::new("Driver::step_spirv", size), &sim, |b, sim| {

View File

@@ -1,7 +1,7 @@
use coremem::{self, Driver, AbstractSim}; use coremem::{self, Driver, AbstractSim};
use coremem::sim::spirv::{SpirvSim, WgpuBackend}; use coremem::sim::spirv::{SpirvSim, WgpuBackend};
use coremem::sim::units::Frame; use coremem::sim::units::Frame;
use coremem::cross::mat::FullyGenericMaterial; use coremem::cross::mat::GenericMaterial;
use coremem::geom::Index; use coremem::geom::Index;
use std::time::{Instant, Duration}; use std::time::{Instant, Duration};
@@ -25,16 +25,16 @@ fn measure_steps<S: AbstractSim + Clone + Default + Send + 'static>(name: &str,
fn main() { fn main() {
coremem::init_logging(); coremem::init_logging();
measure_steps("spirv/80", 1, Driver::new( measure_steps("spirv/80", 1, Driver::new(
SpirvSim::<f32, FullyGenericMaterial<f32>, WgpuBackend>::new(Index::new(80, 80, 80), 1e-3) SpirvSim::<f32, GenericMaterial<f32>, WgpuBackend>::new(Index::new(80, 80, 80), 1e-3)
)); ));
measure_steps("spirv/80 step(2)", 2, Driver::new( measure_steps("spirv/80 step(2)", 2, Driver::new(
SpirvSim::<f32, FullyGenericMaterial<f32>, WgpuBackend>::new(Index::new(80, 80, 80), 1e-3) SpirvSim::<f32, GenericMaterial<f32>, WgpuBackend>::new(Index::new(80, 80, 80), 1e-3)
)); ));
measure_steps("spirv/80 step(10)", 10, Driver::new( measure_steps("spirv/80 step(10)", 10, Driver::new(
SpirvSim::<f32, FullyGenericMaterial<f32>, WgpuBackend>::new(Index::new(80, 80, 80), 1e-3) SpirvSim::<f32, GenericMaterial<f32>, WgpuBackend>::new(Index::new(80, 80, 80), 1e-3)
)); ));
measure_steps("spirv/80 step(100)", 100, Driver::new( measure_steps("spirv/80 step(100)", 100, Driver::new(
SpirvSim::<f32, FullyGenericMaterial<f32>, WgpuBackend>::new(Index::new(80, 80, 80), 1e-3) SpirvSim::<f32, GenericMaterial<f32>, WgpuBackend>::new(Index::new(80, 80, 80), 1e-3)
)); ));
} }

View File

@@ -22,6 +22,8 @@ pub struct Diagnostics {
time_reading_device: Duration, time_reading_device: Duration,
/// time during which CPU was transferring data to GPU /// time during which CPU was transferring data to GPU
time_writing_device: Duration, time_writing_device: Duration,
/// time during which the GPU was actively computing steps
time_stepping_device: Duration,
start_time: Instant, start_time: Instant,
} }
@@ -48,6 +50,7 @@ impl Diagnostics {
time_blocked_on_render: Default::default(), time_blocked_on_render: Default::default(),
time_reading_device: Default::default(), time_reading_device: Default::default(),
time_writing_device: Default::default(), time_writing_device: Default::default(),
time_stepping_device: Default::default(),
start_time: Instant::now(), start_time: Instant::now(),
} }
} }
@@ -78,9 +81,11 @@ impl Diagnostics {
other_time, other_time,
); );
let device_step_time = self.time_stepping_device.as_secs_f64();
let device_write_time = self.time_writing_device.as_secs_f64(); let device_write_time = self.time_writing_device.as_secs_f64();
let device_read_time = self.time_reading_device.as_secs_f64(); let device_read_time = self.time_reading_device.as_secs_f64();
let device_line = format!("> gpu\twrite: {:.1}s, read: {:.1}s", let device_line = format!("> gpu\tstep: {:.1}s, write: {:.1}s, read: {:.1}s",
device_step_time,
device_write_time, device_write_time,
device_read_time, device_read_time,
); );
@@ -166,6 +171,9 @@ impl SyncDiagnostics {
ret ret
} }
pub fn record_step_device(&self, t: Duration) {
self.0.lock().unwrap().time_stepping_device += t;
}
pub fn instrument_read_device<R, F: FnOnce() -> R>(&self, f: F) -> R { pub fn instrument_read_device<R, F: FnOnce() -> R>(&self, f: F) -> R {
let (elapsed, ret) = Self::measure(f); let (elapsed, ret) = Self::measure(f);
self.0.lock().unwrap().time_reading_device += elapsed; self.0.lock().unwrap().time_reading_device += elapsed;
@@ -176,6 +184,5 @@ impl SyncDiagnostics {
self.0.lock().unwrap().time_writing_device += elapsed; self.0.lock().unwrap().time_writing_device += elapsed;
ret ret
} }
} }

View File

@@ -14,7 +14,7 @@ use coremem_cross::vec::Vec3;
/// it's a torus, but elongated around its axis to resemble a pill shape. /// it's a torus, but elongated around its axis to resemble a pill shape.
/// ///
/// ``` /// ```notrust
/// _______ /// _______
/// / \ /// / \
/// | | /// | |

View File

@@ -5,6 +5,7 @@ use coremem_cross::vec::Vec3;
use rayon::prelude::*; use rayon::prelude::*;
use serde::{Serialize, Deserialize}; use serde::{Serialize, Deserialize};
use std::collections::BTreeSet; use std::collections::BTreeSet;
use std::ops::Deref;
use std::sync::Arc; use std::sync::Arc;
mod constructed; mod constructed;
@@ -171,6 +172,9 @@ impl<R1, R2> Intersection<R1, R2> {
pub fn new2(r1: R1, r2: R2) -> Self { pub fn new2(r1: R1, r2: R2) -> Self {
Self(r1, r2) Self(r1, r2)
} }
pub fn new3<R3: Region>(r1: R1, r2: R2, r3: R3) -> Intersection<Self, R3> {
Intersection::new2(r1, r2).and(r3)
}
pub fn region0_of_2(&self) -> &R1 { pub fn region0_of_2(&self) -> &R1 {
&self.0 &self.0
} }
@@ -197,6 +201,13 @@ impl<R> Translate<R> {
} }
} }
impl<R> Deref for Translate<R> {
type Target = R;
fn deref(&self) -> &Self::Target {
&self.inner
}
}
impl<R: Region> Region for Translate<R> { impl<R: Region> Region for Translate<R> {
fn contains(&self, p: Meters) -> bool { fn contains(&self, p: Meters) -> bool {
self.inner.contains(p - self.shift) self.inner.contains(p - self.shift)
@@ -356,6 +367,13 @@ impl<R> Rotate<R> {
} }
} }
impl<R> Deref for Rotate<R> {
type Target = R;
fn deref(&self) -> &Self::Target {
&self.region
}
}
impl<R: Region> Region for Rotate<R> { impl<R: Region> Region for Rotate<R> {
fn contains(&self, p: Meters) -> bool { fn contains(&self, p: Meters) -> bool {
self.region.contains(Meters(self.rotate_into_region(p.0))) self.region.contains(Meters(self.rotate_into_region(p.0)))

View File

@@ -1,6 +1,6 @@
use crate::diagnostics::SyncDiagnostics; use crate::diagnostics::SyncDiagnostics;
use crate::geom::{Coord, Cube, Index, InvertedRegion, Region}; use crate::geom::{Coord, Cube, Index, InvertedRegion, Region};
use crate::cross::mat::{FullyGenericMaterial, Material}; use crate::cross::mat::{GenericMaterial, Material};
use crate::cross::real::Real; use crate::cross::real::Real;
use crate::cross::step::SimMeta; use crate::cross::step::SimMeta;
use crate::cross::vec::{Vec3, Vec3u}; use crate::cross::vec::{Vec3, Vec3u};
@@ -14,7 +14,7 @@ pub mod units;
use spirv::{CpuBackend, SpirvSim}; use spirv::{CpuBackend, SpirvSim};
pub type GenericSim<R> = SpirvSim<R, FullyGenericMaterial<R>, CpuBackend>; pub type GenericSim<R> = SpirvSim<R, GenericMaterial<R>, CpuBackend>;
/// Conceptually, one cell looks like this (in 2d): /// Conceptually, one cell looks like this (in 2d):

View File

@@ -2,6 +2,7 @@ use futures::FutureExt as _;
use log::info; use log::info;
use std::borrow::Cow; use std::borrow::Cow;
use std::num::NonZeroU64; use std::num::NonZeroU64;
use std::time::Duration;
use wgpu; use wgpu;
use wgpu::util::DeviceExt as _; use wgpu::util::DeviceExt as _;
@@ -282,17 +283,28 @@ impl<R: Copy, M: Send + Sync + HasEntryPoints<R>> SimBackend<R, M> for WgpuBacke
}); });
// let timestamp_period = queue.get_timestamp_period(); let timestamp_period = queue.get_timestamp_period();
let timestamp_readback_slice = timestamp_buffer.slice(..); let timestamp_readback_slice = timestamp_buffer.slice(..);
let timestamp_readback_future = timestamp_readback_slice.map_async(wgpu::MapMode::Read).then(|_| async { let timestamp_readback_future = timestamp_readback_slice.map_async(wgpu::MapMode::Read).then(|_| async {
{ let (e_time, h_time) = {
let mapped = timestamp_readback_slice.get_mapped_range(); let mapped = timestamp_readback_slice.get_mapped_range();
let timings: &[u64] = unsafe { let timings: &[u64] = unsafe {
from_bytes(mapped.as_ref()) from_bytes(mapped.as_ref())
}; };
println!("timings: {:?}", timings);
} let (mut e_time, mut h_time) = (0, 0);
for frame in timings.chunks(4) {
e_time += frame[1] - frame[0];
h_time += frame[3] - frame[2];
}
(
Duration::from_nanos((e_time as f64 * timestamp_period as f64) as u64),
Duration::from_nanos((h_time as f64 * timestamp_period as f64) as u64),
)
};
timestamp_buffer.unmap(); timestamp_buffer.unmap();
diag.record_step_device(e_time + h_time);
}); });
// optimization note: it may be possible to use `WaitForSubmission` // optimization note: it may be possible to use `WaitForSubmission`

View File

@@ -8,7 +8,7 @@ use crate::geom::Index;
use crate::real::Real; use crate::real::Real;
use crate::sim::{AbstractSim, Fields, GenericSim}; use crate::sim::{AbstractSim, Fields, GenericSim};
use crate::stim::{RenderedStimulus, Stimulus}; use crate::stim::{RenderedStimulus, Stimulus};
use coremem_cross::mat::{FullyGenericMaterial, Material}; use coremem_cross::mat::{GenericMaterial, Material};
use coremem_cross::step::SimMeta; use coremem_cross::step::SimMeta;
use coremem_cross::vec::Vec3; use coremem_cross::vec::Vec3;
@@ -41,7 +41,7 @@ pub trait SimBackend<R, M> {
/// Wrapper around an inner state object which offloads stepping onto a spirv backend (e.g. GPU). /// Wrapper around an inner state object which offloads stepping onto a spirv backend (e.g. GPU).
#[derive(Default, Serialize, Deserialize)] #[derive(Default, Serialize, Deserialize)]
// TODO: remove default R/M // TODO: remove default R/M
pub struct SpirvSim<R=f32, M=FullyGenericMaterial<R>, B=WgpuBackend> pub struct SpirvSim<R=f32, M=GenericMaterial<R>, B=WgpuBackend>
where M: 'static where M: 'static
{ {
meta: SimMeta<R>, meta: SimMeta<R>,
@@ -78,7 +78,7 @@ impl<R: Clone, M: Clone, B: Default> Clone for SpirvSim<R, M, B> {
impl<R, M, B> AbstractSim for SpirvSim<R, M, B> impl<R, M, B> AbstractSim for SpirvSim<R, M, B>
where where
R: Real, R: Real,
M: Send + Sync + Material<R> + Clone + Into<FullyGenericMaterial<R>>, M: Send + Sync + Material<R> + Clone + Into<GenericMaterial<R>>,
B: Send + Sync + SimBackend<R, M>, B: Send + Sync + SimBackend<R, M>,
{ {
type Real = R; type Real = R;
@@ -204,7 +204,7 @@ impl<R, M, B> SpirvSim<R, M, B> {
impl<R, M, B> SpirvSim<R, M, B> impl<R, M, B> SpirvSim<R, M, B>
where where
R: Real, R: Real,
M: Send + Sync + Material<R> + Clone + Into<FullyGenericMaterial<R>>, M: Send + Sync + Material<R> + Clone + Into<GenericMaterial<R>>,
B: Send + Sync + SimBackend<R, M>, B: Send + Sync + SimBackend<R, M>,
{ {
fn eval_stimulus<'a, S: Stimulus<R>>(&self, stim: &'a S) fn eval_stimulus<'a, S: Stimulus<R>>(&self, stim: &'a S)
@@ -366,7 +366,7 @@ mod test {
#[test] #[test]
fn accessors() { fn accessors() {
let size = Index::new(15, 17, 19); let size = Index::new(15, 17, 19);
let mut state = SpirvSim::<R32, FullyGenericMaterial<R32>, $backend>::new(size, 1e-6); let mut state = SpirvSim::<R32, GenericMaterial<R32>, $backend>::new(size, 1e-6);
assert_eq!(state.width(), 15); assert_eq!(state.width(), 15);
assert_eq!(state.height(), 17); assert_eq!(state.height(), 17);
assert_eq!(state.depth(), 19); assert_eq!(state.depth(), 19);
@@ -379,7 +379,7 @@ mod test {
#[test] #[test]
fn energy_conservation_over_time() { fn energy_conservation_over_time() {
let mut state = SpirvSim::<R32, FullyGenericMaterial<R32>, $backend>::new( let mut state = SpirvSim::<R32, GenericMaterial<R32>, $backend>::new(
Index((2001, 1, 1).into()), 1e-6 Index((2001, 1, 1).into()), 1e-6
); );
let stim = smooth_pulse_at(state.timestep(), state.feature_size(), Index::new(1000, 0, 0), 100); let stim = smooth_pulse_at(state.timestep(), state.feature_size(), Index::new(1000, 0, 0), 100);
@@ -393,7 +393,7 @@ mod test {
#[test] #[test]
fn sane_boundary_conditions() { fn sane_boundary_conditions() {
let mut state = SpirvSim::<R32, FullyGenericMaterial<R32>, $backend>::new( let mut state = SpirvSim::<R32, GenericMaterial<R32>, $backend>::new(
Index((21, 21, 21).into()), 1e-6 Index((21, 21, 21).into()), 1e-6
); );
let stim = smooth_pulse_at(state.timestep(), state.feature_size(), Index::new(10, 10, 10), 40); let stim = smooth_pulse_at(state.timestep(), state.feature_size(), Index::new(10, 10, 10), 40);
@@ -410,8 +410,8 @@ mod test {
/// Fill the world with the provided material and a stimulus. /// Fill the world with the provided material and a stimulus.
/// Measure energy at the start, and then again after advancing many steps. /// Measure energy at the start, and then again after advancing many steps.
/// Return these two measurements (energy(t=0), energy(t=~=1000)) /// Return these two measurements (energy(t=0), energy(t=~=1000))
fn conductor_test<M: Into<FullyGenericMaterial<R32>>>(mat: M) -> (f32, f32) { fn conductor_test<M: Into<GenericMaterial<R32>>>(mat: M) -> (f32, f32) {
let mut state = SpirvSim::<R32, FullyGenericMaterial<R32>, $backend>::new( let mut state = SpirvSim::<R32, GenericMaterial<R32>, $backend>::new(
Index::new(201, 1, 1), 1e-6 Index::new(201, 1, 1), 1e-6
); );
state.fill_region(&WorldRegion, mat.into()); state.fill_region(&WorldRegion, mat.into());
@@ -458,56 +458,56 @@ mod test {
#[test] #[test]
fn smoke_small() { fn smoke_small() {
let mut state = SpirvSim::<R32, FullyGenericMaterial<R32>, $backend>::new( let mut state = SpirvSim::<R32, GenericMaterial<R32>, $backend>::new(
Index::new(8, 8, 8), 1e-3 Index::new(8, 8, 8), 1e-3
); );
state.step(); state.step();
} }
#[test] #[test]
fn smoke_med_7bit() { fn smoke_med_7bit() {
let mut state = SpirvSim::<R32, FullyGenericMaterial<R32>, $backend>::new( let mut state = SpirvSim::<R32, GenericMaterial<R32>, $backend>::new(
Index::new(124, 124, 124), 1e-3 Index::new(124, 124, 124), 1e-3
); );
state.step(); state.step();
} }
#[test] #[test]
fn smoke_med128() { fn smoke_med128() {
let mut state = SpirvSim::<R32, FullyGenericMaterial<R32>, $backend>::new( let mut state = SpirvSim::<R32, GenericMaterial<R32>, $backend>::new(
Index::new(128, 128, 128), 1e-3 Index::new(128, 128, 128), 1e-3
); );
state.step(); state.step();
} }
#[test] #[test]
fn smoke_med_23bit() { fn smoke_med_23bit() {
let mut state = SpirvSim::<R32, FullyGenericMaterial<R32>, $backend>::new( let mut state = SpirvSim::<R32, GenericMaterial<R32>, $backend>::new(
Index::new(127, 256, 256), 1e-3 Index::new(127, 256, 256), 1e-3
); );
state.step(); state.step();
} }
#[test] #[test]
fn smoke_med_0x800000_indexing() { fn smoke_med_0x800000_indexing() {
let mut state = SpirvSim::<R32, FullyGenericMaterial<R32>, $backend>::new( let mut state = SpirvSim::<R32, GenericMaterial<R32>, $backend>::new(
Index::new(128, 256, 256), 1e-3 Index::new(128, 256, 256), 1e-3
); );
state.step(); state.step();
} }
#[test] #[test]
fn smoke_med_0x800000_address_space() { fn smoke_med_0x800000_address_space() {
let mut state = SpirvSim::<R32, FullyGenericMaterial<R32>, $backend>::new( let mut state = SpirvSim::<R32, GenericMaterial<R32>, $backend>::new(
Index::new(170, 256, 256), 1e-3 Index::new(170, 256, 256), 1e-3
); );
state.step(); state.step();
} }
#[test] #[test]
fn smoke_large() { fn smoke_large() {
let mut state = SpirvSim::<R32, FullyGenericMaterial<R32>, $backend>::new( let mut state = SpirvSim::<R32, GenericMaterial<R32>, $backend>::new(
Index::new(326, 252, 160), 1e-3 Index::new(326, 252, 160), 1e-3
); );
state.step(); state.step();
} }
#[test] #[test]
fn smoke_not_multiple_of_4() { fn smoke_not_multiple_of_4() {
let mut state = SpirvSim::<R32, FullyGenericMaterial<R32>, $backend>::new( let mut state = SpirvSim::<R32, GenericMaterial<R32>, $backend>::new(
Index::new(3, 2, 5), 1e-3 Index::new(3, 2, 5), 1e-3
); );
state.step(); state.step();
@@ -519,7 +519,7 @@ mod test {
// XXX This doesn't do anything, except make sure we don't crash! // XXX This doesn't do anything, except make sure we don't crash!
use rand::{Rng as _, SeedableRng as _}; use rand::{Rng as _, SeedableRng as _};
let mut rng = rand::rngs::StdRng::seed_from_u64(seed); let mut rng = rand::rngs::StdRng::seed_from_u64(seed);
let mut dut_state = SpirvSim::<R32, FullyGenericMaterial<R32>, $backend>::new(size, 1e-3); let mut dut_state = SpirvSim::<R32, GenericMaterial<R32>, $backend>::new(size, 1e-3);
for z in 0..size.z() { for z in 0..size.z() {
for y in 0..size.y() { for y in 0..size.y() {
@@ -559,32 +559,31 @@ mod test {
////////// APPLICATION TESTS ////////// ////////// APPLICATION TESTS //////////
// these test that the simulation accurately models *specific* applications // these test that the simulation accurately models *specific* applications
#[test] /// returns the (E, H) value of the first cycle of a rightward-traveling ray at a given point in space/time.
fn ray_propagation() { /// E amplitude of the ray is assumed to be 1; H is scaled appropriately to ensure that.
let size = Index::new(1, 1, 1536); fn ray(z_origin: i32, wavelength_cells: u32, sim: &SpirvSim<R32, GenericMaterial<R32>, $backend>, z: u32, frame: u32) -> (R32, R32) {
let feat_size = 1e-3; let time_step = sim.timestep().cast::<R32>();
let mut sim = SpirvSim::<R32, FullyGenericMaterial<R32>, $backend>::new(size, feat_size); let feat_size = sim.feature_size().cast::<R32>();
let time_step = sim.timestep();
let wave = |z, t| { let amp_e = R32::one();
let amp_e = R32::one(); let amp_h = R32::c_inv() * R32::mu0_inv();
let amp_h = R32::c_inv() * R32::mu0_inv(); let t_shift = R32::from_primitive(frame) * time_step/feat_size * R32::c();
// this describes a rightward-traveling wave that starts at z=512, let idx = (z as f32 - z_origin as f32 - t_shift.cast::<f32>()) / (wavelength_cells as f32);
// with a wavelength of 512, and we only render one cycle of it. if idx < 0.0 || idx > 1.0 {
let t_shift = t as f32 * time_step/feat_size * f32::c(); return (R32::zero(), R32::zero());
let idx = ((z as f32 - 512.0 - t_shift) / 256.0); }
if idx < 0.0 || idx > 2.0 {
return (R32::zero(), R32::zero());
}
let phase = idx.cast::<R32>() * R32::pi(); let phase = idx.cast::<R32>() * R32::two_pi();
let a = phase.sin(); let a = phase.sin();
(amp_e * a, amp_h * a) (amp_e * a, amp_h * a)
}; }
/// inject one cycle of a rightward-traveling ray with the provided parameters into
/// the sim.
fn inject_ray(z_origin: i32, wavelength_cells: u32, sim: &mut SpirvSim<R32, GenericMaterial<R32>, $backend>) {
// inject the wave at t=0 // inject the wave at t=0
for z in 0..size.z() { for z in 0..sim.size().z() {
let (e, h) = wave(z, 0); let (e, h) = ray(z_origin, wavelength_cells, &sim, z, 0);
// due to the staggered nature of stepping (step_e, then step_h), // due to the staggered nature of stepping (step_e, then step_h),
// populating the initial E *and* H fields through a Stimulus is // populating the initial E *and* H fields through a Stimulus is
// nontrivial. we just explicitly set the fields instead. // nontrivial. we just explicitly set the fields instead.
@@ -594,6 +593,16 @@ mod test {
Vec3::default(), Vec3::default(),
)); ));
} }
}
/// verify that an EM ray propagates through the vacuum as expected
#[test]
fn ray_propagation() {
let size = Index::new(1, 1, 1536);
let feat_size = 1e-3;
let mut sim = SpirvSim::<R32, GenericMaterial<R32>, $backend>::new(size, feat_size);
inject_ray(512, 512, &mut sim);
// advance the simulation N steps. // advance the simulation N steps.
// at each step, the wave we injected should have been shifted by some known amount, // at each step, the wave we injected should have been shifted by some known amount,
@@ -606,12 +615,51 @@ mod test {
(f.e().x().cast::<f32>(), f.h().y().cast::<f32>()) (f.e().x().cast::<f32>(), f.h().y().cast::<f32>())
}).collect(); }).collect();
for (z, (e, h)) in eh.iter().enumerate() { for (z, (e, h)) in eh.iter().enumerate() {
let (exp_e, exp_h) = wave(z as u32, t); let (exp_e, exp_h) = ray(512, 512, &sim, z as u32, t);
assert_float_eq!(*e, exp_e.cast::<f32>(), abs <= 1e-2, "t={}, z={} ... {:?}", t, z, eh); assert_float_eq!(*e, exp_e.cast::<f32>(), abs <= 1e-2, "t={}, z={} ... {:?}", t, z, eh);
assert_float_eq!(*h, exp_h.cast::<f32>(), abs <= 1e-4, "t={}, z={} ... {:?}", t, z, eh); assert_float_eq!(*h, exp_h.cast::<f32>(), abs <= 1e-4, "t={}, z={} ... {:?}", t, z, eh);
} }
} }
} }
/// reflect an EM ray off a conductor and make sure the result is as expected
#[test]
fn conductor_reflection() {
let size = Index::new(1, 1, 2048);
let feat_size = 1e-3;
let mut sim = SpirvSim::<R32, GenericMaterial<R32>, $backend>::new(size, feat_size);
inject_ray(512, 512, &mut sim);
sim.put_material(Index::new(0, 0, 1536), mat::IsomorphicConductor::new(1e9.cast()));
// 512 cells for the ray to propagate *toward* the conductor,
// 512 cells for it to reflect,
// 512 cells for it to propagate back from the conductor.
// the wave crosses 0.577 cells per step (Courant value), so roughly 2662 steps
// for the wave to return home.
for _t in 0..2662 {
sim.step();
}
// now the wave should have returned to the starting point
let eh: Vec<_> = (0..size.z()).into_iter().map(|z| {
let f = sim.fields_at_index(Index::new(0, 0, z));
(f.e().x().cast::<f32>(), f.h().y().cast::<f32>())
}).collect();
for (z, (e, h)) in eh.iter().enumerate() {
let (ray_e, ray_h) = ray(512, 512, &sim, z as u32, 0);
// a conductor forces E=0, hence the head of the return E wave is the
// negative of the head of the origin E wave.
// because of the opposite orientation, this actually negates again the
// return E wave evaluates to the same as the origin E wave.
// however, the return H wave is only flipped once, because of the
// orientation.
let (exp_e, exp_h) = (ray_e, -ray_h);
assert_float_eq!(*e, exp_e.cast::<f32>(), abs <= 2e-2, "z={} ... {:?}", z, eh);
assert_float_eq!(*h, exp_h.cast::<f32>(), abs <= 2e-4, "z={} ... {:?}", z, eh);
}
}
} }
} }
} }
@@ -624,8 +672,8 @@ mod test {
use super::*; use super::*;
use crate::stim::{NoopStimulus, RngStimulus}; use crate::stim::{NoopStimulus, RngStimulus};
fn test_same_explicit<S: Stimulus<R32>>( fn test_same_explicit<S: Stimulus<R32>>(
mut cpu_state: SpirvSim<R32, FullyGenericMaterial<R32>, CpuBackend>, mut cpu_state: SpirvSim<R32, GenericMaterial<R32>, CpuBackend>,
mut wgpu_state: SpirvSim<R32, FullyGenericMaterial<R32>, WgpuBackend>, mut wgpu_state: SpirvSim<R32, GenericMaterial<R32>, WgpuBackend>,
stim: &S, stim: &S,
step_iters: u64, step_iters: u64,
steps_per_iter: u32 steps_per_iter: u32
@@ -759,46 +807,12 @@ mod test {
test_same_mb_ferromagnet(0x1234, 10, 5, Index::new(96, 16, 8)); test_same_mb_ferromagnet(0x1234, 10, 5, Index::new(96, 16, 8));
} }
// XXX these tests probably failed because they were allowing negative mu_r values
#[test]
fn mb_ferromagnet_diff_repro() {
let size = Index::new(3, 3, 3);
let seed = 0; // 0 => 28, 4 => 28
let steps = 1000;
let mat = mat::MBPgram::new(0e-4.to_r32(), 9.02e-3.to_r32(), 1e4.to_r32());
let mut ref_state = SpirvSim::<R32, FullyGenericMaterial<R32>, CpuBackend>::new(size, 1e-3);
let mut dut_state = SpirvSim::<R32, FullyGenericMaterial<R32>, WgpuBackend>::new(size, 1e-3);
ref_state.put_material(Index::new(1, 1, 1), mat.clone());
dut_state.put_material(Index::new(1, 1, 1), mat.clone());
let stim = RngStimulus::new(seed);
test_same_explicit(ref_state, dut_state, &stim, steps, 1);
}
#[test]
fn mb_ferromagnet_diff_minimal_repro() {
let size = Index::new(3, 3, 1);
let seed = 0; // 0 => 28, 4 => 28
let steps = 1000;
let mat = mat::MBPgram::new(0e-4.to_r32(), 9.02e-3.to_r32(), 1e4.to_r32());
let mut ref_state = SpirvSim::<R32, FullyGenericMaterial<R32>, CpuBackend>::new(size, 1e-3);
let mut dut_state = SpirvSim::<R32, FullyGenericMaterial<R32>, WgpuBackend>::new(size, 1e-3);
ref_state.put_material(Index::new(1, 1, 0), mat.clone());
dut_state.put_material(Index::new(1, 1, 0), mat.clone());
let stim = RngStimulus::new(seed);
test_same_explicit(ref_state, dut_state, &stim, steps, 1);
}
#[test] #[test]
fn step_multiple_with_stim() { fn step_multiple_with_stim() {
let size = Index::new(4, 12, 8); let size = Index::new(4, 12, 8);
let mut cpu_state: SpirvSim<R32, FullyGenericMaterial<R32>, CpuBackend> = let mut cpu_state: SpirvSim<R32, GenericMaterial<R32>, CpuBackend> =
SpirvSim::new(size, 1e-3); SpirvSim::new(size, 1e-3);
let mut wgpu_state: SpirvSim<R32, FullyGenericMaterial<R32>, WgpuBackend> = let mut wgpu_state: SpirvSim<R32, GenericMaterial<R32>, WgpuBackend> =
SpirvSim::new(size, 1e-3); SpirvSim::new(size, 1e-3);
let stim = stim::Fields::new_e(Vec3::new(1.0e15, 2.0e15, -3.0e15).cast::<R32>()); let stim = stim::Fields::new_e(Vec3::new(1.0e15, 2.0e15, -3.0e15).cast::<R32>());
for _ in 0..5 { for _ in 0..5 {

View File

@@ -11,6 +11,17 @@ pub trait VectorField<R> {
fn at(&self, feat_size: R, loc: Index) -> Fields<R>; fn at(&self, feat_size: R, loc: Index) -> Fields<R>;
} }
// a vec of VectorFields is the sum of those fields
impl<R: Real, V: VectorField<R>> VectorField<R> for Vec<V> {
fn at(&self, feat_size: R, loc: Index) -> Fields<R> {
let mut acc = Fields::default();
for v in self {
acc += v.at(feat_size, loc);
}
acc
}
}
// uniform vector field // uniform vector field
impl<R: Real> VectorField<R> for Fields<R> { impl<R: Real> VectorField<R> for Fields<R> {
fn at(&self, _feat_size: R, _loc: Index) -> Fields<R> { fn at(&self, _feat_size: R, _loc: Index) -> Fields<R> {

View File

@@ -136,7 +136,7 @@ mod test {
let mut pool: JobPool<u32, u32> = JobPool::new(0); let mut pool: JobPool<u32, u32> = JobPool::new(0);
pool.spawn_workers(2, move |x| { pool.spawn_workers(2, move |x| {
// wait until caller unlocks us // wait until caller unlocks us
let _ = worker_mutex.lock().unwrap(); let _lock = worker_mutex.lock().unwrap();
x*2 x*2
}); });
pool.send(1); pool.send(1);

View File

@@ -2,6 +2,13 @@ use crate::compound::peano::{P0, Peano, PNext};
use crate::compound::list::{self, Indexable, IntoList, List}; use crate::compound::list::{self, Indexable, IntoList, List};
// TODO: we can probably simplify a lot of this by using the newer List traits // TODO: we can probably simplify a lot of this by using the newer List traits
// for example:
// - FoldOp<CallOn<P0>, V> calls the user function on the next item fed.
// - FoldOp<CallOn<PNext<P>>, V> returns CallOn<P> as the next state.
// - therefore we have a way to invoke a function on an arbitrary index for any list which is : Fold<CallOn<P>>
// - use DiscrDispatch to instantiate the right CallOn<P>
//
// doing this well will require benchmarking before/after
#[cfg(feature = "serde")] #[cfg(feature = "serde")]
use serde::{Serialize, Deserialize}; use serde::{Serialize, Deserialize};
@@ -48,6 +55,11 @@ impl<P: Peano> Discr<P> {
} }
} }
/// given some Discr<P>, which is a runtime integer < P, invoke the handler H as H<N> where N is
/// the Peano number equivalent to the runtime integer encoded by the discriminant.
///
/// this is just how we dispatch a bounded discriminant while ensuring the receiver can handle
/// every value we might call it with.
pub trait DiscrDispatch<P: Peano> { pub trait DiscrDispatch<P: Peano> {
fn dispatch<H: DiscrHandler<P, Output>, Output>(&self, h: H) -> Output; fn dispatch<H: DiscrHandler<P, Output>, Output>(&self, h: H) -> Output;
} }
@@ -145,6 +157,9 @@ where
} }
} }
/// discriminated enum. D encodes the discriminant, while L encodes the data. if D is the unit
/// tuple, the discriminant is assumed to be encoded in the data itself (see
/// `InternallyDiscriminated`)
#[cfg_attr(feature = "serde", derive(Deserialize, Serialize))] #[cfg_attr(feature = "serde", derive(Deserialize, Serialize))]
#[cfg_attr(feature = "fmt", derive(Debug))] #[cfg_attr(feature = "fmt", derive(Debug))]
#[derive(Copy, Clone, Default, PartialEq)] #[derive(Copy, Clone, Default, PartialEq)]
@@ -180,6 +195,7 @@ pub trait EnumRequirements {
fn encode_discr(&mut self, d: Discr<Self::NumVariants>); fn encode_discr(&mut self, d: Discr<Self::NumVariants>);
} }
// externally discriminated
impl<D, L> EnumRequirements for Enum<(D,), L> impl<D, L> EnumRequirements for Enum<(D,), L>
where where
D: DiscriminantCodable<<L as list::Meta>::Length>, D: DiscriminantCodable<<L as list::Meta>::Length>,
@@ -194,6 +210,7 @@ where
} }
} }
// internally discriminated
impl<L> EnumRequirements for Enum<(), L> impl<L> EnumRequirements for Enum<(), L>
where where
L: list::Meta + Indexable<P0>, L: list::Meta + Indexable<P0>,
@@ -233,6 +250,7 @@ where
self.decode_discr().dispatch(DispatchIndexable::new(&mut self.1, f)) self.decode_discr().dispatch(DispatchIndexable::new(&mut self.1, f))
} }
/// assign the enum to the variant `P` with value `value`.
pub fn set<P>(&mut self, value: L::Element) pub fn set<P>(&mut self, value: L::Element)
where where
P: Peano, P: Peano,

View File

@@ -49,7 +49,7 @@ impl IntoList for () {
} }
/// expands to the type name for a list with the provided types /// expands to the type name for a list with the provided types
/// ``` /// ```ignore
/// list_for!(E0, E1, E2, T) => Node<E0, Node<E1, Node<E2, T>>>> /// list_for!(E0, E1, E2, T) => Node<E0, Node<E1, Node<E2, T>>>>
/// ``` /// ```
macro_rules! list_for { macro_rules! list_for {
@@ -58,7 +58,7 @@ macro_rules! list_for {
} }
/// given N idents, return P`N`. /// given N idents, return P`N`.
/// ``` /// ```ignore
/// peano_for!(P0 a, b, c) => P3 /// peano_for!(P0 a, b, c) => P3
/// ``` /// ```
macro_rules! peano_for { macro_rules! peano_for {
@@ -67,7 +67,7 @@ macro_rules! peano_for {
} }
/// expands to the last item in the sequence /// expands to the last item in the sequence
/// ``` /// ```ignore
/// last!(a, bc, d) => d /// last!(a, bc, d) => d
/// ``` /// ```
macro_rules! last { macro_rules! last {
@@ -77,7 +77,7 @@ macro_rules! last {
/// transforms a list of idents into a `self.tail.[...].tail.head` pattern /// transforms a list of idents into a `self.tail.[...].tail.head` pattern
/// of the same length. /// of the same length.
/// ``` /// ```ignore
/// member_index!(self tail head a, b, c, d, e) => self.tail.tail.tail.head /// member_index!(self tail head a, b, c, d, e) => self.tail.tail.tail.head
/// ``` /// ```
macro_rules! member_index { macro_rules! member_index {
@@ -98,7 +98,7 @@ macro_rules! member_index {
} }
/// implements the Indexable trait for the last element provided of any prefix list. /// implements the Indexable trait for the last element provided of any prefix list.
/// ``` /// ```ignore
/// impl_indexable!(E0, E1, E2) /// impl_indexable!(E0, E1, E2)
/// => impl<E0, E1, E2, T> Indexable<P2> for Node<E0, Node<E1, Node<E1, T>>> { ... } /// => impl<E0, E1, E2, T> Indexable<P2> for Node<E0, Node<E1, Node<E1, T>>> { ... }
/// ``` /// ```
@@ -123,7 +123,7 @@ macro_rules! impl_indexable {
} }
/// implements the IntoList trait for the tuple of the provided elements. /// implements the IntoList trait for the tuple of the provided elements.
/// ``` /// ```ignore
/// impl_into_list!(E0, E1, E2) /// impl_into_list!(E0, E1, E2)
/// => impl<E0, E1, E2> IntoList for (E0, E1, E2) { ... } /// => impl<E0, E1, E2> IntoList for (E0, E1, E2) { ... }
/// ``` /// ```

View File

@@ -1,4 +1,5 @@
use core::convert::{AsMut, AsRef}; use core::convert::{AsMut, AsRef};
#[cfg(feature = "iter")]
use core::iter::Zip; use core::iter::Zip;
use core::ops::{Index, IndexMut}; use core::ops::{Index, IndexMut};
@@ -123,7 +124,10 @@ impl<T: IntoIterator> IntoIterator for DimSlice<T> {
} }
pub struct DimIter { pub struct DimIter {
// fields are unused if `iter` feature is disabled
#[allow(unused)]
idx: Vec3u, idx: Vec3u,
#[allow(unused)]
dim: Vec3u, dim: Vec3u,
} }

View File

@@ -1,4 +1,5 @@
use core::convert::{AsMut, AsRef}; use core::convert::{AsMut, AsRef};
#[cfg(feature = "iter")]
use core::iter::Zip; use core::iter::Zip;
use core::ops::{Index, IndexMut}; use core::ops::{Index, IndexMut};
@@ -81,7 +82,10 @@ impl<T: IntoIterator> IntoIterator for OffsetDimSlice<T> {
} }
pub struct OffsetDimIter { pub struct OffsetDimIter {
// fields are unused if `iter` feature is disabled
#[allow(unused)]
offset: Vec3u, offset: Vec3u,
#[allow(unused)]
inner: DimIter, inner: DimIter,
} }

View File

@@ -65,12 +65,12 @@ impl<P: Peano, T: Into<I>, I> Visitor<P, T, I> for IntoDispatcher {
} }
} }
impl<R, M0, M1> Into<FullyGenericMaterial<R>> for DiscrMat2<M0, M1> impl<R, M0, M1> Into<GenericMaterial<R>> for DiscrMat2<M0, M1>
where where
M0: DiscriminantCodable<P2> + Into<FullyGenericMaterial<R>> + Copy, M0: DiscriminantCodable<P2> + Into<GenericMaterial<R>> + Copy,
M1: Into<FullyGenericMaterial<R>> + Copy, M1: Into<GenericMaterial<R>> + Copy,
{ {
fn into(self) -> FullyGenericMaterial<R> { fn into(self) -> GenericMaterial<R> {
self.0.dispatch(IntoDispatcher) self.0.dispatch(IntoDispatcher)
} }
} }
@@ -88,13 +88,13 @@ where
} }
} }
impl<R, M0, M1, M2> Into<FullyGenericMaterial<R>> for DiscrMat3<M0, M1, M2> impl<R, M0, M1, M2> Into<GenericMaterial<R>> for DiscrMat3<M0, M1, M2>
where where
M0: DiscriminantCodable<P3> + Into<FullyGenericMaterial<R>> + Copy, M0: DiscriminantCodable<P3> + Into<GenericMaterial<R>> + Copy,
M1: Into<FullyGenericMaterial<R>> + Copy, M1: Into<GenericMaterial<R>> + Copy,
M2: Into<FullyGenericMaterial<R>> + Copy, M2: Into<GenericMaterial<R>> + Copy,
{ {
fn into(self) -> FullyGenericMaterial<R> { fn into(self) -> GenericMaterial<R> {
self.0.dispatch(IntoDispatcher) self.0.dispatch(IntoDispatcher)
} }
} }
@@ -195,38 +195,38 @@ impl<R: Real> From<Vacuum> for GenericMagnetic<R> {
/// "Fully Generic" in that one can set both the conductivity, /// "Fully Generic" in that one can set both the conductivity,
/// and set any of the well-known magnetic materials, simultaneously. /// and set any of the well-known magnetic materials, simultaneously.
pub type FullyGenericMaterial<R> = DualMaterial< pub type GenericMaterial<R> = DualMaterial<
AnisomorphicConductor<R>, AnisomorphicConductor<R>,
GenericMagnetic<R>, GenericMagnetic<R>,
>; >;
impl<R: Real> From<AnisomorphicConductor<R>> for FullyGenericMaterial<R> { impl<R: Real> From<AnisomorphicConductor<R>> for GenericMaterial<R> {
fn from(mat: AnisomorphicConductor<R>) -> Self { fn from(mat: AnisomorphicConductor<R>) -> Self {
Self::new(mat, Default::default()) Self::new(mat, Default::default())
} }
} }
impl<R: Real> From<MBPgram<R>> for FullyGenericMaterial<R> { impl<R: Real> From<MBPgram<R>> for GenericMaterial<R> {
fn from(mat: MBPgram<R>) -> Self { fn from(mat: MBPgram<R>) -> Self {
Self::new(Default::default(), mat.into()) Self::new(Default::default(), mat.into())
} }
} }
impl<R: Real> From<MHPgram<R>> for FullyGenericMaterial<R> { impl<R: Real> From<MHPgram<R>> for GenericMaterial<R> {
fn from(mat: MHPgram<R>) -> Self { fn from(mat: MHPgram<R>) -> Self {
Self::new(Default::default(), mat.into()) Self::new(Default::default(), mat.into())
} }
} }
impl<R: Real> From<Vacuum> for FullyGenericMaterial<R> { impl<R: Real> From<Vacuum> for GenericMaterial<R> {
fn from(mat: Vacuum) -> Self { fn from(mat: Vacuum) -> Self {
Self::new(Default::default(), mat.into()) Self::new(Default::default(), mat.into())
} }
} }
impl<R: Real> From<IsomorphicConductor<R>> for FullyGenericMaterial<R> { impl<R: Real> From<IsomorphicConductor<R>> for GenericMaterial<R> {
fn from(mat: IsomorphicConductor<R>) -> Self { fn from(mat: IsomorphicConductor<R>) -> Self {
let mat: AnisomorphicConductor<R> = mat.into(); let mat: AnisomorphicConductor<R> = mat.into();
mat.into() mat.into()
} }
} }
impl<R: Real> From<Ferroxcube3R1MH> for FullyGenericMaterial<R> { impl<R: Real> From<Ferroxcube3R1MH> for GenericMaterial<R> {
fn from(mat: Ferroxcube3R1MH) -> Self { fn from(mat: Ferroxcube3R1MH) -> Self {
let mat: MHPgram<R> = mat.into(); let mat: MHPgram<R> = mat.into();
mat.into() mat.into()

View File

@@ -10,11 +10,11 @@ use serde::{Serialize, Deserialize};
/// M(B) parallelogram /// M(B) parallelogram
/// ///
///```ignore ///```ignore
/// ____________ /// ____________ (P1)
/// / / /// / /
/// / . / /// / . /
/// / / /// / /
/// /___________/ /// /___________/ (P0)
/// ``` /// ```
/// ///
/// The `.` depicts (0, 0). X axis is B; y axis is M. /// The `.` depicts (0, 0). X axis is B; y axis is M.
@@ -26,15 +26,19 @@ use serde::{Serialize, Deserialize};
#[derive(Copy, Clone, Default, PartialEq)] #[derive(Copy, Clone, Default, PartialEq)]
pub struct MBPgram<R> { pub struct MBPgram<R> {
/// X coordinate at which the upward slope starts /// X coordinate at which the upward slope starts
/// X coordinate for point P0 in the diagram above.
pub b_start: R, pub b_start: R,
/// X coordinate at which the upward slope ends /// X coordinate at which the upward slope ends
/// X coordinate for point P1 in the diagram above.
pub b_end: R, pub b_end: R,
/// Vertical range of the graph /// Vertical range of the graph
/// Y coordinate for point P1, negative Y coordinate for point P0, in the diagram above.
pub max_m: R, pub max_m: R,
} }
/// f(x0) = -ymax /// evaluate `f(x)` at the provided `x`, where `f` is defined by these constraints:
/// f(x1) = ymax /// 1. f(x0) = -ymax
/// 2. f(x1) = ymax
fn eval_bounded_line<R: Real>(x: R, x0: R, x1: R, ymax: R) -> R { fn eval_bounded_line<R: Real>(x: R, x0: R, x1: R, ymax: R) -> R {
let x = x - x0; let x = x - x0;
let x1 = x1 - x0; let x1 = x1 - x0;

View File

@@ -5,7 +5,7 @@ mod compound;
mod conductor; mod conductor;
mod mb_pgram; mod mb_pgram;
mod mh_pgram; mod mh_pgram;
pub use compound::{FullyGenericMaterial, IsoConductorOr}; pub use compound::{GenericMaterial, IsoConductorOr};
pub use conductor::{AnisomorphicConductor, IsomorphicConductor}; pub use conductor::{AnisomorphicConductor, IsomorphicConductor};
pub use mb_pgram::MBPgram; pub use mb_pgram::MBPgram;
pub use mh_pgram::{Ferroxcube3R1MH, MHPgram}; pub use mh_pgram::{Ferroxcube3R1MH, MHPgram};

View File

@@ -7,5 +7,5 @@ edition = "2021"
crate-type = ["dylib", "lib"] crate-type = ["dylib", "lib"]
[dependencies] [dependencies]
spirv-std = { git = "https://github.com/EmbarkStudios/rust-gpu", features = ["glam"] } # MIT or Apache 2.0 spirv-std = { git = "https://github.com/Rust-GPU/rust-gpu", rev = "d78c301799e9d254aab3156a230c9a59efd94122", features = ["glam"] } # MIT or Apache 2.0
coremem_cross = { path = "../cross" } coremem_cross = { path = "../cross" }

View File

@@ -1,25 +1,23 @@
#![cfg_attr( #![cfg_attr(
target_arch = "spirv", target_arch = "spirv",
feature(register_attr),
register_attr(spirv),
no_std no_std
)] )]
#![feature(const_fn_floating_point_arithmetic)] #![feature(const_fn_floating_point_arithmetic)]
extern crate spirv_std; extern crate spirv_std;
use spirv_std::{glam, RuntimeArray}; use spirv_std::{glam, RuntimeArray, spirv};
#[cfg(not(target_arch = "spirv"))]
use spirv_std::macros::spirv;
mod adapt; mod adapt;
mod support; mod support;
use coremem_cross::mat::{Ferroxcube3R1MH, FullyGenericMaterial, IsoConductorOr}; use coremem_cross::mat::{Ferroxcube3R1MH, GenericMaterial, IsoConductorOr};
use coremem_cross::real::R32; use coremem_cross::real::R32;
use coremem_cross::step::SimMeta; use coremem_cross::step::SimMeta;
use coremem_cross::vec::{Vec3, Vec3u}; use coremem_cross::vec::{Vec3, Vec3u};
use core::stringify;
type Iso3R1<R> = IsoConductorOr<R, Ferroxcube3R1MH>; type Iso3R1<R> = IsoConductorOr<R, Ferroxcube3R1MH>;
fn glam_vec_to_internal(v: glam::UVec3) -> Vec3u { fn glam_vec_to_internal(v: glam::UVec3) -> Vec3u {
@@ -80,15 +78,15 @@ macro_rules! steps {
}; };
} }
steps!(f32, FullyGenericMaterial<f32>, step_h_generic_material_f32, step_e_generic_material_f32); steps!(f32, GenericMaterial<f32>, step_h_generic_material_f32, step_e_generic_material_f32);
steps!(f32, Iso3R1<f32>, step_h_iso_3r1_f32, step_e_iso_3r1_f32); steps!(f32, Iso3R1<f32>, step_h_iso_3r1_f32, step_e_iso_3r1_f32);
steps!(R32, FullyGenericMaterial<R32>, step_h_generic_material_r32, step_e_generic_material_r32); steps!(R32, GenericMaterial<R32>, step_h_generic_material_r32, step_e_generic_material_r32);
steps!(R32, Iso3R1<R32>, step_h_iso_3r1_r32, step_e_iso_3r1_r32); steps!(R32, Iso3R1<R32>, step_h_iso_3r1_r32, step_e_iso_3r1_r32);
// these should work, but require OpCapability Float64 // these should work, but require OpCapability Float64
// we disable them for compatibility concerns: use the Cpu if you need f64 or temporarily uncomment // we disable them for compatibility concerns: use the Cpu if you need f64 or temporarily uncomment
// this and add the capability to the WgpuBackend driver. // this and add the capability to the WgpuBackend driver.
// steps!(f64, FullyGenericMaterial<f64>, step_h_generic_material_f64, step_e_generic_material_f64); // steps!(f64, GenericMaterial<f64>, step_h_generic_material_f64, step_e_generic_material_f64);
// steps!(f64, Iso3R1<f64>, step_h_iso_3r1_f64, step_e_iso_3r1_f64); // steps!(f64, Iso3R1<f64>, step_h_iso_3r1_f64, step_e_iso_3r1_f64);
// steps!(R64, FullyGenericMaterial<R64>, step_h_generic_material_r64, step_e_generic_material_r64); // steps!(R64, GenericMaterial<R64>, step_h_generic_material_r64, step_e_generic_material_r64);
// steps!(R64, Iso3R1<R64>, step_h_iso_3r1_r64, step_e_iso_3r1_r64); // steps!(R64, Iso3R1<R64>, step_h_iso_3r1_r64, step_e_iso_3r1_r64);

View File

@@ -1,3 +1,4 @@
use core::debug_assert;
use core::ops::{Index, IndexMut}; use core::ops::{Index, IndexMut};
use spirv_std::RuntimeArray; use spirv_std::RuntimeArray;

View File

@@ -5,7 +5,7 @@ authors = ["Colin <colin@uninsane.org>"]
edition = "2021" edition = "2021"
[dependencies] [dependencies]
spirv-builder = { git = "https://github.com/EmbarkStudios/rust-gpu", features = ["use-compiled-tools"] } spirv-builder = { git = "https://github.com/Rust-GPU/rust-gpu", rev = "d78c301799e9d254aab3156a230c9a59efd94122", features = ["use-compiled-tools"] }
# these deps are to satisfy internal rustc stuff, since spirv-builder links into rustc internals (HACK). # these deps are to satisfy internal rustc stuff, since spirv-builder links into rustc internals (HACK).
# this needs to match the rustc lock file -- not just its Cargo.toml -- so we pin down to the patch level # this needs to match the rustc lock file -- not just its Cargo.toml -- so we pin down to the patch level
@@ -14,13 +14,13 @@ spirv-builder = { git = "https://github.com/EmbarkStudios/rust-gpu", features =
# these are very likely to break during rustc updates, but they should be noisy. # these are very likely to break during rustc updates, but they should be noisy.
# just keep filling things in here based on rustc's Cargo.lock until it's satisfied. # just keep filling things in here based on rustc's Cargo.lock until it's satisfied.
# then run `cargo update` inside `nix develop` to sync our own lock file, and `nix build` to test. # then run `cargo update` inside `nix develop` to sync our own lock file, and `nix build` to test.
cc = "=1.0.73" # cc = "=1.0.73"
cfg-if = "=0.1.10" # cfg-if = "=0.1.10"
compiler_builtins = "=0.1.79" # compiler_builtins = "=0.1.79"
dlmalloc = "=0.2.3" # dlmalloc = "=0.2.3"
fortanix-sgx-abi = "=0.5.0" # fortanix-sgx-abi = "=0.5.0"
getopts = "=0.2.21" # getopts = "=0.2.21"
hashbrown = "=0.12.3" # hashbrown = "=0.12.3"
hermit-abi = "=0.2.0" # hermit-abi = "=0.2.0"
libc = "=0.2.131" # libc = "=0.2.131"
unicode-width = "=0.1.8" # unicode-width = "=0.1.8"

59
flake.lock generated
View File

@@ -1,27 +1,15 @@
{ {
"nodes": { "nodes": {
"flake-utils": { "flake-utils": {
"locked": { "inputs": {
"lastModified": 1659877975, "systems": "systems"
"narHash": "sha256-zllb8aq3YO3h8B/U0/J1WBgAL8EX5yWf5pMj3G0NAmc=",
"owner": "numtide",
"repo": "flake-utils",
"rev": "c0e246b9b83f637f4681389ecabcb2681b4f3af0",
"type": "github"
}, },
"original": {
"owner": "numtide",
"repo": "flake-utils",
"type": "github"
}
},
"flake-utils_2": {
"locked": { "locked": {
"lastModified": 1656928814, "lastModified": 1731533236,
"narHash": "sha256-RIFfgBuKz6Hp89yRr7+NR5tzIAbn52h8vT6vXkYjZoM=", "narHash": "sha256-l0KFg5HjrsfsO/JpG+r7fRrqm12kzFHyUHqHCVpMMbI=",
"owner": "numtide", "owner": "numtide",
"repo": "flake-utils", "repo": "flake-utils",
"rev": "7e2a3b3dfd9af950a856d66b0a7d01e3c18aa249", "rev": "11707dc2f618dd54ca8739b309ec4fc024de578b",
"type": "github" "type": "github"
}, },
"original": { "original": {
@@ -32,26 +20,26 @@
}, },
"nixpkgs": { "nixpkgs": {
"locked": { "locked": {
"lastModified": 1664029467, "lastModified": 1720535198,
"narHash": "sha256-ir7JbsLp2mqseCs3qI+Z/pkt+Gh+GfANbYcI5I+Gvnk=", "narHash": "sha256-zwVvxrdIzralnSbcpghA92tWu2DV2lwv89xZc8MTrbg=",
"owner": "NixOS", "owner": "NixOS",
"repo": "nixpkgs", "repo": "nixpkgs",
"rev": "893b6b9f6c4ed0c7efdb84bd300a499a2da9fa51", "rev": "205fd4226592cc83fd4c0885a3e4c9c400efabb5",
"type": "github" "type": "github"
}, },
"original": { "original": {
"id": "nixpkgs", "id": "nixpkgs",
"ref": "nixos-22.05", "ref": "nixos-23.11",
"type": "indirect" "type": "indirect"
} }
}, },
"nixpkgs_2": { "nixpkgs_2": {
"locked": { "locked": {
"lastModified": 1659102345, "lastModified": 1736320768,
"narHash": "sha256-Vbzlz254EMZvn28BhpN8JOi5EuKqnHZ3ujFYgFcSGvk=", "narHash": "sha256-nIYdTAiKIGnFNugbomgBJR+Xv5F1ZQU+HfaBqJKroC0=",
"owner": "NixOS", "owner": "NixOS",
"repo": "nixpkgs", "repo": "nixpkgs",
"rev": "11b60e4f80d87794a2a4a8a256391b37c59a1ea7", "rev": "4bc9c909d9ac828a039f288cf872d16d38185db8",
"type": "github" "type": "github"
}, },
"original": { "original": {
@@ -70,22 +58,37 @@
}, },
"rust-overlay": { "rust-overlay": {
"inputs": { "inputs": {
"flake-utils": "flake-utils_2",
"nixpkgs": "nixpkgs_2" "nixpkgs": "nixpkgs_2"
}, },
"locked": { "locked": {
"lastModified": 1664074880, "lastModified": 1736572187,
"narHash": "sha256-/V1TX4HLADElvi3MuuIbNdvzR/HmNzbYRemKBjX/5YY=", "narHash": "sha256-it8mU8UkbaeVup7GpCI6n2cWPJ/O4U980CxKAMKUGF0=",
"owner": "oxalica", "owner": "oxalica",
"repo": "rust-overlay", "repo": "rust-overlay",
"rev": "45140fa526b1cb85498f717e355c79a54367cb1d", "rev": "06871d5c5f78b0ae846c5758702531b4cabfab9b",
"type": "github" "type": "github"
}, },
"original": { "original": {
"owner": "oxalica", "owner": "oxalica",
"ref": "snapshot/2025-01-11",
"repo": "rust-overlay", "repo": "rust-overlay",
"type": "github" "type": "github"
} }
},
"systems": {
"locked": {
"lastModified": 1681028828,
"narHash": "sha256-Vy1rq5AaRuLzOxct8nz4T6wlgyUR7zLU309k9mBC768=",
"owner": "nix-systems",
"repo": "default",
"rev": "da67096a3b9bf56a91d16901293e51ba5b49a27e",
"type": "github"
},
"original": {
"owner": "nix-systems",
"repo": "default",
"type": "github"
}
} }
}, },
"root": "root", "root": "root",

View File

@@ -1,9 +1,9 @@
{ {
description = "Finite Difference Time Domain simulation binaries"; description = "Finite Difference Time Domain simulation binaries";
inputs = { inputs = {
nixpkgs.url = "nixpkgs/nixos-22.05"; nixpkgs.url = "nixpkgs/nixos-23.11";
flake-utils.url = github:numtide/flake-utils; flake-utils.url = "github:numtide/flake-utils";
rust-overlay.url = github:oxalica/rust-overlay; rust-overlay.url = "github:oxalica/rust-overlay/snapshot/2025-01-11"; #< TODO: update/un- pin (last commit before rust toolchain < 2024-01-01 was removed)
}; };
outputs = { self, nixpkgs, flake-utils, rust-overlay }: outputs = { self, nixpkgs, flake-utils, rust-overlay }:
@@ -15,8 +15,10 @@
}; };
rust-toolchain = pkgs.rust-bin.fromRustupToolchainFile ./rust-toolchain.toml; rust-toolchain = pkgs.rust-bin.fromRustupToolchainFile ./rust-toolchain.toml;
python-packages = pypkg: with pypkg; [ python-packages = pypkg: with pypkg; [
natsort
pandas pandas
plotly plotly
scipy
]; ];
python3 = pkgs.python3.withPackages python-packages; python3 = pkgs.python3.withPackages python-packages;
in in

View File

@@ -1,5 +1,5 @@
[toolchain] [toolchain]
channel = "nightly-2022-08-29" channel = "nightly-2023-01-21"
components = [ "rust-src", "rustc-dev", "llvm-tools-preview" ] components = [ "rust-src", "rustc-dev", "llvm-tools-preview" ]
targets = [ "x86_64-unknown-linux-gnu" ] targets = [ "x86_64-unknown-linux-gnu" ]
profile = "default" profile = "default"