Commit Graph

257 Commits

Author SHA1 Message Date
e5c8bcff95 Driver: remove dead add_classical_boundary_explicit function 2022-08-22 00:51:53 -07:00
ff13a1e96c driver: address a TODO 2022-08-22 00:43:07 -07:00
24b82037b4 Stimulus: parameterize over R.
this saves us from a `mem::transmute` in the sim code to get
`Fields<R>`.
2022-08-22 00:37:34 -07:00
f0fc324188 Stimulus: remove unused eval_into trait method 2022-08-21 21:41:27 -07:00
c02e5427d4 spirv tests: port to R32
this gives better debug info
2022-08-21 20:46:57 -07:00
527e2746ed driver: TODO about optimization 2022-08-21 19:25:20 -07:00
75a5041ed6 diagnostics: nicer formatting 2022-08-21 19:18:42 -07:00
98d6a5b34f spirv: instrument the device read/write operations 2022-08-21 18:51:51 -07:00
6c9a6e1ffa driver: all the user to configure the number of steps to go between stimulus application 2022-08-21 18:22:11 -07:00
a414bd77d4 diagnostics: break out a variable to make this code cleaner 2022-08-21 18:12:31 -07:00
850a7e773f diagnostics: rename time_spent_{foo} -> time_{foo} 2022-08-21 18:11:03 -07:00
a38734a1ed diagnostics: instrument the stimulus and stimulus blocked time 2022-08-21 18:10:09 -07:00
18dd66530a driver: evaluate stimulus in a background thread
this boosts fps from 920 to roughly 1150
2022-08-21 16:20:46 -07:00
7b848bcd16 driver: hide more behind the StimAccess type 2022-08-20 19:08:15 -07:00
c5cede6c6e driver: hide the stimulus stuff behind a wrapper
this will making prefetch cleaner to implement
2022-08-20 18:58:31 -07:00
053943df01 add Stimulus::render() and use it inside the driver and SpirvSim 2022-08-20 17:36:23 -07:00
4f229a51b1 rename RenderedVectorField -> RenderedStimulus 2022-08-20 17:08:26 -07:00
cd2917c8a5 driver, sim: use RenderedVectorField to simplify/optimize sim-internal rendering 2022-08-20 17:07:25 -07:00
69a603920f add a RenderedVectorField. maybe more accurately called a rendered stimulus?
used to represent a stimulus which has been rendered for a specific time with specific simulation parameters.
2022-08-20 17:05:58 -07:00
ff4209ce78 SpirvSim: remove a Vec copy from the stimulus evaluation
this boosts perf from 562 fps -> 900-ish for the multi-core-inverter
```
t=2.73e-9 frame 141700 fps: 901.93 (sim: 154.0s, stim: 30.6s, [render: 92.7s], blocked: 0.0s, render_prep: 0.7s, other: 2.5s)
```

we're now spending more CPU time rendering the measurements
than computing the stimulus
2022-08-19 04:55:50 -07:00
87c24c739c spirv: call Stimulus::at instead of Stimulus::eval_into
this *lowers* perf from 595 fps -> 562 fps
2022-08-19 04:49:18 -07:00
570917cae5 stim: Gated: no longer a Stimulus 2022-08-19 04:34:20 -07:00
8df001773f eval_into: remove the scale parameter
this actually seems to drop perf from 637 -> 595 ish?

i suppose the compiler was able to fold the time multiplier in with the
scale multipler? idk, somewhat surprised.
2022-08-19 04:26:58 -07:00
ad5f064584 stim: Simplify the Exp implementation. it's no longer a Stimulus 2022-08-19 04:14:33 -07:00
77124fcdaf driver: implement an optimized stimulus adapter over ModulatedVectorField
this boosts perf from 520fps -> 632fps.

it does some uneccessary clones.
but it looks like the bulk of the inefficiency resides inside
the sim/spirv/ code though.
it might be that this is nearly memory-bottlenecked.
if so, backgrounding it might be sensible.
2022-08-19 03:54:43 -07:00
35dbdffda7 driver: lay some scaffolding to allow us to optimize the stimulus in future 2022-08-18 22:19:50 -07:00
ffda00b796 stim: convert CurlStimulus to a CurlVectorField and use ModulatedVectorField
this opens the door to caching the vector field stuff.
2022-08-18 20:47:36 -07:00
9461cc7781 stim: introduce a VectorField trait which we'll use to build a more structured approach to Stimulus 2022-08-18 17:08:44 -07:00
cf2d21f975 Stimulus: change at method to accept feat_size: f32, loc: Index 2022-08-18 16:21:21 -07:00
6750feef8d stim: remove TimeVarying3
`TimeVarying`(1) is enough for what we want.
2022-08-18 15:51:54 -07:00
570f058ee1 rename AbstractStimulus -> Stimulus 2022-08-18 15:27:18 -07:00
60e44d6d4d rename Stimulus -> RegionGated 2022-08-18 15:22:28 -07:00
eb406ea46f UniformStimulus: use Fields internally 2022-08-18 15:19:22 -07:00
6e7ae48d86 stim: remove the extra norm call in CurlStimulus application
we call `with_mag` after, making it redundant.
2022-08-18 14:28:33 -07:00
454307325b stim: add a scale parameter to AbstractStimulus::eval_into
this boosts perf from 571fps -> 620-ish.
2022-08-18 04:33:00 -07:00
b9581b79b2 sim: add AbstractStimulus::eval_into for bulk evaluation 2022-08-18 04:11:04 -07:00
07fa4042a3 cross: list: add IntoVec trait 2022-08-18 04:00:41 -07:00
300c11f5ca stim: use a Visitor instead of a FoldOp for eval
boosts perf from 420 -> 530 fps
2022-08-18 02:52:57 -07:00
2a9c065cb0 cross: list: allow visit to be mutable 2022-08-18 02:45:15 -07:00
5cc1c310b5 AbstractStimulus: add bulk eval_into operation. 2022-08-18 01:45:09 -07:00
a247b861e1 cross: hide the iteration features behind a flag
they don't compile on spirv due to the inherent use of Options,
but they'll be useful in the CPU-side code.
2022-08-17 21:14:21 -07:00
ee98e1a060 stim: re-express the AbstractStimulus list op as a fold
this gives a big perf boost: 10.5 fps -> 446 fps.

still far lower from the 720 fps we got on an ordinary Vec<Box<dyn
AbstractRegion>>. i think we had achieved 730  using the old
ListVisitor.

it's probably not worth list-ifying the stimuli; at least not at this
level. at the least, we probably want only 5 stimuli: one per core.
if we did that, the stimuli could even have all the same typename,
and be put into a plain old array; no boxing.
2022-08-17 03:28:52 -07:00
4c5c978053 whitespace nits 2022-08-16 01:29:35 -07:00
fad70f45c1 stim: use Map + Sum for evaluating stimuli lists 2022-08-16 01:11:46 -07:00
e2728e0303 stim: impl Add for Fields to simplify some of this code 2022-08-16 00:17:23 -07:00
1a86fb5ca3 cross: list: fold MaybeMeta and Meta into one trait 2022-08-15 02:32:47 -07:00
19893157fa port: legacy sim accessors test to spirv 2022-08-14 19:16:09 -07:00
ee93c22f4a app: multi_core_inverter: perf: move the stimulus Gating to outside the CurlStimulus
the region.contains() logic is much more expensive than the time bounds
check.
this gets an easy 50% perf boost to the ENTIRE simulation
2022-08-13 15:00:56 -07:00
bbb8b2b9ae driver: better APIs around list-based stimuli 2022-08-13 03:51:01 -07:00
858e787c19 driver: allow preserving the Stimuli as a concrete List 2022-08-12 18:03:10 -07:00