package irmin-bench
Install
Dune Dependency
Authors
Maintainers
Sources
sha256=92a9de7a0a2a35c2feba0c35a806b1f0df24c1c0d15164eebf3f919296d26715
sha512=0203ec5117a851ad5afeb2f9091659b4e142e231b6b945caab93f4d7beb23397c8ac43f7056e91d18f4bff0be1062f6ae966d221f877c229328c0cbbf29fd9f0
doc/irmin-bench.traces/Irmin_traces/Trace_stat_summary/index.html
Module Irmin_traces.Trace_stat_summary
Source
Conversion of a Stat_trace
to a summary that is both pretty-printable and exportable to JSON.
The main type t
here isn't versioned like a Stat_trace.t
is.
Computing a summary may take a long time if the input Stat_trace
is long. Count ~1000 commits per second.
This file is NOT meant to be used from Tezos, as opposed to some other "trace_*" files.
A stat trace can be chunked into blocks. A blocks is made of 2 phases, first the buildup and then the commit.
type bag_stat = {
value_before_commit : Vs.t;
value_after_commit : Vs.t;
diff_per_block : Vs.t;
diff_per_buildup : Vs.t;
diff_per_commit : Vs.t;
}
Summary of an entry contained in Def.bag_of_stat
.
Properties of such a variables:
- Is sampled before each commit operation.
- Is sampled after each commit operation.
- Is sampled in header.
- Most of these entries are expected to grow linearly, it implies that no smoothing is necessary for the downsampled curve in these cases, and that the histogram is best viewed on a linear scale - as opposed to a log scale. The other entries are summarised using
~is_linearly_increasing:false
.
The value_after_commit
is initially fed with the value in the header (i.e. the value recorded just before the start of the play).
type pack = {
finds : finds;
appended_hashes : bag_stat;
appended_offsets : bag_stat;
inode_add : bag_stat;
inode_remove : bag_stat;
inode_of_seq : bag_stat;
inode_of_raw : bag_stat;
inode_rec_add : bag_stat;
inode_rec_remove : bag_stat;
inode_to_binv : bag_stat;
inode_decode_bin : bag_stat;
inode_encode_bin : bag_stat;
}
type t = {
summary_timeofday : float;
summary_hostname : string;
curves_sample_count : int;
moving_average_half_life_ratio : float;
config : Def.config;
hostname : string;
word_size : int;
timeofday : float;
timestamp_wall0 : float;
timestamp_cpu0 : float;
elapsed_wall : float;
elapsed_wall_over_blocks : Utils.curve;
elapsed_cpu : float;
elapsed_cpu_over_blocks : Utils.curve;
op_count : int;
span : Span.map;
block_count : int;
cpu_usage : Vs.t;
index : index;
pack : pack;
tree : tree;
gc : gc;
disk : disk;
store : store;
}
Accumulator for the span
field of t
.
Summary computation for statistics recorded in Def.bag_of_stat
.
Accumulator for the store
field of t
.
val major_heap_top_bytes_folder :
'a Def.header_base ->
int ->
([> `Commit of 'b Def.commit_base ], Utils.Resample.acc, float list)
Utils.Parallel_folders.folder
Build a resampled curve of gc.top_heap_words
val elapsed_wall_over_blocks_folder :
'a Def.header_base ->
int ->
([> `Commit of 'b Def.commit_base ], Utils.Resample.acc, float list)
Utils.Parallel_folders.folder
Build a resampled curve of timestamps.
val elapsed_cpu_over_blocks_folder :
'a Def.header_base ->
int ->
([> `Commit of 'b Def.commit_base ], Utils.Resample.acc, float list)
Utils.Parallel_folders.folder
Build a resampled curve of timestamps.
val merge_durations_folder :
(Def.pack Def.row_base, float list, float list) Utils.Parallel_folders.folder
Build a list of all the merge durations.
val cpu_usage_folder :
'a Def.header_base ->
int ->
([> `Commit of 'b Def.commit_base ], float * float * Vs.acc, Vs.t)
Utils.Parallel_folders.folder
val misc_stats_folder :
'a Def.header_base ->
([> `Commit of 'b Def.commit_base ],
float * float * int,
float * float * int)
Utils.Parallel_folders.folder
Substract the first and the last timestamps and count the number of span.
Fold over row_seq
and produce the summary.
Parallel Folders
Almost all entries in t
require to independently fold over the rows of the stat trace, but we want:
- not to fully load the trace in memory,
- not to reread the trace from disk once for each entry,
- this current file to be verbose and simple,
- to have fun with GADTs and avoid mutability.
All the boilerplate is hidden behind Utils.Parallel_folders
, a datastructure that holds all folder functions, takes care of feeding the rows to those folders, and preseves the types.
In the code below, pf0
is the initial parallel folder, before the first accumulation. Each |+ ...
statement declares a acc, accumulate, finalise
triplet, i.e. a folder.
val acc : acc
is the initial empty accumulation of a folder.
val accumulate : acc -> row -> acc
needs to be folded over all rows of the stat trace. Calling Parallel_folders.accumulate pf row
will feed row
to every folders.
val finalise : acc -> v
has to be applied on the final acc
of a folder in order to produce the final value of that folder - which value is meant to be stored in Trace_stat_summary.t
. Calling Parallel_folders.finalise pf
will finalise all folders and pass their result to construct
.
Turn a stat trace into a summary.
The number of blocks to consider may be provided in order to truncate the summary.