package statocaml_changelog
Install
Dune Dependency
Authors
Maintainers
Sources
md5=df68d4831c73322a834d4483c2429bae
sha512=7a714c81bf552d04deb61ac4fd7fb0ba8a9afe811762dcc9f154c5bad4009ea196c7e1f21cb342f657e21d1b554644c469d2853351796685461f641ce4eff705
Description
Build a typed changelog from json
Published: 27 Jun 2025
README
Statocaml
Statocaml is a tool to gather development data from a github repository and generate statistics.
It was developed initially to study the development of OCaml but can be used on any github repository.
A presentation (in French) is available here.
Development is hosted here.
Installation
opam install statocaml_go statocaml_fetch
or
opam pin add https://gitlab.inria.fr/guesdon/statocaml.git
Statocaml uses some external tools which must be installed too:
Usage
You will need a valid Github token to fetch information from the github site.
Configuration file
Create a configuration file conf.json
(in JSON format) looking like this one (here the ocaml/conf.json
file for the ocaml/ocaml Github repository):
{
"data_dir": "data", // where to store data
"github": {
"token": "...", // put your token here
"cache_dir": "cache", // directory used for caching
"user": "ocaml", // github user the repository belongs to
"repo": "ocaml", // the repository to fetch
"fetch_gh_users": [
"bnigito", "bstarynk", "cagdasbozman", "claudemarche", "enaudon",
"Lucccyo", "clappski", "djs55", "dweil", "flindgren", "gnecula",
"Hirrolot", "jaked", "jeffsco", "jessicah", "jserot", "JuliaLawall",
"kerneis", "khooyp", "klartext", "kyleheadley", "lebotlan", "letouzey",
"MadCoder", "maverickwoo", "mkoconnor", "monate", "mrvn",
"Nick-Chapman", "nickgian", "pdenys", "pocarist", "revskill10",
"rdicosmo", "roshanjames", "sacerdot", "signoles", "smimram", "strub",
"tertium", "TheAspiringHacker", "vog", "zoep"
] // additional users to fetch
}
}
The fetch_gh_users
field is required when a changelog is provided and it refers to contributions before the code was imported to Github: some contributors are not associated to their github accounts; this field is used to force fetching these accounts, and these accounts can be referenced in the file indicated by the --gh-users
command line option of statocaml_go
tool.
The file will be overriden by the statocaml_fetch
tool to perform incremental fetch, i.e. not download all data every time we want to update the data locally.
The Github token can be indicated in three ways in the token
field:
as a string value of the field:
{ ... "token": "ghp_Km...", ... }
in an environment variable whose name is given as a string value of the field, beginning with
$
:{ ... "token": "$GITHUB_TOKEN", ... }
on the first line of a file name is given as a string value of the field, beginning with
.
(relative filename) or/
(absolute filename)':{ ... "token": "./github.token", ... }
Fetching data
In the directory where your conf.json
is, run:
$ statocaml_fetch
Use the -c
option to indicate a different configuration file. This operation can take many hours (or days), depending on the size of the repository (number of commits, issues, pull requests, ...).
To update the local fetched data, run the same command. This will fetch new data since the last fetch, according to a field last_event_date
automatically added into the configuration file.
Generating the statistics
Run
$ statocaml_go
By default, the configuration file used is conf.json
but another one can be specified with the -c
option. Beware that relative directories in configuration file are applied from current directory, not the directory where is the configuration file.
statocaml_go
can use a changelog formatted in JSON format (with option --changelog
) to get releases and contributors of each Issue/PR. For OCaml, Octachron's ocaml-changelog-analyzer is used to convert the OCaml Changes
file to a JSON file.
Main command line options are:
--html-outdir
to indicate the directory where to generate the web site; default isstatocaml-output
.--events
to indicate a file with additional events to display in some visualizations. The file is in JSON format, of the form[ { "label": "my event 1", "date": (2025, 12, 31)}, { "label": "my event 2", "date": (2025, 07, 14)}, ... ]
--gh-users
to indicate a JSON file with information about contributors. This file is used to merge contributors found with different names of emails. It has the following form:[ { "names": ["Alan Turing", "Turing Alan"], "gh_login": "alanturing", "emails": ["alan.turing@foo.bar", "alan@turing.uk"] }, ... ]
--groups
to indicate a JSON file containing the definition of groups. This produces additional statistics for these groups, by merging statistics of their members. A group may have a github account indicated. The file has the following form:{ "Group 1": { "members": { "githublogin1": [{"start":"2023-01-01T00:00:00-00:00"}], "githublogin2": [{"start":"2013-11-01T00:00:00-00:00","stop":"2017-11-01T00:00:00-00:00"}], "githublogin3":[], ... }, "gh_account": "githubloginofgroup" }, ... }
--gui
to launch a Graphical User Interface to create some plots according to entered parameters; these parameters can be copied in a JSON file. When this file is given with option--plots
, these additional plots are integrated to the generated website.--subsystems
to indicate a file containing subsystem definitions. Subsystems are defined by a name and an id; regular expressions are used to associate filenames in the git repository to subsystems. Seeocaml/subsystems.json
for an example of definitions used for OCaml.