From test runner to test reporter (PIP Cockpit Solutions)

j4n_bur53 · May 2, 2026, 9:23am

I want to lift this discussion to the automation issue:

Some thoughts about a test framework for PIPS
https://discourse.prolog-lang.org/t/some-thoughts-about-a-test-framework-for-pips/108

While the gold standard in the past of a testing solution would
be , to be happy to be able to cursory test multiple Prolog systems.
By manually inspecting some logs on your local machine.

The challenges in the current world could be much more complex.
The problem is usually multiple platforms. Not onyl Linux, Mac and
Windows, already different CPU architectures Intel, AMD and ARM.

This gives huge testing matrices, with distributed results. A
cockpit might then create a Single Pane of Glass (SPOG), giving
you the commander Spock looking at his Scope feel of control.

While Logtalk has some allure based tooling. But there is a gap
between offering some tooling, and practicing the tooling itself.
I don’t find some published reports. Also other Prolog systems

might prefer some more lightweight solutions, that don’t depend
on 3rd party tooling. For exampe this proposal creates two
dependencies, namely Logtalk and Allure:

Some thoughts about a test framework for PIPS

Do you want a nice report?
$ logtalk_tester -p eclipse -f xunit
$ logtalk_allure_report
$ allure open
Similar on Windows. Just add the .ps1 extension.

But it lacks the galactic matrix automation! How do you scale it, so
that the testing gets fully automated? There is no suggestion
what scripting to use when platforms are mixed, like Linux,

Mac and Windows! It has rather the repelling suggestion, via the
.ps1 extension, to duplicate scripting for different platforms, since
the .ps1 extension refers to PowerShell. Jan W. has repeatedly

demonstrated reporting with graphical output, I recently tried the
same, with a 100% Prolog written SVG generator. What is missing
is a bar chart color legend. Its actually a benchmark not a compliance test:

But under the hood there is also some automation framework
based on Java Ant Tasks. Which has the advantage that it is
portable across Linux, Mac and Windows, only requires

a Java JVM installed. But it might not be to the taste of everybody.
From seeing Ulrich Neumerkels test cases, I remember also
some cross Prolog system testing. But the issue usually starts

with the test result format questions. Allure suggests JSON based
test result files. But you can also use a Prolog databases format,
just the usual Prolog facts inside a plain text file, as the test result.

j4n_bur53 · May 2, 2026, 10:26am

A format for a test result snapshot matrix, somehow more
or less immediately leads to historic data. If combined with
Continuous Integration und Continuous Delivery/Deployment

(CI/CD) pipelines, either main pipelines or special farming factories,
you can get historic reports. For test results you might
see the number of tests and their composition growing

over time. For benchmarks you see here an example of a
novel pipeline, that recently went online for a CPython JIT project:

https://www.doesjitgobrrr.com/

Pretty cool with the rich charts and all the interactive knobs!